rcu: Prevent rcu_barrier() from starting needless grace periods

Currently rcu_barrier() uses call_rcu() to enqueue new callbacks
on each CPU with a non-empty callback list.  This works, but means
that rcu_barrier() forces grace periods that are not otherwise needed.
The key point is that rcu_barrier() never needs to wait for a grace
period, but instead only for all pre-existing callbacks to be invoked.
This means that rcu_barrier()'s new callbacks should be placed in
the callback-list segment containing the last pre-existing callback.

This commit makes this change using the new rcu_segcblist_entrain()
function.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index e3facb3..91dc089 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -742,6 +742,7 @@ TRACE_EVENT(rcu_torture_read,
  *	"OnlineQ": _rcu_barrier() found online CPU with callbacks.
  *	"OnlineNQ": _rcu_barrier() found online CPU, no callbacks.
  *	"IRQ": An rcu_barrier_callback() callback posted on remote CPU.
+ *	"IRQNQ": An rcu_barrier_callback() callback found no callbacks.
  *	"CB": An rcu_barrier_callback() invoked a callback, not the last.
  *	"LastCB": An rcu_barrier_callback() invoked the last callback.
  *	"Inc2": _rcu_barrier() piggyback check counter incremented.