perf: Fix forgotten preempt_enable by nested writers

A writer that gets a reference to the buffer handle disables
preemption. When we put that reference, we check if we are
the outer most writer and if not, we simply return and defer
the head update to the outer most writer. The problem here
is that preemption is only reenabled by the outer most, that
produces preemption count imbalance for every nested writer
that exit.

So just don't forget to always re-enable preemption when we
put the buffer reference, whoever we are.

Fixes lots of sleeping in atomic warnings, visible with lock
events recording.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Robert Richter <robert.richter@amd.com>
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 2a060be..45b7aec 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -2933,7 +2933,7 @@
 	 */
 
 	if (!local_dec_and_test(&data->nest))
-		return;
+		goto out;
 
 	/*
 	 * Publish the known good head. Rely on the full barrier implied
@@ -2954,6 +2954,7 @@
 	if (handle->wakeup != local_read(&data->wakeup))
 		perf_output_wakeup(handle);
 
+ out:
 	preempt_enable();
 }