[SPARC64] mm: context switch ptlock

sparc64 is unique among architectures in taking the page_table_lock in
its context switch (well, cris does too, but erroneously, and it's not
yet SMP anyway).

This seems to be a private affair between switch_mm and activate_mm,
using page_table_lock as a per-mm lock, without any relation to its uses
elsewhere.  That's fine, but comment it as such; and unlock sooner in
switch_mm, more like in activate_mm (preemption is disabled here).

There is a block of "if (0)"ed code in smp_flush_tlb_pending which would
have liked to rely on the page_table_lock, in switch_mm and elsewhere;
but its comment explains how dup_mmap's flush_tlb_mm defeated it.  And
though that could have been changed at any time over the past few years,
now the chance vanishes as we push the page_table_lock downwards, and
perhaps split it per page table page.  Just delete that block of code.

Which leaves the mysterious spin_unlock_wait(&oldmm->page_table_lock)
in kernel/fork.c copy_mm.  Textual analysis (supported by Nick Piggin)
suggests that the comment was written by DaveM, and that it relates to
the defeated approach in the sparc64 smp_flush_tlb_pending.  Just delete
this block too.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/arch/sparc64/kernel/smp.c b/arch/sparc64/kernel/smp.c
index b137fd6..a9089e2 100644
--- a/arch/sparc64/kernel/smp.c
+++ b/arch/sparc64/kernel/smp.c
@@ -883,34 +883,13 @@
 	u32 ctx = CTX_HWBITS(mm->context);
 	int cpu = get_cpu();
 
-	if (mm == current->active_mm && atomic_read(&mm->mm_users) == 1) {
+	if (mm == current->active_mm && atomic_read(&mm->mm_users) == 1)
 		mm->cpu_vm_mask = cpumask_of_cpu(cpu);
-		goto local_flush_and_out;
-	} else {
-		/* This optimization is not valid.  Normally
-		 * we will be holding the page_table_lock, but
-		 * there is an exception which is copy_page_range()
-		 * when forking.  The lock is held during the individual
-		 * page table updates in the parent, but not at the
-		 * top level, which is where we are invoked.
-		 */
-		if (0) {
-			cpumask_t this_cpu_mask = cpumask_of_cpu(cpu);
+	else
+		smp_cross_call_masked(&xcall_flush_tlb_pending,
+				      ctx, nr, (unsigned long) vaddrs,
+				      mm->cpu_vm_mask);
 
-			/* By virtue of running under the mm->page_table_lock,
-			 * and mmu_context.h:switch_mm doing the same, the
-			 * following operation is safe.
-			 */
-			if (cpus_equal(mm->cpu_vm_mask, this_cpu_mask))
-				goto local_flush_and_out;
-		}
-	}
-
-	smp_cross_call_masked(&xcall_flush_tlb_pending,
-			      ctx, nr, (unsigned long) vaddrs,
-			      mm->cpu_vm_mask);
-
-local_flush_and_out:
 	__flush_tlb_pending(ctx, nr, vaddrs);
 
 	put_cpu();