[PATCH] can_share_swap_page: use page_mapcount
Remember that ironic get_user_pages race? when the raised page_count on a
page swapped out led do_wp_page to decide that it had to copy on write, so
substituted a different page into userspace. 2.6.7 onwards have Andrea's
solution, where try_to_unmap_one backs out if it finds page_count raised.
Which works, but is unsatisfying (rmap.c has no other page_count heuristics),
and was found a few months ago to hang an intensive page migration test. A
year ago I was hesitant to engage page_mapcount, now it seems the right fix.
So remove the page_count hack from try_to_unmap_one; and use activate_page in
unuse_mm when dropping lock, to replace its secondary effect of helping
swapoff to make progress in that case.
Simplify can_share_swap_page (now called only on anonymous pages) to check
page_mapcount + page_swapcount == 1: still needs the page lock to stabilize
their (pessimistic) sum, but does not need swapper_space.tree_lock for that.
In do_swap_page, move swap_free and unlock_page below page_add_anon_rmap, to
keep sum on the high side, and correct when can_share_swap_page called.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
diff --git a/mm/memory.c b/mm/memory.c
index 1c0a3db..da91b7b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1686,10 +1686,6 @@
}
/* The page isn't present yet, go ahead with the fault. */
-
- swap_free(entry);
- if (vm_swap_full())
- remove_exclusive_swap_page(page);
inc_mm_counter(mm, rss);
pte = mk_pte(page, vma->vm_page_prot);
@@ -1697,12 +1693,16 @@
pte = maybe_mkwrite(pte_mkdirty(pte), vma);
write_access = 0;
}
- unlock_page(page);
flush_icache_page(vma, page);
set_pte_at(mm, address, page_table, pte);
page_add_anon_rmap(page, vma, address);
+ swap_free(entry);
+ if (vm_swap_full())
+ remove_exclusive_swap_page(page);
+ unlock_page(page);
+
if (write_access) {
if (do_wp_page(mm, vma, address,
page_table, pmd, pte) == VM_FAULT_OOM)