mac80211: mesh: fix crash in mesh_path_timer
The mesh_path_reclaim() function, called from an rcu callback, cancels
the mesh_path_timer associated with a mesh path. Unfortunately, this
call can happen much later, perhaps after the hash table itself is
destroyed.
Such a situation led to the following crash in mesh_path_send_to_gates()
when dereferencing the tbl pointer:
[ 23.901661] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[ 23.905516] IP: [<ffffffff814c910b>] mesh_path_send_to_gates+0x2b/0x740
[ 23.908757] PGD 99ca067 PUD 99c4067 PMD 0
[ 23.910789] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 23.913485] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.5.0-rc6-wt+ #43
[ 23.916675] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
[ 23.920471] task: ffffffff81685500 ti: ffffffff81678000 task.ti: ffffffff81678000
[ 23.922619] RIP: 0010:[<ffffffff814c910b>] [<ffffffff814c910b>] mesh_path_send_to_gates+0x2b/0x740
[ 23.925237] RSP: 0018:ffff88000b403d30 EFLAGS: 00010286
[ 23.926739] RAX: 0000000000000000 RBX: ffff880009bc0d20 RCX: 0000000000000102
[ 23.928796] RDX: 000000000000002e RSI: 0000000000000001 RDI: ffff880009bc0d20
[ 23.930895] RBP: ffff88000b403e18 R08: 0000000000000001 R09: 0000000000000001
[ 23.932917] R10: 0000000000000000 R11: 0000000000000001 R12: ffff880009c20940
[ 23.936370] R13: ffff880009bc0e70 R14: ffff880009c21c40 R15: ffff880009bc0d20
[ 23.939823] FS: 0000000000000000(0000) GS:ffff88000b400000(0000) knlGS:0000000000000000
[ 23.943688] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 23.946429] CR2: 0000000000000008 CR3: 00000000099c5000 CR4: 00000000000006b0
[ 23.949861] Stack:
[ 23.950840] 000000000000002e ffff880009c20940 ffff88000b403da8 ffffffff8109e551
[ 23.954467] ffffffff82711be2 000000000000002e 0000000000000000 ffffffff8166a5f5
[ 23.958141] 0000000000685ce8 0000000000000246 ffff880009bc0d20 ffff880009c20940
[ 23.961801] Call Trace:
[ 23.962987] <IRQ>
[ 23.963963] [<ffffffff8109e551>] ? vprintk_emit+0x351/0x5e0
[ 23.966782] [<ffffffff8109e8ff>] ? vprintk_default+0x1f/0x30
[ 23.969529] [<ffffffff810ffa41>] ? printk+0x48/0x50
[ 23.971956] [<ffffffff814ceef3>] mesh_path_timer+0x133/0x160
[ 23.974707] [<ffffffff814cedc0>] ? mesh_nexthop_resolve+0x230/0x230
[ 23.977775] [<ffffffff810b04ee>] call_timer_fn+0xce/0x330
[ 23.980448] [<ffffffff810b0425>] ? call_timer_fn+0x5/0x330
[ 23.983126] [<ffffffff814cedc0>] ? mesh_nexthop_resolve+0x230/0x230
[ 23.986091] [<ffffffff810b097c>] run_timer_softirq+0x22c/0x390
Instead of cancelling in the RCU callback, set a new flag to prevent the
timer from being rearmed, and then cancel the timer synchronously when
freeing the mesh path. This leaves mesh_path_reclaim() doing nothing
but kfree, so switch to kfree_rcu().
Fixes: 3b302ada7f0a ("mac80211: mesh: move path tables into if_mesh")
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c
index 5b6aec1..2748cf6 100644
--- a/net/mac80211/mesh_hwmp.c
+++ b/net/mac80211/mesh_hwmp.c
@@ -1012,6 +1012,10 @@
goto enddiscovery;
spin_lock_bh(&mpath->state_lock);
+ if (mpath->flags & MESH_PATH_DELETED) {
+ spin_unlock_bh(&mpath->state_lock);
+ goto enddiscovery;
+ }
mpath->flags &= ~MESH_PATH_REQ_QUEUED;
if (preq_node->flags & PREQ_Q_F_START) {
if (mpath->flags & MESH_PATH_RESOLVING) {