md: be extra careful not to take a reference to a Faulty device.
It is important that we never increment rdev->nr_pending on a Faulty
device as ->hot_remove_disk() assumes that once the Faulty flag is visible
no code will take a new reference.
Some places take a new reference after only check In_sync. This should
be safe as the two are changed together. However to make the code more
obviously safe, add checks for 'Faulty' as well.
Note: the actual rule is:
Never increment nr_pending if Faulty is set and Blocked is clear,
never clear Faulty, and never set Blocked without holding a reference
through nr_pending.
fix build error (Shaohua)
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index c07b22e..f6a191a 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3080,7 +3080,8 @@
struct md_rdev *rdev;
rcu_read_lock();
rdev = rcu_dereference(conf->disks[i].rdev);
- if (rdev && test_bit(In_sync, &rdev->flags))
+ if (rdev && test_bit(In_sync, &rdev->flags) &&
+ !test_bit(Faulty, &rdev->flags))
atomic_inc(&rdev->nr_pending);
else
rdev = NULL;