md/raid5: allow each slot to have an extra replacement device
Just enhance data structures to record a second device per slot to be
used as a 'replacement' device, replacing the original.
We also have a second bio in each slot in each stripe_head. This will
only be used when writing to the array - we need to write to both the
original and the replacement at the same time, so will need two bios.
For now, only try using the replacement drive for aligned-reads.
In this case, we prefer the replacement if it has been recovered far
enough, otherwise use the original.
This includes a small enhancement. Previously we would only do
aligned reads if the target device was fully recovered. Now we also
do them if it has recovered far enough.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 6b9fc58..94bc35b 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3594,6 +3594,7 @@
int dd_idx;
struct bio* align_bi;
struct md_rdev *rdev;
+ sector_t end_sector;
if (!in_chunk_boundary(mddev, raid_bio)) {
pr_debug("chunk_aligned_read : non aligned\n");
@@ -3618,9 +3619,19 @@
0,
&dd_idx, NULL);
+ end_sector = align_bi->bi_sector + (align_bi->bi_size >> 9);
rcu_read_lock();
- rdev = rcu_dereference(conf->disks[dd_idx].rdev);
- if (rdev && test_bit(In_sync, &rdev->flags)) {
+ rdev = rcu_dereference(conf->disks[dd_idx].replacement);
+ if (!rdev || test_bit(Faulty, &rdev->flags) ||
+ rdev->recovery_offset < end_sector) {
+ rdev = rcu_dereference(conf->disks[dd_idx].rdev);
+ if (rdev &&
+ (test_bit(Faulty, &rdev->flags) ||
+ !(test_bit(In_sync, &rdev->flags) ||
+ rdev->recovery_offset >= end_sector)))
+ rdev = NULL;
+ }
+ if (rdev) {
sector_t first_bad;
int bad_sectors;