md/r5cache: handle FLUSH and FUA
With raid5 cache, we committing data from journal device. When
there is flush request, we need to flush journal device's cache.
This was not needed in raid5 journal, because we will flush the
journal before committing data to raid disks.
This is similar to FUA, except that we also need flush journal for
FUA. Otherwise, corruptions in earlier meta data will stop recovery
from reaching FUA data.
slightly changed the code by Shaohua
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index aa4968c..dbab8c7 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -5248,6 +5248,7 @@ static void raid5_make_request(struct mddev *mddev, struct bio * bi)
int remaining;
DEFINE_WAIT(w);
bool do_prepare;
+ bool do_flush = false;
if (unlikely(bi->bi_opf & REQ_PREFLUSH)) {
int ret = r5l_handle_flush_request(conf->log, bi);
@@ -5259,6 +5260,11 @@ static void raid5_make_request(struct mddev *mddev, struct bio * bi)
return;
}
/* ret == -EAGAIN, fallback */
+ /*
+ * if r5l_handle_flush_request() didn't clear REQ_PREFLUSH,
+ * we need to flush journal device
+ */
+ do_flush = bi->bi_opf & REQ_PREFLUSH;
}
md_write_start(mddev, bi);
@@ -5398,6 +5404,12 @@ static void raid5_make_request(struct mddev *mddev, struct bio * bi)
do_prepare = true;
goto retry;
}
+ if (do_flush) {
+ set_bit(STRIPE_R5C_PREFLUSH, &sh->state);
+ /* we only need flush for one stripe */
+ do_flush = false;
+ }
+
set_bit(STRIPE_HANDLE, &sh->state);
clear_bit(STRIPE_DELAYED, &sh->state);
if ((!sh->batch_head || sh == sh->batch_head) &&