md/raid10: fix deadlock with unaligned read during resync If the 'bio_split' path in raid10-read is used while resync/recovery is happening it is possible to deadlock. Fix this be elevating ->nr_waiting for the duration of both parts of the split request. This fixes a bug that has been present since 2.6.22 but has only started manifesting recently for unknown reasons. It is suitable for and -stable since then. Reported-by: Justin Bronder <jsbronder@gentoo.org> Tested-by: Justin Bronder <jsbronder@gentoo.org> Signed-off-by: NeilBrown <neilb@suse.de> Cc: stable@kernel.org

commit: 51e9ac77035a3dfcb6fc0a88a0d80b6f99b5edb1 [log] [tgz]
author: NeilBrown <neilb@suse.de> Sat Aug 07 21:17:00 2010 +1000
committer: NeilBrown <neilb@suse.de> Sat Aug 07 21:17:00 2010 +1000
tree: 94167223c5711c47169db672a0ec0d23a36208b9
parent: 69e51b449d383e97b1b9f890f8378c96e9e17346 [diff] [blame]
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 42e64e4..d1d6891 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c

@@ -825,11 +825,29 @@
 		 */
 		bp = bio_split(bio,
 			       chunk_sects - (bio->bi_sector & (chunk_sects - 1)) );
+
+		/* Each of these 'make_request' calls will call 'wait_barrier'.
+		 * If the first succeeds but the second blocks due to the resync
+		 * thread raising the barrier, we will deadlock because the
+		 * IO to the underlying device will be queued in generic_make_request
+		 * and will never complete, so will never reduce nr_pending.
+		 * So increment nr_waiting here so no new raise_barriers will
+		 * succeed, and so the second wait_barrier cannot block.
+		 */
+		spin_lock_irq(&conf->resync_lock);
+		conf->nr_waiting++;
+		spin_unlock_irq(&conf->resync_lock);
+
 		if (make_request(mddev, &bp->bio1))
 			generic_make_request(&bp->bio1);
 		if (make_request(mddev, &bp->bio2))
 			generic_make_request(&bp->bio2);
 
+		spin_lock_irq(&conf->resync_lock);
+		conf->nr_waiting--;
+		wake_up(&conf->wait_barrier);
+		spin_unlock_irq(&conf->resync_lock);
+
 		bio_pair_release(bp);
 		return 0;
 	bad_map:
commit	51e9ac77035a3dfcb6fc0a88a0d80b6f99b5edb1	[log] [tgz]
author	NeilBrown <neilb@suse.de>	Sat Aug 07 21:17:00 2010 +1000
committer	NeilBrown <neilb@suse.de>	Sat Aug 07 21:17:00 2010 +1000
tree	94167223c5711c47169db672a0ec0d23a36208b9
parent	69e51b449d383e97b1b9f890f8378c96e9e17346 [diff] [blame]