btrfs: proper error handling when invalid device is found in find_next_devid
In a corrupted tree, if search for next devid finds the device with
devid = -1, then report the error -EUCLEAN back to the parent function
to fail gracefully.
The tree checker will not catch this in case the devids are created
using the following script:
umount /btrfs
dev1=/dev/sdb
dev2=/dev/sdc
mkfs.btrfs -fq -dsingle -msingle $dev1
mount $dev1 /btrfs
_fail()
{
echo $1
exit 1
}
while true; do
btrfs dev add -f $dev2 /btrfs || _fail "add failed"
btrfs dev del $dev1 /btrfs || _fail "del failed"
dev_tmp=$dev1
dev1=$dev2
dev2=$dev_tmp
done
With output:
BTRFS critical (device sdb): corrupt leaf: root=3 block=313739198464 slot=1 devid=1 invalid devid: has=507 expect=[0, 506]
BTRFS error (device sdb): block=313739198464 write time tree block corruption detected
BTRFS: error (device sdb) in btrfs_commit_transaction:2268: errno=-5 IO failure (Error while writing out transaction)
BTRFS warning (device sdb): Skipping commit of aborted transaction.
BTRFS: error (device sdb) in cleanup_transaction:1827: errno=-5 IO failure
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
[ add script and messages ]
Signed-off-by: David Sterba <dsterba@suse.com>
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index e2de7c7..c7a08fe 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1849,7 +1849,12 @@ static noinline int find_next_devid(struct btrfs_fs_info *fs_info,
if (ret < 0)
goto error;
- BUG_ON(ret == 0); /* Corruption */
+ if (ret == 0) {
+ /* Corruption */
+ btrfs_err(fs_info, "corrupted chunk tree devid -1 matched");
+ ret = -EUCLEAN;
+ goto error;
+ }
ret = btrfs_previous_item(fs_info->chunk_root, path,
BTRFS_DEV_ITEMS_OBJECTID,