Btrfs: cleanup the read failure record after write or when the inode is freeing
After the data is written successfully, we should cleanup the read failure record
in that range because
- If we set data COW for the file, the range that the failure record pointed to is
mapped to a new place, so it is invalid.
- If we set no data COW for the file, and if there is no error during writting,
the corrupted data is corrected, so the failure record can be removed. And if
some errors happen on the mirrors, we also needn't worry about it because the
failure record will be recreated if we read the same place again.
Sometimes, we may fail to correct the data, so the failure records will be left
in the tree, we need free them when we free the inode or the memory leak happens.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index c3c3269..3c16a14 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2703,6 +2703,10 @@
goto out;
}
+ btrfs_free_io_failure_record(inode, ordered_extent->file_offset,
+ ordered_extent->file_offset +
+ ordered_extent->len - 1);
+
if (test_bit(BTRFS_ORDERED_TRUNCATED, &ordered_extent->flags)) {
truncated = true;
logical_len = ordered_extent->truncated_len;
@@ -4799,6 +4803,8 @@
/* do we really want it for ->i_nlink > 0 and zero btrfs_root_refs? */
btrfs_wait_ordered_range(inode, 0, (u64)-1);
+ btrfs_free_io_failure_record(inode, 0, (u64)-1);
+
if (root->fs_info->log_root_recovering) {
BUG_ON(test_bit(BTRFS_INODE_HAS_ORPHAN_ITEM,
&BTRFS_I(inode)->runtime_flags));