btrfs: Fix memory barriers usage with device stats counters

Commit addc3fa74e5b ("Btrfs: Fix the problem that the dirty flag of dev
stats is cleared") reworked the way device stats changes are tracked. A
new atomic dev_stats_ccnt counter was introduced which is incremented
every time any of the device stats counters are changed. This serves as
a flag whether there are any pending stats changes. However, this patch
only partially implemented the correct memory barriers necessary:

- It only ordered the stores to the counters but not the reads e.g.
  btrfs_run_dev_stats
- It completely omitted any comments documenting the intended design and
  how the memory barriers pair with each-other

This patch provides the necessary comments as well as adds a missing
smp_rmb in btrfs_run_dev_stats. Furthermore since dev_stats_cnt is only
a snapshot at best there was no point in reading the counter twice -
once in btrfs_dev_stats_dirty and then again when assigning stats_cnt.
Just collapse both reads into 1.

Fixes: addc3fa74e5b ("Btrfs: Fix the problem that the dirty flag of dev stats is cleared")
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index baf047b..c96635a 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -7080,10 +7080,24 @@ int btrfs_run_dev_stats(struct btrfs_trans_handle *trans,
 
 	mutex_lock(&fs_devices->device_list_mutex);
 	list_for_each_entry(device, &fs_devices->devices, dev_list) {
-		if (!device->dev_stats_valid || !btrfs_dev_stats_dirty(device))
+		stats_cnt = atomic_read(&device->dev_stats_ccnt);
+		if (!device->dev_stats_valid || stats_cnt == 0)
 			continue;
 
-		stats_cnt = atomic_read(&device->dev_stats_ccnt);
+
+		/*
+		 * There is a LOAD-LOAD control dependency between the value of
+		 * dev_stats_ccnt and updating the on-disk values which requires
+		 * reading the in-memory counters. Such control dependencies
+		 * require explicit read memory barriers.
+		 *
+		 * This memory barriers pairs with smp_mb__before_atomic in
+		 * btrfs_dev_stat_inc/btrfs_dev_stat_set and with the full
+		 * barrier implied by atomic_xchg in
+		 * btrfs_dev_stats_read_and_reset
+		 */
+		smp_rmb();
+
 		ret = update_dev_stat_item(trans, fs_info, device);
 		if (!ret)
 			atomic_sub(stats_cnt, &device->dev_stats_ccnt);