Mauro Carvalho Chehab | f0ba437 | 2019-06-12 14:52:43 -0300 | [diff] [blame] | 1 | ============= |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 2 | DM statistics |
| 3 | ============= |
| 4 | |
| 5 | Device Mapper supports the collection of I/O statistics on user-defined |
| 6 | regions of a DM device. If no regions are defined no statistics are |
| 7 | collected so there isn't any performance impact. Only bio-based DM |
| 8 | devices are currently supported. |
| 9 | |
| 10 | Each user-defined region specifies a starting sector, length and step. |
| 11 | Individual statistics will be collected for each step-sized area within |
| 12 | the range specified. |
| 13 | |
| 14 | The I/O statistics counters for each step-sized area of a region are |
Mauro Carvalho Chehab | f0ba437 | 2019-06-12 14:52:43 -0300 | [diff] [blame] | 15 | in the same format as `/sys/block/*/stat` or `/proc/diskstats` (see: |
Mauro Carvalho Chehab | 4f4cfa6 | 2019-06-27 14:56:51 -0300 | [diff] [blame] | 16 | Documentation/admin-guide/iostats.rst). But two extra counters (12 and 13) are |
Mikulas Patocka | dfcfac3 | 2015-06-09 17:22:05 -0400 | [diff] [blame] | 17 | provided: total time spent reading and writing. When the histogram |
| 18 | argument is used, the 14th parameter is reported that represents the |
| 19 | histogram of latencies. All these counters may be accessed by sending |
| 20 | the @stats_print message to the appropriate DM device via dmsetup. |
Mikulas Patocka | c96aec3 | 2015-06-09 17:21:39 -0400 | [diff] [blame] | 21 | |
| 22 | The reported times are in milliseconds and the granularity depends on |
| 23 | the kernel ticks. When the option precise_timestamps is used, the |
| 24 | reported times are in nanoseconds. |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 25 | |
| 26 | Each region has a corresponding unique identifier, which we call a |
| 27 | region_id, that is assigned when the region is created. The region_id |
| 28 | must be supplied when querying statistics about the region, deleting the |
| 29 | region, etc. Unique region_ids enable multiple userspace programs to |
| 30 | request and process statistics for the same DM device without stepping |
| 31 | on each other's data. |
| 32 | |
| 33 | The creation of DM statistics will allocate memory via kmalloc or |
| 34 | fallback to using vmalloc space. At most, 1/4 of the overall system |
| 35 | memory may be allocated by DM statistics. The admin can see how much |
Mauro Carvalho Chehab | f0ba437 | 2019-06-12 14:52:43 -0300 | [diff] [blame] | 36 | memory is used by reading: |
| 37 | |
| 38 | /sys/module/dm_mod/parameters/stats_current_allocated_bytes |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 39 | |
| 40 | Messages |
| 41 | ======== |
| 42 | |
Mauro Carvalho Chehab | f0ba437 | 2019-06-12 14:52:43 -0300 | [diff] [blame] | 43 | @stats_create <range> <step> [<number_of_optional_arguments> <optional_arguments>...] [<program_id> [<aux_data>]] |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 44 | Create a new region and return the region_id. |
| 45 | |
| 46 | <range> |
Mauro Carvalho Chehab | f0ba437 | 2019-06-12 14:52:43 -0300 | [diff] [blame] | 47 | "-" |
| 48 | whole device |
| 49 | "<start_sector>+<length>" |
| 50 | a range of <length> 512-byte sectors |
| 51 | starting with <start_sector>. |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 52 | |
| 53 | <step> |
Mauro Carvalho Chehab | f0ba437 | 2019-06-12 14:52:43 -0300 | [diff] [blame] | 54 | "<area_size>" |
| 55 | the range is subdivided into areas each containing |
| 56 | <area_size> sectors. |
| 57 | "/<number_of_areas>" |
| 58 | the range is subdivided into the specified |
| 59 | number of areas. |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 60 | |
Mikulas Patocka | c96aec3 | 2015-06-09 17:21:39 -0400 | [diff] [blame] | 61 | <number_of_optional_arguments> |
| 62 | The number of optional arguments |
| 63 | |
| 64 | <optional_arguments> |
Mauro Carvalho Chehab | f0ba437 | 2019-06-12 14:52:43 -0300 | [diff] [blame] | 65 | The following optional arguments are supported: |
| 66 | |
| 67 | precise_timestamps |
| 68 | use precise timer with nanosecond resolution |
Mikulas Patocka | c96aec3 | 2015-06-09 17:21:39 -0400 | [diff] [blame] | 69 | instead of the "jiffies" variable. When this argument is |
| 70 | used, the resulting times are in nanoseconds instead of |
| 71 | milliseconds. Precise timestamps are a little bit slower |
| 72 | to obtain than jiffies-based timestamps. |
Mauro Carvalho Chehab | f0ba437 | 2019-06-12 14:52:43 -0300 | [diff] [blame] | 73 | histogram:n1,n2,n3,n4,... |
| 74 | collect histogram of latencies. The |
Mikulas Patocka | dfcfac3 | 2015-06-09 17:22:05 -0400 | [diff] [blame] | 75 | numbers n1, n2, etc are times that represent the boundaries |
| 76 | of the histogram. If precise_timestamps is not used, the |
| 77 | times are in milliseconds, otherwise they are in |
| 78 | nanoseconds. For each range, the kernel will report the |
| 79 | number of requests that completed within this range. For |
| 80 | example, if we use "histogram:10,20,30", the kernel will |
| 81 | report four numbers a:b:c:d. a is the number of requests |
| 82 | that took 0-10 ms to complete, b is the number of requests |
| 83 | that took 10-20 ms to complete, c is the number of requests |
| 84 | that took 20-30 ms to complete and d is the number of |
| 85 | requests that took more than 30 ms to complete. |
Mikulas Patocka | c96aec3 | 2015-06-09 17:21:39 -0400 | [diff] [blame] | 86 | |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 87 | <program_id> |
| 88 | An optional parameter. A name that uniquely identifies |
| 89 | the userspace owner of the range. This groups ranges together |
| 90 | so that userspace programs can identify the ranges they |
| 91 | created and ignore those created by others. |
| 92 | The kernel returns this string back in the output of |
| 93 | @stats_list message, but it doesn't use it for anything else. |
Mikulas Patocka | c96aec3 | 2015-06-09 17:21:39 -0400 | [diff] [blame] | 94 | If we omit the number of optional arguments, program id must not |
| 95 | be a number, otherwise it would be interpreted as the number of |
| 96 | optional arguments. |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 97 | |
| 98 | <aux_data> |
| 99 | An optional parameter. A word that provides auxiliary data |
| 100 | that is useful to the client program that created the range. |
| 101 | The kernel returns this string back in the output of |
| 102 | @stats_list message, but it doesn't use this value for anything. |
| 103 | |
| 104 | @stats_delete <region_id> |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 105 | Delete the region with the specified id. |
| 106 | |
| 107 | <region_id> |
| 108 | region_id returned from @stats_create |
| 109 | |
| 110 | @stats_clear <region_id> |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 111 | Clear all the counters except the in-flight i/o counters. |
| 112 | |
| 113 | <region_id> |
| 114 | region_id returned from @stats_create |
| 115 | |
| 116 | @stats_list [<program_id>] |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 117 | List all regions registered with @stats_create. |
| 118 | |
| 119 | <program_id> |
| 120 | An optional parameter. |
| 121 | If this parameter is specified, only matching regions |
| 122 | are returned. |
| 123 | If it is not specified, all regions are returned. |
| 124 | |
| 125 | Output format: |
| 126 | <region_id>: <start_sector>+<length> <step> <program_id> <aux_data> |
Mikulas Patocka | bd49784 | 2015-08-18 16:26:16 -0400 | [diff] [blame] | 127 | precise_timestamps histogram:n1,n2,n3,... |
| 128 | |
| 129 | The strings "precise_timestamps" and "histogram" are printed only |
| 130 | if they were specified when creating the region. |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 131 | |
| 132 | @stats_print <region_id> [<starting_line> <number_of_lines>] |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 133 | Print counters for each step-sized area of a region. |
| 134 | |
| 135 | <region_id> |
| 136 | region_id returned from @stats_create |
| 137 | |
| 138 | <starting_line> |
| 139 | The index of the starting line in the output. |
| 140 | If omitted, all lines are returned. |
| 141 | |
| 142 | <number_of_lines> |
| 143 | The number of lines to include in the output. |
| 144 | If omitted, all lines are returned. |
| 145 | |
| 146 | Output format for each step-sized area of a region: |
| 147 | |
Mauro Carvalho Chehab | f0ba437 | 2019-06-12 14:52:43 -0300 | [diff] [blame] | 148 | <start_sector>+<length> |
| 149 | counters |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 150 | |
| 151 | The first 11 counters have the same meaning as |
Mauro Carvalho Chehab | f0ba437 | 2019-06-12 14:52:43 -0300 | [diff] [blame] | 152 | `/sys/block/*/stat or /proc/diskstats`. |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 153 | |
Mauro Carvalho Chehab | 4f4cfa6 | 2019-06-27 14:56:51 -0300 | [diff] [blame] | 154 | Please refer to Documentation/admin-guide/iostats.rst for details. |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 155 | |
| 156 | 1. the number of reads completed |
| 157 | 2. the number of reads merged |
| 158 | 3. the number of sectors read |
| 159 | 4. the number of milliseconds spent reading |
| 160 | 5. the number of writes completed |
| 161 | 6. the number of writes merged |
| 162 | 7. the number of sectors written |
| 163 | 8. the number of milliseconds spent writing |
| 164 | 9. the number of I/Os currently in progress |
| 165 | 10. the number of milliseconds spent doing I/Os |
| 166 | 11. the weighted number of milliseconds spent doing I/Os |
| 167 | |
| 168 | Additional counters: |
Mauro Carvalho Chehab | f0ba437 | 2019-06-12 14:52:43 -0300 | [diff] [blame] | 169 | |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 170 | 12. the total time spent reading in milliseconds |
| 171 | 13. the total time spent writing in milliseconds |
| 172 | |
| 173 | @stats_print_clear <region_id> [<starting_line> <number_of_lines>] |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 174 | Atomically print and then clear all the counters except the |
| 175 | in-flight i/o counters. Useful when the client consuming the |
| 176 | statistics does not want to lose any statistics (those updated |
| 177 | between printing and clearing). |
| 178 | |
| 179 | <region_id> |
| 180 | region_id returned from @stats_create |
| 181 | |
| 182 | <starting_line> |
| 183 | The index of the starting line in the output. |
| 184 | If omitted, all lines are printed and then cleared. |
| 185 | |
| 186 | <number_of_lines> |
| 187 | The number of lines to process. |
| 188 | If omitted, all lines are printed and then cleared. |
| 189 | |
| 190 | @stats_set_aux <region_id> <aux_data> |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 191 | Store auxiliary data aux_data for the specified region. |
| 192 | |
| 193 | <region_id> |
| 194 | region_id returned from @stats_create |
| 195 | |
| 196 | <aux_data> |
| 197 | The string that identifies data which is useful to the client |
| 198 | program that created the range. The kernel returns this |
| 199 | string back in the output of @stats_list message, but it |
| 200 | doesn't use this value for anything. |
| 201 | |
| 202 | Examples |
| 203 | ======== |
| 204 | |
| 205 | Subdivide the DM device 'vol' into 100 pieces and start collecting |
Mauro Carvalho Chehab | f0ba437 | 2019-06-12 14:52:43 -0300 | [diff] [blame] | 206 | statistics on them:: |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 207 | |
| 208 | dmsetup message vol 0 @stats_create - /100 |
| 209 | |
Eric Engestrom | 52813d4 | 2016-04-25 01:24:03 +0100 | [diff] [blame] | 210 | Set the auxiliary data string to "foo bar baz" (the escape for each |
Mauro Carvalho Chehab | f0ba437 | 2019-06-12 14:52:43 -0300 | [diff] [blame] | 211 | space must also be escaped, otherwise the shell will consume them):: |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 212 | |
| 213 | dmsetup message vol 0 @stats_set_aux 0 foo\\ bar\\ baz |
| 214 | |
Mauro Carvalho Chehab | f0ba437 | 2019-06-12 14:52:43 -0300 | [diff] [blame] | 215 | List the statistics:: |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 216 | |
| 217 | dmsetup message vol 0 @stats_list |
| 218 | |
Mauro Carvalho Chehab | f0ba437 | 2019-06-12 14:52:43 -0300 | [diff] [blame] | 219 | Print the statistics:: |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 220 | |
| 221 | dmsetup message vol 0 @stats_print 0 |
| 222 | |
Mauro Carvalho Chehab | f0ba437 | 2019-06-12 14:52:43 -0300 | [diff] [blame] | 223 | Delete the statistics:: |
Mikulas Patocka | fd2ed4d | 2013-08-16 10:54:23 -0400 | [diff] [blame] | 224 | |
| 225 | dmsetup message vol 0 @stats_delete 0 |