Nitin Gupta | 00ac9ba | 2010-06-01 13:31:26 +0530 | [diff] [blame] | 1 | zram: Compressed RAM based block devices |
| 2 | ---------------------------------------- |
Nitin Gupta | 47f9afb | 2009-09-22 10:26:54 +0530 | [diff] [blame] | 3 | |
Nitin Gupta | 47f9afb | 2009-09-22 10:26:54 +0530 | [diff] [blame] | 4 | * Introduction |
| 5 | |
Nitin Gupta | 9b9913d | 2010-08-09 22:56:55 +0530 | [diff] [blame] | 6 | The zram module creates RAM based block devices named /dev/zram<id> |
| 7 | (<id> = 0, 1, ...). Pages written to these disks are compressed and stored |
| 8 | in memory itself. These disks allow very fast I/O and compression provides |
| 9 | good amounts of memory savings. Some of the usecases include /tmp storage, |
| 10 | use as swap disks, various caches under /var and maybe many more :) |
Nitin Gupta | 47f9afb | 2009-09-22 10:26:54 +0530 | [diff] [blame] | 11 | |
Nitin Gupta | 9b9913d | 2010-08-09 22:56:55 +0530 | [diff] [blame] | 12 | Statistics for individual zram devices are exported through sysfs nodes at |
| 13 | /sys/block/zram<id>/ |
Nitin Gupta | 47f9afb | 2009-09-22 10:26:54 +0530 | [diff] [blame] | 14 | |
| 15 | * Usage |
| 16 | |
Sergey SENOZHATSKY | 3657c20 | 2015-09-24 18:56:41 +0900 | [diff] [blame] | 17 | There are several ways to configure and manage zram device(-s): |
| 18 | a) using zram and zram_control sysfs attributes |
| 19 | b) using zramctl utility, provided by util-linux (util-linux@vger.kernel.org). |
| 20 | |
| 21 | In this document we will describe only 'manual' zram configuration steps, |
| 22 | IOW, zram and zram_control sysfs attributes. |
| 23 | |
| 24 | In order to get a better idea about zramctl please consult util-linux |
| 25 | documentation, zramctl man-page or `zramctl --help'. Please be informed |
| 26 | that zram maintainers do not develop/maintain util-linux or zramctl, should |
| 27 | you have any questions please contact util-linux@vger.kernel.org |
| 28 | |
Nitin Gupta | 00ac9ba | 2010-06-01 13:31:26 +0530 | [diff] [blame] | 29 | Following shows a typical sequence of steps for using zram. |
Nitin Gupta | 47f9afb | 2009-09-22 10:26:54 +0530 | [diff] [blame] | 30 | |
Sergey SENOZHATSKY | 3657c20 | 2015-09-24 18:56:41 +0900 | [diff] [blame] | 31 | WARNING |
| 32 | ======= |
| 33 | For the sake of simplicity we skip error checking parts in most of the |
| 34 | examples below. However, it is your sole responsibility to handle errors. |
| 35 | |
| 36 | zram sysfs attributes always return negative values in case of errors. |
| 37 | The list of possible return codes: |
| 38 | -EBUSY -- an attempt to modify an attribute that cannot be changed once |
| 39 | the device has been initialised. Please reset device first; |
| 40 | -ENOMEM -- zram was not able to allocate enough memory to fulfil your |
| 41 | needs; |
| 42 | -EINVAL -- invalid input has been provided. |
| 43 | |
| 44 | If you use 'echo', the returned value that is changed by 'echo' utility, |
| 45 | and, in general case, something like: |
| 46 | |
| 47 | echo 3 > /sys/block/zram0/max_comp_streams |
| 48 | if [ $? -ne 0 ]; |
| 49 | handle_error |
| 50 | fi |
| 51 | |
| 52 | should suffice. |
| 53 | |
Nitin Gupta | 9b9913d | 2010-08-09 22:56:55 +0530 | [diff] [blame] | 54 | 1) Load Module: |
Nitin Gupta | 00ac9ba | 2010-06-01 13:31:26 +0530 | [diff] [blame] | 55 | modprobe zram num_devices=4 |
Nitin Gupta | 9b9913d | 2010-08-09 22:56:55 +0530 | [diff] [blame] | 56 | This creates 4 devices: /dev/zram{0,1,2,3} |
Sergey Senozhatsky | c3cdb40 | 2015-06-25 15:00:11 -0700 | [diff] [blame] | 57 | |
| 58 | num_devices parameter is optional and tells zram how many devices should be |
| 59 | pre-created. Default: 1. |
Nitin Gupta | 47f9afb | 2009-09-22 10:26:54 +0530 | [diff] [blame] | 60 | |
Sergey Senozhatsky | beca3ec | 2014-04-07 15:38:14 -0700 | [diff] [blame] | 61 | 2) Set max number of compression streams |
Sergey Senozhatsky | 69a30a8 | 2016-07-26 15:22:51 -0700 | [diff] [blame] | 62 | Regardless the value passed to this attribute, ZRAM will always |
| 63 | allocate multiple compression streams - one per online CPUs - thus |
| 64 | allowing several concurrent compression operations. The number of |
| 65 | allocated compression streams goes down when some of the CPUs |
| 66 | become offline. There is no single-compression-stream mode anymore, |
| 67 | unless you are running a UP system or has only 1 CPU online. |
Sergey Senozhatsky | beca3ec | 2014-04-07 15:38:14 -0700 | [diff] [blame] | 68 | |
Sergey Senozhatsky | 69a30a8 | 2016-07-26 15:22:51 -0700 | [diff] [blame] | 69 | To find out how many streams are currently available: |
Sergey Senozhatsky | beca3ec | 2014-04-07 15:38:14 -0700 | [diff] [blame] | 70 | cat /sys/block/zram0/max_comp_streams |
| 71 | |
Sergey Senozhatsky | e46b8a0 | 2014-04-07 15:38:17 -0700 | [diff] [blame] | 72 | 3) Select compression algorithm |
Sergey Senozhatsky | 69a30a8 | 2016-07-26 15:22:51 -0700 | [diff] [blame] | 73 | Using comp_algorithm device attribute one can see available and |
| 74 | currently selected (shown in square brackets) compression algorithms, |
| 75 | change selected compression algorithm (once the device is initialised |
| 76 | there is no way to change compression algorithm). |
Sergey Senozhatsky | e46b8a0 | 2014-04-07 15:38:17 -0700 | [diff] [blame] | 77 | |
Sergey Senozhatsky | 69a30a8 | 2016-07-26 15:22:51 -0700 | [diff] [blame] | 78 | Examples: |
Sergey Senozhatsky | e46b8a0 | 2014-04-07 15:38:17 -0700 | [diff] [blame] | 79 | #show supported compression algorithms |
| 80 | cat /sys/block/zram0/comp_algorithm |
| 81 | lzo [lz4] |
| 82 | |
| 83 | #select lzo compression algorithm |
| 84 | echo lzo > /sys/block/zram0/comp_algorithm |
| 85 | |
Sergey Senozhatsky | 69a30a8 | 2016-07-26 15:22:51 -0700 | [diff] [blame] | 86 | For the time being, the `comp_algorithm' content does not necessarily |
| 87 | show every compression algorithm supported by the kernel. We keep this |
| 88 | list primarily to simplify device configuration and one can configure |
| 89 | a new device with a compression algorithm that is not listed in |
| 90 | `comp_algorithm'. The thing is that, internally, ZRAM uses Crypto API |
| 91 | and, if some of the algorithms were built as modules, it's impossible |
| 92 | to list all of them using, for instance, /proc/crypto or any other |
| 93 | method. This, however, has an advantage of permitting the usage of |
| 94 | custom crypto compression modules (implementing S/W or H/W compression). |
Sergey Senozhatsky | 415403b | 2016-07-26 15:22:48 -0700 | [diff] [blame] | 95 | |
Sergey Senozhatsky | e46b8a0 | 2014-04-07 15:38:17 -0700 | [diff] [blame] | 96 | 4) Set Disksize |
Sergey Senozhatsky | 69a30a8 | 2016-07-26 15:22:51 -0700 | [diff] [blame] | 97 | Set disk size by writing the value to sysfs node 'disksize'. |
| 98 | The value can be either in bytes or you can use mem suffixes. |
| 99 | Examples: |
| 100 | # Initialize /dev/zram0 with 50MB disksize |
| 101 | echo $((50*1024*1024)) > /sys/block/zram0/disksize |
Nitin Gupta | 47f9afb | 2009-09-22 10:26:54 +0530 | [diff] [blame] | 102 | |
Sergey Senozhatsky | 69a30a8 | 2016-07-26 15:22:51 -0700 | [diff] [blame] | 103 | # Using mem suffixes |
| 104 | echo 256K > /sys/block/zram0/disksize |
| 105 | echo 512M > /sys/block/zram0/disksize |
| 106 | echo 1G > /sys/block/zram0/disksize |
Nitin Gupta | 47f9afb | 2009-09-22 10:26:54 +0530 | [diff] [blame] | 107 | |
Sergey Senozhatsky | e64cd51 | 2014-04-07 15:38:07 -0700 | [diff] [blame] | 108 | Note: |
| 109 | There is little point creating a zram of greater than twice the size of memory |
| 110 | since we expect a 2:1 compression ratio. Note that zram uses about 0.1% of the |
| 111 | size of the disk when not in use so a huge zram is wasteful. |
| 112 | |
Minchan Kim | 9ada9da | 2014-10-09 15:29:53 -0700 | [diff] [blame] | 113 | 5) Set memory limit: Optional |
Sergey Senozhatsky | 69a30a8 | 2016-07-26 15:22:51 -0700 | [diff] [blame] | 114 | Set memory limit by writing the value to sysfs node 'mem_limit'. |
| 115 | The value can be either in bytes or you can use mem suffixes. |
| 116 | In addition, you could change the value in runtime. |
| 117 | Examples: |
| 118 | # limit /dev/zram0 with 50MB memory |
| 119 | echo $((50*1024*1024)) > /sys/block/zram0/mem_limit |
Minchan Kim | 9ada9da | 2014-10-09 15:29:53 -0700 | [diff] [blame] | 120 | |
Sergey Senozhatsky | 69a30a8 | 2016-07-26 15:22:51 -0700 | [diff] [blame] | 121 | # Using mem suffixes |
| 122 | echo 256K > /sys/block/zram0/mem_limit |
| 123 | echo 512M > /sys/block/zram0/mem_limit |
| 124 | echo 1G > /sys/block/zram0/mem_limit |
Minchan Kim | 9ada9da | 2014-10-09 15:29:53 -0700 | [diff] [blame] | 125 | |
Sergey Senozhatsky | 69a30a8 | 2016-07-26 15:22:51 -0700 | [diff] [blame] | 126 | # To disable memory limit |
| 127 | echo 0 > /sys/block/zram0/mem_limit |
Minchan Kim | 9ada9da | 2014-10-09 15:29:53 -0700 | [diff] [blame] | 128 | |
| 129 | 6) Activate: |
Nitin Gupta | 00ac9ba | 2010-06-01 13:31:26 +0530 | [diff] [blame] | 130 | mkswap /dev/zram0 |
| 131 | swapon /dev/zram0 |
| 132 | |
| 133 | mkfs.ext4 /dev/zram1 |
| 134 | mount /dev/zram1 /tmp |
Nitin Gupta | 47f9afb | 2009-09-22 10:26:54 +0530 | [diff] [blame] | 135 | |
Sergey Senozhatsky | 6566d1a | 2015-06-25 15:00:24 -0700 | [diff] [blame] | 136 | 7) Add/remove zram devices |
| 137 | |
| 138 | zram provides a control interface, which enables dynamic (on-demand) device |
| 139 | addition and removal. |
| 140 | |
| 141 | In order to add a new /dev/zramX device, perform read operation on hot_add |
| 142 | attribute. This will return either new device's device id (meaning that you |
| 143 | can use /dev/zram<id>) or error code. |
| 144 | |
| 145 | Example: |
| 146 | cat /sys/class/zram-control/hot_add |
| 147 | 1 |
| 148 | |
| 149 | To remove the existing /dev/zramX device (where X is a device id) |
| 150 | execute |
| 151 | echo X > /sys/class/zram-control/hot_remove |
| 152 | |
| 153 | 8) Stats: |
Sergey Senozhatsky | 77ba015 | 2015-04-15 16:16:00 -0700 | [diff] [blame] | 154 | Per-device statistics are exported as various nodes under /sys/block/zram<id>/ |
| 155 | |
Sergey SENOZHATSKY | 3657c20 | 2015-09-24 18:56:41 +0900 | [diff] [blame] | 156 | A brief description of exported device attributes. For more details please |
Sergey Senozhatsky | 77ba015 | 2015-04-15 16:16:00 -0700 | [diff] [blame] | 157 | read Documentation/ABI/testing/sysfs-block-zram. |
| 158 | |
| 159 | Name access description |
| 160 | ---- ------ ----------- |
| 161 | disksize RW show and set the device's disk size |
| 162 | initstate RO shows the initialization state of the device |
| 163 | reset WO trigger device reset |
Sergey Senozhatsky | c87d165 | 2017-02-22 15:46:45 -0800 | [diff] [blame] | 164 | mem_used_max WO reset the `mem_used_max' counter (see later) |
| 165 | mem_limit WO specifies the maximum amount of memory ZRAM can use |
| 166 | to store the compressed data |
Sergey Senozhatsky | 77ba015 | 2015-04-15 16:16:00 -0700 | [diff] [blame] | 167 | max_comp_streams RW the number of possible concurrent compress operations |
| 168 | comp_algorithm RW show and change the compression algorithm |
Sergey Senozhatsky | 3d8ed88 | 2015-06-25 15:00:00 -0700 | [diff] [blame] | 169 | compact WO trigger memory compaction |
Sergey Senozhatsky | 623e47f | 2016-05-20 17:00:02 -0700 | [diff] [blame] | 170 | debug_stat RO this file is used for zram debugging purposes |
Minchan Kim | 5a47074 | 2017-09-06 16:20:10 -0700 | [diff] [blame] | 171 | backing_dev RW set up backend storage for zram to write out |
Sergey Senozhatsky | 77ba015 | 2015-04-15 16:16:00 -0700 | [diff] [blame] | 172 | |
Sergey Senozhatsky | 8f7d282 | 2015-04-15 16:16:09 -0700 | [diff] [blame] | 173 | |
| 174 | User space is advised to use the following files to read the device statistics. |
| 175 | |
Sergey Senozhatsky | 77ba015 | 2015-04-15 16:16:00 -0700 | [diff] [blame] | 176 | File /sys/block/zram<id>/stat |
| 177 | |
| 178 | Represents block layer statistics. Read Documentation/block/stat.txt for |
| 179 | details. |
Nitin Gupta | 47f9afb | 2009-09-22 10:26:54 +0530 | [diff] [blame] | 180 | |
Sergey Senozhatsky | 2f6a3be | 2015-04-15 16:16:03 -0700 | [diff] [blame] | 181 | File /sys/block/zram<id>/io_stat |
| 182 | |
| 183 | The stat file represents device's I/O statistics not accounted by block |
| 184 | layer and, thus, not available in zram<id>/stat file. It consists of a |
| 185 | single line of text and contains the following stats separated by |
| 186 | whitespace: |
Sergey Senozhatsky | c87d165 | 2017-02-22 15:46:45 -0800 | [diff] [blame] | 187 | failed_reads the number of failed reads |
| 188 | failed_writes the number of failed writes |
| 189 | invalid_io the number of non-page-size-aligned I/O requests |
| 190 | notify_free Depending on device usage scenario it may account |
| 191 | a) the number of pages freed because of swap slot free |
| 192 | notifications or b) the number of pages freed because of |
| 193 | REQ_DISCARD requests sent by bio. The former ones are |
| 194 | sent to a swap block device when a swap slot is freed, |
| 195 | which implies that this disk is being used as a swap disk. |
| 196 | The latter ones are sent by filesystem mounted with |
| 197 | discard option, whenever some data blocks are getting |
| 198 | discarded. |
Sergey Senozhatsky | 2f6a3be | 2015-04-15 16:16:03 -0700 | [diff] [blame] | 199 | |
Sergey Senozhatsky | 4f2109f | 2015-04-15 16:16:06 -0700 | [diff] [blame] | 200 | File /sys/block/zram<id>/mm_stat |
| 201 | |
| 202 | The stat file represents device's mm statistics. It consists of a single |
| 203 | line of text and contains the following stats separated by whitespace: |
Sergey Senozhatsky | c87d165 | 2017-02-22 15:46:45 -0800 | [diff] [blame] | 204 | orig_data_size uncompressed size of data stored in this disk. |
zhouxianrong | 8e19d54 | 2017-02-24 14:59:27 -0800 | [diff] [blame] | 205 | This excludes same-element-filled pages (same_pages) since |
| 206 | no memory is allocated for them. |
Sergey Senozhatsky | c87d165 | 2017-02-22 15:46:45 -0800 | [diff] [blame] | 207 | Unit: bytes |
| 208 | compr_data_size compressed size of data stored in this disk |
| 209 | mem_used_total the amount of memory allocated for this disk. This |
| 210 | includes allocator fragmentation and metadata overhead, |
| 211 | allocated for this disk. So, allocator space efficiency |
| 212 | can be calculated using compr_data_size and this statistic. |
| 213 | Unit: bytes |
| 214 | mem_limit the maximum amount of memory ZRAM can use to store |
| 215 | the compressed data |
| 216 | mem_used_max the maximum amount of memory zram have consumed to |
| 217 | store the data |
zhouxianrong | 8e19d54 | 2017-02-24 14:59:27 -0800 | [diff] [blame] | 218 | same_pages the number of same element filled pages written to this disk. |
Sergey Senozhatsky | c87d165 | 2017-02-22 15:46:45 -0800 | [diff] [blame] | 219 | No memory is allocated for such pages. |
| 220 | pages_compacted the number of pages freed during compaction |
Minchan Kim | 89e85bc | 2018-06-07 17:05:42 -0700 | [diff] [blame] | 221 | huge_pages the number of incompressible pages |
Sergey Senozhatsky | 4f2109f | 2015-04-15 16:16:06 -0700 | [diff] [blame] | 222 | |
Sergey Senozhatsky | 6566d1a | 2015-06-25 15:00:24 -0700 | [diff] [blame] | 223 | 9) Deactivate: |
Nitin Gupta | 00ac9ba | 2010-06-01 13:31:26 +0530 | [diff] [blame] | 224 | swapoff /dev/zram0 |
| 225 | umount /dev/zram1 |
Nitin Gupta | 47f9afb | 2009-09-22 10:26:54 +0530 | [diff] [blame] | 226 | |
Sergey Senozhatsky | 6566d1a | 2015-06-25 15:00:24 -0700 | [diff] [blame] | 227 | 10) Reset: |
Nitin Gupta | 9b9913d | 2010-08-09 22:56:55 +0530 | [diff] [blame] | 228 | Write any positive value to 'reset' sysfs node |
| 229 | echo 1 > /sys/block/zram0/reset |
| 230 | echo 1 > /sys/block/zram1/reset |
| 231 | |
Minchan Kim | 0231c40 | 2013-01-30 11:41:40 +0900 | [diff] [blame] | 232 | This frees all the memory allocated for the given device and |
| 233 | resets the disksize to zero. You must set the disksize again |
| 234 | before reusing the device. |
Nitin Gupta | 47f9afb | 2009-09-22 10:26:54 +0530 | [diff] [blame] | 235 | |
Minchan Kim | 5a47074 | 2017-09-06 16:20:10 -0700 | [diff] [blame] | 236 | * Optional Feature |
| 237 | |
| 238 | = writeback |
| 239 | |
| 240 | With incompressible pages, there is no memory saving with zram. |
| 241 | Instead, with CONFIG_ZRAM_WRITEBACK, zram can write incompressible page |
| 242 | to backing storage rather than keeping it in memory. |
| 243 | User should set up backing device via /sys/block/zramX/backing_dev |
| 244 | before disksize setting. |
| 245 | |
Minchan Kim | c026534 | 2018-06-07 17:05:49 -0700 | [diff] [blame] | 246 | = memory tracking |
| 247 | |
| 248 | With CONFIG_ZRAM_MEMORY_TRACKING, user can know information of the |
| 249 | zram block. It could be useful to catch cold or incompressible |
| 250 | pages of the process with*pagemap. |
| 251 | If you enable the feature, you could see block state via |
| 252 | /sys/kernel/debug/zram/zram0/block_state". The output is as follows, |
| 253 | |
| 254 | 300 75.033841 .wh |
| 255 | 301 63.806904 s.. |
| 256 | 302 63.806919 ..h |
| 257 | |
| 258 | First column is zram's block index. |
| 259 | Second column is access time since the system was booted |
| 260 | Third column is state of the block. |
| 261 | (s: same page |
| 262 | w: written page to backing store |
| 263 | h: huge page) |
| 264 | |
| 265 | First line of above example says 300th block is accessed at 75.033841sec |
| 266 | and the block's state is huge so it is written back to the backing |
| 267 | storage. It's a debugging feature so anyone shouldn't rely on it to work |
| 268 | properly. |
| 269 | |
Nitin Gupta | 47f9afb | 2009-09-22 10:26:54 +0530 | [diff] [blame] | 270 | Nitin Gupta |
| 271 | ngupta@vflare.org |