Mike Rapoport | 2a05c58 | 2018-03-21 21:22:45 +0200 | [diff] [blame] | 1 | .. _zsmalloc: |
| 2 | |
| 3 | ======== |
Minchan Kim | d02be50 | 2015-04-15 16:15:46 -0700 | [diff] [blame] | 4 | zsmalloc |
Mike Rapoport | 2a05c58 | 2018-03-21 21:22:45 +0200 | [diff] [blame] | 5 | ======== |
Minchan Kim | d02be50 | 2015-04-15 16:15:46 -0700 | [diff] [blame] | 6 | |
| 7 | This allocator is designed for use with zram. Thus, the allocator is |
| 8 | supposed to work well under low memory conditions. In particular, it |
| 9 | never attempts higher order page allocation which is very likely to |
| 10 | fail under memory pressure. On the other hand, if we just use single |
| 11 | (0-order) pages, it would suffer from very high fragmentation -- |
| 12 | any object of size PAGE_SIZE/2 or larger would occupy an entire page. |
| 13 | This was one of the major issues with its predecessor (xvmalloc). |
| 14 | |
| 15 | To overcome these issues, zsmalloc allocates a bunch of 0-order pages |
| 16 | and links them together using various 'struct page' fields. These linked |
| 17 | pages act as a single higher-order page i.e. an object can span 0-order |
| 18 | page boundaries. The code refers to these linked pages as a single entity |
| 19 | called zspage. |
| 20 | |
| 21 | For simplicity, zsmalloc can only allocate objects of size up to PAGE_SIZE |
| 22 | since this satisfies the requirements of all its current users (in the |
| 23 | worst case, page is incompressible and is thus stored "as-is" i.e. in |
| 24 | uncompressed form). For allocation requests larger than this size, failure |
| 25 | is returned (see zs_malloc). |
| 26 | |
| 27 | Additionally, zs_malloc() does not return a dereferenceable pointer. |
| 28 | Instead, it returns an opaque handle (unsigned long) which encodes actual |
| 29 | location of the allocated object. The reason for this indirection is that |
| 30 | zsmalloc does not keep zspages permanently mapped since that would cause |
| 31 | issues on 32-bit systems where the VA region for kernel space mappings |
| 32 | is very small. So, before using the allocating memory, the object has to |
| 33 | be mapped using zs_map_object() to get a usable pointer and subsequently |
| 34 | unmapped using zs_unmap_object(). |
| 35 | |
| 36 | stat |
Mike Rapoport | 2a05c58 | 2018-03-21 21:22:45 +0200 | [diff] [blame] | 37 | ==== |
Minchan Kim | d02be50 | 2015-04-15 16:15:46 -0700 | [diff] [blame] | 38 | |
| 39 | With CONFIG_ZSMALLOC_STAT, we could see zsmalloc internal information via |
Mike Rapoport | 2a05c58 | 2018-03-21 21:22:45 +0200 | [diff] [blame] | 40 | ``/sys/kernel/debug/zsmalloc/<user name>``. Here is a sample of stat output:: |
Minchan Kim | d02be50 | 2015-04-15 16:15:46 -0700 | [diff] [blame] | 41 | |
Mike Rapoport | 2a05c58 | 2018-03-21 21:22:45 +0200 | [diff] [blame] | 42 | # cat /sys/kernel/debug/zsmalloc/zram0/classes |
Minchan Kim | d02be50 | 2015-04-15 16:15:46 -0700 | [diff] [blame] | 43 | |
| 44 | class size almost_full almost_empty obj_allocated obj_used pages_used pages_per_zspage |
Mike Rapoport | 2a05c58 | 2018-03-21 21:22:45 +0200 | [diff] [blame] | 45 | ... |
| 46 | ... |
Minchan Kim | d02be50 | 2015-04-15 16:15:46 -0700 | [diff] [blame] | 47 | 9 176 0 1 186 129 8 4 |
| 48 | 10 192 1 0 2880 2872 135 3 |
| 49 | 11 208 0 1 819 795 42 2 |
| 50 | 12 224 0 1 219 159 12 4 |
Mike Rapoport | 2a05c58 | 2018-03-21 21:22:45 +0200 | [diff] [blame] | 51 | ... |
| 52 | ... |
Minchan Kim | d02be50 | 2015-04-15 16:15:46 -0700 | [diff] [blame] | 53 | |
| 54 | |
Mike Rapoport | 2a05c58 | 2018-03-21 21:22:45 +0200 | [diff] [blame] | 55 | class |
| 56 | index |
| 57 | size |
| 58 | object size zspage stores |
| 59 | almost_empty |
| 60 | the number of ZS_ALMOST_EMPTY zspages(see below) |
| 61 | almost_full |
| 62 | the number of ZS_ALMOST_FULL zspages(see below) |
| 63 | obj_allocated |
| 64 | the number of objects allocated |
| 65 | obj_used |
| 66 | the number of objects allocated to the user |
| 67 | pages_used |
| 68 | the number of pages allocated for the class |
| 69 | pages_per_zspage |
| 70 | the number of 0-order pages to make a zspage |
Minchan Kim | d02be50 | 2015-04-15 16:15:46 -0700 | [diff] [blame] | 71 | |
Mike Rapoport | 2a05c58 | 2018-03-21 21:22:45 +0200 | [diff] [blame] | 72 | We assign a zspage to ZS_ALMOST_EMPTY fullness group when n <= N / f, where |
| 73 | |
| 74 | * n = number of allocated objects |
| 75 | * N = total number of objects zspage can store |
| 76 | * f = fullness_threshold_frac(ie, 4 at the moment) |
Minchan Kim | d02be50 | 2015-04-15 16:15:46 -0700 | [diff] [blame] | 77 | |
| 78 | Similarly, we assign zspage to: |
Mike Rapoport | 2a05c58 | 2018-03-21 21:22:45 +0200 | [diff] [blame] | 79 | |
| 80 | * ZS_ALMOST_FULL when n > N / f |
| 81 | * ZS_EMPTY when n == 0 |
| 82 | * ZS_FULL when n == N |