Mike Rapoport | f227e04 | 2018-03-21 21:22:35 +0200 | [diff] [blame] | 1 | .. _page_owner: |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 2 | |
Mike Rapoport | f227e04 | 2018-03-21 21:22:35 +0200 | [diff] [blame] | 3 | ================================================== |
| 4 | page owner: Tracking about who allocated each page |
| 5 | ================================================== |
| 6 | |
| 7 | Introduction |
| 8 | ============ |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 9 | |
| 10 | page owner is for the tracking about who allocated each page. |
| 11 | It can be used to debug memory leak or to find a memory hogger. |
| 12 | When allocation happens, information about allocation such as call stack |
| 13 | and order of pages is stored into certain storage for each page. |
| 14 | When we need to know about status of all pages, we can get and analyze |
| 15 | this information. |
| 16 | |
| 17 | Although we already have tracepoint for tracing page allocation/free, |
| 18 | using it for analyzing who allocate each page is rather complex. We need |
| 19 | to enlarge the trace buffer for preventing overlapping until userspace |
| 20 | program launched. And, launched program continually dump out the trace |
Colin Ian King | 94ebdd2 | 2020-10-22 15:26:53 +0100 | [diff] [blame] | 21 | buffer for later analysis and it would change system behaviour with more |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 22 | possibility rather than just keeping it in memory, so bad for debugging. |
| 23 | |
| 24 | page owner can also be used for various purposes. For example, accurate |
| 25 | fragmentation statistics can be obtained through gfp flag information of |
| 26 | each page. It is already implemented and activated if page owner is |
| 27 | enabled. Other usages are more than welcome. |
| 28 | |
| 29 | page owner is disabled in default. So, if you'd like to use it, you need |
| 30 | to add "page_owner=on" into your boot cmdline. If the kernel is built |
| 31 | with page owner and page owner is disabled in runtime due to no enabling |
| 32 | boot option, runtime overhead is marginal. If disabled in runtime, it |
| 33 | doesn't require memory to store owner information, so there is no runtime |
| 34 | memory overhead. And, page owner inserts just two unlikely branches into |
Vlastimil Babka | 7dd80b8 | 2016-03-15 14:56:12 -0700 | [diff] [blame] | 35 | the page allocator hotpath and if not enabled, then allocation is done |
| 36 | like as the kernel without page owner. These two unlikely branches should |
| 37 | not affect to allocation performance, especially if the static keys jump |
| 38 | label patching functionality is available. Following is the kernel's code |
| 39 | size change due to this facility. |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 40 | |
Mike Rapoport | f227e04 | 2018-03-21 21:22:35 +0200 | [diff] [blame] | 41 | - Without page owner:: |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 42 | |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 43 | text data bss dec hex filename |
Liam Mark | 9cc7e96a | 2020-12-14 19:04:49 -0800 | [diff] [blame] | 44 | 48392 2333 644 51369 c8a9 mm/page_alloc.o |
Mike Rapoport | f227e04 | 2018-03-21 21:22:35 +0200 | [diff] [blame] | 45 | |
| 46 | - With page owner:: |
| 47 | |
| 48 | text data bss dec hex filename |
Liam Mark | 9cc7e96a | 2020-12-14 19:04:49 -0800 | [diff] [blame] | 49 | 48800 2445 644 51889 cab1 mm/page_alloc.o |
Georgi Djakov | 866b485 | 2021-04-29 22:54:57 -0700 | [diff] [blame] | 50 | 6662 108 29 6799 1a8f mm/page_owner.o |
Liam Mark | 9cc7e96a | 2020-12-14 19:04:49 -0800 | [diff] [blame] | 51 | 1025 8 8 1041 411 mm/page_ext.o |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 52 | |
Liam Mark | 9cc7e96a | 2020-12-14 19:04:49 -0800 | [diff] [blame] | 53 | Although, roughly, 8 KB code is added in total, page_alloc.o increase by |
| 54 | 520 bytes and less than half of it is in hotpath. Building the kernel with |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 55 | page owner and turning it on if needed would be great option to debug |
| 56 | kernel memory problem. |
| 57 | |
| 58 | There is one notice that is caused by implementation detail. page owner |
| 59 | stores information into the memory from struct page extension. This memory |
| 60 | is initialized some time later than that page allocator starts in sparse |
| 61 | memory system, so, until initialization, many pages can be allocated and |
| 62 | they would have no owner information. To fix it up, these early allocated |
| 63 | pages are investigated and marked as allocated in initialization phase. |
| 64 | Although it doesn't mean that they have the right owner information, |
| 65 | at least, we can tell whether the page is allocated or not, |
| 66 | more accurately. On 2GB memory x86-64 VM box, 13343 early allocated pages |
| 67 | are catched and marked, although they are mostly allocated from struct |
| 68 | page extension feature. Anyway, after that, no page is left in |
| 69 | un-tracking state. |
| 70 | |
Mike Rapoport | f227e04 | 2018-03-21 21:22:35 +0200 | [diff] [blame] | 71 | Usage |
| 72 | ===== |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 73 | |
Mike Rapoport | f227e04 | 2018-03-21 21:22:35 +0200 | [diff] [blame] | 74 | 1) Build user-space helper:: |
| 75 | |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 76 | cd tools/vm |
| 77 | make page_owner_sort |
| 78 | |
Mike Rapoport | f227e04 | 2018-03-21 21:22:35 +0200 | [diff] [blame] | 79 | 2) Enable page owner: add "page_owner=on" to boot cmdline. |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 80 | |
| 81 | 3) Do the job what you want to debug |
| 82 | |
Mike Rapoport | f227e04 | 2018-03-21 21:22:35 +0200 | [diff] [blame] | 83 | 4) Analyze information from page owner:: |
| 84 | |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 85 | cat /sys/kernel/debug/page_owner > page_owner_full.txt |
Changhee Han | 5b94ce2 | 2020-06-03 16:03:22 -0700 | [diff] [blame] | 86 | ./page_owner_sort page_owner_full.txt sorted_page_owner.txt |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 87 | |
Zhenliang Wei | f7df2b1 | 2021-11-05 13:42:55 -0700 | [diff] [blame] | 88 | The general output of ``page_owner_full.txt`` is as follows: |
| 89 | |
| 90 | Page allocated via order XXX, ... |
| 91 | PFN XXX ... |
| 92 | // Detailed stack |
| 93 | |
| 94 | Page allocated via order XXX, ... |
| 95 | PFN XXX ... |
| 96 | // Detailed stack |
| 97 | |
| 98 | The ``page_owner_sort`` tool ignores ``PFN`` rows, puts the remaining rows |
| 99 | in buf, uses regexp to extract the page order value, counts the times |
| 100 | and pages of buf, and finally sorts them according to the times. |
| 101 | |
Mike Rapoport | f227e04 | 2018-03-21 21:22:35 +0200 | [diff] [blame] | 102 | See the result about who allocated each page |
Zhenliang Wei | f7df2b1 | 2021-11-05 13:42:55 -0700 | [diff] [blame] | 103 | in the ``sorted_page_owner.txt``. General output: |
| 104 | |
| 105 | XXX times, XXX pages: |
| 106 | Page allocated via order XXX, ... |
| 107 | // Detailed stack |
| 108 | |
| 109 | By default, ``page_owner_sort`` is sorted according to the times of buf. |
| 110 | If you want to sort by the pages nums of buf, use the ``-m`` parameter. |