Mike Rapoport | f227e04 | 2018-03-21 21:22:35 +0200 | [diff] [blame] | 1 | .. _page_owner: |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 2 | |
Mike Rapoport | f227e04 | 2018-03-21 21:22:35 +0200 | [diff] [blame] | 3 | ================================================== |
| 4 | page owner: Tracking about who allocated each page |
| 5 | ================================================== |
| 6 | |
| 7 | Introduction |
| 8 | ============ |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 9 | |
| 10 | page owner is for the tracking about who allocated each page. |
| 11 | It can be used to debug memory leak or to find a memory hogger. |
| 12 | When allocation happens, information about allocation such as call stack |
| 13 | and order of pages is stored into certain storage for each page. |
| 14 | When we need to know about status of all pages, we can get and analyze |
| 15 | this information. |
| 16 | |
| 17 | Although we already have tracepoint for tracing page allocation/free, |
| 18 | using it for analyzing who allocate each page is rather complex. We need |
| 19 | to enlarge the trace buffer for preventing overlapping until userspace |
| 20 | program launched. And, launched program continually dump out the trace |
| 21 | buffer for later analysis and it would change system behviour with more |
| 22 | possibility rather than just keeping it in memory, so bad for debugging. |
| 23 | |
| 24 | page owner can also be used for various purposes. For example, accurate |
| 25 | fragmentation statistics can be obtained through gfp flag information of |
| 26 | each page. It is already implemented and activated if page owner is |
| 27 | enabled. Other usages are more than welcome. |
| 28 | |
| 29 | page owner is disabled in default. So, if you'd like to use it, you need |
| 30 | to add "page_owner=on" into your boot cmdline. If the kernel is built |
| 31 | with page owner and page owner is disabled in runtime due to no enabling |
| 32 | boot option, runtime overhead is marginal. If disabled in runtime, it |
| 33 | doesn't require memory to store owner information, so there is no runtime |
| 34 | memory overhead. And, page owner inserts just two unlikely branches into |
Vlastimil Babka | 7dd80b8 | 2016-03-15 14:56:12 -0700 | [diff] [blame] | 35 | the page allocator hotpath and if not enabled, then allocation is done |
| 36 | like as the kernel without page owner. These two unlikely branches should |
| 37 | not affect to allocation performance, especially if the static keys jump |
| 38 | label patching functionality is available. Following is the kernel's code |
| 39 | size change due to this facility. |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 40 | |
Mike Rapoport | f227e04 | 2018-03-21 21:22:35 +0200 | [diff] [blame] | 41 | - Without page owner:: |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 42 | |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 43 | text data bss dec hex filename |
Mike Rapoport | f227e04 | 2018-03-21 21:22:35 +0200 | [diff] [blame] | 44 | 40662 1493 644 42799 a72f mm/page_alloc.o |
| 45 | |
| 46 | - With page owner:: |
| 47 | |
| 48 | text data bss dec hex filename |
| 49 | 40892 1493 644 43029 a815 mm/page_alloc.o |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 50 | 1427 24 8 1459 5b3 mm/page_ext.o |
| 51 | 2722 50 0 2772 ad4 mm/page_owner.o |
| 52 | |
| 53 | Although, roughly, 4 KB code is added in total, page_alloc.o increase by |
| 54 | 230 bytes and only half of it is in hotpath. Building the kernel with |
| 55 | page owner and turning it on if needed would be great option to debug |
| 56 | kernel memory problem. |
| 57 | |
| 58 | There is one notice that is caused by implementation detail. page owner |
| 59 | stores information into the memory from struct page extension. This memory |
| 60 | is initialized some time later than that page allocator starts in sparse |
| 61 | memory system, so, until initialization, many pages can be allocated and |
| 62 | they would have no owner information. To fix it up, these early allocated |
| 63 | pages are investigated and marked as allocated in initialization phase. |
| 64 | Although it doesn't mean that they have the right owner information, |
| 65 | at least, we can tell whether the page is allocated or not, |
| 66 | more accurately. On 2GB memory x86-64 VM box, 13343 early allocated pages |
| 67 | are catched and marked, although they are mostly allocated from struct |
| 68 | page extension feature. Anyway, after that, no page is left in |
| 69 | un-tracking state. |
| 70 | |
Mike Rapoport | f227e04 | 2018-03-21 21:22:35 +0200 | [diff] [blame] | 71 | Usage |
| 72 | ===== |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 73 | |
Mike Rapoport | f227e04 | 2018-03-21 21:22:35 +0200 | [diff] [blame] | 74 | 1) Build user-space helper:: |
| 75 | |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 76 | cd tools/vm |
| 77 | make page_owner_sort |
| 78 | |
Mike Rapoport | f227e04 | 2018-03-21 21:22:35 +0200 | [diff] [blame] | 79 | 2) Enable page owner: add "page_owner=on" to boot cmdline. |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 80 | |
| 81 | 3) Do the job what you want to debug |
| 82 | |
Mike Rapoport | f227e04 | 2018-03-21 21:22:35 +0200 | [diff] [blame] | 83 | 4) Analyze information from page owner:: |
| 84 | |
Joonsoo Kim | 16a7ade | 2014-12-12 16:56:07 -0800 | [diff] [blame] | 85 | cat /sys/kernel/debug/page_owner > page_owner_full.txt |
| 86 | grep -v ^PFN page_owner_full.txt > page_owner.txt |
| 87 | ./page_owner_sort page_owner.txt sorted_page_owner.txt |
| 88 | |
Mike Rapoport | f227e04 | 2018-03-21 21:22:35 +0200 | [diff] [blame] | 89 | See the result about who allocated each page |
| 90 | in the ``sorted_page_owner.txt``. |