blob: bb94a4bd597a0d0966b7461a9aec3c062c3bdc4d [file] [log] [blame]
Lianbo Jiangf2632452019-01-10 20:19:43 +08001================================================================
2 VMCOREINFO
3================================================================
4
5===========
6What is it?
7===========
8
9VMCOREINFO is a special ELF note section. It contains various
10information from the kernel like structure size, page size, symbol
11values, field offsets, etc. These data are packed into an ELF note
12section and used by user-space tools like crash and makedumpfile to
13analyze a kernel's memory layout.
14
15================
16Common variables
17================
18
19init_uts_ns.name.release
20------------------------
21
22The version of the Linux kernel. Used to find the corresponding source
23code from which the kernel has been built. For example, crash uses it to
24find the corresponding vmlinux in order to process vmcore.
25
26PAGE_SIZE
27---------
28
29The size of a page. It is the smallest unit of data used by the memory
30management facilities. It is usually 4096 bytes of size and a page is
31aligned on 4096 bytes. Used for computing page addresses.
32
33init_uts_ns
34-----------
35
36The UTS namespace which is used to isolate two specific elements of the
37system that relate to the uname(2) system call. It is named after the
38data structure used to store information returned by the uname(2) system
39call.
40
41User-space tools can get the kernel name, host name, kernel release
42number, kernel version, architecture name and OS type from it.
43
44node_online_map
45---------------
46
47An array node_states[N_ONLINE] which represents the set of online nodes
48in a system, one bit position per node number. Used to keep track of
49which nodes are in the system and online.
50
51swapper_pg_dir
52-------------
53
54The global page directory pointer of the kernel. Used to translate
55virtual to physical addresses.
56
57_stext
58------
59
60Defines the beginning of the text section. In general, _stext indicates
61the kernel start address. Used to convert a virtual address from the
62direct kernel map to a physical address.
63
64vmap_area_list
65--------------
66
67Stores the virtual area list. makedumpfile gets the vmalloc start value
68from this variable and its value is necessary for vmalloc translation.
69
70mem_map
71-------
72
73Physical addresses are translated to struct pages by treating them as
74an index into the mem_map array. Right-shifting a physical address
75PAGE_SHIFT bits converts it into a page frame number which is an index
76into that mem_map array.
77
78Used to map an address to the corresponding struct page.
79
80contig_page_data
81----------------
82
83Makedumpfile gets the pglist_data structure from this symbol, which is
84used to describe the memory layout.
85
86User-space tools use this to exclude free pages when dumping memory.
87
88mem_section|(mem_section, NR_SECTION_ROOTS)|(mem_section, section_mem_map)
89--------------------------------------------------------------------------
90
91The address of the mem_section array, its length, structure size, and
92the section_mem_map offset.
93
94It exists in the sparse memory mapping model, and it is also somewhat
95similar to the mem_map variable, both of them are used to translate an
96address.
97
98page
99----
100
101The size of a page structure. struct page is an important data structure
102and it is widely used to compute contiguous memory.
103
104pglist_data
105-----------
106
107The size of a pglist_data structure. This value is used to check if the
108pglist_data structure is valid. It is also used for checking the memory
109type.
110
111zone
112----
113
114The size of a zone structure. This value is used to check if the zone
115structure has been found. It is also used for excluding free pages.
116
117free_area
118---------
119
120The size of a free_area structure. It indicates whether the free_area
121structure is valid or not. Useful when excluding free pages.
122
123list_head
124---------
125
126The size of a list_head structure. Used when iterating lists in a
127post-mortem analysis session.
128
129nodemask_t
130----------
131
132The size of a nodemask_t type. Used to compute the number of online
133nodes.
134
135(page, flags|_refcount|mapping|lru|_mapcount|private|compound_dtor|
136 compound_order|compound_head)
137-------------------------------------------------------------------
138
139User-space tools compute their values based on the offset of these
140variables. The variables are used when excluding unnecessary pages.
141
142(pglist_data, node_zones|nr_zones|node_mem_map|node_start_pfn|node_
143 spanned_pages|node_id)
144-------------------------------------------------------------------
145
146On NUMA machines, each NUMA node has a pg_data_t to describe its memory
147layout. On UMA machines there is a single pglist_data which describes the
148whole memory.
149
150These values are used to check the memory type and to compute the
151virtual address for memory map.
152
153(zone, free_area|vm_stat|spanned_pages)
154---------------------------------------
155
156Each node is divided into a number of blocks called zones which
157represent ranges within memory. A zone is described by a structure zone.
158
159User-space tools compute required values based on the offset of these
160variables.
161
162(free_area, free_list)
163----------------------
164
165Offset of the free_list's member. This value is used to compute the number
166of free pages.
167
168Each zone has a free_area structure array called free_area[MAX_ORDER].
169The free_list represents a linked list of free page blocks.
170
171(list_head, next|prev)
172----------------------
173
174Offsets of the list_head's members. list_head is used to define a
175circular linked list. User-space tools need these in order to traverse
176lists.
177
178(vmap_area, va_start|list)
179--------------------------
180
181Offsets of the vmap_area's members. They carry vmalloc-specific
182information. Makedumpfile gets the start address of the vmalloc region
183from this.
184
185(zone.free_area, MAX_ORDER)
186---------------------------
187
188Free areas descriptor. User-space tools use this value to iterate the
189free_area ranges. MAX_ORDER is used by the zone buddy allocator.
190
191log_first_idx
192-------------
193
194Index of the first record stored in the buffer log_buf. Used by
195user-space tools to read the strings in the log_buf.
196
197log_buf
198-------
199
200Console output is written to the ring buffer log_buf at index
201log_first_idx. Used to get the kernel log.
202
203log_buf_len
204-----------
205
206log_buf's length.
207
208clear_idx
209---------
210
211The index that the next printk() record to read after the last clear
212command. It indicates the first record after the last SYSLOG_ACTION
213_CLEAR, like issued by 'dmesg -c'. Used by user-space tools to dump
214the dmesg log.
215
216log_next_idx
217------------
218
219The index of the next record to store in the buffer log_buf. Used to
220compute the index of the current buffer position.
221
222printk_log
223----------
224
225The size of a structure printk_log. Used to compute the size of
226messages, and extract dmesg log. It encapsulates header information for
227log_buf, such as timestamp, syslog level, etc.
228
229(printk_log, ts_nsec|len|text_len|dict_len)
230-------------------------------------------
231
232It represents field offsets in struct printk_log. User space tools
233parse it and check whether the values of printk_log's members have been
234changed.
235
236(free_area.free_list, MIGRATE_TYPES)
237------------------------------------
238
239The number of migrate types for pages. The free_list is described by the
240array. Used by tools to compute the number of free pages.
241
242NR_FREE_PAGES
243-------------
244
245On linux-2.6.21 or later, the number of free pages is in
246vm_stat[NR_FREE_PAGES]. Used to get the number of free pages.
247
248PG_lru|PG_private|PG_swapcache|PG_swapbacked|PG_slab|PG_hwpoision
249|PG_head_mask|PAGE_BUDDY_MAPCOUNT_VALUE(~PG_buddy)
250|PAGE_OFFLINE_MAPCOUNT_VALUE(~PG_offline)
251-----------------------------------------------------------------
252
253Page attributes. These flags are used to filter various unnecessary for
254dumping pages.
255
256HUGETLB_PAGE_DTOR
257-----------------
258
259The HUGETLB_PAGE_DTOR flag denotes hugetlbfs pages. Makedumpfile
260excludes these pages.
261
262======
263x86_64
264======
265
266phys_base
267---------
268
269Used to convert the virtual address of an exported kernel symbol to its
270corresponding physical address.
271
272init_top_pgt
273------------
274
275Used to walk through the whole page table and convert virtual addresses
276to physical addresses. The init_top_pgt is somewhat similar to
277swapper_pg_dir, but it is only used in x86_64.
278
279pgtable_l5_enabled
280------------------
281
282User-space tools need to know whether the crash kernel was in 5-level
283paging mode.
284
285node_data
286---------
287
288This is a struct pglist_data array and stores all NUMA nodes
289information. Makedumpfile gets the pglist_data structure from it.
290
291(node_data, MAX_NUMNODES)
292-------------------------
293
294The maximum number of nodes in system.
295
296KERNELOFFSET
297------------
298
299The kernel randomization offset. Used to compute the page offset. If
300KASLR is disabled, this value is zero.
301
302KERNEL_IMAGE_SIZE
303-----------------
304
305Currently unused by Makedumpfile. Used to compute the module virtual
306address by Crash.
307
308sme_mask
309--------
310
311AMD-specific with SME support: it indicates the secure memory encryption
312mask. Makedumpfile tools need to know whether the crash kernel was
313encrypted. If SME is enabled in the first kernel, the crash kernel's
314page table entries (pgd/pud/pmd/pte) contain the memory encryption
315mask. This is used to remove the SME mask and obtain the true physical
316address.
317
318Currently, sme_mask stores the value of the C-bit position. If needed,
319additional SME-relevant info can be placed in that variable.
320
321For example:
322[ misc ][ enc bit ][ other misc SME info ]
3230000_0000_0000_0000_1000_0000_0000_0000_0000_0000_..._0000
32463 59 55 51 47 43 39 35 31 27 ... 3
325
326======
327x86_32
328======
329
330X86_PAE
331-------
332
333Denotes whether physical address extensions are enabled. It has the cost
334of a higher page table lookup overhead, and also consumes more page
335table space per process. Used to check whether PAE was enabled in the
336crash kernel when converting virtual addresses to physical addresses.
337
338====
339ia64
340====
341
342pgdat_list|(pgdat_list, MAX_NUMNODES)
343-------------------------------------
344
345pg_data_t array storing all NUMA nodes information. MAX_NUMNODES
346indicates the number of the nodes.
347
348node_memblk|(node_memblk, NR_NODE_MEMBLKS)
349------------------------------------------
350
351List of node memory chunks. Filled when parsing the SRAT table to obtain
352information about memory nodes. NR_NODE_MEMBLKS indicates the number of
353node memory chunks.
354
355These values are used to compute the number of nodes the crashed kernel used.
356
357node_memblk_s|(node_memblk_s, start_paddr)|(node_memblk_s, size)
358----------------------------------------------------------------
359
360The size of a struct node_memblk_s and the offsets of the
361node_memblk_s's members. Used to compute the number of nodes.
362
363PGTABLE_3|PGTABLE_4
364-------------------
365
366User-space tools need to know whether the crash kernel was in 3-level or
3674-level paging mode. Used to distinguish the page table.
368
369=====
370ARM64
371=====
372
373VA_BITS
374-------
375
376The maximum number of bits for virtual addresses. Used to compute the
377virtual memory ranges.
378
379kimage_voffset
380--------------
381
382The offset between the kernel virtual and physical mappings. Used to
383translate virtual to physical addresses.
384
385PHYS_OFFSET
386-----------
387
388Indicates the physical address of the start of memory. Similar to
389kimage_voffset, which is used to translate virtual to physical
390addresses.
391
392KERNELOFFSET
393------------
394
395The kernel randomization offset. Used to compute the page offset. If
396KASLR is disabled, this value is zero.
397
398====
399arm
400====
401
402ARM_LPAE
403--------
404
405It indicates whether the crash kernel supports large physical address
406extensions. Used to translate virtual to physical addresses.
407
408====
409s390
410====
411
412lowcore_ptr
413----------
414
415An array with a pointer to the lowcore of every CPU. Used to print the
416psw and all registers information.
417
418high_memory
419-----------
420
421Used to get the vmalloc_start address from the high_memory symbol.
422
423(lowcore_ptr, NR_CPUS)
424----------------------
425
426The maximum number of CPUs.
427
428=======
429powerpc
430=======
431
432
433node_data|(node_data, MAX_NUMNODES)
434-----------------------------------
435
436See above.
437
438contig_page_data
439----------------
440
441See above.
442
443vmemmap_list
444------------
445
446The vmemmap_list maintains the entire vmemmap physical mapping. Used
447to get vmemmap list count and populated vmemmap regions info. If the
448vmemmap address translation information is stored in the crash kernel,
449it is used to translate vmemmap kernel virtual addresses.
450
451mmu_vmemmap_psize
452-----------------
453
454The size of a page. Used to translate virtual to physical addresses.
455
456mmu_psize_defs
457--------------
458
459Page size definitions, i.e. 4k, 64k, or 16M.
460
461Used to make vtop translations.
462
463vmemmap_backing|(vmemmap_backing, list)|(vmemmap_backing, phys)|
464(vmemmap_backing, virt_addr)
465----------------------------------------------------------------
466
467The vmemmap virtual address space management does not have a traditional
468page table to track which virtual struct pages are backed by a physical
469mapping. The virtual to physical mappings are tracked in a simple linked
470list format.
471
472User-space tools need to know the offset of list, phys and virt_addr
473when computing the count of vmemmap regions.
474
475mmu_psize_def|(mmu_psize_def, shift)
476------------------------------------
477
478The size of a struct mmu_psize_def and the offset of mmu_psize_def's
479member.
480
481Used in vtop translations.
482
483==
484sh
485==
486
487node_data|(node_data, MAX_NUMNODES)
488-----------------------------------
489
490See above.
491
492X2TLB
493-----
494
495Indicates whether the crashed kernel enabled SH extended mode.