blob: 44dc789de2b495f5708062a704b8ab8c412e2124 [file] [log] [blame]
Yonghong Songffcf7ce2019-01-18 13:56:49 -08001=====================
2BPF Type Format (BTF)
3=====================
4
51. Introduction
6***************
7
Andrii Nakryiko9ab53052019-02-28 17:12:20 -08008BTF (BPF Type Format) is the metadata format which encodes the debug info
9related to BPF program/map. The name BTF was used initially to describe data
10types. The BTF was later extended to include function info for defined
11subroutines, and line info for source/line information.
Yonghong Songffcf7ce2019-01-18 13:56:49 -080012
Andrii Nakryiko9ab53052019-02-28 17:12:20 -080013The debug info is used for map pretty print, function signature, etc. The
14function signature enables better bpf program/function kernel symbol. The line
15info helps generate source annotated translated byte code, jited code and
16verifier log.
Yonghong Songffcf7ce2019-01-18 13:56:49 -080017
18The BTF specification contains two parts,
19 * BTF kernel API
20 * BTF ELF file format
21
Andrii Nakryiko9ab53052019-02-28 17:12:20 -080022The kernel API is the contract between user space and kernel. The kernel
23verifies the BTF info before using it. The ELF file format is a user space
24contract between ELF file and libbpf loader.
Yonghong Songffcf7ce2019-01-18 13:56:49 -080025
Andrii Nakryiko9ab53052019-02-28 17:12:20 -080026The type and string sections are part of the BTF kernel API, describing the
27debug info (mostly types related) referenced by the bpf program. These two
28sections are discussed in details in :ref:`BTF_Type_String`.
Yonghong Songffcf7ce2019-01-18 13:56:49 -080029
30.. _BTF_Type_String:
31
322. BTF Type and String Encoding
33*******************************
34
Andrii Nakryiko9ab53052019-02-28 17:12:20 -080035The file ``include/uapi/linux/btf.h`` provides high-level definition of how
36types/strings are encoded.
Yonghong Songffcf7ce2019-01-18 13:56:49 -080037
38The beginning of data blob must be::
39
40 struct btf_header {
41 __u16 magic;
42 __u8 version;
43 __u8 flags;
44 __u32 hdr_len;
45
46 /* All offsets are in bytes relative to the end of this header */
47 __u32 type_off; /* offset of type section */
48 __u32 type_len; /* length of type section */
49 __u32 str_off; /* offset of string section */
50 __u32 str_len; /* length of string section */
51 };
52
53The magic is ``0xeB9F``, which has different encoding for big and little
Andrii Nakryiko9ab53052019-02-28 17:12:20 -080054endian systems, and can be used to test whether BTF is generated for big- or
55little-endian target. The ``btf_header`` is designed to be extensible with
56``hdr_len`` equal to ``sizeof(struct btf_header)`` when a data blob is
57generated.
Yonghong Songffcf7ce2019-01-18 13:56:49 -080058
592.1 String Encoding
60===================
61
Andrii Nakryiko9ab53052019-02-28 17:12:20 -080062The first string in the string section must be a null string. The rest of
63string table is a concatenation of other null-terminated strings.
Yonghong Songffcf7ce2019-01-18 13:56:49 -080064
652.2 Type Encoding
66=================
67
Andrii Nakryiko9ab53052019-02-28 17:12:20 -080068The type id ``0`` is reserved for ``void`` type. The type section is parsed
69sequentially and type id is assigned to each recognized type starting from id
70``1``. Currently, the following types are supported::
Yonghong Songffcf7ce2019-01-18 13:56:49 -080071
72 #define BTF_KIND_INT 1 /* Integer */
73 #define BTF_KIND_PTR 2 /* Pointer */
74 #define BTF_KIND_ARRAY 3 /* Array */
75 #define BTF_KIND_STRUCT 4 /* Struct */
76 #define BTF_KIND_UNION 5 /* Union */
77 #define BTF_KIND_ENUM 6 /* Enumeration */
78 #define BTF_KIND_FWD 7 /* Forward */
79 #define BTF_KIND_TYPEDEF 8 /* Typedef */
80 #define BTF_KIND_VOLATILE 9 /* Volatile */
81 #define BTF_KIND_CONST 10 /* Const */
82 #define BTF_KIND_RESTRICT 11 /* Restrict */
83 #define BTF_KIND_FUNC 12 /* Function */
84 #define BTF_KIND_FUNC_PROTO 13 /* Function Proto */
Daniel Borkmannf063c882019-04-09 23:20:08 +020085 #define BTF_KIND_VAR 14 /* Variable */
86 #define BTF_KIND_DATASEC 15 /* Section */
Yonghong Songffcf7ce2019-01-18 13:56:49 -080087
88Note that the type section encodes debug info, not just pure types.
89``BTF_KIND_FUNC`` is not a type, and it represents a defined subprogram.
90
91Each type contains the following common data::
92
93 struct btf_type {
94 __u32 name_off;
95 /* "info" bits arrangement
96 * bits 0-15: vlen (e.g. # of struct's members)
97 * bits 16-23: unused
98 * bits 24-27: kind (e.g. int, ptr, array...etc)
99 * bits 28-30: unused
100 * bit 31: kind_flag, currently used by
101 * struct, union and fwd
102 */
103 __u32 info;
104 /* "size" is used by INT, ENUM, STRUCT and UNION.
105 * "size" tells the size of the type it is describing.
106 *
107 * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT,
108 * FUNC and FUNC_PROTO.
109 * "type" is a type_id referring to another type.
110 */
111 union {
112 __u32 size;
113 __u32 type;
114 };
115 };
116
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800117For certain kinds, the common data are followed by kind-specific data. The
118``name_off`` in ``struct btf_type`` specifies the offset in the string table.
119The following sections detail encoding of each kind.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800120
1212.2.1 BTF_KIND_INT
122~~~~~~~~~~~~~~~~~~
123
124``struct btf_type`` encoding requirement:
125 * ``name_off``: any valid offset
126 * ``info.kind_flag``: 0
127 * ``info.kind``: BTF_KIND_INT
128 * ``info.vlen``: 0
129 * ``size``: the size of the int type in bytes.
130
Andrii Nakryiko5efc5292019-02-28 17:12:19 -0800131``btf_type`` is followed by a ``u32`` with the following bits arrangement::
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800132
133 #define BTF_INT_ENCODING(VAL) (((VAL) & 0x0f000000) >> 24)
Gary Lin948dc8c2019-05-13 17:45:48 +0800134 #define BTF_INT_OFFSET(VAL) (((VAL) & 0x00ff0000) >> 16)
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800135 #define BTF_INT_BITS(VAL) ((VAL) & 0x000000ff)
136
137The ``BTF_INT_ENCODING`` has the following attributes::
138
139 #define BTF_INT_SIGNED (1 << 0)
140 #define BTF_INT_CHAR (1 << 1)
141 #define BTF_INT_BOOL (1 << 2)
142
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800143The ``BTF_INT_ENCODING()`` provides extra information: signedness, char, or
144bool, for the int type. The char and bool encoding are mostly useful for
145pretty print. At most one encoding can be specified for the int type.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800146
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800147The ``BTF_INT_BITS()`` specifies the number of actual bits held by this int
148type. For example, a 4-bit bitfield encodes ``BTF_INT_BITS()`` equals to 4.
149The ``btf_type.size * 8`` must be equal to or greater than ``BTF_INT_BITS()``
150for the type. The maximum value of ``BTF_INT_BITS()`` is 128.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800151
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800152The ``BTF_INT_OFFSET()`` specifies the starting bit offset to calculate values
Jesper Dangaard Brouerf52c97d2019-03-25 15:12:15 +0100153for this int. For example, a bitfield struct member has:
Mauro Carvalho Chehabd857a3f2019-06-07 15:54:21 -0300154
Jesper Dangaard Brouerf52c97d2019-03-25 15:12:15 +0100155 * btf member bit offset 100 from the start of the structure,
156 * btf member pointing to an int type,
157 * the int type has ``BTF_INT_OFFSET() = 2`` and ``BTF_INT_BITS() = 4``
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800158
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800159Then in the struct memory layout, this member will occupy ``4`` bits starting
160from bits ``100 + 2 = 102``.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800161
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800162Alternatively, the bitfield struct member can be the following to access the
163same bits as the above:
Mauro Carvalho Chehabd857a3f2019-06-07 15:54:21 -0300164
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800165 * btf member bit offset 102,
166 * btf member pointing to an int type,
167 * the int type has ``BTF_INT_OFFSET() = 0`` and ``BTF_INT_BITS() = 4``
168
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800169The original intention of ``BTF_INT_OFFSET()`` is to provide flexibility of
170bitfield encoding. Currently, both llvm and pahole generate
171``BTF_INT_OFFSET() = 0`` for all int types.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800172
1732.2.2 BTF_KIND_PTR
174~~~~~~~~~~~~~~~~~~
175
176``struct btf_type`` encoding requirement:
177 * ``name_off``: 0
178 * ``info.kind_flag``: 0
179 * ``info.kind``: BTF_KIND_PTR
180 * ``info.vlen``: 0
181 * ``type``: the pointee type of the pointer
182
183No additional type data follow ``btf_type``.
184
1852.2.3 BTF_KIND_ARRAY
186~~~~~~~~~~~~~~~~~~~~
187
188``struct btf_type`` encoding requirement:
189 * ``name_off``: 0
190 * ``info.kind_flag``: 0
191 * ``info.kind``: BTF_KIND_ARRAY
192 * ``info.vlen``: 0
193 * ``size/type``: 0, not used
194
Andrii Nakryiko5efc5292019-02-28 17:12:19 -0800195``btf_type`` is followed by one ``struct btf_array``::
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800196
197 struct btf_array {
198 __u32 type;
199 __u32 index_type;
200 __u32 nelems;
201 };
202
203The ``struct btf_array`` encoding:
204 * ``type``: the element type
205 * ``index_type``: the index type
206 * ``nelems``: the number of elements for this array (``0`` is also allowed).
207
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800208The ``index_type`` can be any regular int type (``u8``, ``u16``, ``u32``,
209``u64``, ``unsigned __int128``). The original design of including
210``index_type`` follows DWARF, which has an ``index_type`` for its array type.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800211Currently in BTF, beyond type verification, the ``index_type`` is not used.
212
213The ``struct btf_array`` allows chaining through element type to represent
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800214multidimensional arrays. For example, for ``int a[5][6]``, the following type
215information illustrates the chaining:
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800216
217 * [1]: int
218 * [2]: array, ``btf_array.type = [1]``, ``btf_array.nelems = 6``
219 * [3]: array, ``btf_array.type = [2]``, ``btf_array.nelems = 5``
220
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800221Currently, both pahole and llvm collapse multidimensional array into
222one-dimensional array, e.g., for ``a[5][6]``, the ``btf_array.nelems`` is
223equal to ``30``. This is because the original use case is map pretty print
224where the whole array is dumped out so one-dimensional array is enough. As
225more BTF usage is explored, pahole and llvm can be changed to generate proper
226chained representation for multidimensional arrays.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800227
2282.2.4 BTF_KIND_STRUCT
229~~~~~~~~~~~~~~~~~~~~~
2302.2.5 BTF_KIND_UNION
231~~~~~~~~~~~~~~~~~~~~
232
233``struct btf_type`` encoding requirement:
234 * ``name_off``: 0 or offset to a valid C identifier
235 * ``info.kind_flag``: 0 or 1
236 * ``info.kind``: BTF_KIND_STRUCT or BTF_KIND_UNION
237 * ``info.vlen``: the number of struct/union members
238 * ``info.size``: the size of the struct/union in bytes
239
240``btf_type`` is followed by ``info.vlen`` number of ``struct btf_member``.::
241
242 struct btf_member {
243 __u32 name_off;
244 __u32 type;
245 __u32 offset;
246 };
247
248``struct btf_member`` encoding:
249 * ``name_off``: offset to a valid C identifier
250 * ``type``: the member type
251 * ``offset``: <see below>
252
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800253If the type info ``kind_flag`` is not set, the offset contains only bit offset
254of the member. Note that the base type of the bitfield can only be int or enum
255type. If the bitfield size is 32, the base type can be either int or enum
256type. If the bitfield size is not 32, the base type must be int, and int type
257``BTF_INT_BITS()`` encodes the bitfield size.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800258
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800259If the ``kind_flag`` is set, the ``btf_member.offset`` contains both member
260bitfield size and bit offset. The bitfield size and bit offset are calculated
261as below.::
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800262
263 #define BTF_MEMBER_BITFIELD_SIZE(val) ((val) >> 24)
264 #define BTF_MEMBER_BIT_OFFSET(val) ((val) & 0xffffff)
265
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800266In this case, if the base type is an int type, it must be a regular int type:
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800267
268 * ``BTF_INT_OFFSET()`` must be 0.
269 * ``BTF_INT_BITS()`` must be equal to ``{1,2,4,8,16} * 8``.
270
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800271The following kernel patch introduced ``kind_flag`` and explained why both
272modes exist:
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800273
274 https://github.com/torvalds/linux/commit/9d5f9f701b1891466fb3dbb1806ad97716f95cc3#diff-fa650a64fdd3968396883d2fe8215ff3
275
2762.2.6 BTF_KIND_ENUM
277~~~~~~~~~~~~~~~~~~~
278
279``struct btf_type`` encoding requirement:
280 * ``name_off``: 0 or offset to a valid C identifier
281 * ``info.kind_flag``: 0
282 * ``info.kind``: BTF_KIND_ENUM
283 * ``info.vlen``: number of enum values
284 * ``size``: 4
285
286``btf_type`` is followed by ``info.vlen`` number of ``struct btf_enum``.::
287
288 struct btf_enum {
289 __u32 name_off;
290 __s32 val;
291 };
292
293The ``btf_enum`` encoding:
294 * ``name_off``: offset to a valid C identifier
295 * ``val``: any value
296
2972.2.7 BTF_KIND_FWD
298~~~~~~~~~~~~~~~~~~
299
300``struct btf_type`` encoding requirement:
301 * ``name_off``: offset to a valid C identifier
302 * ``info.kind_flag``: 0 for struct, 1 for union
303 * ``info.kind``: BTF_KIND_FWD
304 * ``info.vlen``: 0
305 * ``type``: 0
306
307No additional type data follow ``btf_type``.
308
3092.2.8 BTF_KIND_TYPEDEF
310~~~~~~~~~~~~~~~~~~~~~~
311
312``struct btf_type`` encoding requirement:
313 * ``name_off``: offset to a valid C identifier
314 * ``info.kind_flag``: 0
315 * ``info.kind``: BTF_KIND_TYPEDEF
316 * ``info.vlen``: 0
317 * ``type``: the type which can be referred by name at ``name_off``
318
319No additional type data follow ``btf_type``.
320
3212.2.9 BTF_KIND_VOLATILE
322~~~~~~~~~~~~~~~~~~~~~~~
323
324``struct btf_type`` encoding requirement:
325 * ``name_off``: 0
326 * ``info.kind_flag``: 0
327 * ``info.kind``: BTF_KIND_VOLATILE
328 * ``info.vlen``: 0
329 * ``type``: the type with ``volatile`` qualifier
330
331No additional type data follow ``btf_type``.
332
3332.2.10 BTF_KIND_CONST
334~~~~~~~~~~~~~~~~~~~~~
335
336``struct btf_type`` encoding requirement:
337 * ``name_off``: 0
338 * ``info.kind_flag``: 0
339 * ``info.kind``: BTF_KIND_CONST
340 * ``info.vlen``: 0
341 * ``type``: the type with ``const`` qualifier
342
343No additional type data follow ``btf_type``.
344
3452.2.11 BTF_KIND_RESTRICT
346~~~~~~~~~~~~~~~~~~~~~~~~
347
348``struct btf_type`` encoding requirement:
349 * ``name_off``: 0
350 * ``info.kind_flag``: 0
351 * ``info.kind``: BTF_KIND_RESTRICT
352 * ``info.vlen``: 0
353 * ``type``: the type with ``restrict`` qualifier
354
355No additional type data follow ``btf_type``.
356
3572.2.12 BTF_KIND_FUNC
358~~~~~~~~~~~~~~~~~~~~
359
360``struct btf_type`` encoding requirement:
361 * ``name_off``: offset to a valid C identifier
362 * ``info.kind_flag``: 0
363 * ``info.kind``: BTF_KIND_FUNC
364 * ``info.vlen``: 0
365 * ``type``: a BTF_KIND_FUNC_PROTO type
366
367No additional type data follow ``btf_type``.
368
Andrii Nakryiko5efc5292019-02-28 17:12:19 -0800369A BTF_KIND_FUNC defines not a type, but a subprogram (function) whose
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800370signature is defined by ``type``. The subprogram is thus an instance of that
371type. The BTF_KIND_FUNC may in turn be referenced by a func_info in the
372:ref:`BTF_Ext_Section` (ELF) or in the arguments to :ref:`BPF_Prog_Load`
373(ABI).
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800374
3752.2.13 BTF_KIND_FUNC_PROTO
376~~~~~~~~~~~~~~~~~~~~~~~~~~
377
378``struct btf_type`` encoding requirement:
379 * ``name_off``: 0
380 * ``info.kind_flag``: 0
381 * ``info.kind``: BTF_KIND_FUNC_PROTO
382 * ``info.vlen``: # of parameters
383 * ``type``: the return type
384
385``btf_type`` is followed by ``info.vlen`` number of ``struct btf_param``.::
386
387 struct btf_param {
388 __u32 name_off;
389 __u32 type;
390 };
391
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800392If a BTF_KIND_FUNC_PROTO type is referred by a BTF_KIND_FUNC type, then
393``btf_param.name_off`` must point to a valid C identifier except for the
394possible last argument representing the variable argument. The btf_param.type
395refers to parameter type.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800396
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800397If the function has variable arguments, the last parameter is encoded with
398``name_off = 0`` and ``type = 0``.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800399
Daniel Borkmannf063c882019-04-09 23:20:08 +02004002.2.14 BTF_KIND_VAR
401~~~~~~~~~~~~~~~~~~~
402
403``struct btf_type`` encoding requirement:
404 * ``name_off``: offset to a valid C identifier
405 * ``info.kind_flag``: 0
406 * ``info.kind``: BTF_KIND_VAR
407 * ``info.vlen``: 0
408 * ``type``: the type of the variable
409
410``btf_type`` is followed by a single ``struct btf_variable`` with the
411following data::
412
413 struct btf_var {
414 __u32 linkage;
415 };
416
417``struct btf_var`` encoding:
418 * ``linkage``: currently only static variable 0, or globally allocated
419 variable in ELF sections 1
420
421Not all type of global variables are supported by LLVM at this point.
422The following is currently available:
423
424 * static variables with or without section attributes
425 * global variables with section attributes
426
427The latter is for future extraction of map key/value type id's from a
428map definition.
429
4302.2.15 BTF_KIND_DATASEC
431~~~~~~~~~~~~~~~~~~~~~~~
432
433``struct btf_type`` encoding requirement:
434 * ``name_off``: offset to a valid name associated with a variable or
435 one of .data/.bss/.rodata
436 * ``info.kind_flag``: 0
437 * ``info.kind``: BTF_KIND_DATASEC
438 * ``info.vlen``: # of variables
439 * ``size``: total section size in bytes (0 at compilation time, patched
440 to actual size by BPF loaders such as libbpf)
441
442``btf_type`` is followed by ``info.vlen`` number of ``struct btf_var_secinfo``.::
443
444 struct btf_var_secinfo {
445 __u32 type;
446 __u32 offset;
447 __u32 size;
448 };
449
450``struct btf_var_secinfo`` encoding:
451 * ``type``: the type of the BTF_KIND_VAR variable
452 * ``offset``: the in-section offset of the variable
453 * ``size``: the size of the variable in bytes
454
Yonghong Songffcf7ce2019-01-18 13:56:49 -08004553. BTF Kernel API
456*****************
457
458The following bpf syscall command involves BTF:
459 * BPF_BTF_LOAD: load a blob of BTF data into kernel
460 * BPF_MAP_CREATE: map creation with btf key and value type info.
461 * BPF_PROG_LOAD: prog load with btf function and line info.
462 * BPF_BTF_GET_FD_BY_ID: get a btf fd
463 * BPF_OBJ_GET_INFO_BY_FD: btf, func_info, line_info
464 and other btf related info are returned.
465
466The workflow typically looks like:
467::
468
469 Application:
470 BPF_BTF_LOAD
471 |
472 v
473 BPF_MAP_CREATE and BPF_PROG_LOAD
474 |
475 V
476 ......
477
478 Introspection tool:
479 ......
480 BPF_{PROG,MAP}_GET_NEXT_ID (get prog/map id's)
481 |
482 V
483 BPF_{PROG,MAP}_GET_FD_BY_ID (get a prog/map fd)
484 |
485 V
486 BPF_OBJ_GET_INFO_BY_FD (get bpf_prog_info/bpf_map_info with btf_id)
487 | |
488 V |
489 BPF_BTF_GET_FD_BY_ID (get btf_fd) |
490 | |
491 V |
492 BPF_OBJ_GET_INFO_BY_FD (get btf) |
493 | |
494 V V
495 pretty print types, dump func signatures and line info, etc.
496
497
4983.1 BPF_BTF_LOAD
499================
500
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800501Load a blob of BTF data into kernel. A blob of data, described in
502:ref:`BTF_Type_String`, can be directly loaded into the kernel. A ``btf_fd``
503is returned to a userspace.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800504
5053.2 BPF_MAP_CREATE
506==================
507
508A map can be created with ``btf_fd`` and specified key/value type id.::
509
510 __u32 btf_fd; /* fd pointing to a BTF type data */
511 __u32 btf_key_type_id; /* BTF type_id of the key */
512 __u32 btf_value_type_id; /* BTF type_id of the value */
513
514In libbpf, the map can be defined with extra annotation like below:
515::
516
517 struct bpf_map_def SEC("maps") btf_map = {
518 .type = BPF_MAP_TYPE_ARRAY,
519 .key_size = sizeof(int),
520 .value_size = sizeof(struct ipv_counts),
521 .max_entries = 4,
522 };
523 BPF_ANNOTATE_KV_PAIR(btf_map, int, struct ipv_counts);
524
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800525Here, the parameters for macro BPF_ANNOTATE_KV_PAIR are map name, key and
526value types for the map. During ELF parsing, libbpf is able to extract
527key/value type_id's and assign them to BPF_MAP_CREATE attributes
528automatically.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800529
530.. _BPF_Prog_Load:
531
5323.3 BPF_PROG_LOAD
533=================
534
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800535During prog_load, func_info and line_info can be passed to kernel with proper
536values for the following attributes:
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800537::
538
539 __u32 insn_cnt;
540 __aligned_u64 insns;
541 ......
542 __u32 prog_btf_fd; /* fd pointing to BTF type data */
543 __u32 func_info_rec_size; /* userspace bpf_func_info size */
544 __aligned_u64 func_info; /* func info */
545 __u32 func_info_cnt; /* number of bpf_func_info records */
546 __u32 line_info_rec_size; /* userspace bpf_line_info size */
547 __aligned_u64 line_info; /* line info */
548 __u32 line_info_cnt; /* number of bpf_line_info records */
549
550The func_info and line_info are an array of below, respectively.::
551
552 struct bpf_func_info {
553 __u32 insn_off; /* [0, insn_cnt - 1] */
554 __u32 type_id; /* pointing to a BTF_KIND_FUNC type */
555 };
556 struct bpf_line_info {
557 __u32 insn_off; /* [0, insn_cnt - 1] */
558 __u32 file_name_off; /* offset to string table for the filename */
559 __u32 line_off; /* offset to string table for the source line */
560 __u32 line_col; /* line number and column number */
561 };
562
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800563func_info_rec_size is the size of each func_info record, and
564line_info_rec_size is the size of each line_info record. Passing the record
565size to kernel make it possible to extend the record itself in the future.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800566
567Below are requirements for func_info:
568 * func_info[0].insn_off must be 0.
569 * the func_info insn_off is in strictly increasing order and matches
570 bpf func boundaries.
571
572Below are requirements for line_info:
Andrii Nakryiko5efc5292019-02-28 17:12:19 -0800573 * the first insn in each func must have a line_info record pointing to it.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800574 * the line_info insn_off is in strictly increasing order.
575
576For line_info, the line number and column number are defined as below:
577::
578
579 #define BPF_LINE_INFO_LINE_NUM(line_col) ((line_col) >> 10)
580 #define BPF_LINE_INFO_LINE_COL(line_col) ((line_col) & 0x3ff)
581
5823.4 BPF_{PROG,MAP}_GET_NEXT_ID
Gary Lin3ef46412019-05-08 15:54:48 +0800583==============================
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800584
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800585In kernel, every loaded program, map or btf has a unique id. The id won't
586change during the lifetime of a program, map, or btf.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800587
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800588The bpf syscall command BPF_{PROG,MAP}_GET_NEXT_ID returns all id's, one for
589each command, to user space, for bpf program or maps, respectively, so an
590inspection tool can inspect all programs and maps.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800591
5923.5 BPF_{PROG,MAP}_GET_FD_BY_ID
Gary Lin3ef46412019-05-08 15:54:48 +0800593===============================
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800594
Andrii Nakryiko5efc5292019-02-28 17:12:19 -0800595An introspection tool cannot use id to get details about program or maps.
596A file descriptor needs to be obtained first for reference-counting purpose.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800597
5983.6 BPF_OBJ_GET_INFO_BY_FD
599==========================
600
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800601Once a program/map fd is acquired, an introspection tool can get the detailed
602information from kernel about this fd, some of which are BTF-related. For
603example, ``bpf_map_info`` returns ``btf_id`` and key/value type ids.
604``bpf_prog_info`` returns ``btf_id``, func_info, and line info for translated
605bpf byte codes, and jited_line_info.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800606
6073.7 BPF_BTF_GET_FD_BY_ID
608========================
609
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800610With ``btf_id`` obtained in ``bpf_map_info`` and ``bpf_prog_info``, bpf
611syscall command BPF_BTF_GET_FD_BY_ID can retrieve a btf fd. Then, with
612command BPF_OBJ_GET_INFO_BY_FD, the btf blob, originally loaded into the
613kernel with BPF_BTF_LOAD, can be retrieved.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800614
Andrii Nakryiko5efc5292019-02-28 17:12:19 -0800615With the btf blob, ``bpf_map_info``, and ``bpf_prog_info``, an introspection
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800616tool has full btf knowledge and is able to pretty print map key/values, dump
617func signatures and line info, along with byte/jit codes.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800618
6194. ELF File Format Interface
620****************************
621
6224.1 .BTF section
623================
624
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800625The .BTF section contains type and string data. The format of this section is
626same as the one describe in :ref:`BTF_Type_String`.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800627
628.. _BTF_Ext_Section:
629
6304.2 .BTF.ext section
631====================
632
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800633The .BTF.ext section encodes func_info and line_info which needs loader
634manipulation before loading into the kernel.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800635
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800636The specification for .BTF.ext section is defined at ``tools/lib/bpf/btf.h``
637and ``tools/lib/bpf/btf.c``.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800638
639The current header of .BTF.ext section::
640
641 struct btf_ext_header {
642 __u16 magic;
643 __u8 version;
644 __u8 flags;
645 __u32 hdr_len;
646
647 /* All offsets are in bytes relative to the end of this header */
648 __u32 func_info_off;
649 __u32 func_info_len;
650 __u32 line_info_off;
651 __u32 line_info_len;
652 };
653
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800654It is very similar to .BTF section. Instead of type/string section, it
655contains func_info and line_info section. See :ref:`BPF_Prog_Load` for details
656about func_info and line_info record format.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800657
658The func_info is organized as below.::
659
660 func_info_rec_size
661 btf_ext_info_sec for section #1 /* func_info for section #1 */
662 btf_ext_info_sec for section #2 /* func_info for section #2 */
663 ...
664
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800665``func_info_rec_size`` specifies the size of ``bpf_func_info`` structure when
666.BTF.ext is generated. ``btf_ext_info_sec``, defined below, is a collection of
667func_info for each specific ELF section.::
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800668
669 struct btf_ext_info_sec {
670 __u32 sec_name_off; /* offset to section name */
671 __u32 num_info;
672 /* Followed by num_info * record_size number of bytes */
673 __u8 data[0];
674 };
675
676Here, num_info must be greater than 0.
677
678The line_info is organized as below.::
679
680 line_info_rec_size
681 btf_ext_info_sec for section #1 /* line_info for section #1 */
682 btf_ext_info_sec for section #2 /* line_info for section #2 */
683 ...
684
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800685``line_info_rec_size`` specifies the size of ``bpf_line_info`` structure when
686.BTF.ext is generated.
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800687
688The interpretation of ``bpf_func_info->insn_off`` and
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800689``bpf_line_info->insn_off`` is different between kernel API and ELF API. For
690kernel API, the ``insn_off`` is the instruction offset in the unit of ``struct
691bpf_insn``. For ELF API, the ``insn_off`` is the byte offset from the
692beginning of section (``btf_ext_info_sec->sec_name_off``).
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800693
Jiri Olsa232ce4b2020-07-11 23:53:27 +02006944.2 .BTF_ids section
695====================
696
697The .BTF_ids section encodes BTF ID values that are used within the kernel.
698
699This section is created during the kernel compilation with the help of
700macros defined in ``include/linux/btf_ids.h`` header file. Kernel code can
701use them to create lists and sets (sorted lists) of BTF ID values.
702
703The ``BTF_ID_LIST`` and ``BTF_ID`` macros define unsorted list of BTF ID values,
704with following syntax::
705
706 BTF_ID_LIST(list)
707 BTF_ID(type1, name1)
708 BTF_ID(type2, name2)
709
710resulting in following layout in .BTF_ids section::
711
712 __BTF_ID__type1__name1__1:
713 .zero 4
714 __BTF_ID__type2__name2__2:
715 .zero 4
716
717The ``u32 list[];`` variable is defined to access the list.
718
719The ``BTF_ID_UNUSED`` macro defines 4 zero bytes. It's used when we
720want to define unused entry in BTF_ID_LIST, like::
721
722 BTF_ID_LIST(bpf_skb_output_btf_ids)
723 BTF_ID(struct, sk_buff)
724 BTF_ID_UNUSED
725 BTF_ID(struct, task_struct)
726
Jiri Olsa68a26bc2020-08-25 21:21:21 +0200727The ``BTF_SET_START/END`` macros pair defines sorted list of BTF ID values
728and their count, with following syntax::
729
730 BTF_SET_START(set)
731 BTF_ID(type1, name1)
732 BTF_ID(type2, name2)
733 BTF_SET_END(set)
734
735resulting in following layout in .BTF_ids section::
736
737 __BTF_ID__set__set:
738 .zero 4
739 __BTF_ID__type1__name1__3:
740 .zero 4
741 __BTF_ID__type2__name2__4:
742 .zero 4
743
744The ``struct btf_id_set set;`` variable is defined to access the list.
745
746The ``typeX`` name can be one of following::
747
748 struct, union, typedef, func
749
750and is used as a filter when resolving the BTF ID value.
751
Jiri Olsa232ce4b2020-07-11 23:53:27 +0200752All the BTF ID lists and sets are compiled in the .BTF_ids section and
753resolved during the linking phase of kernel build by ``resolve_btfids`` tool.
754
Yonghong Songffcf7ce2019-01-18 13:56:49 -08007555. Using BTF
756************
757
7585.1 bpftool map pretty print
759============================
760
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800761With BTF, the map key/value can be printed based on fields rather than simply
762raw bytes. This is especially valuable for large structure or if your data
763structure has bitfields. For example, for the following map,::
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800764
765 enum A { A1, A2, A3, A4, A5 };
766 typedef enum A ___A;
767 struct tmp_t {
768 char a1:4;
769 int a2:4;
770 int :4;
771 __u32 a3:4;
772 int b;
773 ___A b1:4;
774 enum A b2:4;
775 };
776 struct bpf_map_def SEC("maps") tmpmap = {
777 .type = BPF_MAP_TYPE_ARRAY,
778 .key_size = sizeof(__u32),
779 .value_size = sizeof(struct tmp_t),
780 .max_entries = 1,
781 };
782 BPF_ANNOTATE_KV_PAIR(tmpmap, int, struct tmp_t);
783
784bpftool is able to pretty print like below:
785::
786
787 [{
788 "key": 0,
789 "value": {
790 "a1": 0x2,
791 "a2": 0x4,
792 "a3": 0x6,
793 "b": 7,
794 "b1": 0x8,
795 "b2": 0xa
796 }
797 }
798 ]
799
8005.2 bpftool prog dump
801=====================
802
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800803The following is an example showing how func_info and line_info can help prog
804dump with better kernel symbol names, function prototypes and line
805information.::
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800806
807 $ bpftool prog dump jited pinned /sys/fs/bpf/test_btf_haskv
808 [...]
809 int test_long_fname_2(struct dummy_tracepoint_args * arg):
810 bpf_prog_44a040bf25481309_test_long_fname_2:
811 ; static int test_long_fname_2(struct dummy_tracepoint_args *arg)
812 0: push %rbp
813 1: mov %rsp,%rbp
814 4: sub $0x30,%rsp
815 b: sub $0x28,%rbp
816 f: mov %rbx,0x0(%rbp)
817 13: mov %r13,0x8(%rbp)
818 17: mov %r14,0x10(%rbp)
819 1b: mov %r15,0x18(%rbp)
820 1f: xor %eax,%eax
821 21: mov %rax,0x20(%rbp)
822 25: xor %esi,%esi
823 ; int key = 0;
824 27: mov %esi,-0x4(%rbp)
825 ; if (!arg->sock)
826 2a: mov 0x8(%rdi),%rdi
827 ; if (!arg->sock)
828 2e: cmp $0x0,%rdi
829 32: je 0x0000000000000070
830 34: mov %rbp,%rsi
831 ; counts = bpf_map_lookup_elem(&btf_map, &key);
832 [...]
833
Andrii Nakryiko5efc5292019-02-28 17:12:19 -08008345.3 Verifier Log
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800835================
836
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800837The following is an example of how line_info can help debugging verification
838failure.::
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800839
840 /* The code at tools/testing/selftests/bpf/test_xdp_noinline.c
841 * is modified as below.
842 */
843 data = (void *)(long)xdp->data;
844 data_end = (void *)(long)xdp->data_end;
845 /*
846 if (data + 4 > data_end)
847 return XDP_DROP;
848 */
849 *(u32 *)data = dst->dst;
850
851 $ bpftool prog load ./test_xdp_noinline.o /sys/fs/bpf/test_xdp_noinline type xdp
852 ; data = (void *)(long)xdp->data;
853 224: (79) r2 = *(u64 *)(r10 -112)
854 225: (61) r2 = *(u32 *)(r2 +0)
855 ; *(u32 *)data = dst->dst;
856 226: (63) *(u32 *)(r2 +0) = r1
857 invalid access to packet, off=0 size=4, R2(id=0,off=0,r=0)
858 R2 offset is outside of the packet
859
8606. BTF Generation
861*****************
862
863You need latest pahole
864
865 https://git.kernel.org/pub/scm/devel/pahole/pahole.git/
866
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800867or llvm (8.0 or later). The pahole acts as a dwarf2btf converter. It doesn't
868support .BTF.ext and btf BTF_KIND_FUNC type yet. For example,::
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800869
870 -bash-4.4$ cat t.c
871 struct t {
872 int a:2;
873 int b:3;
874 int c:2;
875 } g;
876 -bash-4.4$ gcc -c -O2 -g t.c
877 -bash-4.4$ pahole -JV t.o
878 File t.o:
879 [1] STRUCT t kind_flag=1 size=4 vlen=3
880 a type_id=2 bitfield_size=2 bits_offset=0
881 b type_id=2 bitfield_size=3 bits_offset=2
882 c type_id=2 bitfield_size=2 bits_offset=5
883 [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
884
Andrii Nakryiko9ab53052019-02-28 17:12:20 -0800885The llvm is able to generate .BTF and .BTF.ext directly with -g for bpf target
886only. The assembly code (-S) is able to show the BTF encoding in assembly
887format.::
Yonghong Songffcf7ce2019-01-18 13:56:49 -0800888
889 -bash-4.4$ cat t2.c
890 typedef int __int32;
891 struct t2 {
892 int a2;
893 int (*f2)(char q1, __int32 q2, ...);
894 int (*f3)();
895 } g2;
896 int main() { return 0; }
897 int test() { return 0; }
898 -bash-4.4$ clang -c -g -O2 -target bpf t2.c
899 -bash-4.4$ readelf -S t2.o
900 ......
901 [ 8] .BTF PROGBITS 0000000000000000 00000247
902 000000000000016e 0000000000000000 0 0 1
903 [ 9] .BTF.ext PROGBITS 0000000000000000 000003b5
904 0000000000000060 0000000000000000 0 0 1
905 [10] .rel.BTF.ext REL 0000000000000000 000007e0
906 0000000000000040 0000000000000010 16 9 8
907 ......
908 -bash-4.4$ clang -S -g -O2 -target bpf t2.c
909 -bash-4.4$ cat t2.s
910 ......
911 .section .BTF,"",@progbits
912 .short 60319 # 0xeb9f
913 .byte 1
914 .byte 0
915 .long 24
916 .long 0
917 .long 220
918 .long 220
919 .long 122
920 .long 0 # BTF_KIND_FUNC_PROTO(id = 1)
921 .long 218103808 # 0xd000000
922 .long 2
923 .long 83 # BTF_KIND_INT(id = 2)
924 .long 16777216 # 0x1000000
925 .long 4
926 .long 16777248 # 0x1000020
927 ......
928 .byte 0 # string offset=0
929 .ascii ".text" # string offset=1
930 .byte 0
931 .ascii "/home/yhs/tmp-pahole/t2.c" # string offset=7
932 .byte 0
933 .ascii "int main() { return 0; }" # string offset=33
934 .byte 0
935 .ascii "int test() { return 0; }" # string offset=58
936 .byte 0
937 .ascii "int" # string offset=83
938 ......
939 .section .BTF.ext,"",@progbits
940 .short 60319 # 0xeb9f
941 .byte 1
942 .byte 0
943 .long 24
944 .long 0
945 .long 28
946 .long 28
947 .long 44
948 .long 8 # FuncInfo
949 .long 1 # FuncInfo section string offset=1
950 .long 2
951 .long .Lfunc_begin0
952 .long 3
953 .long .Lfunc_begin1
954 .long 5
955 .long 16 # LineInfo
956 .long 1 # LineInfo section string offset=1
957 .long 2
958 .long .Ltmp0
959 .long 7
960 .long 33
961 .long 7182 # Line 7 Col 14
962 .long .Ltmp3
963 .long 7
964 .long 58
965 .long 8206 # Line 8 Col 14
966
9677. Testing
968**********
969
Andrii Nakryiko5efc5292019-02-28 17:12:19 -0800970Kernel bpf selftest `test_btf.c` provides extensive set of BTF-related tests.