Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 1 | ============== |
| 2 | BPF Design Q&A |
| 3 | ============== |
| 4 | |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 5 | BPF extensibility and applicability to networking, tracing, security |
| 6 | in the linux kernel and several user space implementations of BPF |
| 7 | virtual machine led to a number of misunderstanding on what BPF actually is. |
| 8 | This short QA is an attempt to address that and outline a direction |
| 9 | of where BPF is heading long term. |
| 10 | |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 11 | .. contents:: |
| 12 | :local: |
| 13 | :depth: 3 |
| 14 | |
| 15 | Questions and Answers |
| 16 | ===================== |
| 17 | |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 18 | Q: Is BPF a generic instruction set similar to x64 and arm64? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 19 | ------------------------------------------------------------- |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 20 | A: NO. |
| 21 | |
| 22 | Q: Is BPF a generic virtual machine ? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 23 | ------------------------------------- |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 24 | A: NO. |
| 25 | |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 26 | BPF is generic instruction set *with* C calling convention. |
| 27 | ----------------------------------------------------------- |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 28 | |
| 29 | Q: Why C calling convention was chosen? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 30 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 31 | |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 32 | A: Because BPF programs are designed to run in the linux kernel |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 33 | which is written in C, hence BPF defines instruction set compatible |
| 34 | with two most used architectures x64 and arm64 (and takes into |
| 35 | consideration important quirks of other architectures) and |
| 36 | defines calling convention that is compatible with C calling |
| 37 | convention of the linux kernel on those architectures. |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 38 | |
Andrii Nakryiko | 46604676 | 2019-02-28 17:12:21 -0800 | [diff] [blame] | 39 | Q: Can multiple return values be supported in the future? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 40 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 41 | A: NO. BPF allows only register R0 to be used as return value. |
| 42 | |
Andrii Nakryiko | 46604676 | 2019-02-28 17:12:21 -0800 | [diff] [blame] | 43 | Q: Can more than 5 function arguments be supported in the future? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 44 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 45 | A: NO. BPF calling convention only allows registers R1-R5 to be used |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 46 | as arguments. BPF is not a standalone instruction set. |
| 47 | (unlike x64 ISA that allows msft, cdecl and other conventions) |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 48 | |
Andrii Nakryiko | 46604676 | 2019-02-28 17:12:21 -0800 | [diff] [blame] | 49 | Q: Can BPF programs access instruction pointer or return address? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 50 | ----------------------------------------------------------------- |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 51 | A: NO. |
| 52 | |
Andrii Nakryiko | 46604676 | 2019-02-28 17:12:21 -0800 | [diff] [blame] | 53 | Q: Can BPF programs access stack pointer ? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 54 | ------------------------------------------ |
| 55 | A: NO. |
| 56 | |
| 57 | Only frame pointer (register R10) is accessible. |
| 58 | From compiler point of view it's necessary to have stack pointer. |
Andrii Nakryiko | 46604676 | 2019-02-28 17:12:21 -0800 | [diff] [blame] | 59 | For example, LLVM defines register R11 as stack pointer in its |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 60 | BPF backend, but it makes sure that generated code never uses it. |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 61 | |
| 62 | Q: Does C-calling convention diminishes possible use cases? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 63 | ----------------------------------------------------------- |
| 64 | A: YES. |
| 65 | |
| 66 | BPF design forces addition of major functionality in the form |
| 67 | of kernel helper functions and kernel objects like BPF maps with |
| 68 | seamless interoperability between them. It lets kernel call into |
Andrii Nakryiko | 46604676 | 2019-02-28 17:12:21 -0800 | [diff] [blame] | 69 | BPF programs and programs call kernel helpers with zero overhead, |
| 70 | as all of them were native C code. That is particularly the case |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 71 | for JITed BPF programs that are indistinguishable from |
| 72 | native kernel C code. |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 73 | |
| 74 | Q: Does it mean that 'innovative' extensions to BPF code are disallowed? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 75 | ------------------------------------------------------------------------ |
| 76 | A: Soft yes. |
| 77 | |
Andrii Nakryiko | 46604676 | 2019-02-28 17:12:21 -0800 | [diff] [blame] | 78 | At least for now, until BPF core has support for |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 79 | bpf-to-bpf calls, indirect calls, loops, global variables, |
Andrii Nakryiko | 46604676 | 2019-02-28 17:12:21 -0800 | [diff] [blame] | 80 | jump tables, read-only sections, and all other normal constructs |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 81 | that C code can produce. |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 82 | |
| 83 | Q: Can loops be supported in a safe way? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 84 | ---------------------------------------- |
| 85 | A: It's not clear yet. |
| 86 | |
| 87 | BPF developers are trying to find a way to |
Alexei Starovoitov | 3b88024 | 2019-04-17 18:27:01 -0700 | [diff] [blame] | 88 | support bounded loops. |
| 89 | |
| 90 | Q: What are the verifier limits? |
| 91 | -------------------------------- |
| 92 | A: The only limit known to the user space is BPF_MAXINSNS (4096). |
| 93 | It's the maximum number of instructions that the unprivileged bpf |
| 94 | program can have. The verifier has various internal limits. |
| 95 | Like the maximum number of instructions that can be explored during |
| 96 | program analysis. Currently, that limit is set to 1 million. |
| 97 | Which essentially means that the largest program can consist |
| 98 | of 1 million NOP instructions. There is a limit to the maximum number |
| 99 | of subsequent branches, a limit to the number of nested bpf-to-bpf |
| 100 | calls, a limit to the number of the verifier states per instruction, |
| 101 | a limit to the number of maps used by the program. |
| 102 | All these limits can be hit with a sufficiently complex program. |
| 103 | There are also non-numerical limits that can cause the program |
| 104 | to be rejected. The verifier used to recognize only pointer + constant |
| 105 | expressions. Now it can recognize pointer + bounded_register. |
| 106 | bpf_lookup_map_elem(key) had a requirement that 'key' must be |
| 107 | a pointer to the stack. Now, 'key' can be a pointer to map value. |
| 108 | The verifier is steadily getting 'smarter'. The limits are |
| 109 | being removed. The only way to know that the program is going to |
| 110 | be accepted by the verifier is to try to load it. |
| 111 | The bpf development process guarantees that the future kernel |
| 112 | versions will accept all bpf programs that were accepted by |
| 113 | the earlier versions. |
| 114 | |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 115 | |
| 116 | Instruction level questions |
| 117 | --------------------------- |
| 118 | |
| 119 | Q: LD_ABS and LD_IND instructions vs C code |
| 120 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 121 | |
| 122 | Q: How come LD_ABS and LD_IND instruction are present in BPF whereas |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 123 | C code cannot express them and has to use builtin intrinsics? |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 124 | |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 125 | A: This is artifact of compatibility with classic BPF. Modern |
| 126 | networking code in BPF performs better without them. |
| 127 | See 'direct packet access'. |
| 128 | |
| 129 | Q: BPF instructions mapping not one-to-one to native CPU |
| 130 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 131 | Q: It seems not all BPF instructions are one-to-one to native CPU. |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 132 | For example why BPF_JNE and other compare and jumps are not cpu-like? |
| 133 | |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 134 | A: This was necessary to avoid introducing flags into ISA which are |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 135 | impossible to make generic and efficient across CPU architectures. |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 136 | |
Andrii Nakryiko | 46604676 | 2019-02-28 17:12:21 -0800 | [diff] [blame] | 137 | Q: Why BPF_DIV instruction doesn't map to x64 div? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 138 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 139 | A: Because if we picked one-to-one relationship to x64 it would have made |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 140 | it more complicated to support on arm64 and other archs. Also it |
| 141 | needs div-by-zero runtime check. |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 142 | |
Andrii Nakryiko | 46604676 | 2019-02-28 17:12:21 -0800 | [diff] [blame] | 143 | Q: Why there is no BPF_SDIV for signed divide operation? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 144 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 145 | A: Because it would be rarely used. llvm errors in such case and |
Andrii Nakryiko | 46604676 | 2019-02-28 17:12:21 -0800 | [diff] [blame] | 146 | prints a suggestion to use unsigned divide instead. |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 147 | |
| 148 | Q: Why BPF has implicit prologue and epilogue? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 149 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 150 | A: Because architectures like sparc have register windows and in general |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 151 | there are enough subtle differences between architectures, so naive |
| 152 | store return address into stack won't work. Another reason is BPF has |
| 153 | to be safe from division by zero (and legacy exception path |
| 154 | of LD_ABS insn). Those instructions need to invoke epilogue and |
| 155 | return implicitly. |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 156 | |
| 157 | Q: Why BPF_JLT and BPF_JLE instructions were not introduced in the beginning? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 158 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 159 | A: Because classic BPF didn't have them and BPF authors felt that compiler |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 160 | workaround would be acceptable. Turned out that programs lose performance |
| 161 | due to lack of these compare instructions and they were added. |
| 162 | These two instructions is a perfect example what kind of new BPF |
| 163 | instructions are acceptable and can be added in the future. |
| 164 | These two already had equivalent instructions in native CPUs. |
| 165 | New instructions that don't have one-to-one mapping to HW instructions |
| 166 | will not be accepted. |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 167 | |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 168 | Q: BPF 32-bit subregister requirements |
| 169 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 170 | Q: BPF 32-bit subregisters have a requirement to zero upper 32-bits of BPF |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 171 | registers which makes BPF inefficient virtual machine for 32-bit |
| 172 | CPU architectures and 32-bit HW accelerators. Can true 32-bit registers |
| 173 | be added to BPF in the future? |
| 174 | |
Jiong Wang | c231c22 | 2019-05-30 21:23:18 +0100 | [diff] [blame] | 175 | A: NO. |
| 176 | |
| 177 | But some optimizations on zero-ing the upper 32 bits for BPF registers are |
| 178 | available, and can be leveraged to improve the performance of JITed BPF |
| 179 | programs for 32-bit architectures. |
| 180 | |
| 181 | Starting with version 7, LLVM is able to generate instructions that operate |
| 182 | on 32-bit subregisters, provided the option -mattr=+alu32 is passed for |
| 183 | compiling a program. Furthermore, the verifier can now mark the |
| 184 | instructions for which zero-ing the upper bits of the destination register |
| 185 | is required, and insert an explicit zero-extension (zext) instruction |
| 186 | (a mov32 variant). This means that for architectures without zext hardware |
| 187 | support, the JIT back-ends do not need to clear the upper bits for |
| 188 | subregisters written by alu32 instructions or narrow loads. Instead, the |
| 189 | back-ends simply need to support code generation for that mov32 variant, |
| 190 | and to overwrite bpf_jit_needs_zext() to make it return "true" (in order to |
| 191 | enable zext insertion in the verifier). |
| 192 | |
| 193 | Note that it is possible for a JIT back-end to have partial hardware |
| 194 | support for zext. In that case, if verifier zext insertion is enabled, |
| 195 | it could lead to the insertion of unnecessary zext instructions. Such |
| 196 | instructions could be removed by creating a simple peephole inside the JIT |
| 197 | back-end: if one instruction has hardware support for zext and if the next |
| 198 | instruction is an explicit zext, then the latter can be skipped when doing |
| 199 | the code generation. |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 200 | |
| 201 | Q: Does BPF have a stable ABI? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 202 | ------------------------------ |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 203 | A: YES. BPF instructions, arguments to BPF programs, set of helper |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 204 | functions and their arguments, recognized return codes are all part |
Daniel Borkmann | a769fa7 | 2019-01-07 22:57:17 +0100 | [diff] [blame] | 205 | of ABI. However there is one specific exception to tracing programs |
| 206 | which are using helpers like bpf_probe_read() to walk kernel internal |
| 207 | data structures and compile with kernel internal headers. Both of these |
| 208 | kernel internals are subject to change and can break with newer kernels |
| 209 | such that the program needs to be adapted accordingly. |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 210 | |
Qais Yousef | 6939f4e | 2021-01-19 12:22:36 +0000 | [diff] [blame] | 211 | Q: Are tracepoints part of the stable ABI? |
| 212 | ------------------------------------------ |
| 213 | A: NO. Tracepoints are tied to internal implementation details hence they are |
| 214 | subject to change and can break with newer kernels. BPF programs need to change |
| 215 | accordingly when this happens. |
| 216 | |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 217 | Q: How much stack space a BPF program uses? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 218 | ------------------------------------------- |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 219 | A: Currently all program types are limited to 512 bytes of stack |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 220 | space, but the verifier computes the actual amount of stack used |
| 221 | and both interpreter and most JITed code consume necessary amount. |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 222 | |
| 223 | Q: Can BPF be offloaded to HW? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 224 | ------------------------------ |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 225 | A: YES. BPF HW offload is supported by NFP driver. |
| 226 | |
| 227 | Q: Does classic BPF interpreter still exist? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 228 | -------------------------------------------- |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 229 | A: NO. Classic BPF programs are converted into extend BPF instructions. |
| 230 | |
| 231 | Q: Can BPF call arbitrary kernel functions? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 232 | ------------------------------------------- |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 233 | A: NO. BPF programs can only call a set of helper functions which |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 234 | is defined for every program type. |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 235 | |
| 236 | Q: Can BPF overwrite arbitrary kernel memory? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 237 | --------------------------------------------- |
| 238 | A: NO. |
| 239 | |
| 240 | Tracing bpf programs can *read* arbitrary memory with bpf_probe_read() |
| 241 | and bpf_probe_read_str() helpers. Networking programs cannot read |
| 242 | arbitrary memory, since they don't have access to these helpers. |
| 243 | Programs can never read or write arbitrary memory directly. |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 244 | |
| 245 | Q: Can BPF overwrite arbitrary user memory? |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 246 | ------------------------------------------- |
| 247 | A: Sort-of. |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 248 | |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 249 | Tracing BPF programs can overwrite the user memory |
| 250 | of the current task with bpf_probe_write_user(). Every time such |
| 251 | program is loaded the kernel will print warning message, so |
| 252 | this helper is only useful for experiments and prototypes. |
| 253 | Tracing BPF programs are root only. |
| 254 | |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 255 | Q: New functionality via kernel modules? |
| 256 | ---------------------------------------- |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 257 | Q: Can BPF functionality such as new program or map types, new |
Jesper Dangaard Brouer | 1a6ac1d | 2018-05-14 15:42:22 +0200 | [diff] [blame] | 258 | helpers, etc be added out of kernel module code? |
| 259 | |
Alexei Starovoitov | 2e39748 | 2017-10-30 19:39:56 -0700 | [diff] [blame] | 260 | A: NO. |
Martin KaFai Lau | 5bdca94 | 2021-03-29 22:41:50 -0700 | [diff] [blame] | 261 | |
| 262 | Q: Directly calling kernel function is an ABI? |
| 263 | ---------------------------------------------- |
| 264 | Q: Some kernel functions (e.g. tcp_slow_start) can be called |
| 265 | by BPF programs. Do these kernel functions become an ABI? |
| 266 | |
| 267 | A: NO. |
| 268 | |
| 269 | The kernel function protos will change and the bpf programs will be |
| 270 | rejected by the verifier. Also, for example, some of the bpf-callable |
| 271 | kernel functions have already been used by other kernel tcp |
| 272 | cc (congestion-control) implementations. If any of these kernel |
| 273 | functions has changed, both the in-tree and out-of-tree kernel tcp cc |
| 274 | implementations have to be changed. The same goes for the bpf |
| 275 | programs and they have to be adjusted accordingly. |