Andrea Parri | 48d44d4 | 2018-02-20 15:25:01 -0800 | [diff] [blame] | 1 | ===================================== |
| 2 | LINUX KERNEL MEMORY CONSISTENCY MODEL |
| 3 | ===================================== |
Paul E. McKenney | 1c27b64 | 2018-01-18 19:58:55 -0800 | [diff] [blame] | 4 | |
| 5 | ============ |
| 6 | INTRODUCTION |
| 7 | ============ |
| 8 | |
Andrea Parri | 48d44d4 | 2018-02-20 15:25:01 -0800 | [diff] [blame] | 9 | This directory contains the memory consistency model (memory model, for |
| 10 | short) of the Linux kernel, written in the "cat" language and executable |
| 11 | by the externally provided "herd7" simulator, which exhaustively explores |
| 12 | the state space of small litmus tests. |
Paul E. McKenney | 1c27b64 | 2018-01-18 19:58:55 -0800 | [diff] [blame] | 13 | |
| 14 | In addition, the "klitmus7" tool (also externally provided) may be used |
| 15 | to convert a litmus test to a Linux kernel module, which in turn allows |
| 16 | that litmus test to be exercised within the Linux kernel. |
| 17 | |
| 18 | |
| 19 | ============ |
| 20 | REQUIREMENTS |
| 21 | ============ |
| 22 | |
Andrea Parri | 034fb71 | 2019-01-31 08:08:40 -0800 | [diff] [blame] | 23 | Version 7.52 or higher of the "herd7" and "klitmus7" tools must be |
| 24 | downloaded separately: |
Paul E. McKenney | 1c27b64 | 2018-01-18 19:58:55 -0800 | [diff] [blame] | 25 | |
| 26 | https://github.com/herd/herdtools7 |
| 27 | |
| 28 | See "herdtools7/INSTALL.md" for installation instructions. |
| 29 | |
Andrea Parri | 034fb71 | 2019-01-31 08:08:40 -0800 | [diff] [blame] | 30 | Note that although these tools usually provide backwards compatibility, |
| 31 | this is not absolutely guaranteed. Therefore, if a later version does |
| 32 | not work, please try using the exact version called out above. |
| 33 | |
Paul E. McKenney | 1c27b64 | 2018-01-18 19:58:55 -0800 | [diff] [blame] | 34 | |
| 35 | ================== |
| 36 | BASIC USAGE: HERD7 |
| 37 | ================== |
| 38 | |
| 39 | The memory model is used, in conjunction with "herd7", to exhaustively |
| 40 | explore the state space of small litmus tests. |
| 41 | |
Andrea Parri | 71b7ff5 | 2018-07-16 11:06:05 -0700 | [diff] [blame] | 42 | For example, to run SB+fencembonceonces.litmus against the memory model: |
Paul E. McKenney | 1c27b64 | 2018-01-18 19:58:55 -0800 | [diff] [blame] | 43 | |
Andrea Parri | 71b7ff5 | 2018-07-16 11:06:05 -0700 | [diff] [blame] | 44 | $ herd7 -conf linux-kernel.cfg litmus-tests/SB+fencembonceonces.litmus |
Paul E. McKenney | 1c27b64 | 2018-01-18 19:58:55 -0800 | [diff] [blame] | 45 | |
| 46 | Here is the corresponding output: |
| 47 | |
Andrea Parri | 71b7ff5 | 2018-07-16 11:06:05 -0700 | [diff] [blame] | 48 | Test SB+fencembonceonces Allowed |
Paul E. McKenney | 1c27b64 | 2018-01-18 19:58:55 -0800 | [diff] [blame] | 49 | States 3 |
| 50 | 0:r0=0; 1:r0=1; |
| 51 | 0:r0=1; 1:r0=0; |
| 52 | 0:r0=1; 1:r0=1; |
| 53 | No |
| 54 | Witnesses |
| 55 | Positive: 0 Negative: 3 |
| 56 | Condition exists (0:r0=0 /\ 1:r0=0) |
Andrea Parri | 71b7ff5 | 2018-07-16 11:06:05 -0700 | [diff] [blame] | 57 | Observation SB+fencembonceonces Never 0 3 |
| 58 | Time SB+fencembonceonces 0.01 |
Paul E. McKenney | 1c27b64 | 2018-01-18 19:58:55 -0800 | [diff] [blame] | 59 | Hash=d66d99523e2cac6b06e66f4c995ebb48 |
| 60 | |
| 61 | The "Positive: 0 Negative: 3" and the "Never 0 3" each indicate that |
| 62 | this litmus test's "exists" clause can not be satisfied. |
| 63 | |
| 64 | See "herd7 -help" or "herdtools7/doc/" for more information. |
| 65 | |
| 66 | |
| 67 | ===================== |
| 68 | BASIC USAGE: KLITMUS7 |
| 69 | ===================== |
| 70 | |
| 71 | The "klitmus7" tool converts a litmus test into a Linux kernel module, |
| 72 | which may then be loaded and run. |
| 73 | |
Andrea Parri | 71b7ff5 | 2018-07-16 11:06:05 -0700 | [diff] [blame] | 74 | For example, to run SB+fencembonceonces.litmus against hardware: |
Paul E. McKenney | 1c27b64 | 2018-01-18 19:58:55 -0800 | [diff] [blame] | 75 | |
| 76 | $ mkdir mymodules |
Andrea Parri | 71b7ff5 | 2018-07-16 11:06:05 -0700 | [diff] [blame] | 77 | $ klitmus7 -o mymodules litmus-tests/SB+fencembonceonces.litmus |
Paul E. McKenney | 1c27b64 | 2018-01-18 19:58:55 -0800 | [diff] [blame] | 78 | $ cd mymodules ; make |
| 79 | $ sudo sh run.sh |
| 80 | |
| 81 | The corresponding output includes: |
| 82 | |
Andrea Parri | 71b7ff5 | 2018-07-16 11:06:05 -0700 | [diff] [blame] | 83 | Test SB+fencembonceonces Allowed |
Paul E. McKenney | 1c27b64 | 2018-01-18 19:58:55 -0800 | [diff] [blame] | 84 | Histogram (3 states) |
| 85 | 644580 :>0:r0=1; 1:r0=0; |
| 86 | 644328 :>0:r0=0; 1:r0=1; |
| 87 | 711092 :>0:r0=1; 1:r0=1; |
| 88 | No |
| 89 | Witnesses |
| 90 | Positive: 0, Negative: 2000000 |
| 91 | Condition exists (0:r0=0 /\ 1:r0=0) is NOT validated |
| 92 | Hash=d66d99523e2cac6b06e66f4c995ebb48 |
Andrea Parri | 71b7ff5 | 2018-07-16 11:06:05 -0700 | [diff] [blame] | 93 | Observation SB+fencembonceonces Never 0 2000000 |
| 94 | Time SB+fencembonceonces 0.16 |
Paul E. McKenney | 1c27b64 | 2018-01-18 19:58:55 -0800 | [diff] [blame] | 95 | |
| 96 | The "Positive: 0 Negative: 2000000" and the "Never 0 2000000" indicate |
| 97 | that during two million trials, the state specified in this litmus |
| 98 | test's "exists" clause was not reached. |
| 99 | |
| 100 | And, as with "herd7", please see "klitmus7 -help" or "herdtools7/doc/" |
| 101 | for more information. |
| 102 | |
| 103 | |
| 104 | ==================== |
| 105 | DESCRIPTION OF FILES |
| 106 | ==================== |
| 107 | |
| 108 | Documentation/cheatsheet.txt |
| 109 | Quick-reference guide to the Linux-kernel memory model. |
| 110 | |
| 111 | Documentation/explanation.txt |
| 112 | Describes the memory model in detail. |
| 113 | |
| 114 | Documentation/recipes.txt |
| 115 | Lists common memory-ordering patterns. |
| 116 | |
| 117 | Documentation/references.txt |
| 118 | Provides background reading. |
| 119 | |
| 120 | linux-kernel.bell |
| 121 | Categorizes the relevant instructions, including memory |
| 122 | references, memory barriers, atomic read-modify-write operations, |
| 123 | lock acquisition/release, and RCU operations. |
| 124 | |
| 125 | More formally, this file (1) lists the subtypes of the various |
| 126 | event types used by the memory model and (2) performs RCU |
| 127 | read-side critical section nesting analysis. |
| 128 | |
| 129 | linux-kernel.cat |
| 130 | Specifies what reorderings are forbidden by memory references, |
| 131 | memory barriers, atomic read-modify-write operations, and RCU. |
| 132 | |
| 133 | More formally, this file specifies what executions are forbidden |
| 134 | by the memory model. Allowed executions are those which |
| 135 | satisfy the model's "coherence", "atomic", "happens-before", |
| 136 | "propagation", and "rcu" axioms, which are defined in the file. |
| 137 | |
| 138 | linux-kernel.cfg |
| 139 | Convenience file that gathers the common-case herd7 command-line |
| 140 | arguments. |
| 141 | |
| 142 | linux-kernel.def |
| 143 | Maps from C-like syntax to herd7's internal litmus-test |
| 144 | instruction-set architecture. |
| 145 | |
| 146 | litmus-tests |
| 147 | Directory containing a few representative litmus tests, which |
| 148 | are listed in litmus-tests/README. A great deal more litmus |
| 149 | tests are available at https://github.com/paulmckrcu/litmus. |
| 150 | |
| 151 | lock.cat |
| 152 | Provides a front-end analysis of lock acquisition and release, |
| 153 | for example, associating a lock acquisition with the preceding |
| 154 | and following releases and checking for self-deadlock. |
| 155 | |
| 156 | More formally, this file defines a performance-enhanced scheme |
| 157 | for generation of the possible reads-from and coherence order |
| 158 | relations on the locking primitives. |
| 159 | |
| 160 | README |
| 161 | This file. |
| 162 | |
Paul E. McKenney | b02eb5b | 2018-12-03 15:04:50 -0800 | [diff] [blame] | 163 | scripts Various scripts, see scripts/README. |
| 164 | |
Paul E. McKenney | 1c27b64 | 2018-01-18 19:58:55 -0800 | [diff] [blame] | 165 | |
| 166 | =========== |
| 167 | LIMITATIONS |
| 168 | =========== |
| 169 | |
Andrea Parri | 6738ff8 | 2019-06-29 23:10:44 +0200 | [diff] [blame] | 170 | The Linux-kernel memory model (LKMM) has the following limitations: |
Paul E. McKenney | 1c27b64 | 2018-01-18 19:58:55 -0800 | [diff] [blame] | 171 | |
Andrea Parri | 6738ff8 | 2019-06-29 23:10:44 +0200 | [diff] [blame] | 172 | 1. Compiler optimizations are not accurately modeled. Of course, |
| 173 | the use of READ_ONCE() and WRITE_ONCE() limits the compiler's |
| 174 | ability to optimize, but under some circumstances it is possible |
| 175 | for the compiler to undermine the memory model. For more |
| 176 | information, see Documentation/explanation.txt (in particular, |
| 177 | the "THE PROGRAM ORDER RELATION: po AND po-loc" and "A WARNING" |
| 178 | sections). |
Paul E. McKenney | 1c27b64 | 2018-01-18 19:58:55 -0800 | [diff] [blame] | 179 | |
Paul E. McKenney | d8fa25c | 2018-09-26 11:29:19 -0700 | [diff] [blame] | 180 | Note that this limitation in turn limits LKMM's ability to |
| 181 | accurately model address, control, and data dependencies. |
| 182 | For example, if the compiler can deduce the value of some variable |
| 183 | carrying a dependency, then the compiler can break that dependency |
| 184 | by substituting a constant of that value. |
| 185 | |
Paul E. McKenney | 1c27b64 | 2018-01-18 19:58:55 -0800 | [diff] [blame] | 186 | 2. Multiple access sizes for a single variable are not supported, |
| 187 | and neither are misaligned or partially overlapping accesses. |
| 188 | |
| 189 | 3. Exceptions and interrupts are not modeled. In some cases, |
| 190 | this limitation can be overcome by modeling the interrupt or |
| 191 | exception with an additional process. |
| 192 | |
| 193 | 4. I/O such as MMIO or DMA is not supported. |
| 194 | |
| 195 | 5. Self-modifying code (such as that found in the kernel's |
| 196 | alternatives mechanism, function tracer, Berkeley Packet Filter |
| 197 | JIT compiler, and module loader) is not supported. |
| 198 | |
| 199 | 6. Complete modeling of all variants of atomic read-modify-write |
| 200 | operations, locking primitives, and RCU is not provided. |
| 201 | For example, call_rcu() and rcu_barrier() are not supported. |
| 202 | However, a substantial amount of support is provided for these |
| 203 | operations, as shown in the linux-kernel.def file. |
| 204 | |
Paul E. McKenney | d8fa25c | 2018-09-26 11:29:19 -0700 | [diff] [blame] | 205 | a. When rcu_assign_pointer() is passed NULL, the Linux |
| 206 | kernel provides no ordering, but LKMM models this |
| 207 | case as a store release. |
| 208 | |
| 209 | b. The "unless" RMW operations are not currently modeled: |
| 210 | atomic_long_add_unless(), atomic_add_unless(), |
| 211 | atomic_inc_unless_negative(), and |
| 212 | atomic_dec_unless_positive(). These can be emulated |
| 213 | in litmus tests, for example, by using atomic_cmpxchg(). |
| 214 | |
| 215 | c. The call_rcu() function is not modeled. It can be |
| 216 | emulated in litmus tests by adding another process that |
| 217 | invokes synchronize_rcu() and the body of the callback |
| 218 | function, with (for example) a release-acquire from |
| 219 | the site of the emulated call_rcu() to the beginning |
| 220 | of the additional process. |
| 221 | |
| 222 | d. The rcu_barrier() function is not modeled. It can be |
| 223 | emulated in litmus tests emulating call_rcu() via |
| 224 | (for example) a release-acquire from the end of each |
| 225 | additional call_rcu() process to the site of the |
| 226 | emulated rcu-barrier(). |
| 227 | |
Paul E. McKenney | ad9fd20 | 2018-11-26 14:26:43 -0800 | [diff] [blame] | 228 | e. Although sleepable RCU (SRCU) is now modeled, there |
| 229 | are some subtle differences between its semantics and |
| 230 | those in the Linux kernel. For example, the kernel |
| 231 | might interpret the following sequence as two partially |
| 232 | overlapping SRCU read-side critical sections: |
| 233 | |
| 234 | 1 r1 = srcu_read_lock(&my_srcu); |
| 235 | 2 do_something_1(); |
| 236 | 3 r2 = srcu_read_lock(&my_srcu); |
| 237 | 4 do_something_2(); |
| 238 | 5 srcu_read_unlock(&my_srcu, r1); |
| 239 | 6 do_something_3(); |
| 240 | 7 srcu_read_unlock(&my_srcu, r2); |
| 241 | |
| 242 | In contrast, LKMM will interpret this as a nested pair of |
| 243 | SRCU read-side critical sections, with the outer critical |
| 244 | section spanning lines 1-7 and the inner critical section |
| 245 | spanning lines 3-5. |
| 246 | |
| 247 | This difference would be more of a concern had anyone |
| 248 | identified a reasonable use case for partially overlapping |
| 249 | SRCU read-side critical sections. For more information, |
| 250 | please see: https://paulmck.livejournal.com/40593.html |
Paul E. McKenney | d8fa25c | 2018-09-26 11:29:19 -0700 | [diff] [blame] | 251 | |
| 252 | f. Reader-writer locking is not modeled. It can be |
| 253 | emulated in litmus tests using atomic read-modify-write |
| 254 | operations. |
| 255 | |
Paul E. McKenney | 1c27b64 | 2018-01-18 19:58:55 -0800 | [diff] [blame] | 256 | The "herd7" tool has some additional limitations of its own, apart from |
| 257 | the memory model: |
| 258 | |
| 259 | 1. Non-trivial data structures such as arrays or structures are |
| 260 | not supported. However, pointers are supported, allowing trivial |
| 261 | linked lists to be constructed. |
| 262 | |
| 263 | 2. Dynamic memory allocation is not supported, although this can |
| 264 | be worked around in some cases by supplying multiple statically |
| 265 | allocated variables. |
| 266 | |
| 267 | Some of these limitations may be overcome in the future, but others are |
| 268 | more likely to be addressed by incorporating the Linux-kernel memory model |
| 269 | into other tools. |
Paul E. McKenney | d8fa25c | 2018-09-26 11:29:19 -0700 | [diff] [blame] | 270 | |
| 271 | Finally, please note that LKMM is subject to change as hardware, use cases, |
| 272 | and compilers evolve. |