Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame^] | 1 | Review Checklist for RCU Patches |
| 2 | |
| 3 | |
| 4 | This document contains a checklist for producing and reviewing patches |
| 5 | that make use of RCU. Violating any of the rules listed below will |
| 6 | result in the same sorts of problems that leaving out a locking primitive |
| 7 | would cause. This list is based on experiences reviewing such patches |
| 8 | over a rather long period of time, but improvements are always welcome! |
| 9 | |
| 10 | 0. Is RCU being applied to a read-mostly situation? If the data |
| 11 | structure is updated more than about 10% of the time, then |
| 12 | you should strongly consider some other approach, unless |
| 13 | detailed performance measurements show that RCU is nonetheless |
| 14 | the right tool for the job. |
| 15 | |
| 16 | The other exception would be where performance is not an issue, |
| 17 | and RCU provides a simpler implementation. An example of this |
| 18 | situation is the dynamic NMI code in the Linux 2.6 kernel, |
| 19 | at least on architectures where NMIs are rare. |
| 20 | |
| 21 | 1. Does the update code have proper mutual exclusion? |
| 22 | |
| 23 | RCU does allow -readers- to run (almost) naked, but -writers- must |
| 24 | still use some sort of mutual exclusion, such as: |
| 25 | |
| 26 | a. locking, |
| 27 | b. atomic operations, or |
| 28 | c. restricting updates to a single task. |
| 29 | |
| 30 | If you choose #b, be prepared to describe how you have handled |
| 31 | memory barriers on weakly ordered machines (pretty much all of |
| 32 | them -- even x86 allows reads to be reordered), and be prepared |
| 33 | to explain why this added complexity is worthwhile. If you |
| 34 | choose #c, be prepared to explain how this single task does not |
| 35 | become a major bottleneck on big multiprocessor machines. |
| 36 | |
| 37 | 2. Do the RCU read-side critical sections make proper use of |
| 38 | rcu_read_lock() and friends? These primitives are needed |
| 39 | to suppress preemption (or bottom halves, in the case of |
| 40 | rcu_read_lock_bh()) in the read-side critical sections, |
| 41 | and are also an excellent aid to readability. |
| 42 | |
| 43 | 3. Does the update code tolerate concurrent accesses? |
| 44 | |
| 45 | The whole point of RCU is to permit readers to run without |
| 46 | any locks or atomic operations. This means that readers will |
| 47 | be running while updates are in progress. There are a number |
| 48 | of ways to handle this concurrency, depending on the situation: |
| 49 | |
| 50 | a. Make updates appear atomic to readers. For example, |
| 51 | pointer updates to properly aligned fields will appear |
| 52 | atomic, as will individual atomic primitives. Operations |
| 53 | performed under a lock and sequences of multiple atomic |
| 54 | primitives will -not- appear to be atomic. |
| 55 | |
| 56 | This is almost always the best approach. |
| 57 | |
| 58 | b. Carefully order the updates and the reads so that |
| 59 | readers see valid data at all phases of the update. |
| 60 | This is often more difficult than it sounds, especially |
| 61 | given modern CPUs' tendency to reorder memory references. |
| 62 | One must usually liberally sprinkle memory barriers |
| 63 | (smp_wmb(), smp_rmb(), smp_mb()) through the code, |
| 64 | making it difficult to understand and to test. |
| 65 | |
| 66 | It is usually better to group the changing data into |
| 67 | a separate structure, so that the change may be made |
| 68 | to appear atomic by updating a pointer to reference |
| 69 | a new structure containing updated values. |
| 70 | |
| 71 | 4. Weakly ordered CPUs pose special challenges. Almost all CPUs |
| 72 | are weakly ordered -- even i386 CPUs allow reads to be reordered. |
| 73 | RCU code must take all of the following measures to prevent |
| 74 | memory-corruption problems: |
| 75 | |
| 76 | a. Readers must maintain proper ordering of their memory |
| 77 | accesses. The rcu_dereference() primitive ensures that |
| 78 | the CPU picks up the pointer before it picks up the data |
| 79 | that the pointer points to. This really is necessary |
| 80 | on Alpha CPUs. If you don't believe me, see: |
| 81 | |
| 82 | http://www.openvms.compaq.com/wizard/wiz_2637.html |
| 83 | |
| 84 | The rcu_dereference() primitive is also an excellent |
| 85 | documentation aid, letting the person reading the code |
| 86 | know exactly which pointers are protected by RCU. |
| 87 | |
| 88 | The rcu_dereference() primitive is used by the various |
| 89 | "_rcu()" list-traversal primitives, such as the |
| 90 | list_for_each_entry_rcu(). |
| 91 | |
| 92 | b. If the list macros are being used, the list_del_rcu(), |
| 93 | list_add_tail_rcu(), and list_del_rcu() primitives must |
| 94 | be used in order to prevent weakly ordered machines from |
| 95 | misordering structure initialization and pointer planting. |
| 96 | Similarly, if the hlist macros are being used, the |
| 97 | hlist_del_rcu() and hlist_add_head_rcu() primitives |
| 98 | are required. |
| 99 | |
| 100 | c. Updates must ensure that initialization of a given |
| 101 | structure happens before pointers to that structure are |
| 102 | publicized. Use the rcu_assign_pointer() primitive |
| 103 | when publicizing a pointer to a structure that can |
| 104 | be traversed by an RCU read-side critical section. |
| 105 | |
| 106 | [The rcu_assign_pointer() primitive is in process.] |
| 107 | |
| 108 | 5. If call_rcu(), or a related primitive such as call_rcu_bh(), |
| 109 | is used, the callback function must be written to be called |
| 110 | from softirq context. In particular, it cannot block. |
| 111 | |
| 112 | 6. Since synchronize_kernel() blocks, it cannot be called from |
| 113 | any sort of irq context. |
| 114 | |
| 115 | 7. If the updater uses call_rcu(), then the corresponding readers |
| 116 | must use rcu_read_lock() and rcu_read_unlock(). If the updater |
| 117 | uses call_rcu_bh(), then the corresponding readers must use |
| 118 | rcu_read_lock_bh() and rcu_read_unlock_bh(). Mixing things up |
| 119 | will result in confusion and broken kernels. |
| 120 | |
| 121 | One exception to this rule: rcu_read_lock() and rcu_read_unlock() |
| 122 | may be substituted for rcu_read_lock_bh() and rcu_read_unlock_bh() |
| 123 | in cases where local bottom halves are already known to be |
| 124 | disabled, for example, in irq or softirq context. Commenting |
| 125 | such cases is a must, of course! And the jury is still out on |
| 126 | whether the increased speed is worth it. |
| 127 | |
| 128 | 8. Although synchronize_kernel() is a bit slower than is call_rcu(), |
| 129 | it usually results in simpler code. So, unless update performance |
| 130 | is important or the updaters cannot block, synchronize_kernel() |
| 131 | should be used in preference to call_rcu(). |
| 132 | |
| 133 | 9. All RCU list-traversal primitives, which include |
| 134 | list_for_each_rcu(), list_for_each_entry_rcu(), |
| 135 | list_for_each_continue_rcu(), and list_for_each_safe_rcu(), |
| 136 | must be within an RCU read-side critical section. RCU |
| 137 | read-side critical sections are delimited by rcu_read_lock() |
| 138 | and rcu_read_unlock(), or by similar primitives such as |
| 139 | rcu_read_lock_bh() and rcu_read_unlock_bh(). |
| 140 | |
| 141 | Use of the _rcu() list-traversal primitives outside of an |
| 142 | RCU read-side critical section causes no harm other than |
| 143 | a slight performance degradation on Alpha CPUs and some |
| 144 | confusion on the part of people trying to read the code. |
| 145 | |
| 146 | Another way of thinking of this is "If you are holding the |
| 147 | lock that prevents the data structure from changing, why do |
| 148 | you also need RCU-based protection?" That said, there may |
| 149 | well be situations where use of the _rcu() list-traversal |
| 150 | primitives while the update-side lock is held results in |
| 151 | simpler and more maintainable code. The jury is still out |
| 152 | on this question. |
| 153 | |
| 154 | 10. Conversely, if you are in an RCU read-side critical section, |
| 155 | you -must- use the "_rcu()" variants of the list macros. |
| 156 | Failing to do so will break Alpha and confuse people reading |
| 157 | your code. |