Madhuparna Bhowmik | 6705cae | 2019-10-29 03:12:52 +0530 | [diff] [blame] | 1 | .. _NMI_rcu_doc: |
| 2 | |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 3 | Using RCU to Protect Dynamic NMI Handlers |
Madhuparna Bhowmik | 6705cae | 2019-10-29 03:12:52 +0530 | [diff] [blame] | 4 | ========================================= |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 5 | |
| 6 | |
| 7 | Although RCU is usually used to protect read-mostly data structures, |
| 8 | it is possible to use RCU to provide dynamic non-maskable interrupt |
| 9 | handlers, as well as dynamic irq handlers. This document describes |
| 10 | how to do this, drawing loosely from Zwane Mwaikambo's NMI-timer |
Viresh Kumar | f840826 | 2021-01-14 17:05:30 +0530 | [diff] [blame] | 11 | work in "arch/x86/kernel/traps.c". |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 12 | |
| 13 | The relevant pieces of code are listed below, each followed by a |
Madhuparna Bhowmik | 6705cae | 2019-10-29 03:12:52 +0530 | [diff] [blame] | 14 | brief explanation:: |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 15 | |
| 16 | static int dummy_nmi_callback(struct pt_regs *regs, int cpu) |
| 17 | { |
| 18 | return 0; |
| 19 | } |
| 20 | |
| 21 | The dummy_nmi_callback() function is a "dummy" NMI handler that does |
| 22 | nothing, but returns zero, thus saying that it did nothing, allowing |
Madhuparna Bhowmik | 6705cae | 2019-10-29 03:12:52 +0530 | [diff] [blame] | 23 | the NMI handler to take the default machine-specific action:: |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 24 | |
| 25 | static nmi_callback_t nmi_callback = dummy_nmi_callback; |
| 26 | |
| 27 | This nmi_callback variable is a global function pointer to the current |
Madhuparna Bhowmik | 6705cae | 2019-10-29 03:12:52 +0530 | [diff] [blame] | 28 | NMI handler:: |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 29 | |
Harvey Harrison | b5606c2 | 2008-02-13 15:03:16 -0800 | [diff] [blame] | 30 | void do_nmi(struct pt_regs * regs, long error_code) |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 31 | { |
| 32 | int cpu; |
| 33 | |
| 34 | nmi_enter(); |
| 35 | |
| 36 | cpu = smp_processor_id(); |
| 37 | ++nmi_count(cpu); |
| 38 | |
Paul E. McKenney | 50aec00 | 2010-04-09 15:39:12 -0700 | [diff] [blame] | 39 | if (!rcu_dereference_sched(nmi_callback)(regs, cpu)) |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 40 | default_do_nmi(regs); |
| 41 | |
| 42 | nmi_exit(); |
| 43 | } |
| 44 | |
| 45 | The do_nmi() function processes each NMI. It first disables preemption |
| 46 | in the same way that a hardware irq would, then increments the per-CPU |
| 47 | count of NMIs. It then invokes the NMI handler stored in the nmi_callback |
| 48 | function pointer. If this handler returns zero, do_nmi() invokes the |
| 49 | default_do_nmi() function to handle a machine-specific NMI. Finally, |
| 50 | preemption is restored. |
| 51 | |
Paul E. McKenney | 50aec00 | 2010-04-09 15:39:12 -0700 | [diff] [blame] | 52 | In theory, rcu_dereference_sched() is not needed, since this code runs |
| 53 | only on i386, which in theory does not need rcu_dereference_sched() |
| 54 | anyway. However, in practice it is a good documentation aid, particularly |
| 55 | for anyone attempting to do something similar on Alpha or on systems |
| 56 | with aggressive optimizing compilers. |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 57 | |
Madhuparna Bhowmik | 6705cae | 2019-10-29 03:12:52 +0530 | [diff] [blame] | 58 | Quick Quiz: |
| 59 | Why might the rcu_dereference_sched() be necessary on Alpha, given that the code referenced by the pointer is read-only? |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 60 | |
Madhuparna Bhowmik | 6705cae | 2019-10-29 03:12:52 +0530 | [diff] [blame] | 61 | :ref:`Answer to Quick Quiz <answer_quick_quiz_NMI>` |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 62 | |
Madhuparna Bhowmik | 6705cae | 2019-10-29 03:12:52 +0530 | [diff] [blame] | 63 | Back to the discussion of NMI and RCU:: |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 64 | |
| 65 | void set_nmi_callback(nmi_callback_t callback) |
| 66 | { |
| 67 | rcu_assign_pointer(nmi_callback, callback); |
| 68 | } |
| 69 | |
| 70 | The set_nmi_callback() function registers an NMI handler. Note that any |
| 71 | data that is to be used by the callback must be initialized up -before- |
| 72 | the call to set_nmi_callback(). On architectures that do not order |
| 73 | writes, the rcu_assign_pointer() ensures that the NMI handler sees the |
Madhuparna Bhowmik | 6705cae | 2019-10-29 03:12:52 +0530 | [diff] [blame] | 74 | initialized values:: |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 75 | |
| 76 | void unset_nmi_callback(void) |
| 77 | { |
| 78 | rcu_assign_pointer(nmi_callback, dummy_nmi_callback); |
| 79 | } |
| 80 | |
| 81 | This function unregisters an NMI handler, restoring the original |
| 82 | dummy_nmi_handler(). However, there may well be an NMI handler |
| 83 | currently executing on some other CPU. We therefore cannot free |
| 84 | up any data structures used by the old NMI handler until execution |
| 85 | of it completes on all other CPUs. |
| 86 | |
Paul E. McKenney | 4fea6ef | 2019-01-09 14:48:09 -0800 | [diff] [blame] | 87 | One way to accomplish this is via synchronize_rcu(), perhaps as |
Madhuparna Bhowmik | 6705cae | 2019-10-29 03:12:52 +0530 | [diff] [blame] | 88 | follows:: |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 89 | |
| 90 | unset_nmi_callback(); |
Paul E. McKenney | 4fea6ef | 2019-01-09 14:48:09 -0800 | [diff] [blame] | 91 | synchronize_rcu(); |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 92 | kfree(my_nmi_data); |
| 93 | |
Paul E. McKenney | 4fea6ef | 2019-01-09 14:48:09 -0800 | [diff] [blame] | 94 | This works because (as of v4.20) synchronize_rcu() blocks until all |
| 95 | CPUs complete any preemption-disabled segments of code that they were |
| 96 | executing. |
| 97 | Since NMI handlers disable preemption, synchronize_rcu() is guaranteed |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 98 | not to return until all ongoing NMI handlers exit. It is therefore safe |
Paul E. McKenney | 4fea6ef | 2019-01-09 14:48:09 -0800 | [diff] [blame] | 99 | to free up the handler's data as soon as synchronize_rcu() returns. |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 100 | |
Paul E. McKenney | 3230075 | 2008-05-12 21:21:05 +0200 | [diff] [blame] | 101 | Important note: for this to work, the architecture in question must |
Paul E. McKenney | b15a2e7 | 2011-06-07 17:05:34 -0700 | [diff] [blame] | 102 | invoke nmi_enter() and nmi_exit() on NMI entry and exit, respectively. |
Paul E. McKenney | 3230075 | 2008-05-12 21:21:05 +0200 | [diff] [blame] | 103 | |
Madhuparna Bhowmik | 6705cae | 2019-10-29 03:12:52 +0530 | [diff] [blame] | 104 | .. _answer_quick_quiz_NMI: |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 105 | |
Madhuparna Bhowmik | 6705cae | 2019-10-29 03:12:52 +0530 | [diff] [blame] | 106 | Answer to Quick Quiz: |
| 107 | Why might the rcu_dereference_sched() be necessary on Alpha, given that the code referenced by the pointer is read-only? |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 108 | |
Madhuparna Bhowmik | 6705cae | 2019-10-29 03:12:52 +0530 | [diff] [blame] | 109 | The caller to set_nmi_callback() might well have |
| 110 | initialized some data that is to be used by the new NMI |
| 111 | handler. In this case, the rcu_dereference_sched() would |
| 112 | be needed, because otherwise a CPU that received an NMI |
| 113 | just after the new handler was set might see the pointer |
| 114 | to the new NMI handler, but the old pre-initialized |
| 115 | version of the handler's data. |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 116 | |
Madhuparna Bhowmik | 6705cae | 2019-10-29 03:12:52 +0530 | [diff] [blame] | 117 | This same sad story can happen on other CPUs when using |
| 118 | a compiler with aggressive pointer-value speculation |
| 119 | optimizations. |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 120 | |
Madhuparna Bhowmik | 6705cae | 2019-10-29 03:12:52 +0530 | [diff] [blame] | 121 | More important, the rcu_dereference_sched() makes it |
| 122 | clear to someone reading the code that the pointer is |
| 123 | being protected by RCU-sched. |