Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 1 | Using RCU to Protect Dynamic NMI Handlers |
| 2 | |
| 3 | |
| 4 | Although RCU is usually used to protect read-mostly data structures, |
| 5 | it is possible to use RCU to provide dynamic non-maskable interrupt |
| 6 | handlers, as well as dynamic irq handlers. This document describes |
| 7 | how to do this, drawing loosely from Zwane Mwaikambo's NMI-timer |
| 8 | work in "arch/i386/oprofile/nmi_timer_int.c" and in |
| 9 | "arch/i386/kernel/traps.c". |
| 10 | |
| 11 | The relevant pieces of code are listed below, each followed by a |
| 12 | brief explanation. |
| 13 | |
| 14 | static int dummy_nmi_callback(struct pt_regs *regs, int cpu) |
| 15 | { |
| 16 | return 0; |
| 17 | } |
| 18 | |
| 19 | The dummy_nmi_callback() function is a "dummy" NMI handler that does |
| 20 | nothing, but returns zero, thus saying that it did nothing, allowing |
| 21 | the NMI handler to take the default machine-specific action. |
| 22 | |
| 23 | static nmi_callback_t nmi_callback = dummy_nmi_callback; |
| 24 | |
| 25 | This nmi_callback variable is a global function pointer to the current |
| 26 | NMI handler. |
| 27 | |
Harvey Harrison | b5606c2 | 2008-02-13 15:03:16 -0800 | [diff] [blame] | 28 | void do_nmi(struct pt_regs * regs, long error_code) |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 29 | { |
| 30 | int cpu; |
| 31 | |
| 32 | nmi_enter(); |
| 33 | |
| 34 | cpu = smp_processor_id(); |
| 35 | ++nmi_count(cpu); |
| 36 | |
| 37 | if (!rcu_dereference(nmi_callback)(regs, cpu)) |
| 38 | default_do_nmi(regs); |
| 39 | |
| 40 | nmi_exit(); |
| 41 | } |
| 42 | |
| 43 | The do_nmi() function processes each NMI. It first disables preemption |
| 44 | in the same way that a hardware irq would, then increments the per-CPU |
| 45 | count of NMIs. It then invokes the NMI handler stored in the nmi_callback |
| 46 | function pointer. If this handler returns zero, do_nmi() invokes the |
| 47 | default_do_nmi() function to handle a machine-specific NMI. Finally, |
| 48 | preemption is restored. |
| 49 | |
| 50 | Strictly speaking, rcu_dereference() is not needed, since this code runs |
| 51 | only on i386, which does not need rcu_dereference() anyway. However, |
| 52 | it is a good documentation aid, particularly for anyone attempting to |
| 53 | do something similar on Alpha. |
| 54 | |
| 55 | Quick Quiz: Why might the rcu_dereference() be necessary on Alpha, |
| 56 | given that the code referenced by the pointer is read-only? |
| 57 | |
| 58 | |
| 59 | Back to the discussion of NMI and RCU... |
| 60 | |
| 61 | void set_nmi_callback(nmi_callback_t callback) |
| 62 | { |
| 63 | rcu_assign_pointer(nmi_callback, callback); |
| 64 | } |
| 65 | |
| 66 | The set_nmi_callback() function registers an NMI handler. Note that any |
| 67 | data that is to be used by the callback must be initialized up -before- |
| 68 | the call to set_nmi_callback(). On architectures that do not order |
| 69 | writes, the rcu_assign_pointer() ensures that the NMI handler sees the |
| 70 | initialized values. |
| 71 | |
| 72 | void unset_nmi_callback(void) |
| 73 | { |
| 74 | rcu_assign_pointer(nmi_callback, dummy_nmi_callback); |
| 75 | } |
| 76 | |
| 77 | This function unregisters an NMI handler, restoring the original |
| 78 | dummy_nmi_handler(). However, there may well be an NMI handler |
| 79 | currently executing on some other CPU. We therefore cannot free |
| 80 | up any data structures used by the old NMI handler until execution |
| 81 | of it completes on all other CPUs. |
| 82 | |
| 83 | One way to accomplish this is via synchronize_sched(), perhaps as |
| 84 | follows: |
| 85 | |
| 86 | unset_nmi_callback(); |
| 87 | synchronize_sched(); |
| 88 | kfree(my_nmi_data); |
| 89 | |
| 90 | This works because synchronize_sched() blocks until all CPUs complete |
| 91 | any preemption-disabled segments of code that they were executing. |
| 92 | Since NMI handlers disable preemption, synchronize_sched() is guaranteed |
| 93 | not to return until all ongoing NMI handlers exit. It is therefore safe |
| 94 | to free up the handler's data as soon as synchronize_sched() returns. |
| 95 | |
Paul E. McKenney | 3230075 | 2008-05-12 21:21:05 +0200 | [diff] [blame^] | 96 | Important note: for this to work, the architecture in question must |
| 97 | invoke irq_enter() and irq_exit() on NMI entry and exit, respectively. |
| 98 | |
Paul E. McKenney | 1930605 | 2005-09-06 15:16:35 -0700 | [diff] [blame] | 99 | |
| 100 | Answer to Quick Quiz |
| 101 | |
| 102 | Why might the rcu_dereference() be necessary on Alpha, given |
| 103 | that the code referenced by the pointer is read-only? |
| 104 | |
| 105 | Answer: The caller to set_nmi_callback() might well have |
| 106 | initialized some data that is to be used by the |
| 107 | new NMI handler. In this case, the rcu_dereference() |
| 108 | would be needed, because otherwise a CPU that received |
| 109 | an NMI just after the new handler was set might see |
| 110 | the pointer to the new NMI handler, but the old |
| 111 | pre-initialized version of the handler's data. |
| 112 | |
| 113 | More important, the rcu_dereference() makes it clear |
| 114 | to someone reading the code that the pointer is being |
| 115 | protected by RCU. |