blob: 7c051e714943cb7bc625b332dd36669e13dbb839 [file] [log] [blame]
Fenghua Yu18979072021-04-19 21:49:55 +00001.. SPDX-License-Identifier: GPL-2.0
2
3.. include:: <isonum.txt>
4
5===============================
6Bus lock detection and handling
7===============================
8
9:Copyright: |copy| 2021 Intel Corporation
10:Authors: - Fenghua Yu <fenghua.yu@intel.com>
11 - Tony Luck <tony.luck@intel.com>
12
13Problem
14=======
15
16A split lock is any atomic operation whose operand crosses two cache lines.
17Since the operand spans two cache lines and the operation must be atomic,
18the system locks the bus while the CPU accesses the two cache lines.
19
20A bus lock is acquired through either split locked access to writeback (WB)
21memory or any locked access to non-WB memory. This is typically thousands of
22cycles slower than an atomic operation within a cache line. It also disrupts
23performance on other cores and brings the whole system to its knees.
24
25Detection
26=========
27
28Intel processors may support either or both of the following hardware
29mechanisms to detect split locks and bus locks.
30
31#AC exception for split lock detection
32--------------------------------------
33
34Beginning with the Tremont Atom CPU split lock operations may raise an
35Alignment Check (#AC) exception when a split lock operation is attemped.
36
37#DB exception for bus lock detection
38------------------------------------
39
40Some CPUs have the ability to notify the kernel by an #DB trap after a user
41instruction acquires a bus lock and is executed. This allows the kernel to
42terminate the application or to enforce throttling.
43
44Software handling
45=================
46
47The kernel #AC and #DB handlers handle bus lock based on the kernel
48parameter "split_lock_detect". Here is a summary of different options:
49
50+------------------+----------------------------+-----------------------+
51|split_lock_detect=|#AC for split lock |#DB for bus lock |
52+------------------+----------------------------+-----------------------+
53|off |Do nothing |Do nothing |
54+------------------+----------------------------+-----------------------+
55|warn |Kernel OOPs |Warn once per task and |
56|(default) |Warn once per task and |and continues to run. |
57| |disable future checking | |
58| |When both features are | |
59| |supported, warn in #AC | |
60+------------------+----------------------------+-----------------------+
61|fatal |Kernel OOPs |Send SIGBUS to user. |
62| |Send SIGBUS to user | |
63| |When both features are | |
64| |supported, fatal in #AC | |
65+------------------+----------------------------+-----------------------+
Fenghua Yud28397e2021-04-19 21:49:58 +000066|ratelimit:N |Do nothing |Limit bus lock rate to |
67|(0 < N <= 1000) | |N bus locks per second |
68| | |system wide and warn on|
69| | |bus locks. |
70+------------------+----------------------------+-----------------------+
Fenghua Yu18979072021-04-19 21:49:55 +000071
72Usages
73======
74
75Detecting and handling bus lock may find usages in various areas:
76
77It is critical for real time system designers who build consolidated real
78time systems. These systems run hard real time code on some cores and run
79"untrusted" user processes on other cores. The hard real time cannot afford
80to have any bus lock from the untrusted processes to hurt real time
81performance. To date the designers have been unable to deploy these
82solutions as they have no way to prevent the "untrusted" user code from
83generating split lock and bus lock to block the hard real time code to
84access memory during bus locking.
85
86It's also useful for general computing to prevent guests or user
87applications from slowing down the overall system by executing instructions
88with bus lock.
89
90
91Guidance
92========
93off
94---
95
96Disable checking for split lock and bus lock. This option can be useful if
97there are legacy applications that trigger these events at a low rate so
98that mitigation is not needed.
99
100warn
101----
102
103A warning is emitted when a bus lock is detected which allows to identify
104the offending application. This is the default behavior.
105
106fatal
107-----
108
109In this case, the bus lock is not tolerated and the process is killed.
Fenghua Yud28397e2021-04-19 21:49:58 +0000110
111ratelimit
112---------
113
114A system wide bus lock rate limit N is specified where 0 < N <= 1000. This
115allows a bus lock rate up to N bus locks per second. When the bus lock rate
116is exceeded then any task which is caught via the buslock #DB exception is
117throttled by enforced sleeps until the rate goes under the limit again.
118
119This is an effective mitigation in cases where a minimal impact can be
120tolerated, but an eventual Denial of Service attack has to be prevented. It
121allows to identify the offending processes and analyze whether they are
122malicious or just badly written.
123
124Selecting a rate limit of 1000 allows the bus to be locked for up to about
125seven million cycles each second (assuming 7000 cycles for each bus
126lock). On a 2 GHz processor that would be about 0.35% system slowdown.