blob: fed13eaead8911614a595d0eae318fce2378f320 [file] [log] [blame]
Changbin Dued657c12018-02-17 13:39:47 +08001===================================
2In-kernel memory-mapped I/O tracing
3===================================
Pekka Paalanenc6c67c12008-05-12 21:20:59 +02004
5
6Home page and links to optional user space tools:
7
Alexander A. Klimov93431e02020-05-26 08:05:44 +02008 https://nouveau.freedesktop.org/wiki/MmioTrace
Pekka Paalanenc6c67c12008-05-12 21:20:59 +02009
10MMIO tracing was originally developed by Intel around 2003 for their Fault
11Injection Test Harness. In Dec 2006 - Jan 2007, using the code from Intel,
12Jeff Muizelaar created a tool for tracing MMIO accesses with the Nouveau
13project in mind. Since then many people have contributed.
14
15Mmiotrace was built for reverse engineering any memory-mapped IO device with
16the Nouveau project as the first real user. Only x86 and x86_64 architectures
17are supported.
18
19Out-of-tree mmiotrace was originally modified for mainline inclusion and
20ftrace framework by Pekka Paalanen <pq@iki.fi>.
21
22
23Preparation
24-----------
25
26Mmiotrace feature is compiled in by the CONFIG_MMIOTRACE option. Tracing is
27disabled by default, so it is safe to have this set to yes. SMP systems are
28supported, but tracing is unreliable and may miss events if more than one CPU
29is on-line, therefore mmiotrace takes all but one CPU off-line during run-time
Pekka Paalanen6f6f3942008-05-12 21:21:03 +020030activation. You can re-enable CPUs by hand, but you have been warned, there
31is no way to automatically detect if you are losing events due to CPUs racing.
Pekka Paalanenc6c67c12008-05-12 21:20:59 +020032
33
34Usage Quick Reference
35---------------------
Changbin Dued657c12018-02-17 13:39:47 +080036::
Pekka Paalanenc6c67c12008-05-12 21:20:59 +020037
Changbin Dued657c12018-02-17 13:39:47 +080038 $ mount -t debugfs debugfs /sys/kernel/debug
39 $ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
40 $ cat /sys/kernel/debug/tracing/trace_pipe > mydump.txt &
41 Start X or whatever.
42 $ echo "X is up" > /sys/kernel/debug/tracing/trace_marker
43 $ echo nop > /sys/kernel/debug/tracing/current_tracer
44 Check for lost events.
Pekka Paalanenc6c67c12008-05-12 21:20:59 +020045
46
47Usage
48-----
49
Randy Dunlap360b6e52009-12-18 15:16:56 -080050Make sure debugfs is mounted to /sys/kernel/debug.
Changbin Dued657c12018-02-17 13:39:47 +080051If not (requires root privileges)::
52
53 $ mount -t debugfs debugfs /sys/kernel/debug
Pekka Paalanenc6c67c12008-05-12 21:20:59 +020054
55Check that the driver you are about to trace is not loaded.
56
Changbin Dued657c12018-02-17 13:39:47 +080057Activate mmiotrace (requires root privileges)::
Pekka Paalanenc6c67c12008-05-12 21:20:59 +020058
Changbin Dued657c12018-02-17 13:39:47 +080059 $ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
60
61Start storing the trace::
62
63 $ cat /sys/kernel/debug/tracing/trace_pipe > mydump.txt &
64
Pekka Paalanenc6c67c12008-05-12 21:20:59 +020065The 'cat' process should stay running (sleeping) in the background.
66
67Load the driver you want to trace and use it. Mmiotrace will only catch MMIO
68accesses to areas that are ioremapped while mmiotrace is active.
69
Pekka Paalanenc6c67c12008-05-12 21:20:59 +020070During tracing you can place comments (markers) into the trace by
GeunSik Lim156f5a72009-06-02 15:01:37 +090071$ echo "X is up" > /sys/kernel/debug/tracing/trace_marker
Pekka Paalanenc6c67c12008-05-12 21:20:59 +020072This makes it easier to see which part of the (huge) trace corresponds to
73which action. It is recommended to place descriptive markers about what you
74do.
75
Changbin Dued657c12018-02-17 13:39:47 +080076Shut down mmiotrace (requires root privileges)::
77
78 $ echo nop > /sys/kernel/debug/tracing/current_tracer
79
Pekka Paalanen6f6f3942008-05-12 21:21:03 +020080The 'cat' process exits. If it does not, kill it by issuing 'fg' command and
81pressing ctrl+c.
Pekka Paalanenc6c67c12008-05-12 21:20:59 +020082
Changbin Dued657c12018-02-17 13:39:47 +080083Check that mmiotrace did not lose events due to a buffer filling up. Either::
84
85 $ grep -i lost mydump.txt
86
87which tells you exactly how many events were lost, or use::
88
89 $ dmesg
90
Pekka Paalanen6f6f3942008-05-12 21:21:03 +020091to view your kernel log and look for "mmiotrace has lost events" warning. If
92events were lost, the trace is incomplete. You should enlarge the buffers and
93try again. Buffers are enlarged by first seeing how large the current buffers
Changbin Dued657c12018-02-17 13:39:47 +080094are::
95
96 $ cat /sys/kernel/debug/tracing/buffer_size_kb
97
Pekka Paalanen6f6f3942008-05-12 21:21:03 +020098gives you a number. Approximately double this number and write it back, for
Changbin Dued657c12018-02-17 13:39:47 +080099instance::
100
101 $ echo 128000 > /sys/kernel/debug/tracing/buffer_size_kb
102
Pekka Paalanen6f6f3942008-05-12 21:21:03 +0200103Then start again from the top.
Pekka Paalanenc6c67c12008-05-12 21:20:59 +0200104
105If you are doing a trace for a driver project, e.g. Nouveau, you should also
Changbin Dued657c12018-02-17 13:39:47 +0800106do the following before sending your results::
107
108 $ lspci -vvv > lspci.txt
109 $ dmesg > dmesg.txt
110 $ tar zcf pciid-nick-mmiotrace.tar.gz mydump.txt lspci.txt dmesg.txt
111
Pekka Paalanenc6c67c12008-05-12 21:20:59 +0200112and then send the .tar.gz file. The trace compresses considerably. Replace
113"pciid" and "nick" with the PCI ID or model name of your piece of hardware
Randy Dunlap360b6e52009-12-18 15:16:56 -0800114under investigation and your nickname.
Pekka Paalanenc6c67c12008-05-12 21:20:59 +0200115
116
117How Mmiotrace Works
118-------------------
119
120Access to hardware IO-memory is gained by mapping addresses from PCI bus by
121calling one of the ioremap_*() functions. Mmiotrace is hooked into the
122__ioremap() function and gets called whenever a mapping is created. Mapping is
Randy Dunlap360b6e52009-12-18 15:16:56 -0800123an event that is recorded into the trace log. Note that ISA range mappings
Pekka Paalanenc6c67c12008-05-12 21:20:59 +0200124are not caught, since the mapping always exists and is returned directly.
125
126MMIO accesses are recorded via page faults. Just before __ioremap() returns,
127the mapped pages are marked as not present. Any access to the pages causes a
128fault. The page fault handler calls mmiotrace to handle the fault. Mmiotrace
129marks the page present, sets TF flag to achieve single stepping and exits the
130fault handler. The instruction that faulted is executed and debug trap is
131entered. Here mmiotrace again marks the page as not present. The instruction
132is decoded to get the type of operation (read/write), data width and the value
133read or written. These are stored to the trace log.
134
135Setting the page present in the page fault handler has a race condition on SMP
136machines. During the single stepping other CPUs may run freely on that page
137and events can be missed without a notice. Re-enabling other CPUs during
138tracing is discouraged.
139
140
141Trace Log Format
142----------------
143
144The raw log is text and easily filtered with e.g. grep and awk. One record is
Randy Dunlap360b6e52009-12-18 15:16:56 -0800145one line in the log. A record starts with a keyword, followed by keyword-
146dependent arguments. Arguments are separated by a space, or continue until the
Pekka Paalanenc6c67c12008-05-12 21:20:59 +0200147end of line. The format for version 20070824 is as follows:
148
Randy Dunlap360b6e52009-12-18 15:16:56 -0800149Explanation Keyword Space-separated arguments
Pekka Paalanenc6c67c12008-05-12 21:20:59 +0200150---------------------------------------------------------------------------
151
152read event R width, timestamp, map id, physical, value, PC, PID
153write event W width, timestamp, map id, physical, value, PC, PID
154ioremap event MAP timestamp, map id, physical, virtual, length, PC, PID
155iounmap event UNMAP timestamp, map id, PC, PID
156marker MARK timestamp, text
157version VERSION the string "20070824"
158info for reader LSPCI one line from lspci -v
Randy Dunlap360b6e52009-12-18 15:16:56 -0800159PCI address map PCIDEV space-separated /proc/bus/pci/devices data
Pekka Paalanenc6c67c12008-05-12 21:20:59 +0200160unk. opcode UNKNOWN timestamp, map id, physical, data, PC, PID
161
162Timestamp is in seconds with decimals. Physical is a PCI bus address, virtual
163is a kernel virtual address. Width is the data width in bytes and value is the
164data value. Map id is an arbitrary id number identifying the mapping that was
165used in an operation. PC is the program counter and PID is process id. PC is
166zero if it is not recorded. PID is always zero as tracing MMIO accesses
167originating in user space memory is not yet supported.
168
169For instance, the following awk filter will pass all 32-bit writes that target
Changbin Dued657c12018-02-17 13:39:47 +0800170physical addresses in the range [0xfb73ce40, 0xfb800000]
171::
Pekka Paalanenc6c67c12008-05-12 21:20:59 +0200172
Changbin Dued657c12018-02-17 13:39:47 +0800173 $ awk '/W 4 / { adr=strtonum($5); if (adr >= 0xfb73ce40 &&
174 adr < 0xfb800000) print; }'
Pekka Paalanenc6c67c12008-05-12 21:20:59 +0200175
176
177Tools for Developers
178--------------------
179
180The user space tools include utilities for:
Changbin Dued657c12018-02-17 13:39:47 +0800181 - replacing numeric addresses and values with hardware register names
182 - replaying MMIO logs, i.e., re-executing the recorded writes
Pekka Paalanenc6c67c12008-05-12 21:20:59 +0200183
184