Arnaldo Carvalho de Melo | 514f1c6 | 2012-09-26 20:05:56 -0300 | [diff] [blame] | 1 | perf-trace(1) |
| 2 | ============= |
| 3 | |
| 4 | NAME |
| 5 | ---- |
| 6 | perf-trace - strace inspired tool |
| 7 | |
| 8 | SYNOPSIS |
| 9 | -------- |
| 10 | [verse] |
| 11 | 'perf trace' |
David Ahern | 5e2485b | 2013-09-28 13:13:01 -0600 | [diff] [blame] | 12 | 'perf trace record' |
Arnaldo Carvalho de Melo | 514f1c6 | 2012-09-26 20:05:56 -0300 | [diff] [blame] | 13 | |
| 14 | DESCRIPTION |
| 15 | ----------- |
| 16 | This command will show the events associated with the target, initially |
| 17 | syscalls, but other system events like pagefaults, task lifetime events, |
| 18 | scheduling events, etc. |
| 19 | |
David Ahern | 5e2485b | 2013-09-28 13:13:01 -0600 | [diff] [blame] | 20 | This is a live mode tool in addition to working with perf.data files like |
| 21 | the other perf tools. Files can be generated using the 'perf record' command |
| 22 | but the session needs to include the raw_syscalls events (-e 'raw_syscalls:*'). |
Masanari Iida | 96355f2 | 2014-09-10 00:18:50 +0900 | [diff] [blame] | 23 | Alternatively, 'perf trace record' can be used as a shortcut to |
David Ahern | 5e2485b | 2013-09-28 13:13:01 -0600 | [diff] [blame] | 24 | automatically include the raw_syscalls events when writing events to a file. |
| 25 | |
| 26 | The following options apply to perf trace; options to perf trace record are |
| 27 | found in the perf record man page. |
Arnaldo Carvalho de Melo | 514f1c6 | 2012-09-26 20:05:56 -0300 | [diff] [blame] | 28 | |
| 29 | OPTIONS |
| 30 | ------- |
| 31 | |
David Ahern | ac9be8e | 2013-08-20 11:15:45 -0600 | [diff] [blame] | 32 | -a:: |
Arnaldo Carvalho de Melo | 514f1c6 | 2012-09-26 20:05:56 -0300 | [diff] [blame] | 33 | --all-cpus:: |
| 34 | System-wide collection from all CPUs. |
| 35 | |
Arnaldo Carvalho de Melo | 2ae3a31 | 2013-08-09 12:28:31 -0300 | [diff] [blame] | 36 | -e:: |
| 37 | --expr:: |
Arnaldo Carvalho de Melo | 017037f | 2017-01-09 17:26:26 -0300 | [diff] [blame] | 38 | --event:: |
| 39 | List of syscalls and other perf events (tracepoints, HW cache events, |
Arnaldo Carvalho de Melo | 27702bc | 2017-08-31 11:50:04 -0300 | [diff] [blame] | 40 | etc) to show. Globbing is supported, e.g.: "epoll_*", "*msg*", etc. |
Arnaldo Carvalho de Melo | 017037f | 2017-01-09 17:26:26 -0300 | [diff] [blame] | 41 | See 'perf list' for a complete list of events. |
Arnaldo Carvalho de Melo | b059efd | 2013-08-21 12:56:21 -0300 | [diff] [blame] | 42 | Prefixing with ! shows all syscalls but the ones specified. You may |
| 43 | need to escape it. |
Arnaldo Carvalho de Melo | 2ae3a31 | 2013-08-09 12:28:31 -0300 | [diff] [blame] | 44 | |
Alexis Berlemont | e36b782 | 2016-10-10 07:43:28 +0200 | [diff] [blame] | 45 | -D msecs:: |
| 46 | --delay msecs:: |
| 47 | After starting the program, wait msecs before measuring. This is useful to |
| 48 | filter out the startup phase of the program, which is often very different. |
| 49 | |
Arnaldo Carvalho de Melo | c24ff99 | 2013-08-19 12:01:10 -0300 | [diff] [blame] | 50 | -o:: |
| 51 | --output=:: |
| 52 | Output file name. |
| 53 | |
Arnaldo Carvalho de Melo | 514f1c6 | 2012-09-26 20:05:56 -0300 | [diff] [blame] | 54 | -p:: |
| 55 | --pid=:: |
| 56 | Record events on existing process ID (comma separated list). |
| 57 | |
David Ahern | ac9be8e | 2013-08-20 11:15:45 -0600 | [diff] [blame] | 58 | -t:: |
Arnaldo Carvalho de Melo | 514f1c6 | 2012-09-26 20:05:56 -0300 | [diff] [blame] | 59 | --tid=:: |
| 60 | Record events on existing thread ID (comma separated list). |
| 61 | |
David Ahern | ac9be8e | 2013-08-20 11:15:45 -0600 | [diff] [blame] | 62 | -u:: |
Arnaldo Carvalho de Melo | 514f1c6 | 2012-09-26 20:05:56 -0300 | [diff] [blame] | 63 | --uid=:: |
| 64 | Record events in threads owned by uid. Name or number. |
| 65 | |
Arnaldo Carvalho de Melo | 9ea42ba | 2018-03-06 16:30:51 -0300 | [diff] [blame] | 66 | -G:: |
| 67 | --cgroup:: |
| 68 | Record events in threads in a cgroup. |
| 69 | |
| 70 | Look for cgroups to set at the /sys/fs/cgroup/perf_event directory, then |
| 71 | remove the /sys/fs/cgroup/perf_event/ part and try: |
| 72 | |
| 73 | perf trace -G A -e sched:*switch |
| 74 | |
| 75 | Will set all raw_syscalls:sys_{enter,exit}, pgfault, vfs_getname, etc |
| 76 | _and_ sched:sched_switch to the 'A' cgroup, while: |
| 77 | |
| 78 | perf trace -e sched:*switch -G A |
| 79 | |
| 80 | will only set the sched:sched_switch event to the 'A' cgroup, all the |
| 81 | other events (raw_syscalls:sys_{enter,exit}, etc are left "without" |
| 82 | a cgroup (on the root cgroup, sys wide, etc). |
| 83 | |
| 84 | Multiple cgroups: |
| 85 | |
| 86 | perf trace -G A -e sched:*switch -G B |
| 87 | |
| 88 | the syscall ones go to the 'A' cgroup, the sched:sched_switch goes |
| 89 | to the 'B' cgroup. |
| 90 | |
Arnaldo Carvalho de Melo | f078c38 | 2015-02-21 11:36:52 -0800 | [diff] [blame] | 91 | --filter-pids=:: |
| 92 | Filter out events for these pids and for 'trace' itself (comma separated list). |
| 93 | |
Arnaldo Carvalho de Melo | 7c304ee0 | 2013-08-22 16:49:54 -0300 | [diff] [blame] | 94 | -v:: |
| 95 | --verbose=:: |
| 96 | Verbosity level. |
| 97 | |
Arnaldo Carvalho de Melo | 514f1c6 | 2012-09-26 20:05:56 -0300 | [diff] [blame] | 98 | --no-inherit:: |
| 99 | Child tasks do not inherit counters. |
| 100 | |
David Ahern | ac9be8e | 2013-08-20 11:15:45 -0600 | [diff] [blame] | 101 | -m:: |
Arnaldo Carvalho de Melo | 514f1c6 | 2012-09-26 20:05:56 -0300 | [diff] [blame] | 102 | --mmap-pages=:: |
Jiri Olsa | 27050f5 | 2013-09-01 12:36:13 +0200 | [diff] [blame] | 103 | Number of mmap data pages (must be a power of two) or size |
| 104 | specification with appended unit character - B/K/M/G. The |
| 105 | size is rounded up to have nearest pages power of two value. |
Arnaldo Carvalho de Melo | 514f1c6 | 2012-09-26 20:05:56 -0300 | [diff] [blame] | 106 | |
David Ahern | ac9be8e | 2013-08-20 11:15:45 -0600 | [diff] [blame] | 107 | -C:: |
Arnaldo Carvalho de Melo | 514f1c6 | 2012-09-26 20:05:56 -0300 | [diff] [blame] | 108 | --cpu:: |
| 109 | Collect samples only on the list of CPUs provided. Multiple CPUs can be provided as a |
| 110 | comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2. |
| 111 | In per-thread mode with inheritance mode on (default), Events are captured only when |
| 112 | the thread executes on the designated CPUs. Default is to monitor all CPUs. |
| 113 | |
Sihyeon Jang | 9b9d28a | 2017-11-16 23:26:03 +0900 | [diff] [blame] | 114 | --duration:: |
Arnaldo Carvalho de Melo | ae9ed03 | 2012-10-08 09:56:00 -0300 | [diff] [blame] | 115 | Show only events that had a duration greater than N.M ms. |
| 116 | |
Sihyeon Jang | 9b9d28a | 2017-11-16 23:26:03 +0900 | [diff] [blame] | 117 | --sched:: |
Arnaldo Carvalho de Melo | 1302d88 | 2012-10-17 17:13:12 -0300 | [diff] [blame] | 118 | Accrue thread runtime and provide a summary at the end of the session. |
| 119 | |
Arnaldo Carvalho de Melo | 0a6545b | 2018-03-29 12:22:59 -0300 | [diff] [blame] | 120 | --failure:: |
| 121 | Show only syscalls that failed, i.e. that returned < 0. |
| 122 | |
Sihyeon Jang | 9b9d28a | 2017-11-16 23:26:03 +0900 | [diff] [blame] | 123 | -i:: |
| 124 | --input:: |
David Ahern | 6810fc9 | 2013-08-28 22:29:52 -0600 | [diff] [blame] | 125 | Process events from a given perf data file. |
| 126 | |
Sihyeon Jang | 9b9d28a | 2017-11-16 23:26:03 +0900 | [diff] [blame] | 127 | -T:: |
| 128 | --time:: |
David Ahern | 4bb0919 | 2013-09-04 12:37:43 -0600 | [diff] [blame] | 129 | Print full timestamp rather time relative to first sample. |
| 130 | |
Arnaldo Carvalho de Melo | 50c95cb | 2013-09-12 12:35:21 -0300 | [diff] [blame] | 131 | --comm:: |
| 132 | Show process COMM right beside its ID, on by default, disable with --no-comm. |
| 133 | |
David Ahern | fd2eaba | 2013-11-12 09:31:15 -0700 | [diff] [blame] | 134 | -s:: |
David Ahern | bf2575c | 2013-10-08 21:26:53 -0600 | [diff] [blame] | 135 | --summary:: |
David Ahern | fd2eaba | 2013-11-12 09:31:15 -0700 | [diff] [blame] | 136 | Show only a summary of syscalls by thread with min, max, and average times |
| 137 | (in msec) and relative stddev. |
| 138 | |
| 139 | -S:: |
| 140 | --with-summary:: |
| 141 | Show all syscalls followed by a summary by thread with min, max, and |
| 142 | average times (in msec) and relative stddev. |
David Ahern | bf2575c | 2013-10-08 21:26:53 -0600 | [diff] [blame] | 143 | |
Arnaldo Carvalho de Melo | c522739 | 2013-09-27 18:06:19 -0300 | [diff] [blame] | 144 | --tool_stats:: |
| 145 | Show tool stats such as number of times fd->pathname was discovered thru |
| 146 | hooking the open syscall return + vfs_getname or via reading /proc/pid/fd, etc. |
| 147 | |
Sihyeon Jang | 9b9d28a | 2017-11-16 23:26:03 +0900 | [diff] [blame] | 148 | -f:: |
| 149 | --force:: |
| 150 | Don't complain, do it. |
| 151 | |
Stanislav Fomichev | 598d02c | 2014-06-26 20:14:25 +0400 | [diff] [blame] | 152 | -F=[all|min|maj]:: |
| 153 | --pf=[all|min|maj]:: |
| 154 | Trace pagefaults. Optionally, you can specify whether you want minor, |
| 155 | major or all pagefaults. Default value is maj. |
| 156 | |
Stanislav Fomichev | e281a96 | 2014-06-26 20:14:28 +0400 | [diff] [blame] | 157 | --syscalls:: |
Ravi Bangoria | 739cf30 | 2017-04-13 11:32:12 +0530 | [diff] [blame] | 158 | Trace system calls. This options is enabled by default, disable with |
| 159 | --no-syscalls. |
Stanislav Fomichev | e281a96 | 2014-06-26 20:14:28 +0400 | [diff] [blame] | 160 | |
Milian Wolff | 566a088 | 2016-04-08 13:34:15 +0200 | [diff] [blame] | 161 | --call-graph [mode,type,min[,limit],order[,key][,branch]]:: |
| 162 | Setup and enable call-graph (stack chain/backtrace) recording. |
| 163 | See `--call-graph` section in perf-record and perf-report |
| 164 | man pages for details. The ones that are most useful in 'perf trace' |
| 165 | are 'dwarf' and 'lbr', where available, try: 'perf trace --call-graph dwarf'. |
| 166 | |
Arnaldo Carvalho de Melo | f3e459d | 2016-04-15 17:52:34 -0300 | [diff] [blame] | 167 | Using this will, for the root user, bump the value of --mmap-pages to 4 |
| 168 | times the maximum for non-root users, based on the kernel.perf_event_mlock_kb |
| 169 | sysctl. This is done only if the user doesn't specify a --mmap-pages value. |
| 170 | |
Arnaldo Carvalho de Melo | 4462181 | 2016-04-11 15:49:11 -0300 | [diff] [blame] | 171 | --kernel-syscall-graph:: |
| 172 | Show the kernel callchains on the syscall exit path. |
| 173 | |
Arnaldo Carvalho de Melo | 5067a8c | 2018-10-18 16:38:27 -0300 | [diff] [blame] | 174 | --max-events=N:: |
| 175 | Stop after processing N events. Note that strace-like events are considered |
| 176 | only at exit time or when a syscall is interrupted, i.e. in those cases this |
| 177 | option is equivalent to the number of lines printed. |
| 178 | |
Arnaldo Carvalho de Melo | c6d4a49 | 2016-04-14 18:29:08 -0300 | [diff] [blame] | 179 | --max-stack:: |
| 180 | Set the stack depth limit when parsing the callchain, anything |
| 181 | beyond the specified depth will be ignored. Note that at this point |
| 182 | this is just about the presentation part, i.e. the kernel is still |
| 183 | not limiting, the overhead of callchains needs to be set via the |
| 184 | knobs in --call-graph dwarf. |
| 185 | |
Arnaldo Carvalho de Melo | 0561499 | 2016-04-15 16:41:19 -0300 | [diff] [blame] | 186 | Implies '--call-graph dwarf' when --call-graph not present on the |
| 187 | command line, on systems where DWARF unwinding was built in. |
| 188 | |
Arnaldo Carvalho de Melo | fe17608 | 2016-05-19 11:34:06 -0300 | [diff] [blame] | 189 | Default: /proc/sys/kernel/perf_event_max_stack when present for |
| 190 | live sessions (without --input/-i), 127 otherwise. |
Arnaldo Carvalho de Melo | c6d4a49 | 2016-04-14 18:29:08 -0300 | [diff] [blame] | 191 | |
Arnaldo Carvalho de Melo | 5cf9c84 | 2016-04-15 11:10:31 -0300 | [diff] [blame] | 192 | --min-stack:: |
| 193 | Set the stack depth limit when parsing the callchain, anything |
| 194 | below the specified depth will be ignored. Disabled by default. |
| 195 | |
Arnaldo Carvalho de Melo | 0561499 | 2016-04-15 16:41:19 -0300 | [diff] [blame] | 196 | Implies '--call-graph dwarf' when --call-graph not present on the |
| 197 | command line, on systems where DWARF unwinding was built in. |
| 198 | |
Arnaldo Carvalho de Melo | 591421e | 2018-01-22 11:38:54 -0300 | [diff] [blame] | 199 | --print-sample:: |
| 200 | Print the PERF_RECORD_SAMPLE PERF_SAMPLE_ info for the |
| 201 | raw_syscalls:sys_{enter,exit} tracepoints, for debugging. |
| 202 | |
Kan Liang | 9d9cad7 | 2015-06-17 09:51:11 -0400 | [diff] [blame] | 203 | --proc-map-timeout:: |
| 204 | When processing pre-existing threads /proc/XXX/mmap, it may take a long time, |
| 205 | because the file may be huge. A time out is needed in such cases. |
| 206 | This option sets the time out limit. The default value is 500 ms. |
| 207 | |
Stanislav Fomichev | 598d02c | 2014-06-26 20:14:25 +0400 | [diff] [blame] | 208 | PAGEFAULTS |
| 209 | ---------- |
| 210 | |
| 211 | When tracing pagefaults, the format of the trace is as follows: |
| 212 | |
| 213 | <min|maj>fault [<ip.symbol>+<ip.offset>] => <addr.dso@addr.offset> (<map type><addr level>). |
| 214 | |
| 215 | - min/maj indicates whether fault event is minor or major; |
| 216 | - ip.symbol shows symbol for instruction pointer (the code that generated the |
| 217 | fault); if no debug symbols available, perf trace will print raw IP; |
| 218 | - addr.dso shows DSO for the faulted address; |
| 219 | - map type is either 'd' for non-executable maps or 'x' for executable maps; |
| 220 | - addr level is either 'k' for kernel dso or '.' for user dso. |
| 221 | |
| 222 | For symbols resolution you may need to install debugging symbols. |
| 223 | |
| 224 | Please be aware that duration is currently always 0 and doesn't reflect actual |
| 225 | time it took for fault to be handled! |
| 226 | |
| 227 | When --verbose specified, perf trace tries to print all available information |
| 228 | for both IP and fault address in the form of dso@symbol+offset. |
| 229 | |
| 230 | EXAMPLES |
| 231 | -------- |
| 232 | |
Stanislav Fomichev | e281a96 | 2014-06-26 20:14:28 +0400 | [diff] [blame] | 233 | Trace only major pagefaults: |
| 234 | |
| 235 | $ perf trace --no-syscalls -F |
| 236 | |
Stanislav Fomichev | 598d02c | 2014-06-26 20:14:25 +0400 | [diff] [blame] | 237 | Trace syscalls, major and minor pagefaults: |
| 238 | |
| 239 | $ perf trace -F all |
| 240 | |
| 241 | 1416.547 ( 0.000 ms): python/20235 majfault [CRYPTO_push_info_+0x0] => /lib/x86_64-linux-gnu/libcrypto.so.1.0.0@0x61be0 (x.) |
| 242 | |
| 243 | As you can see, there was major pagefault in python process, from |
| 244 | CRYPTO_push_info_ routine which faulted somewhere in libcrypto.so. |
| 245 | |
Arnaldo Carvalho de Melo | 5067a8c | 2018-10-18 16:38:27 -0300 | [diff] [blame] | 246 | Trace the first 4 open, openat or open_by_handle_at syscalls (in the future more syscalls may match here): |
| 247 | |
| 248 | $ perf trace -e open* --max-events 4 |
| 249 | [root@jouet perf]# trace -e open* --max-events 4 |
| 250 | 2272.992 ( 0.037 ms): gnome-shell/1370 openat(dfd: CWD, filename: /proc/self/stat) = 31 |
| 251 | 2277.481 ( 0.139 ms): gnome-shell/3039 openat(dfd: CWD, filename: /proc/self/stat) = 65 |
| 252 | 3026.398 ( 0.076 ms): gnome-shell/3039 openat(dfd: CWD, filename: /proc/self/stat) = 65 |
| 253 | 4294.665 ( 0.015 ms): sed/15879 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3 |
| 254 | $ |
| 255 | |
| 256 | Trace the first minor page fault when running a workload: |
| 257 | |
| 258 | # perf trace -F min --max-stack=7 --max-events 1 sleep 1 |
| 259 | 0.000 ( 0.000 ms): sleep/18006 minfault [__clear_user+0x1a] => 0x5626efa56080 (?k) |
| 260 | __clear_user ([kernel.kallsyms]) |
| 261 | load_elf_binary ([kernel.kallsyms]) |
| 262 | search_binary_handler ([kernel.kallsyms]) |
| 263 | __do_execve_file.isra.33 ([kernel.kallsyms]) |
| 264 | __x64_sys_execve ([kernel.kallsyms]) |
| 265 | do_syscall_64 ([kernel.kallsyms]) |
| 266 | entry_SYSCALL_64 ([kernel.kallsyms]) |
| 267 | # |
| 268 | |
| 269 | Trace the next min page page fault to take place on the first CPU: |
| 270 | |
| 271 | # perf trace -F min --call-graph=dwarf --max-events 1 --cpu 0 |
| 272 | 0.000 ( 0.000 ms): Web Content/17136 minfault [js::gc::Chunk::fetchNextDecommittedArena+0x4b] => 0x7fbe6181b000 (?.) |
| 273 | js::gc::FreeSpan::initAsEmpty (inlined) |
| 274 | js::gc::Arena::setAsNotAllocated (inlined) |
| 275 | js::gc::Chunk::fetchNextDecommittedArena (/usr/lib64/firefox/libxul.so) |
| 276 | js::gc::Chunk::allocateArena (/usr/lib64/firefox/libxul.so) |
| 277 | js::gc::GCRuntime::allocateArena (/usr/lib64/firefox/libxul.so) |
| 278 | js::gc::ArenaLists::allocateFromArena (/usr/lib64/firefox/libxul.so) |
| 279 | js::gc::GCRuntime::tryNewTenuredThing<JSString, (js::AllowGC)1> (inlined) |
| 280 | js::AllocateString<JSString, (js::AllowGC)1> (/usr/lib64/firefox/libxul.so) |
| 281 | js::Allocate<JSThinInlineString, (js::AllowGC)1> (inlined) |
| 282 | JSThinInlineString::new_<(js::AllowGC)1> (inlined) |
| 283 | AllocateInlineString<(js::AllowGC)1, unsigned char> (inlined) |
| 284 | js::ConcatStrings<(js::AllowGC)1> (/usr/lib64/firefox/libxul.so) |
| 285 | [0x18b26e6bc2bd] (/tmp/perf-17136.map) |
| 286 | # |
| 287 | |
Arnaldo Carvalho de Melo | a9c5e6c | 2018-10-22 14:14:16 -0300 | [diff] [blame] | 288 | Trace the next two sched:sched_switch events, four block:*_plug events, the |
| 289 | next block:*_unplug and the next three net:*dev_queue events, this last one |
| 290 | with a backtrace of at most 16 entries, system wide: |
| 291 | |
| 292 | # perf trace -e sched:*switch/nr=2/,block:*_plug/nr=4/,block:*_unplug/nr=1/,net:*dev_queue/nr=3,max-stack=16/ |
| 293 | 0.000 :0/0 sched:sched_switch:swapper/2:0 [120] S ==> rcu_sched:10 [120] |
| 294 | 0.015 rcu_sched/10 sched:sched_switch:rcu_sched:10 [120] R ==> swapper/2:0 [120] |
| 295 | 254.198 irq/50-iwlwifi/680 net:net_dev_queue:dev=wlp3s0 skbaddr=0xffff93498051f600 len=66 |
| 296 | __dev_queue_xmit ([kernel.kallsyms]) |
| 297 | 273.977 :0/0 net:net_dev_queue:dev=wlp3s0 skbaddr=0xffff93498051f600 len=78 |
| 298 | __dev_queue_xmit ([kernel.kallsyms]) |
| 299 | 274.007 :0/0 net:net_dev_queue:dev=wlp3s0 skbaddr=0xffff93498051ff00 len=78 |
| 300 | __dev_queue_xmit ([kernel.kallsyms]) |
| 301 | 2930.140 kworker/u16:58/2722 block:block_plug:[kworker/u16:58] |
| 302 | 2930.162 kworker/u16:58/2722 block:block_unplug:[kworker/u16:58] 1 |
| 303 | 4466.094 jbd2/dm-2-8/748 block:block_plug:[jbd2/dm-2-8] |
| 304 | 8050.123 kworker/u16:30/2694 block:block_plug:[kworker/u16:30] |
| 305 | 8050.271 kworker/u16:30/2694 block:block_plug:[kworker/u16:30] |
| 306 | # |
| 307 | |
Arnaldo Carvalho de Melo | 514f1c6 | 2012-09-26 20:05:56 -0300 | [diff] [blame] | 308 | SEE ALSO |
| 309 | -------- |
| 310 | linkperf:perf-record[1], linkperf:perf-script[1] |