Blame - tools/perf/Documentation/intel-pt.txt - SHIFTPHONES/android_kernel_shift_sdm845

blob: 4a0501d7a3b412337960b07a695a772d2b7c1344 [file] [log] [blame]

Adrian Hunter	5efb1d5	2015-07-17 19:33:42 +0300	[diff] [blame]	1	Intel Processor Trace
				2	=====================
				3
				4	Overview
				5	========
				6
				7	Intel Processor Trace (Intel PT) is an extension of Intel Architecture that
				8	collects information about software execution such as control flow, execution
				9	modes and timings and formats it into highly compressed binary packets.
				10	Technical details are documented in the Intel 64 and IA-32 Architectures
				11	Software Developer Manuals, Chapter 36 Intel Processor Trace.
				12
				13	Intel PT is first supported in Intel Core M and 5th generation Intel Core
				14	processors that are based on the Intel micro-architecture code name Broadwell.
				15
				16	Trace data is collected by 'perf record' and stored within the perf.data file.
				17	See below for options to 'perf record'.
				18
				19	Trace data must be 'decoded' which involves walking the object code and matching
				20	the trace data packets. For example a TNT packet only tells whether a
				21	conditional branch was taken or not taken, so to make use of that packet the
				22	decoder must know precisely which instruction was being executed.
				23
				24	Decoding is done on-the-fly. The decoder outputs samples in the same format as
				25	samples output by perf hardware events, for example as though the "instructions"
				26	or "branches" events had been recorded. Presently 3 tools support this:
				27	'perf script', 'perf report' and 'perf inject'. See below for more information
				28	on using those tools.
				29
				30	The main distinguishing feature of Intel PT is that the decoder can determine
				31	the exact flow of software execution. Intel PT can be used to understand why
				32	and how did software get to a certain point, or behave a certain way. The
				33	software does not have to be recompiled, so Intel PT works with debug or release
				34	builds, however the executed images are needed - which makes use in JIT-compiled
				35	environments, or with self-modified code, a challenge. Also symbols need to be
				36	provided to make sense of addresses.
				37
				38	A limitation of Intel PT is that it produces huge amounts of trace data
				39	(hundreds of megabytes per second per core) which takes a long time to decode,
				40	for example two or three orders of magnitude longer than it took to collect.
				41	Another limitation is the performance impact of tracing, something that will
				42	vary depending on the use-case and architecture.
				43
				44
				45	Quickstart
				46	==========
				47
				48	It is important to start small. That is because it is easy to capture vastly
				49	more data than can possibly be processed.
				50
				51	The simplest thing to do with Intel PT is userspace profiling of small programs.
				52	Data is captured with 'perf record' e.g. to trace 'ls' userspace-only:
				53
				54	perf record -e intel_pt//u ls
				55
				56	And profiled with 'perf report' e.g.
				57
				58	perf report
				59
				60	To also trace kernel space presents a problem, namely kernel self-modifying
				61	code. A fairly good kernel image is available in /proc/kcore but to get an
				62	accurate image a copy of /proc/kcore needs to be made under the same conditions
				63	as the data capture. A script perf-with-kcore can do that, but beware that the
				64	script makes use of 'sudo' to copy /proc/kcore. If you have perf installed
				65	locally from the source tree you can do:
				66
				67	~/libexec/perf-core/perf-with-kcore record pt_ls -e intel_pt// -- ls
				68
				69	which will create a directory named 'pt_ls' and put the perf.data file and
				70	copies of /proc/kcore, /proc/kallsyms and /proc/modules into it. Then to use
				71	'perf report' becomes:
				72
				73	~/libexec/perf-core/perf-with-kcore report pt_ls
				74
				75	Because samples are synthesized after-the-fact, the sampling period can be
				76	selected for reporting. e.g. sample every microsecond
				77
				78	~/libexec/perf-core/perf-with-kcore report pt_ls --itrace=i1usge
				79
				80	See the sections below for more information about the --itrace option.
				81
				82	Beware the smaller the period, the more samples that are produced, and the
				83	longer it takes to process them.
				84
				85	Also note that the coarseness of Intel PT timing information will start to
				86	distort the statistical value of the sampling as the sampling period becomes
				87	smaller.
				88
				89	To represent software control flow, "branches" samples are produced. By default
				90	a branch sample is synthesized for every single branch. To get an idea what
				91	data is available you can use the 'perf script' tool with no parameters, which
				92	will list all the samples.
				93
				94	perf record -e intel_pt//u ls
				95	perf script
				96
				97	An interesting field that is not printed by default is 'flags' which can be
				98	displayed as follows:
				99
				100	perf script -Fcomm,tid,pid,time,cpu,event,trace,ip,sym,dso,addr,symoff,flags
				101
				102	The flags are "bcrosyiABEx" which stand for branch, call, return, conditional,
				103	system, asynchronous, interrupt, transaction abort, trace begin, trace end, and
				104	in transaction, respectively.
				105
				106	While it is possible to create scripts to analyze the data, an alternative
				107	approach is available to export the data to a postgresql database. Refer to
				108	script export-to-postgresql.py for more details, and to script
				109	call-graph-from-postgresql.py for an example of using the database.
				110
				111	As mentioned above, it is easy to capture too much data. One way to limit the
				112	data captured is to use 'snapshot' mode which is explained further below.
				113	Refer to 'new snapshot option' and 'Intel PT modes of operation' further below.
				114
				115	Another problem that will be experienced is decoder errors. They can be caused
				116	by inability to access the executed image, self-modified or JIT-ed code, or the
				117	inability to match side-band information (such as context switches and mmaps)
				118	which results in the decoder not knowing what code was executed.
				119
				120	There is also the problem of perf not being able to copy the data fast enough,
				121	resulting in data lost because the buffer was full. See 'Buffer handling' below
				122	for more details.
				123
				124
				125	perf record
				126	===========
				127
				128	new event
				129	---------
				130
				131	The Intel PT kernel driver creates a new PMU for Intel PT. PMU events are
				132	selected by providing the PMU name followed by the "config" separated by slashes.
				133	An enhancement has been made to allow default "config" e.g. the option
				134
				135	-e intel_pt//
				136
				137	will use a default config value. Currently that is the same as
				138
				139	-e intel_pt/tsc,noretcomp=0/
				140
				141	which is the same as
				142
				143	-e intel_pt/tsc=1,noretcomp=0/
				144
Adrian Hunter	9d1bf02	2015-07-17 19:34:00 +0300	[diff] [blame]	145	Note there are now new config terms - see section 'config terms' further below.
				146
Adrian Hunter	5efb1d5	2015-07-17 19:33:42 +0300	[diff] [blame]	147	The config terms are listed in /sys/devices/intel_pt/format. They are bit
				148	fields within the config member of the struct perf_event_attr which is
				149	passed to the kernel by the perf_event_open system call. They correspond to bit
				150	fields in the IA32_RTIT_CTL MSR. Here is a list of them and their definitions:
				151
Adrian Hunter	9d1bf02	2015-07-17 19:34:00 +0300	[diff] [blame]	152	$ grep -H . /sys/bus/event_source/devices/intel_pt/format/*
				153	/sys/bus/event_source/devices/intel_pt/format/cyc:config:1
				154	/sys/bus/event_source/devices/intel_pt/format/cyc_thresh:config:19-22
				155	/sys/bus/event_source/devices/intel_pt/format/mtc:config:9
				156	/sys/bus/event_source/devices/intel_pt/format/mtc_period:config:14-17
				157	/sys/bus/event_source/devices/intel_pt/format/noretcomp:config:11
				158	/sys/bus/event_source/devices/intel_pt/format/psb_period:config:24-27
				159	/sys/bus/event_source/devices/intel_pt/format/tsc:config:10
Adrian Hunter	5efb1d5	2015-07-17 19:33:42 +0300	[diff] [blame]	160
				161	Note that the default config must be overridden for each term i.e.
				162
				163	-e intel_pt/noretcomp=0/
				164
				165	is the same as:
				166
				167	-e intel_pt/tsc=1,noretcomp=0/
				168
				169	So, to disable TSC packets use:
				170
				171	-e intel_pt/tsc=0/
				172
				173	It is also possible to specify the config value explicitly:
				174
				175	-e intel_pt/config=0x400/
				176
				177	Note that, as with all events, the event is suffixed with event modifiers:
				178
				179	u userspace
				180	k kernel
				181	h hypervisor
				182	G guest
				183	H host
				184	p precise ip
				185
				186	'h', 'G' and 'H' are for virtualization which is not supported by Intel PT.
				187	'p' is also not relevant to Intel PT. So only options 'u' and 'k' are
				188	meaningful for Intel PT.
				189
				190	perf_event_attr is displayed if the -vv option is used e.g.
				191
				192	------------------------------------------------------------
				193	perf_event_attr:
				194	type 6
				195	size 112
				196	config 0x400
				197	{ sample_period, sample_freq } 1
				198	sample_type IP\|TID\|TIME\|CPU\|IDENTIFIER
				199	read_format ID
				200	disabled 1
				201	inherit 1
				202	exclude_kernel 1
				203	exclude_hv 1
				204	enable_on_exec 1
				205	sample_id_all 1
				206	------------------------------------------------------------
				207	sys_perf_event_open: pid 31104 cpu 0 group_fd -1 flags 0x8
				208	sys_perf_event_open: pid 31104 cpu 1 group_fd -1 flags 0x8
				209	sys_perf_event_open: pid 31104 cpu 2 group_fd -1 flags 0x8
				210	sys_perf_event_open: pid 31104 cpu 3 group_fd -1 flags 0x8
				211	------------------------------------------------------------
				212
				213
Adrian Hunter	9d1bf02	2015-07-17 19:34:00 +0300	[diff] [blame]	214	config terms
				215	------------
				216
				217	The June 2015 version of Intel 64 and IA-32 Architectures Software Developer
				218	Manuals, Chapter 36 Intel Processor Trace, defined new Intel PT features.
				219	Some of the features are reflect in new config terms. All the config terms are
				220	described below.
				221
				222	tsc Always supported. Produces TSC timestamp packets to provide
				223	timing information. In some cases it is possible to decode
				224	without timing information, for example a per-thread context
				225	that does not overlap executable memory maps.
				226
				227	The default config selects tsc (i.e. tsc=1).
				228
				229	noretcomp Always supported. Disables "return compression" so a TIP packet
				230	is produced when a function returns. Causes more packets to be
				231	produced but might make decoding more reliable.
				232
				233	The default config does not select noretcomp (i.e. noretcomp=0).
				234
				235	psb_period Allows the frequency of PSB packets to be specified.
				236
				237	The PSB packet is a synchronization packet that provides a
				238	starting point for decoding or recovery from errors.
				239
				240	Support for psb_period is indicated by:
				241
				242	/sys/bus/event_source/devices/intel_pt/caps/psb_cyc
				243
				244	which contains "1" if the feature is supported and "0"
				245	otherwise.
				246
				247	Valid values are given by:
				248
				249	/sys/bus/event_source/devices/intel_pt/caps/psb_periods
				250
				251	which contains a hexadecimal value, the bits of which represent
				252	valid values e.g. bit 2 set means value 2 is valid.
				253
				254	The psb_period value is converted to the approximate number of
				255	trace bytes between PSB packets as:
				256
				257	2 ^ (value + 11)
				258
				259	e.g. value 3 means 16KiB bytes between PSBs
				260
				261	If an invalid value is entered, the error message
				262	will give a list of valid values e.g.
				263
				264	$ perf record -e intel_pt/psb_period=15/u uname
				265	Invalid psb_period for intel_pt. Valid values are: 0-5
				266
				267	If MTC packets are selected, the default config selects a value
				268	of 3 (i.e. psb_period=3) or the nearest lower value that is
				269	supported (0 is always supported). Otherwise the default is 0.
				270
				271	If decoding is expected to be reliable and the buffer is large
				272	then a large PSB period can be used.
				273
				274	Because a TSC packet is produced with PSB, the PSB period can
				275	also affect the granularity to timing information in the absence
				276	of MTC or CYC.
				277
				278	mtc Produces MTC timing packets.
				279
				280	MTC packets provide finer grain timestamp information than TSC
				281	packets. MTC packets record time using the hardware crystal
				282	clock (CTC) which is related to TSC packets using a TMA packet.
				283
				284	Support for this feature is indicated by:
				285
				286	/sys/bus/event_source/devices/intel_pt/caps/mtc
				287
				288	which contains "1" if the feature is supported and
				289	"0" otherwise.
				290
				291	The frequency of MTC packets can also be specified - see
				292	mtc_period below.
				293
				294	mtc_period Specifies how frequently MTC packets are produced - see mtc
				295	above for how to determine if MTC packets are supported.
				296
				297	Valid values are given by:
				298
				299	/sys/bus/event_source/devices/intel_pt/caps/mtc_periods
				300
				301	which contains a hexadecimal value, the bits of which represent
				302	valid values e.g. bit 2 set means value 2 is valid.
				303
				304	The mtc_period value is converted to the MTC frequency as:
				305
				306	CTC-frequency / (2 ^ value)
				307
				308	e.g. value 3 means one eighth of CTC-frequency
				309
				310	Where CTC is the hardware crystal clock, the frequency of which
				311	can be related to TSC via values provided in cpuid leaf 0x15.
				312
				313	If an invalid value is entered, the error message
				314	will give a list of valid values e.g.
				315
				316	$ perf record -e intel_pt/mtc_period=15/u uname
				317	Invalid mtc_period for intel_pt. Valid values are: 0,3,6,9
				318
				319	The default value is 3 or the nearest lower value
				320	that is supported (0 is always supported).
				321
				322	cyc Produces CYC timing packets.
				323
				324	CYC packets provide even finer grain timestamp information than
				325	MTC and TSC packets. A CYC packet contains the number of CPU
				326	cycles since the last CYC packet. Unlike MTC and TSC packets,
				327	CYC packets are only sent when another packet is also sent.
				328
				329	Support for this feature is indicated by:
				330
				331	/sys/bus/event_source/devices/intel_pt/caps/psb_cyc
				332
				333	which contains "1" if the feature is supported and
				334	"0" otherwise.
				335
				336	The number of CYC packets produced can be reduced by specifying
				337	a threshold - see cyc_thresh below.
				338
				339	cyc_thresh Specifies how frequently CYC packets are produced - see cyc
				340	above for how to determine if CYC packets are supported.
				341
				342	Valid cyc_thresh values are given by:
				343
				344	/sys/bus/event_source/devices/intel_pt/caps/cycle_thresholds
				345
				346	which contains a hexadecimal value, the bits of which represent
				347	valid values e.g. bit 2 set means value 2 is valid.
				348
				349	The cyc_thresh value represents the minimum number of CPU cycles
				350	that must have passed before a CYC packet can be sent. The
				351	number of CPU cycles is:
				352
				353	2 ^ (value - 1)
				354
				355	e.g. value 4 means 8 CPU cycles must pass before a CYC packet
				356	can be sent. Note a CYC packet is still only sent when another
				357	packet is sent, not at, e.g. every 8 CPU cycles.
				358
				359	If an invalid value is entered, the error message
				360	will give a list of valid values e.g.
				361
				362	$ perf record -e intel_pt/cyc,cyc_thresh=15/u uname
				363	Invalid cyc_thresh for intel_pt. Valid values are: 0-12
				364
				365	CYC packets are not requested by default.
				366
				367	no_force_psb This is a driver option and is not in the IA32_RTIT_CTL MSR.
				368
				369	It stops the driver resetting the byte count to zero whenever
				370	enabling the trace (for example on context switches) which in
				371	turn results in no PSB being forced. However some processors
				372	will produce a PSB anyway.
				373
				374	In any case, there is still a PSB when the trace is enabled for
				375	the first time.
				376
				377	no_force_psb can be used to slightly decrease the trace size but
				378	may make it harder for the decoder to recover from errors.
				379
				380	no_force_psb is not selected by default.
				381
				382
Adrian Hunter	5efb1d5	2015-07-17 19:33:42 +0300	[diff] [blame]	383	new snapshot option
				384	-------------------
				385
Adrian Hunter	9d1bf02	2015-07-17 19:34:00 +0300	[diff] [blame]	386	The difference between full trace and snapshot from the kernel's perspective is
				387	that in full trace we don't overwrite trace data that the user hasn't collected
				388	yet (and indicated that by advancing aux_tail), whereas in snapshot mode we let
				389	the trace run and overwrite older data in the buffer so that whenever something
				390	interesting happens, we can stop it and grab a snapshot of what was going on
				391	around that interesting moment.
				392
Adrian Hunter	5efb1d5	2015-07-17 19:33:42 +0300	[diff] [blame]	393	To select snapshot mode a new option has been added:
				394
				395	-S
				396
				397	Optionally it can be followed by the snapshot size e.g.
				398
				399	-S0x100000
				400
				401	The default snapshot size is the auxtrace mmap size. If neither auxtrace mmap size
				402	nor snapshot size is specified, then the default is 4MiB for privileged users
				403	(or if /proc/sys/kernel/perf_event_paranoid < 0), 128KiB for unprivileged users.
				404	If an unprivileged user does not specify mmap pages, the mmap pages will be
				405	reduced as described in the 'new auxtrace mmap size option' section below.
				406
				407	The snapshot size is displayed if the option -vv is used e.g.
				408
				409	Intel PT snapshot size: %zu
				410
				411
				412	new auxtrace mmap size option
				413	---------------------------
				414
				415	Intel PT buffer size is specified by an addition to the -m option e.g.
				416
				417	-m,16
				418
				419	selects a buffer size of 16 pages i.e. 64KiB.
				420
				421	Note that the existing functionality of -m is unchanged. The auxtrace mmap size
				422	is specified by the optional addition of a comma and the value.
				423
				424	The default auxtrace mmap size for Intel PT is 4MiB/page_size for privileged users
				425	(or if /proc/sys/kernel/perf_event_paranoid < 0), 128KiB for unprivileged users.
				426	If an unprivileged user does not specify mmap pages, the mmap pages will be
				427	reduced from the default 512KiB/page_size to 256KiB/page_size, otherwise the
				428	user is likely to get an error as they exceed their mlock limit (Max locked
				429	memory as shown in /proc/self/limits). Note that perf does not count the first
				430	512KiB (actually /proc/sys/kernel/perf_event_mlock_kb minus 1 page) per cpu
				431	against the mlock limit so an unprivileged user is allowed 512KiB per cpu plus
				432	their mlock limit (which defaults to 64KiB but is not multiplied by the number
				433	of cpus).
				434
				435	In full-trace mode, powers of two are allowed for buffer size, with a minimum
				436	size of 2 pages. In snapshot mode, it is the same but the minimum size is
				437	1 page.
				438
				439	The mmap size and auxtrace mmap size are displayed if the -vv option is used e.g.
				440
				441	mmap length 528384
				442	auxtrace mmap length 4198400
				443
				444
				445	Intel PT modes of operation
				446	---------------------------
				447
				448	Intel PT can be used in 2 modes:
				449	full-trace mode
				450	snapshot mode
				451
				452	Full-trace mode traces continuously e.g.
				453
				454	perf record -e intel_pt//u uname
				455
				456	Snapshot mode captures the available data when a signal is sent e.g.
				457
				458	perf record -v -e intel_pt//u -S ./loopy 1000000000 &
				459	[1] 11435
				460	kill -USR2 11435
				461	Recording AUX area tracing snapshot
				462
				463	Note that the signal sent is SIGUSR2.
				464	Note that "Recording AUX area tracing snapshot" is displayed because the -v
				465	option is used.
				466
				467	The 2 modes cannot be used together.
				468
				469
				470	Buffer handling
				471	---------------
				472
				473	There may be buffer limitations (i.e. single ToPa entry) which means that actual
				474	buffer sizes are limited to powers of 2 up to 4MiB (MAX_ORDER). In order to
				475	provide other sizes, and in particular an arbitrarily large size, multiple
				476	buffers are logically concatenated. However an interrupt must be used to switch
				477	between buffers. That has two potential problems:
				478	a) the interrupt may not be handled in time so that the current buffer
				479	becomes full and some trace data is lost.
				480	b) the interrupts may slow the system and affect the performance
				481	results.
				482
				483	If trace data is lost, the driver sets 'truncated' in the PERF_RECORD_AUX event
				484	which the tools report as an error.
				485
				486	In full-trace mode, the driver waits for data to be copied out before allowing
				487	the (logical) buffer to wrap-around. If data is not copied out quickly enough,
				488	again 'truncated' is set in the PERF_RECORD_AUX event. If the driver has to
				489	wait, the intel_pt event gets disabled. Because it is difficult to know when
				490	that happens, perf tools always re-enable the intel_pt event after copying out
				491	data.
				492
				493
				494	Intel PT and build ids
				495	----------------------
				496
				497	By default "perf record" post-processes the event stream to find all build ids
				498	for executables for all addresses sampled. Deliberately, Intel PT is not
				499	decoded for that purpose (it would take too long). Instead the build ids for
				500	all executables encountered (due to mmap, comm or task events) are included
				501	in the perf.data file.
				502
				503	To see buildids included in the perf.data file use the command:
				504
				505	perf buildid-list
				506
				507	If the perf.data file contains Intel PT data, that is the same as:
				508
				509	perf buildid-list --with-hits
				510
				511
				512	Snapshot mode and event disabling
				513	---------------------------------
				514
				515	In order to make a snapshot, the intel_pt event is disabled using an IOCTL,
				516	namely PERF_EVENT_IOC_DISABLE. However doing that can also disable the
				517	collection of side-band information. In order to prevent that, a dummy
				518	software event has been introduced that permits tracking events (like mmaps) to
				519	continue to be recorded while intel_pt is disabled. That is important to ensure
				520	there is complete side-band information to allow the decoding of subsequent
				521	snapshots.
				522
				523	A test has been created for that. To find the test:
				524
				525	perf test list
				526	...
				527	23: Test using a dummy software event to keep tracking
				528
				529	To run the test:
				530
				531	perf test 23
				532	23: Test using a dummy software event to keep tracking : Ok
				533
				534
				535	perf record modes (nothing new here)
				536	------------------------------------
				537
				538	perf record essentially operates in one of three modes:
				539	per thread
				540	per cpu
				541	workload only
				542
				543	"per thread" mode is selected by -t or by --per-thread (with -p or -u or just a
				544	workload).
				545	"per cpu" is selected by -C or -a.
				546	"workload only" mode is selected by not using the other options but providing a
				547	command to run (i.e. the workload).
				548
				549	In per-thread mode an exact list of threads is traced. There is no inheritance.
				550	Each thread has its own event buffer.
				551
				552	In per-cpu mode all processes (or processes from the selected cgroup i.e. -G
				553	option, or processes selected with -p or -u) are traced. Each cpu has its own
				554	buffer. Inheritance is allowed.
				555
				556	In workload-only mode, the workload is traced but with per-cpu buffers.
				557	Inheritance is allowed. Note that you can now trace a workload in per-thread
				558	mode by using the --per-thread option.
				559
				560
				561	Privileged vs non-privileged users
				562	----------------------------------
				563
				564	Unless /proc/sys/kernel/perf_event_paranoid is set to -1, unprivileged users
				565	have memory limits imposed upon them. That affects what buffer sizes they can
				566	have as outlined above.
				567
				568	Unless /proc/sys/kernel/perf_event_paranoid is set to -1, unprivileged users are
				569	not permitted to use tracepoints which means there is insufficient side-band
				570	information to decode Intel PT in per-cpu mode, and potentially workload-only
				571	mode too if the workload creates new processes.
				572
				573	Note also, that to use tracepoints, read-access to debugfs is required. So if
				574	debugfs is not mounted or the user does not have read-access, it will again not
				575	be possible to decode Intel PT in per-cpu mode.
				576
				577
				578	sched_switch tracepoint
				579	-----------------------
				580
				581	The sched_switch tracepoint is used to provide side-band data for Intel PT
				582	decoding. sched_switch events are automatically added. e.g. the second event
				583	shown below
				584
				585	$ perf record -vv -e intel_pt//u uname
				586	------------------------------------------------------------
				587	perf_event_attr:
				588	type 6
				589	size 112
				590	config 0x400
				591	{ sample_period, sample_freq } 1
				592	sample_type IP\|TID\|TIME\|CPU\|IDENTIFIER
				593	read_format ID
				594	disabled 1
				595	inherit 1
				596	exclude_kernel 1
				597	exclude_hv 1
				598	enable_on_exec 1
				599	sample_id_all 1
				600	------------------------------------------------------------
				601	sys_perf_event_open: pid 31104 cpu 0 group_fd -1 flags 0x8
				602	sys_perf_event_open: pid 31104 cpu 1 group_fd -1 flags 0x8
				603	sys_perf_event_open: pid 31104 cpu 2 group_fd -1 flags 0x8
				604	sys_perf_event_open: pid 31104 cpu 3 group_fd -1 flags 0x8
				605	------------------------------------------------------------
				606	perf_event_attr:
				607	type 2
				608	size 112
				609	config 0x108
				610	{ sample_period, sample_freq } 1
				611	sample_type IP\|TID\|TIME\|CPU\|PERIOD\|RAW\|IDENTIFIER
				612	read_format ID
				613	inherit 1
				614	sample_id_all 1
				615	exclude_guest 1
				616	------------------------------------------------------------
				617	sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
				618	sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8
				619	sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8
				620	sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8
				621	------------------------------------------------------------
				622	perf_event_attr:
				623	type 1
				624	size 112
				625	config 0x9
				626	{ sample_period, sample_freq } 1
				627	sample_type IP\|TID\|TIME\|IDENTIFIER
				628	read_format ID
				629	disabled 1
				630	inherit 1
				631	exclude_kernel 1
				632	exclude_hv 1
				633	mmap 1
				634	comm 1
				635	enable_on_exec 1
				636	task 1
				637	sample_id_all 1
				638	mmap2 1
				639	comm_exec 1
				640	------------------------------------------------------------
				641	sys_perf_event_open: pid 31104 cpu 0 group_fd -1 flags 0x8
				642	sys_perf_event_open: pid 31104 cpu 1 group_fd -1 flags 0x8
				643	sys_perf_event_open: pid 31104 cpu 2 group_fd -1 flags 0x8
				644	sys_perf_event_open: pid 31104 cpu 3 group_fd -1 flags 0x8
				645	mmap size 528384B
				646	AUX area mmap length 4194304
				647	perf event ring buffer mmapped per cpu
				648	Synthesizing auxtrace information
				649	Linux
				650	[ perf record: Woken up 1 times to write data ]
				651	[ perf record: Captured and wrote 0.042 MB perf.data ]
				652
				653	Note, the sched_switch event is only added if the user is permitted to use it
				654	and only in per-cpu mode.
				655
				656	Note also, the sched_switch event is only added if TSC packets are requested.
				657	That is because, in the absence of timing information, the sched_switch events
				658	cannot be matched against the Intel PT trace.
				659
				660
				661	perf script
				662	===========
				663
				664	By default, perf script will decode trace data found in the perf.data file.
				665	This can be further controlled by new option --itrace.
				666
				667
				668	New --itrace option
				669	-------------------
				670
				671	Having no option is the same as
				672
				673	--itrace
				674
				675	which, in turn, is the same as
				676
				677	--itrace=ibxe
				678
				679	The letters are:
				680
				681	i synthesize "instructions" events
				682	b synthesize "branches" events
				683	x synthesize "transactions" events
				684	c synthesize branches events (calls only)
				685	r synthesize branches events (returns only)
				686	e synthesize tracing error events
				687	d create a debug log
				688	g synthesize a call chain (use with i or x)
				689
				690	"Instructions" events look like they were recorded by "perf record -e
				691	instructions".
				692
				693	"Branches" events look like they were recorded by "perf record -e branches". "c"
				694	and "r" can be combined to get calls and returns.
				695
				696	"Transactions" events correspond to the start or end of transactions. The
				697	'flags' field can be used in perf script to determine whether the event is a
				698	tranasaction start, commit or abort.
				699
				700	Error events are new. They show where the decoder lost the trace. Error events
				701	are quite important. Users must know if what they are seeing is a complete
				702	picture or not.
				703
				704	The "d" option will cause the creation of a file "intel_pt.log" containing all
				705	decoded packets and instructions. Note that this option slows down the decoder
				706	and that the resulting file may be very large.
				707
				708	In addition, the period of the "instructions" event can be specified. e.g.
				709
				710	--itrace=i10us
				711
				712	sets the period to 10us i.e. one instruction sample is synthesized for each 10
				713	microseconds of trace. Alternatives to "us" are "ms" (milliseconds),
				714	"ns" (nanoseconds), "t" (TSC ticks) or "i" (instructions).
				715
				716	"ms", "us" and "ns" are converted to TSC ticks.
				717
				718	The timing information included with Intel PT does not give the time of every
				719	instruction. Consequently, for the purpose of sampling, the decoder estimates
				720	the time since the last timing packet based on 1 tick per instruction. The time
				721	on the sample is not adjusted and reflects the last known value of TSC.
				722
				723	For Intel PT, the default period is 100us.
				724
				725	Also the call chain size (default 16, max. 1024) for instructions or
				726	transactions events can be specified. e.g.
				727
				728	--itrace=ig32
				729	--itrace=xg32
				730
				731	To disable trace decoding entirely, use the option --no-itrace.
				732
				733
				734	dump option
				735	-----------
				736
				737	perf script has an option (-D) to "dump" the events i.e. display the binary
				738	data.
				739
				740	When -D is used, Intel PT packets are displayed. The packet decoder does not
				741	pay attention to PSB packets, but just decodes the bytes - so the packets seen
				742	by the actual decoder may not be identical in places where the data is corrupt.
				743	One example of that would be when the buffer-switching interrupt has been too
				744	slow, and the buffer has been filled completely. In that case, the last packet
				745	in the buffer might be truncated and immediately followed by a PSB as the trace
				746	continues in the next buffer.
				747
				748	To disable the display of Intel PT packets, combine the -D option with
				749	--no-itrace.
				750
				751
				752	perf report
				753	===========
				754
				755	By default, perf report will decode trace data found in the perf.data file.
				756	This can be further controlled by new option --itrace exactly the same as
				757	perf script, with the exception that the default is --itrace=igxe.
				758
				759
				760	perf inject
				761	===========
				762
				763	perf inject also accepts the --itrace option in which case tracing data is
				764	removed and replaced with the synthesized events. e.g.
				765
				766	perf inject --itrace -i perf.data -o perf.data.new