bbeba3e58f040a4297a5ba88ebf6e2b16adc3657 - SHIFTPHONES/mainline/linux

commit	bbeba3e58f040a4297a5ba88ebf6e2b16adc3657	[log] [tgz]
author	Steven Rostedt (VMware) <rostedt@goodmis.org>	Tue Jun 30 13:05:29 2020 -0400
committer	Steven Rostedt (VMware) <rostedt@goodmis.org>	Wed Jul 01 22:12:07 2020 -0400
tree	9561961bfd549a73c8324672c502b4e0b7847294
parent	74e879373b377f15d4ecb45bf8316b77e8badc49 [diff]

ring-buffer: Call trace_clock_local() directly for RETPOLINE kernels

After doing some benchmarks and examining the code, I found that the ring
buffer clock calls were quite expensive, and noticed that it uses
retpolines. This is because the ring buffer clock is programmable, and can
be set. But in most cases it simply uses the fastest ns unit clock which is
the trace_clock_local(). For RETPOLINE builds, checking if the ring buffer
clock is set to trace_clock_local() and then calling it directly has brought
the time of an event on my i7 box from an average of 93 nanoseconds an event
down to 83 nanoseconds an event, and the minimum time from 81 nanoseconds to
68 nanoseconds!

Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

kernel/trace/ring_buffer.c[diff]

1 file changed

tree: 9561961bfd549a73c8324672c502b4e0b7847294