drm/i915/tgl: Add perf support on TGL

The design of the OA unit has been split into several units. We now
have a global unit (OAG) and a render specific unit (OAR). This leads
to some changes on how we program things. Some details :

OAR:
  - has its own set of counter registers, they are per-context
    saved/restored
  - counters are not written to the circular OA buffer
  - a snapshot of the counters can be acquired with
    MI_RECORD_PERF_COUNT, or a single counter can be read with
    MI_STORE_REGISTER_MEM.

OAG:
  - has global counters that increment across context switches
  - counters are written into the circular OA buffer (if requested)

v2: Fix checkpatch warnings on code style (Lucas)
v3: (Umesh)
  - Update register from which tail, status and head are read
  - Update logic to sample context reports
  - Update whitelist mux and b counter regs
v4: Fix a bug when updating context image for new contexts (Umesh)
v5: Squash patch enabling save/restore of counters into context image

    We want this so we can preempt performance queries and keep the
    system responsive even when long running queries are ongoing. We
    avoid doing it for all contexts.

    - use LRI to modify context control (Chris)
    - use MASKED_FIELD to program just the masked bits (Chris)
    - disable save/restore of counters on cleanup (Chris)
v6: Do not use implicit parameters (Chris)

BSpec: 28727, 30021

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Chris Wilson <chris.p.wilson@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191025193746.47155-2-umesh.nerlige.ramappa@intel.com
6 files changed