Leo Yan | 0cd555c | 2017-06-05 14:15:11 -0600 | [diff] [blame] | 1 | Coresight CPU Debug Module |
| 2 | ========================== |
| 3 | |
| 4 | Author: Leo Yan <leo.yan@linaro.org> |
| 5 | Date: April 5th, 2017 |
| 6 | |
| 7 | Introduction |
| 8 | ------------ |
| 9 | |
| 10 | Coresight CPU debug module is defined in ARMv8-a architecture reference manual |
| 11 | (ARM DDI 0487A.k) Chapter 'Part H: External debug', the CPU can integrate |
| 12 | debug module and it is mainly used for two modes: self-hosted debug and |
| 13 | external debug. Usually the external debug mode is well known as the external |
| 14 | debugger connects with SoC from JTAG port; on the other hand the program can |
| 15 | explore debugging method which rely on self-hosted debug mode, this document |
| 16 | is to focus on this part. |
| 17 | |
| 18 | The debug module provides sample-based profiling extension, which can be used |
| 19 | to sample CPU program counter, secure state and exception level, etc; usually |
| 20 | every CPU has one dedicated debug module to be connected. Based on self-hosted |
| 21 | debug mechanism, Linux kernel can access these related registers from mmio |
| 22 | region when the kernel panic happens. The callback notifier for kernel panic |
| 23 | will dump related registers for every CPU; finally this is good for assistant |
| 24 | analysis for panic. |
| 25 | |
| 26 | |
| 27 | Implementation |
| 28 | -------------- |
| 29 | |
| 30 | - During driver registration, it uses EDDEVID and EDDEVID1 - two device ID |
| 31 | registers to decide if sample-based profiling is implemented or not. On some |
| 32 | platforms this hardware feature is fully or partially implemented; and if |
| 33 | this feature is not supported then registration will fail. |
| 34 | |
| 35 | - At the time this documentation was written, the debug driver mainly relies on |
| 36 | information gathered by the kernel panic callback notifier from three |
| 37 | sampling registers: EDPCSR, EDVIDSR and EDCIDSR: from EDPCSR we can get |
| 38 | program counter; EDVIDSR has information for secure state, exception level, |
| 39 | bit width, etc; EDCIDSR is context ID value which contains the sampled value |
| 40 | of CONTEXTIDR_EL1. |
| 41 | |
| 42 | - The driver supports a CPU running in either AArch64 or AArch32 mode. The |
| 43 | registers naming convention is a bit different between them, AArch64 uses |
| 44 | 'ED' for register prefix (ARM DDI 0487A.k, chapter H9.1) and AArch32 uses |
| 45 | 'DBG' as prefix (ARM DDI 0487A.k, chapter G5.1). The driver is unified to |
| 46 | use AArch64 naming convention. |
| 47 | |
| 48 | - ARMv8-a (ARM DDI 0487A.k) and ARMv7-a (ARM DDI 0406C.b) have different |
| 49 | register bits definition. So the driver consolidates two difference: |
| 50 | |
| 51 | If PCSROffset=0b0000, on ARMv8-a the feature of EDPCSR is not implemented; |
| 52 | but ARMv7-a defines "PCSR samples are offset by a value that depends on the |
| 53 | instruction set state". For ARMv7-a, the driver checks furthermore if CPU |
| 54 | runs with ARM or thumb instruction set and calibrate PCSR value, the |
| 55 | detailed description for offset is in ARMv7-a ARM (ARM DDI 0406C.b) chapter |
| 56 | C11.11.34 "DBGPCSR, Program Counter Sampling Register". |
| 57 | |
| 58 | If PCSROffset=0b0010, ARMv8-a defines "EDPCSR implemented, and samples have |
| 59 | no offset applied and do not sample the instruction set state in AArch32 |
| 60 | state". So on ARMv8 if EDDEVID1.PCSROffset is 0b0010 and the CPU operates |
| 61 | in AArch32 state, EDPCSR is not sampled; when the CPU operates in AArch64 |
| 62 | state EDPCSR is sampled and no offset are applied. |
| 63 | |
| 64 | |
| 65 | Clock and power domain |
| 66 | ---------------------- |
| 67 | |
| 68 | Before accessing debug registers, we should ensure the clock and power domain |
| 69 | have been enabled properly. In ARMv8-a ARM (ARM DDI 0487A.k) chapter 'H9.1 |
| 70 | Debug registers', the debug registers are spread into two domains: the debug |
| 71 | domain and the CPU domain. |
| 72 | |
| 73 | +---------------+ |
| 74 | | | |
| 75 | | | |
| 76 | +----------+--+ | |
| 77 | dbg_clock -->| |**| |<-- cpu_clock |
| 78 | | Debug |**| CPU | |
| 79 | dbg_power_domain -->| |**| |<-- cpu_power_domain |
| 80 | +----------+--+ | |
| 81 | | | |
| 82 | | | |
| 83 | +---------------+ |
| 84 | |
| 85 | For debug domain, the user uses DT binding "clocks" and "power-domains" to |
| 86 | specify the corresponding clock source and power supply for the debug logic. |
| 87 | The driver calls the pm_runtime_{put|get} operations as needed to handle the |
| 88 | debug power domain. |
| 89 | |
| 90 | For CPU domain, the different SoC designs have different power management |
| 91 | schemes and finally this heavily impacts external debug module. So we can |
| 92 | divide into below cases: |
| 93 | |
| 94 | - On systems with a sane power controller which can behave correctly with |
| 95 | respect to CPU power domain, the CPU power domain can be controlled by |
| 96 | register EDPRCR in driver. The driver firstly writes bit EDPRCR.COREPURQ |
| 97 | to power up the CPU, and then writes bit EDPRCR.CORENPDRQ for emulation |
| 98 | of CPU power down. As result, this can ensure the CPU power domain is |
| 99 | powered on properly during the period when access debug related registers; |
| 100 | |
| 101 | - Some designs will power down an entire cluster if all CPUs on the cluster |
| 102 | are powered down - including the parts of the debug registers that should |
| 103 | remain powered in the debug power domain. The bits in EDPRCR are not |
| 104 | respected in these cases, so these designs do not support debug over |
| 105 | power down in the way that the CoreSight / Debug designers anticipated. |
| 106 | This means that even checking EDPRSR has the potential to cause a bus hang |
| 107 | if the target register is unpowered. |
| 108 | |
| 109 | In this case, accessing to the debug registers while they are not powered |
| 110 | is a recipe for disaster; so we need preventing CPU low power states at boot |
| 111 | time or when user enable module at the run time. Please see chapter |
| 112 | "How to use the module" for detailed usage info for this. |
| 113 | |
| 114 | |
| 115 | Device Tree Bindings |
| 116 | -------------------- |
| 117 | |
| 118 | See Documentation/devicetree/bindings/arm/coresight-cpu-debug.txt for details. |
| 119 | |
| 120 | |
| 121 | How to use the module |
| 122 | --------------------- |
| 123 | |
| 124 | If you want to enable debugging functionality at boot time, you can add |
| 125 | "coresight_cpu_debug.enable=1" to the kernel command line parameter. |
| 126 | |
| 127 | The driver also can work as module, so can enable the debugging when insmod |
| 128 | module: |
| 129 | # insmod coresight_cpu_debug.ko debug=1 |
| 130 | |
| 131 | When boot time or insmod module you have not enabled the debugging, the driver |
| 132 | uses the debugfs file system to provide a knob to dynamically enable or disable |
| 133 | debugging: |
| 134 | |
| 135 | To enable it, write a '1' into /sys/kernel/debug/coresight_cpu_debug/enable: |
| 136 | # echo 1 > /sys/kernel/debug/coresight_cpu_debug/enable |
| 137 | |
| 138 | To disable it, write a '0' into /sys/kernel/debug/coresight_cpu_debug/enable: |
| 139 | # echo 0 > /sys/kernel/debug/coresight_cpu_debug/enable |
| 140 | |
| 141 | As explained in chapter "Clock and power domain", if you are working on one |
| 142 | platform which has idle states to power off debug logic and the power |
| 143 | controller cannot work well for the request from EDPRCR, then you should |
| 144 | firstly constraint CPU idle states before enable CPU debugging feature; so can |
| 145 | ensure the accessing to debug logic. |
| 146 | |
| 147 | If you want to limit idle states at boot time, you can use "nohlt" or |
| 148 | "cpuidle.off=1" in the kernel command line. |
| 149 | |
| 150 | At the runtime you can disable idle states with below methods: |
| 151 | |
Leo Yan | 24f0d31 | 2017-10-10 14:32:13 -0600 | [diff] [blame] | 152 | It is possible to disable CPU idle states by way of the PM QoS |
| 153 | subsystem, more specifically by using the "/dev/cpu_dma_latency" |
| 154 | interface (see Documentation/power/pm_qos_interface.txt for more |
| 155 | details). As specified in the PM QoS documentation the requested |
| 156 | parameter will stay in effect until the file descriptor is released. |
| 157 | For example: |
Leo Yan | 0cd555c | 2017-06-05 14:15:11 -0600 | [diff] [blame] | 158 | |
Leo Yan | 24f0d31 | 2017-10-10 14:32:13 -0600 | [diff] [blame] | 159 | # exec 3<> /dev/cpu_dma_latency; echo 0 >&3 |
| 160 | ... |
| 161 | Do some work... |
| 162 | ... |
| 163 | # exec 3<>- |
| 164 | |
| 165 | The same can also be done from an application program. |
| 166 | |
| 167 | Disable specific CPU's specific idle state from cpuidle sysfs (see |
| 168 | Documentation/cpuidle/sysfs.txt): |
Leo Yan | 0cd555c | 2017-06-05 14:15:11 -0600 | [diff] [blame] | 169 | # echo 1 > /sys/devices/system/cpu/cpu$cpu/cpuidle/state$state/disable |
| 170 | |
| 171 | |
| 172 | Output format |
| 173 | ------------- |
| 174 | |
| 175 | Here is an example of the debugging output format: |
| 176 | |
| 177 | ARM external debug module: |
| 178 | coresight-cpu-debug 850000.debug: CPU[0]: |
| 179 | coresight-cpu-debug 850000.debug: EDPRSR: 00000001 (Power:On DLK:Unlock) |
| 180 | coresight-cpu-debug 850000.debug: EDPCSR: [<ffff00000808e9bc>] handle_IPI+0x174/0x1d8 |
| 181 | coresight-cpu-debug 850000.debug: EDCIDSR: 00000000 |
| 182 | coresight-cpu-debug 850000.debug: EDVIDSR: 90000000 (State:Non-secure Mode:EL1/0 Width:64bits VMID:0) |
| 183 | coresight-cpu-debug 852000.debug: CPU[1]: |
| 184 | coresight-cpu-debug 852000.debug: EDPRSR: 00000001 (Power:On DLK:Unlock) |
| 185 | coresight-cpu-debug 852000.debug: EDPCSR: [<ffff0000087fab34>] debug_notifier_call+0x23c/0x358 |
| 186 | coresight-cpu-debug 852000.debug: EDCIDSR: 00000000 |
| 187 | coresight-cpu-debug 852000.debug: EDVIDSR: 90000000 (State:Non-secure Mode:EL1/0 Width:64bits VMID:0) |