Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 1 | .. SPDX-License-Identifier: GPL-2.0 |
| 2 | |
| 3 | =============================================== |
| 4 | How to Implement a new CPUFreq Processor Driver |
| 5 | =============================================== |
| 6 | |
| 7 | Authors: |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 8 | |
| 9 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 10 | - Dominik Brodowski <linux@brodo.de> |
| 11 | - Rafael J. Wysocki <rafael.j.wysocki@intel.com> |
| 12 | - Viresh Kumar <viresh.kumar@linaro.org> |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 13 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 14 | .. Contents |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 15 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 16 | 1. What To Do? |
| 17 | 1.1 Initialization |
| 18 | 1.2 Per-CPU Initialization |
| 19 | 1.3 verify |
| 20 | 1.4 target/target_index or setpolicy? |
| 21 | 1.5 target/target_index |
| 22 | 1.6 setpolicy |
| 23 | 1.7 get_intermediate and target_intermediate |
| 24 | 2. Frequency Table Helpers |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 25 | |
| 26 | |
| 27 | |
| 28 | 1. What To Do? |
| 29 | ============== |
| 30 | |
| 31 | So, you just got a brand-new CPU / chipset with datasheets and want to |
| 32 | add cpufreq support for this CPU / chipset? Great. Here are some hints |
| 33 | on what is necessary: |
| 34 | |
| 35 | |
| 36 | 1.1 Initialization |
| 37 | ------------------ |
| 38 | |
| 39 | First of all, in an __initcall level 7 (module_init()) or later |
| 40 | function check whether this kernel runs on the right CPU and the right |
| 41 | chipset. If so, register a struct cpufreq_driver with the CPUfreq core |
| 42 | using cpufreq_register_driver() |
| 43 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 44 | What shall this struct cpufreq_driver contain? |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 45 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 46 | .name - The name of this driver. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 47 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 48 | .init - A pointer to the per-policy initialization function. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 49 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 50 | .verify - A pointer to a "verification" function. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 51 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 52 | .setpolicy _or_ .fast_switch _or_ .target _or_ .target_index - See |
| 53 | below on the differences. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 54 | |
| 55 | And optionally |
| 56 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 57 | .flags - Hints for the cpufreq core. |
Dirk Brandewie | 367dc4a | 2014-03-19 08:45:53 -0700 | [diff] [blame] | 58 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 59 | .driver_data - cpufreq driver specific data. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 60 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 61 | .resolve_freq - Returns the most appropriate frequency for a target |
| 62 | frequency. Doesn't change the frequency though. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 63 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 64 | .get_intermediate and target_intermediate - Used to switch to stable |
| 65 | frequency while changing CPU frequency. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 66 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 67 | .get - Returns current frequency of the CPU. |
| 68 | |
| 69 | .bios_limit - Returns HW/BIOS max frequency limitations for the CPU. |
| 70 | |
| 71 | .exit - A pointer to a per-policy cleanup function called during |
| 72 | CPU_POST_DEAD phase of cpu hotplug process. |
| 73 | |
| 74 | .stop_cpu - A pointer to a per-policy stop function called during |
| 75 | CPU_DOWN_PREPARE phase of cpu hotplug process. |
| 76 | |
| 77 | .suspend - A pointer to a per-policy suspend function which is called |
| 78 | with interrupts disabled and _after_ the governor is stopped for the |
| 79 | policy. |
| 80 | |
| 81 | .resume - A pointer to a per-policy resume function which is called |
| 82 | with interrupts disabled and _before_ the governor is started again. |
| 83 | |
| 84 | .ready - A pointer to a per-policy ready function which is called after |
| 85 | the policy is fully initialized. |
| 86 | |
| 87 | .attr - A pointer to a NULL-terminated list of "struct freq_attr" which |
| 88 | allow to export values to sysfs. |
| 89 | |
| 90 | .boost_enabled - If set, boost frequencies are enabled. |
| 91 | |
| 92 | .set_boost - A pointer to a per-policy function to enable/disable boost |
| 93 | frequencies. |
Viresh Kumar | 1c03a2d | 2014-06-02 22:49:28 +0530 | [diff] [blame] | 94 | |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 95 | |
| 96 | 1.2 Per-CPU Initialization |
| 97 | -------------------------- |
| 98 | |
| 99 | Whenever a new CPU is registered with the device model, or after the |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 100 | cpufreq driver registers itself, the per-policy initialization function |
| 101 | cpufreq_driver.init is called if no cpufreq policy existed for the CPU. |
| 102 | Note that the .init() and .exit() routines are called only once for the |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 103 | policy and not for each CPU managed by the policy. It takes a ``struct |
| 104 | cpufreq_policy *policy`` as argument. What to do now? |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 105 | |
| 106 | If necessary, activate the CPUfreq support on your CPU. |
| 107 | |
| 108 | Then, the driver must fill in the following values: |
| 109 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 110 | +-----------------------------------+--------------------------------------+ |
| 111 | |policy->cpuinfo.min_freq _and_ | | |
| 112 | |policy->cpuinfo.max_freq | the minimum and maximum frequency | |
| 113 | | | (in kHz) which is supported by | |
| 114 | | | this CPU | |
| 115 | +-----------------------------------+--------------------------------------+ |
| 116 | |policy->cpuinfo.transition_latency | the time it takes on this CPU to | |
| 117 | | | switch between two frequencies in | |
| 118 | | | nanoseconds (if appropriate, else | |
| 119 | | | specify CPUFREQ_ETERNAL) | |
| 120 | +-----------------------------------+--------------------------------------+ |
| 121 | |policy->cur | The current operating frequency of | |
| 122 | | | this CPU (if appropriate) | |
| 123 | +-----------------------------------+--------------------------------------+ |
| 124 | |policy->min, | | |
| 125 | |policy->max, | | |
| 126 | |policy->policy and, if necessary, | | |
| 127 | |policy->governor | must contain the "default policy" for| |
| 128 | | | this CPU. A few moments later, | |
| 129 | | | cpufreq_driver.verify and either | |
| 130 | | | cpufreq_driver.setpolicy or | |
| 131 | | | cpufreq_driver.target/target_index is| |
| 132 | | | called with these values. | |
| 133 | +-----------------------------------+--------------------------------------+ |
| 134 | |policy->cpus | Update this with the masks of the | |
| 135 | | | (online + offline) CPUs that do DVFS | |
| 136 | | | along with this CPU (i.e. that share| |
| 137 | | | clock/voltage rails with it). | |
| 138 | +-----------------------------------+--------------------------------------+ |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 139 | |
Viresh Kumar | eb2f50f | 2013-04-01 12:57:48 +0000 | [diff] [blame] | 140 | For setting some of these values (cpuinfo.min[max]_freq, policy->min[max]), the |
| 141 | frequency table helpers might be helpful. See the section 2 for more information |
| 142 | on them. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 143 | |
| 144 | |
| 145 | 1.3 verify |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 146 | ---------- |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 147 | |
| 148 | When the user decides a new policy (consisting of |
| 149 | "policy,governor,min,max") shall be set, this policy must be validated |
| 150 | so that incompatible values can be corrected. For verifying these |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 151 | values cpufreq_verify_within_limits(``struct cpufreq_policy *policy``, |
| 152 | ``unsigned int min_freq``, ``unsigned int max_freq``) function might be helpful. |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 153 | See section 2 for details on frequency table helpers. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 154 | |
| 155 | You need to make sure that at least one valid frequency (or operating |
| 156 | range) is within policy->min and policy->max. If necessary, increase |
| 157 | policy->max first, and only if this is no solution, decrease policy->min. |
| 158 | |
| 159 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 160 | 1.4 target or target_index or setpolicy or fast_switch? |
| 161 | ------------------------------------------------------- |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 162 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 163 | Most cpufreq drivers or even most cpu frequency scaling algorithms |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 164 | only allow the CPU frequency to be set to predefined fixed values. For |
| 165 | these, you use the ->target(), ->target_index() or ->fast_switch() |
| 166 | callbacks. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 167 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 168 | Some cpufreq capable processors switch the frequency between certain |
| 169 | limits on their own. These shall use the ->setpolicy() callback. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 170 | |
| 171 | |
Viresh Kumar | 1c03a2d | 2014-06-02 22:49:28 +0530 | [diff] [blame] | 172 | 1.5. target/target_index |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 173 | ------------------------ |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 174 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 175 | The target_index call has two arguments: ``struct cpufreq_policy *policy``, |
| 176 | and ``unsigned int`` index (into the exposed frequency table). |
Viresh Kumar | 9c0ebcf | 2013-10-25 19:45:48 +0530 | [diff] [blame] | 177 | |
| 178 | The CPUfreq driver must set the new frequency when called here. The |
| 179 | actual frequency must be determined by freq_table[index].frequency. |
| 180 | |
Viresh Kumar | 1c03a2d | 2014-06-02 22:49:28 +0530 | [diff] [blame] | 181 | It should always restore to earlier frequency (i.e. policy->restore_freq) in |
| 182 | case of errors, even if we switched to intermediate frequency earlier. |
| 183 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 184 | Deprecated |
Viresh Kumar | 9c0ebcf | 2013-10-25 19:45:48 +0530 | [diff] [blame] | 185 | ---------- |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 186 | The target call has three arguments: ``struct cpufreq_policy *policy``, |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 187 | unsigned int target_frequency, unsigned int relation. |
| 188 | |
| 189 | The CPUfreq driver must set the new frequency when called here. The |
| 190 | actual frequency must be determined using the following rules: |
| 191 | |
| 192 | - keep close to "target_freq" |
| 193 | - policy->min <= new_freq <= policy->max (THIS MUST BE VALID!!!) |
| 194 | - if relation==CPUFREQ_REL_L, try to select a new_freq higher than or equal |
| 195 | target_freq. ("L for lowest, but no lower than") |
| 196 | - if relation==CPUFREQ_REL_H, try to select a new_freq lower than or equal |
| 197 | target_freq. ("H for highest, but no higher than") |
| 198 | |
Chumbalkar Nagananda | 51555c0 | 2009-05-21 23:29:48 +0000 | [diff] [blame] | 199 | Here again the frequency table helper might assist you - see section 2 |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 200 | for details. |
| 201 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 202 | 1.6. fast_switch |
| 203 | ---------------- |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 204 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 205 | This function is used for frequency switching from scheduler's context. |
| 206 | Not all drivers are expected to implement it, as sleeping from within |
| 207 | this callback isn't allowed. This callback must be highly optimized to |
| 208 | do switching as fast as possible. |
| 209 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 210 | This function has two arguments: ``struct cpufreq_policy *policy`` and |
| 211 | ``unsigned int target_frequency``. |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 212 | |
| 213 | |
| 214 | 1.7 setpolicy |
| 215 | ------------- |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 216 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 217 | The setpolicy call only takes a ``struct cpufreq_policy *policy`` as |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 218 | argument. You need to set the lower limit of the in-processor or |
| 219 | in-chipset dynamic frequency switching to policy->min, the upper limit |
| 220 | to policy->max, and -if supported- select a performance-oriented |
| 221 | setting when policy->policy is CPUFREQ_POLICY_PERFORMANCE, and a |
| 222 | powersaving-oriented setting when CPUFREQ_POLICY_POWERSAVE. Also check |
Wanlong Gao | 25eb650 | 2011-06-13 17:53:53 +0800 | [diff] [blame] | 223 | the reference implementation in drivers/cpufreq/longrun.c |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 224 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 225 | 1.8 get_intermediate and target_intermediate |
Viresh Kumar | 1c03a2d | 2014-06-02 22:49:28 +0530 | [diff] [blame] | 226 | -------------------------------------------- |
| 227 | |
| 228 | Only for drivers with target_index() and CPUFREQ_ASYNC_NOTIFICATION unset. |
| 229 | |
| 230 | get_intermediate should return a stable intermediate frequency platform wants to |
sayli karnik | 54f5d13 | 2017-03-09 11:48:21 +0530 | [diff] [blame] | 231 | switch to, and target_intermediate() should set CPU to that frequency, before |
Viresh Kumar | 1c03a2d | 2014-06-02 22:49:28 +0530 | [diff] [blame] | 232 | jumping to the frequency corresponding to 'index'. Core will take care of |
| 233 | sending notifications and driver doesn't have to handle them in |
| 234 | target_intermediate() or target_index(). |
| 235 | |
| 236 | Drivers can return '0' from get_intermediate() in case they don't wish to switch |
| 237 | to intermediate frequency for some target frequency. In that case core will |
| 238 | directly call ->target_index(). |
| 239 | |
| 240 | NOTE: ->target_index() should restore to policy->restore_freq in case of |
| 241 | failures as core would send notifications for that. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 242 | |
| 243 | |
| 244 | 2. Frequency Table Helpers |
| 245 | ========================== |
| 246 | |
| 247 | As most cpufreq processors only allow for being set to a few specific |
| 248 | frequencies, a "frequency table" with some functions might assist in |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 249 | some work of the processor driver. Such a "frequency table" consists of |
| 250 | an array of struct cpufreq_frequency_table entries, with driver specific |
| 251 | values in "driver_data", the corresponding frequency in "frequency" and |
| 252 | flags set. At the end of the table, you need to add a |
| 253 | cpufreq_frequency_table entry with frequency set to CPUFREQ_TABLE_END. |
| 254 | And if you want to skip one entry in the table, set the frequency to |
| 255 | CPUFREQ_ENTRY_INVALID. The entries don't need to be in sorted in any |
| 256 | particular order, but if they are cpufreq core will do DVFS a bit |
| 257 | quickly for them as search for best match is faster. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 258 | |
Viresh Kumar | 2dd0df8 | 2018-04-03 15:37:39 +0530 | [diff] [blame] | 259 | The cpufreq table is verified automatically by the core if the policy contains a |
| 260 | valid pointer in its policy->freq_table field. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 261 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 262 | cpufreq_frequency_table_verify() assures that at least one valid |
| 263 | frequency is within policy->min and policy->max, and all other criteria |
| 264 | are met. This is helpful for the ->verify call. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 265 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 266 | cpufreq_frequency_table_target() is the corresponding frequency table |
| 267 | helper for the ->target stage. Just pass the values to this function, |
| 268 | and this function returns the of the frequency table entry which |
| 269 | contains the frequency the CPU shall be set to. |
Stratos Karafotis | 27e289d | 2014-04-25 23:15:23 +0300 | [diff] [blame] | 270 | |
| 271 | The following macros can be used as iterators over cpufreq_frequency_table: |
| 272 | |
| 273 | cpufreq_for_each_entry(pos, table) - iterates over all entries of frequency |
| 274 | table. |
| 275 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 276 | cpufreq_for_each_valid_entry(pos, table) - iterates over all entries, |
Stratos Karafotis | 27e289d | 2014-04-25 23:15:23 +0300 | [diff] [blame] | 277 | excluding CPUFREQ_ENTRY_INVALID frequencies. |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 278 | Use arguments "pos" - a ``cpufreq_frequency_table *`` as a loop cursor and |
| 279 | "table" - the ``cpufreq_frequency_table *`` you want to iterate over. |
Stratos Karafotis | 27e289d | 2014-04-25 23:15:23 +0300 | [diff] [blame] | 280 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 281 | For example:: |
Stratos Karafotis | 27e289d | 2014-04-25 23:15:23 +0300 | [diff] [blame] | 282 | |
| 283 | struct cpufreq_frequency_table *pos, *driver_freq_table; |
| 284 | |
| 285 | cpufreq_for_each_entry(pos, driver_freq_table) { |
| 286 | /* Do something with pos */ |
| 287 | pos->frequency = ... |
| 288 | } |
Dominik Brodowski | ffd81dc | 2018-01-30 06:42:37 +0100 | [diff] [blame] | 289 | |
| 290 | If you need to work with the position of pos within driver_freq_table, |
| 291 | do not subtract the pointers, as it is quite costly. Instead, use the |
| 292 | macros cpufreq_for_each_entry_idx() and cpufreq_for_each_valid_entry_idx(). |