Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 1 | .. SPDX-License-Identifier: GPL-2.0 |
| 2 | |
| 3 | =============================================== |
| 4 | How to Implement a new CPUFreq Processor Driver |
| 5 | =============================================== |
| 6 | |
| 7 | Authors: |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 8 | |
| 9 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 10 | - Dominik Brodowski <linux@brodo.de> |
| 11 | - Rafael J. Wysocki <rafael.j.wysocki@intel.com> |
| 12 | - Viresh Kumar <viresh.kumar@linaro.org> |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 13 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 14 | .. Contents |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 15 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 16 | 1. What To Do? |
| 17 | 1.1 Initialization |
| 18 | 1.2 Per-CPU Initialization |
| 19 | 1.3 verify |
| 20 | 1.4 target/target_index or setpolicy? |
| 21 | 1.5 target/target_index |
| 22 | 1.6 setpolicy |
| 23 | 1.7 get_intermediate and target_intermediate |
| 24 | 2. Frequency Table Helpers |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 25 | |
| 26 | |
| 27 | |
| 28 | 1. What To Do? |
| 29 | ============== |
| 30 | |
| 31 | So, you just got a brand-new CPU / chipset with datasheets and want to |
| 32 | add cpufreq support for this CPU / chipset? Great. Here are some hints |
| 33 | on what is necessary: |
| 34 | |
| 35 | |
| 36 | 1.1 Initialization |
| 37 | ------------------ |
| 38 | |
| 39 | First of all, in an __initcall level 7 (module_init()) or later |
| 40 | function check whether this kernel runs on the right CPU and the right |
| 41 | chipset. If so, register a struct cpufreq_driver with the CPUfreq core |
| 42 | using cpufreq_register_driver() |
| 43 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 44 | What shall this struct cpufreq_driver contain? |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 45 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 46 | .name - The name of this driver. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 47 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 48 | .init - A pointer to the per-policy initialization function. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 49 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 50 | .verify - A pointer to a "verification" function. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 51 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 52 | .setpolicy _or_ .fast_switch _or_ .target _or_ .target_index - See |
| 53 | below on the differences. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 54 | |
| 55 | And optionally |
| 56 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 57 | .flags - Hints for the cpufreq core. |
Dirk Brandewie | 367dc4a | 2014-03-19 08:45:53 -0700 | [diff] [blame] | 58 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 59 | .driver_data - cpufreq driver specific data. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 60 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 61 | .get_intermediate and target_intermediate - Used to switch to stable |
| 62 | frequency while changing CPU frequency. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 63 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 64 | .get - Returns current frequency of the CPU. |
| 65 | |
| 66 | .bios_limit - Returns HW/BIOS max frequency limitations for the CPU. |
| 67 | |
| 68 | .exit - A pointer to a per-policy cleanup function called during |
| 69 | CPU_POST_DEAD phase of cpu hotplug process. |
| 70 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 71 | .suspend - A pointer to a per-policy suspend function which is called |
| 72 | with interrupts disabled and _after_ the governor is stopped for the |
| 73 | policy. |
| 74 | |
| 75 | .resume - A pointer to a per-policy resume function which is called |
| 76 | with interrupts disabled and _before_ the governor is started again. |
| 77 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 78 | .attr - A pointer to a NULL-terminated list of "struct freq_attr" which |
| 79 | allow to export values to sysfs. |
| 80 | |
| 81 | .boost_enabled - If set, boost frequencies are enabled. |
| 82 | |
| 83 | .set_boost - A pointer to a per-policy function to enable/disable boost |
| 84 | frequencies. |
Viresh Kumar | 1c03a2d | 2014-06-02 22:49:28 +0530 | [diff] [blame] | 85 | |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 86 | |
| 87 | 1.2 Per-CPU Initialization |
| 88 | -------------------------- |
| 89 | |
| 90 | Whenever a new CPU is registered with the device model, or after the |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 91 | cpufreq driver registers itself, the per-policy initialization function |
| 92 | cpufreq_driver.init is called if no cpufreq policy existed for the CPU. |
| 93 | Note that the .init() and .exit() routines are called only once for the |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 94 | policy and not for each CPU managed by the policy. It takes a ``struct |
| 95 | cpufreq_policy *policy`` as argument. What to do now? |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 96 | |
| 97 | If necessary, activate the CPUfreq support on your CPU. |
| 98 | |
| 99 | Then, the driver must fill in the following values: |
| 100 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 101 | +-----------------------------------+--------------------------------------+ |
| 102 | |policy->cpuinfo.min_freq _and_ | | |
| 103 | |policy->cpuinfo.max_freq | the minimum and maximum frequency | |
| 104 | | | (in kHz) which is supported by | |
| 105 | | | this CPU | |
| 106 | +-----------------------------------+--------------------------------------+ |
| 107 | |policy->cpuinfo.transition_latency | the time it takes on this CPU to | |
| 108 | | | switch between two frequencies in | |
| 109 | | | nanoseconds (if appropriate, else | |
| 110 | | | specify CPUFREQ_ETERNAL) | |
| 111 | +-----------------------------------+--------------------------------------+ |
| 112 | |policy->cur | The current operating frequency of | |
| 113 | | | this CPU (if appropriate) | |
| 114 | +-----------------------------------+--------------------------------------+ |
| 115 | |policy->min, | | |
| 116 | |policy->max, | | |
| 117 | |policy->policy and, if necessary, | | |
| 118 | |policy->governor | must contain the "default policy" for| |
| 119 | | | this CPU. A few moments later, | |
| 120 | | | cpufreq_driver.verify and either | |
| 121 | | | cpufreq_driver.setpolicy or | |
| 122 | | | cpufreq_driver.target/target_index is| |
| 123 | | | called with these values. | |
| 124 | +-----------------------------------+--------------------------------------+ |
| 125 | |policy->cpus | Update this with the masks of the | |
| 126 | | | (online + offline) CPUs that do DVFS | |
| 127 | | | along with this CPU (i.e. that share| |
| 128 | | | clock/voltage rails with it). | |
| 129 | +-----------------------------------+--------------------------------------+ |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 130 | |
Viresh Kumar | eb2f50f | 2013-04-01 12:57:48 +0000 | [diff] [blame] | 131 | For setting some of these values (cpuinfo.min[max]_freq, policy->min[max]), the |
| 132 | frequency table helpers might be helpful. See the section 2 for more information |
| 133 | on them. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 134 | |
| 135 | |
| 136 | 1.3 verify |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 137 | ---------- |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 138 | |
| 139 | When the user decides a new policy (consisting of |
| 140 | "policy,governor,min,max") shall be set, this policy must be validated |
| 141 | so that incompatible values can be corrected. For verifying these |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 142 | values cpufreq_verify_within_limits(``struct cpufreq_policy *policy``, |
| 143 | ``unsigned int min_freq``, ``unsigned int max_freq``) function might be helpful. |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 144 | See section 2 for details on frequency table helpers. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 145 | |
| 146 | You need to make sure that at least one valid frequency (or operating |
| 147 | range) is within policy->min and policy->max. If necessary, increase |
| 148 | policy->max first, and only if this is no solution, decrease policy->min. |
| 149 | |
| 150 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 151 | 1.4 target or target_index or setpolicy or fast_switch? |
| 152 | ------------------------------------------------------- |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 153 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 154 | Most cpufreq drivers or even most cpu frequency scaling algorithms |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 155 | only allow the CPU frequency to be set to predefined fixed values. For |
| 156 | these, you use the ->target(), ->target_index() or ->fast_switch() |
| 157 | callbacks. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 158 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 159 | Some cpufreq capable processors switch the frequency between certain |
| 160 | limits on their own. These shall use the ->setpolicy() callback. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 161 | |
| 162 | |
Viresh Kumar | 1c03a2d | 2014-06-02 22:49:28 +0530 | [diff] [blame] | 163 | 1.5. target/target_index |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 164 | ------------------------ |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 165 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 166 | The target_index call has two arguments: ``struct cpufreq_policy *policy``, |
| 167 | and ``unsigned int`` index (into the exposed frequency table). |
Viresh Kumar | 9c0ebcf | 2013-10-25 19:45:48 +0530 | [diff] [blame] | 168 | |
| 169 | The CPUfreq driver must set the new frequency when called here. The |
| 170 | actual frequency must be determined by freq_table[index].frequency. |
| 171 | |
Viresh Kumar | 1c03a2d | 2014-06-02 22:49:28 +0530 | [diff] [blame] | 172 | It should always restore to earlier frequency (i.e. policy->restore_freq) in |
| 173 | case of errors, even if we switched to intermediate frequency earlier. |
| 174 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 175 | Deprecated |
Viresh Kumar | 9c0ebcf | 2013-10-25 19:45:48 +0530 | [diff] [blame] | 176 | ---------- |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 177 | The target call has three arguments: ``struct cpufreq_policy *policy``, |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 178 | unsigned int target_frequency, unsigned int relation. |
| 179 | |
| 180 | The CPUfreq driver must set the new frequency when called here. The |
| 181 | actual frequency must be determined using the following rules: |
| 182 | |
| 183 | - keep close to "target_freq" |
| 184 | - policy->min <= new_freq <= policy->max (THIS MUST BE VALID!!!) |
| 185 | - if relation==CPUFREQ_REL_L, try to select a new_freq higher than or equal |
| 186 | target_freq. ("L for lowest, but no lower than") |
| 187 | - if relation==CPUFREQ_REL_H, try to select a new_freq lower than or equal |
| 188 | target_freq. ("H for highest, but no higher than") |
| 189 | |
Chumbalkar Nagananda | 51555c0 | 2009-05-21 23:29:48 +0000 | [diff] [blame] | 190 | Here again the frequency table helper might assist you - see section 2 |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 191 | for details. |
| 192 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 193 | 1.6. fast_switch |
| 194 | ---------------- |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 195 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 196 | This function is used for frequency switching from scheduler's context. |
| 197 | Not all drivers are expected to implement it, as sleeping from within |
| 198 | this callback isn't allowed. This callback must be highly optimized to |
| 199 | do switching as fast as possible. |
| 200 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 201 | This function has two arguments: ``struct cpufreq_policy *policy`` and |
| 202 | ``unsigned int target_frequency``. |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 203 | |
| 204 | |
| 205 | 1.7 setpolicy |
| 206 | ------------- |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 207 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 208 | The setpolicy call only takes a ``struct cpufreq_policy *policy`` as |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 209 | argument. You need to set the lower limit of the in-processor or |
| 210 | in-chipset dynamic frequency switching to policy->min, the upper limit |
| 211 | to policy->max, and -if supported- select a performance-oriented |
| 212 | setting when policy->policy is CPUFREQ_POLICY_PERFORMANCE, and a |
| 213 | powersaving-oriented setting when CPUFREQ_POLICY_POWERSAVE. Also check |
Wanlong Gao | 25eb650 | 2011-06-13 17:53:53 +0800 | [diff] [blame] | 214 | the reference implementation in drivers/cpufreq/longrun.c |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 215 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 216 | 1.8 get_intermediate and target_intermediate |
Viresh Kumar | 1c03a2d | 2014-06-02 22:49:28 +0530 | [diff] [blame] | 217 | -------------------------------------------- |
| 218 | |
| 219 | Only for drivers with target_index() and CPUFREQ_ASYNC_NOTIFICATION unset. |
| 220 | |
| 221 | get_intermediate should return a stable intermediate frequency platform wants to |
sayli karnik | 54f5d13 | 2017-03-09 11:48:21 +0530 | [diff] [blame] | 222 | switch to, and target_intermediate() should set CPU to that frequency, before |
Viresh Kumar | 1c03a2d | 2014-06-02 22:49:28 +0530 | [diff] [blame] | 223 | jumping to the frequency corresponding to 'index'. Core will take care of |
| 224 | sending notifications and driver doesn't have to handle them in |
| 225 | target_intermediate() or target_index(). |
| 226 | |
| 227 | Drivers can return '0' from get_intermediate() in case they don't wish to switch |
| 228 | to intermediate frequency for some target frequency. In that case core will |
| 229 | directly call ->target_index(). |
| 230 | |
| 231 | NOTE: ->target_index() should restore to policy->restore_freq in case of |
| 232 | failures as core would send notifications for that. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 233 | |
| 234 | |
| 235 | 2. Frequency Table Helpers |
| 236 | ========================== |
| 237 | |
| 238 | As most cpufreq processors only allow for being set to a few specific |
| 239 | frequencies, a "frequency table" with some functions might assist in |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 240 | some work of the processor driver. Such a "frequency table" consists of |
| 241 | an array of struct cpufreq_frequency_table entries, with driver specific |
| 242 | values in "driver_data", the corresponding frequency in "frequency" and |
| 243 | flags set. At the end of the table, you need to add a |
| 244 | cpufreq_frequency_table entry with frequency set to CPUFREQ_TABLE_END. |
| 245 | And if you want to skip one entry in the table, set the frequency to |
| 246 | CPUFREQ_ENTRY_INVALID. The entries don't need to be in sorted in any |
| 247 | particular order, but if they are cpufreq core will do DVFS a bit |
| 248 | quickly for them as search for best match is faster. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 249 | |
Viresh Kumar | 2dd0df8 | 2018-04-03 15:37:39 +0530 | [diff] [blame] | 250 | The cpufreq table is verified automatically by the core if the policy contains a |
| 251 | valid pointer in its policy->freq_table field. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 252 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 253 | cpufreq_frequency_table_verify() assures that at least one valid |
| 254 | frequency is within policy->min and policy->max, and all other criteria |
| 255 | are met. This is helpful for the ->verify call. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 256 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 257 | cpufreq_frequency_table_target() is the corresponding frequency table |
| 258 | helper for the ->target stage. Just pass the values to this function, |
| 259 | and this function returns the of the frequency table entry which |
| 260 | contains the frequency the CPU shall be set to. |
Stratos Karafotis | 27e289d | 2014-04-25 23:15:23 +0300 | [diff] [blame] | 261 | |
| 262 | The following macros can be used as iterators over cpufreq_frequency_table: |
| 263 | |
| 264 | cpufreq_for_each_entry(pos, table) - iterates over all entries of frequency |
| 265 | table. |
| 266 | |
Viresh Kumar | 7de962c | 2017-01-06 11:08:05 +0530 | [diff] [blame] | 267 | cpufreq_for_each_valid_entry(pos, table) - iterates over all entries, |
Stratos Karafotis | 27e289d | 2014-04-25 23:15:23 +0300 | [diff] [blame] | 268 | excluding CPUFREQ_ENTRY_INVALID frequencies. |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 269 | Use arguments "pos" - a ``cpufreq_frequency_table *`` as a loop cursor and |
| 270 | "table" - the ``cpufreq_frequency_table *`` you want to iterate over. |
Stratos Karafotis | 27e289d | 2014-04-25 23:15:23 +0300 | [diff] [blame] | 271 | |
Mauro Carvalho Chehab | 8f920589 | 2020-03-03 14:52:05 +0100 | [diff] [blame] | 272 | For example:: |
Stratos Karafotis | 27e289d | 2014-04-25 23:15:23 +0300 | [diff] [blame] | 273 | |
| 274 | struct cpufreq_frequency_table *pos, *driver_freq_table; |
| 275 | |
| 276 | cpufreq_for_each_entry(pos, driver_freq_table) { |
| 277 | /* Do something with pos */ |
| 278 | pos->frequency = ... |
| 279 | } |
Dominik Brodowski | ffd81dc | 2018-01-30 06:42:37 +0100 | [diff] [blame] | 280 | |
| 281 | If you need to work with the position of pos within driver_freq_table, |
| 282 | do not subtract the pointers, as it is quite costly. Instead, use the |
| 283 | macros cpufreq_for_each_entry_idx() and cpufreq_for_each_valid_entry_idx(). |