cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy
If the current P-state selection algorithm is set to "performance"
in intel_pstate_set_policy(), the limits may be initialized from
scratch, but only if no_turbo is not set and the maximum frequency
allowed for the given CPU (i.e. the policy object representing it)
is at least equal to the max frequency supported by the CPU. In all
of the other cases, the limits will not be updated.
For example, the following can happen:
# cat intel_pstate/status
active
# echo performance > cpufreq/policy0/scaling_governor
# cat intel_pstate/min_perf_pct
100
# echo 94 > intel_pstate/min_perf_pct
# cat intel_pstate/min_perf_pct
100
# cat cpufreq/policy0/scaling_max_freq
3100000
echo 3000000 > cpufreq/policy0/scaling_max_freq
# cat intel_pstate/min_perf_pct
94
# echo 95 > intel_pstate/min_perf_pct
# cat intel_pstate/min_perf_pct
95
That is confusing for two reasons. First, the initial attempt to
change min_perf_pct to 94 seems to have no effect, even though
setting the global limits should always work. Second, after
changing scaling_max_freq for policy0 the global min_perf_pct
attribute shows 94, even though it should have not been affected
by that operation in principle.
Moreover, the final attempt to change min_perf_pct to 95 worked
as expected, because scaling_max_freq for the only policy with
scaling_governor equal to "performance" was different from the
maximum at that time.
To make all that confusion go away, modify intel_pstate_set_policy()
so that it doesn't reinitialize the limits at all.
At the same time, change intel_pstate_set_performance_limits() to
set min_sysfs_pct to 100 in the "performance" limits set so that
switching the P-state selection algorithm to "performance" causes
intel_pstate/min_perf_pct in sysfs to go to 100 (or whatever value
min_sysfs_pct in the "performance" limits is set to later).
That requires per-CPU limits to be initialized explicitly rather
than by copying the global limits to avoid setting min_sysfs_pct
in the per-CPU limits to 100.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 3dc5466..f1f8fbe 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -382,6 +382,7 @@ static void intel_pstate_set_performance_limits(struct perf_limits *limits)
intel_pstate_init_limits(limits);
limits->min_perf_pct = 100;
limits->min_perf = int_ext_tofp(1);
+ limits->min_sysfs_pct = 100;
}
static DEFINE_MUTEX(intel_pstate_driver_lock);
@@ -2146,16 +2147,11 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy)
mutex_lock(&intel_pstate_limits_lock);
if (policy->policy == CPUFREQ_POLICY_PERFORMANCE) {
+ pr_debug("set performance\n");
if (!perf_limits) {
limits = &performance_limits;
perf_limits = limits;
}
- if (policy->max >= policy->cpuinfo.max_freq &&
- !limits->no_turbo) {
- pr_debug("set performance\n");
- intel_pstate_set_performance_limits(perf_limits);
- goto out;
- }
} else {
pr_debug("set powersave\n");
if (!perf_limits) {
@@ -2166,7 +2162,7 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy)
}
intel_pstate_update_perf_limits(policy, perf_limits);
- out:
+
if (cpu->policy == CPUFREQ_POLICY_PERFORMANCE) {
/*
* NOHZ_FULL CPUs need this as the governor callback may not
@@ -2257,13 +2253,8 @@ static int __intel_pstate_cpu_init(struct cpufreq_policy *policy)
cpu = all_cpu_data[policy->cpu];
- /*
- * We need sane value in the cpu->perf_limits, so inherit from global
- * perf_limits limits, which are seeded with values based on the
- * CONFIG_CPU_FREQ_DEFAULT_GOV_*, during boot up.
- */
if (per_cpu_limits)
- memcpy(cpu->perf_limits, limits, sizeof(struct perf_limits));
+ intel_pstate_init_limits(cpu->perf_limits);
policy->min = cpu->pstate.min_pstate * cpu->pstate.scaling;
policy->max = cpu->pstate.turbo_pstate * cpu->pstate.scaling;