sched: skip updating rq's next_balance under null SD
Was playing with sched_smt_power_savings/sched_mc_power_savings and
found out that while the scheduler domains are reconstructed when sysfs
settings change, rebalance_domains() can get triggered with null domain
on other cpus, which is setting next_balance to jiffies + 60*HZ.
Resulting in no idle/busy balancing for 60 seconds.
Fix this.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
diff --git a/kernel/sched.c b/kernel/sched.c
index d96030d..a4b22d9 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -3043,6 +3043,7 @@
struct sched_domain *sd;
/* Earliest time when we have to do rebalance again */
unsigned long next_balance = jiffies + 60*HZ;
+ int update_next_balance = 0;
for_each_domain(cpu, sd) {
if (!(sd->flags & SD_LOAD_BALANCE))
@@ -3079,8 +3080,10 @@
if (sd->flags & SD_SERIALIZE)
spin_unlock(&balancing);
out:
- if (time_after(next_balance, sd->last_balance + interval))
+ if (time_after(next_balance, sd->last_balance + interval)) {
next_balance = sd->last_balance + interval;
+ update_next_balance = 1;
+ }
/*
* Stop the load balance at this level. There is another
@@ -3090,7 +3093,14 @@
if (!balance)
break;
}
- rq->next_balance = next_balance;
+
+ /*
+ * next_balance will be updated only when there is a need.
+ * When the cpu is attached to null domain for ex, it will not be
+ * updated.
+ */
+ if (likely(update_next_balance))
+ rq->next_balance = next_balance;
}
/*