Diff - f9f9ffc237dd924f048204e8799da74f9ecf40cf^! - SHIFTPHONES/android_kernel_shift_sdm845

commit	f9f9ffc237dd924f048204e8799da74f9ecf40cf	[log] [tgz]
author	Ben Segall <bsegall@google.com>	Wed Oct 16 11:16:32 2013 -0700
committer	Ingo Molnar <mingo@kernel.org>	Tue Oct 29 12:02:32 2013 +0100
tree	81ed0c3435dfe54781d0f120d3a5938d571bacd1
parent	0ac9b1c21874d2490331233b3242085f8151e166 [diff] [blame]

sched: Avoid throttle_cfs_rq() racing with period_timer stopping

throttle_cfs_rq() doesn't check to make sure that period_timer is running,
and while update_curr/assign_cfs_runtime does, a concurrently running
period_timer on another cpu could cancel itself between this cpu's
update_curr and throttle_cfs_rq(). If there are no other cfs_rqs running
in the tg to restart the timer, this causes the cfs_rq to be stranded
forever.

Fix this by calling __start_cfs_bandwidth() in throttle if the timer is
inactive.

(Also add some sched_debug lines for cfs_bandwidth.)

Tested: make a run/sleep task in a cgroup, loop switching the cgroup
between 1ms/100ms quota and unlimited, checking for timer_active=0 and
throttled=1 as a failure. With the throttle_cfs_rq() change commented out
this fails, with the full patch it passes.

Signed-off-by: Ben Segall <bsegall@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: pjt@google.com
Link: http://lkml.kernel.org/r/20131016181632.22647.84174.stgit@sword-of-the-dawn.mtv.corp.google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0923ab2..41c02b6 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c

@@ -3112,6 +3112,8 @@
 	cfs_rq->throttled_clock = rq_clock(rq);
 	raw_spin_lock(&cfs_b->lock);
 	list_add_tail_rcu(&cfs_rq->throttled_list, &cfs_b->throttled_cfs_rq);
+	if (!cfs_b->timer_active)
+		__start_cfs_bandwidth(cfs_b);
 	raw_spin_unlock(&cfs_b->lock);
 }