diff options
author | Steven Rostedt <srostedt@redhat.com> | 2012-07-16 08:07:43 (GMT) |
---|---|---|
committer | Scott Wood <scottwood@freescale.com> | 2015-02-13 22:21:03 (GMT) |
commit | 46dbcd09f7eca4b77dc7f663c009b14a89ca30ac (patch) | |
tree | 4ada8a16def204635befcc042540177e91c12319 /kernel/sched/core.c | |
parent | 9cbe0e5c2400442c006717861b504244b2b16c0e (diff) | |
download | linux-fsl-qoriq-46dbcd09f7eca4b77dc7f663c009b14a89ca30ac.tar.xz |
cpu/rt: Rework cpu down for PREEMPT_RT
Bringing a CPU down is a pain with the PREEMPT_RT kernel because
tasks can be preempted in many more places than in non-RT. In
order to handle per_cpu variables, tasks may be pinned to a CPU
for a while, and even sleep. But these tasks need to be off the CPU
if that CPU is going down.
Several synchronization methods have been tried, but when stressed
they failed. This is a new approach.
A sync_tsk thread is still created and tasks may still block on a
lock when the CPU is going down, but how that works is a bit different.
When cpu_down() starts, it will create the sync_tsk and wait on it
to inform that current tasks that are pinned on the CPU are no longer
pinned. But new tasks that are about to be pinned will still be allowed
to do so at this time.
Then the notifiers are called. Several notifiers will bring down tasks
that will enter these locations. Some of these tasks will take locks
of other tasks that are on the CPU. If we don't let those other tasks
continue, but make them block until CPU down is done, the tasks that
the notifiers are waiting on will never complete as they are waiting
for the locks held by the tasks that are blocked.
Thus we still let the task pin the CPU until the notifiers are done.
After the notifiers run, we then make new tasks entering the pinned
CPU sections grab a mutex and wait. This mutex is now a per CPU mutex
in the hotplug_pcp descriptor.
To help things along, a new function in the scheduler code is created
called migrate_me(). This function will try to migrate the current task
off the CPU this is going down if possible. When the sync_tsk is created,
all tasks will then try to migrate off the CPU going down. There are
several cases that this wont work, but it helps in most cases.
After the notifiers are called and if a task can't migrate off but enters
the pin CPU sections, it will be forced to wait on the hotplug_pcp mutex
until the CPU down is complete. Then the scheduler will force the migration
anyway.
Also, I found that THREAD_BOUND need to also be accounted for in the
pinned CPU, and the migrate_disable no longer treats them special.
This helps fix issues with ksoftirqd and workqueue that unbind on CPU down.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Diffstat (limited to 'kernel/sched/core.c')
-rw-r--r-- | kernel/sched/core.c | 82 |
1 files changed, 80 insertions, 2 deletions
diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 9165f01..c8c8ac1 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2371,7 +2371,7 @@ void migrate_disable(void) { struct task_struct *p = current; - if (in_atomic() || p->flags & PF_NO_SETAFFINITY) { + if (in_atomic()) { #ifdef CONFIG_SCHED_DEBUG p->migrate_disable_atomic++; #endif @@ -2401,7 +2401,7 @@ void migrate_enable(void) unsigned long flags; struct rq *rq; - if (in_atomic() || p->flags & PF_NO_SETAFFINITY) { + if (in_atomic()) { #ifdef CONFIG_SCHED_DEBUG p->migrate_disable_atomic--; #endif @@ -4440,6 +4440,84 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) cpumask_copy(&p->cpus_allowed, new_mask); } +static DEFINE_PER_CPU(struct cpumask, sched_cpumasks); +static DEFINE_MUTEX(sched_down_mutex); +static cpumask_t sched_down_cpumask; + +void tell_sched_cpu_down_begin(int cpu) +{ + mutex_lock(&sched_down_mutex); + cpumask_set_cpu(cpu, &sched_down_cpumask); + mutex_unlock(&sched_down_mutex); +} + +void tell_sched_cpu_down_done(int cpu) +{ + mutex_lock(&sched_down_mutex); + cpumask_clear_cpu(cpu, &sched_down_cpumask); + mutex_unlock(&sched_down_mutex); +} + +/** + * migrate_me - try to move the current task off this cpu + * + * Used by the pin_current_cpu() code to try to get tasks + * to move off the current CPU as it is going down. + * It will only move the task if the task isn't pinned to + * the CPU (with migrate_disable, affinity or NO_SETAFFINITY) + * and the task has to be in a RUNNING state. Otherwise the + * movement of the task will wake it up (change its state + * to running) when the task did not expect it. + * + * Returns 1 if it succeeded in moving the current task + * 0 otherwise. + */ +int migrate_me(void) +{ + struct task_struct *p = current; + struct migration_arg arg; + struct cpumask *cpumask; + struct cpumask *mask; + unsigned long flags; + unsigned int dest_cpu; + struct rq *rq; + + /* + * We can not migrate tasks bounded to a CPU or tasks not + * running. The movement of the task will wake it up. + */ + if (p->flags & PF_NO_SETAFFINITY || p->state) + return 0; + + mutex_lock(&sched_down_mutex); + rq = task_rq_lock(p, &flags); + + cpumask = &__get_cpu_var(sched_cpumasks); + mask = &p->cpus_allowed; + + cpumask_andnot(cpumask, mask, &sched_down_cpumask); + + if (!cpumask_weight(cpumask)) { + /* It's only on this CPU? */ + task_rq_unlock(rq, p, &flags); + mutex_unlock(&sched_down_mutex); + return 0; + } + + dest_cpu = cpumask_any_and(cpu_active_mask, cpumask); + + arg.task = p; + arg.dest_cpu = dest_cpu; + + task_rq_unlock(rq, p, &flags); + + stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg); + tlb_migrate_finish(p->mm); + mutex_unlock(&sched_down_mutex); + + return 1; +} + /* * This is how migration works: * |