summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2014-04-10rtmutex-lock-killable.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10futex: Ensure lock/unlock symetry versus pi_lock and hash bucket lockThomas Gleixner
In exit_pi_state_list() we have the following locking construct: spin_lock(&hb->lock); raw_spin_lock_irq(&curr->pi_lock); ... spin_unlock(&hb->lock); In !RT this works, but on RT the migrate_enable() function which is called from spin_unlock() sees atomic context due to the held pi_lock and just decrements the migrate_disable_atomic counter of the task. Now the next call to migrate_disable() sees the counter being negative and issues a warning. That check should be in migrate_enable() already. Fix this by dropping pi_lock before unlocking hb->lock and reaquire pi_lock after that again. This is safe as the loop code reevaluates head again under the pi_lock. Reported-by: Yong Zhang <yong.zhang@windriver.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2014-04-10futex: Fix bug on when a requeued RT task times outSteven Rostedt
Requeue with timeout causes a bug with PREEMPT_RT_FULL. The bug comes from a timed out condition. TASK 1 TASK 2 ------ ------ futex_wait_requeue_pi() futex_wait_queue_me() <timed out> double_lock_hb(); raw_spin_lock(pi_lock); if (current->pi_blocked_on) { } else { current->pi_blocked_on = PI_WAKE_INPROGRESS; run_spin_unlock(pi_lock); spin_lock(hb->lock); <-- blocked! plist_for_each_entry_safe(this) { rt_mutex_start_proxy_lock(); task_blocks_on_rt_mutex(); BUG_ON(task->pi_blocked_on)!!!! The BUG_ON() actually has a check for PI_WAKE_INPROGRESS, but the problem is that, after TASK 1 sets PI_WAKE_INPROGRESS, it then tries to grab the hb->lock, which it fails to do so. As the hb->lock is a mutex, it will block and set the "pi_blocked_on" to the hb->lock. When TASK 2 goes to requeue it, the check for PI_WAKE_INPROGESS fails because the task1's pi_blocked_on is no longer set to that, but instead, set to the hb->lock. The fix: When calling rt_mutex_start_proxy_lock() a check is made to see if the proxy tasks pi_blocked_on is set. If so, exit out early. Otherwise set it to a new flag PI_REQUEUE_INPROGRESS, which notifies the proxy task that it is being requeued, and will handle things appropriately. Cc: stable-rt@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10rtmutex-futex-prepare-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10genirq: Allow disabling of softirq processing in irq thread contextThomas Gleixner
The processing of softirqs in irq thread context is a performance gain for the non-rt workloads of a system, but it's counterproductive for interrupts which are explicitely related to the realtime workload. Allow such interrupts to prevent softirq processing in their thread context. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable-rt@vger.kernel.org
2014-04-10tasklet: Prevent tasklets from going into infinite spin in RTIngo Molnar
When CONFIG_PREEMPT_RT_FULL is enabled, tasklets run as threads, and spinlocks turn are mutexes. But this can cause issues with tasks disabling tasklets. A tasklet runs under ksoftirqd, and if a tasklets are disabled with tasklet_disable(), the tasklet count is increased. When a tasklet runs, it checks this counter and if it is set, it adds itself back on the softirq queue and returns. The problem arises in RT because ksoftirq will see that a softirq is ready to run (the tasklet softirq just re-armed itself), and will not sleep, but instead run the softirqs again. The tasklet softirq will still see that the count is non-zero and will not execute the tasklet and requeue itself on the softirq again, which will cause ksoftirqd to run it again and again and again. It gets worse because ksoftirqd runs as a real-time thread. If it preempted the task that disabled tasklets, and that task has migration disabled, or can't run for other reasons, the tasklet softirq will never run because the count will never be zero, and ksoftirqd will go into an infinite loop. As an RT task, it this becomes a big problem. This is a hack solution to have tasklet_disable stop tasklets, and when a tasklet runs, instead of requeueing the tasklet softirqd it delays it. When tasklet_enable() is called, and tasklets are waiting, then the tasklet_enable() will kick the tasklets to continue. This prevents the lock up from ksoftirq going into an infinite loop. [ rostedt@goodmis.org: ported to 3.0-rt ] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10softirq-make-fifo.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10softirq-local-lock.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10mutex-no-spin-on-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10lockdep-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10softirq: Sanitize softirq pending for NOHZ/RTThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10sched-clear-pf-thread-bound-on-fallback-rq.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10sched: dont calculate hweight in update_migrate_disable()Nicholas Mc Guire
Proposal for a minor optimization in update_migrate_disable - its only a few instructions saved but those are in the hot path of locks so it might be worth it When being scheduled out while migrate_disable > 0 and migrate_disabled_updated is not yet set we end up here (kernel/sched/core.c): static inline void update_migrate_disable(struct task_struct *p) { ... mask = tsk_cpus_allowed(p); if (p->sched_class->set_cpus_allowed) p->sched_class->set_cpus_allowed(p, mask); p->nr_cpus_allowed = cpumask_weight(mask); as we only can get here if migrate_disable > 0 there is no need to calculate the cpumask_weight(mask) as tsk_cpus_allowed in that case will return cpumask_of(task_cpu(p)) which only can have a hamming weight of 1 anyway. So we can simply do: p->nr_cpus_allowed = 1; without changing the behavior. Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2014-04-10sched: Have migrate_disable ignore bounded threadsPeter Zijlstra
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Clark Williams <williams@redhat.com> Link: http://lkml.kernel.org/r/20110927124423.567944215@goodmis.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10sched: Do not compare cpu masks in schedulerPeter Zijlstra
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Clark Williams <williams@redhat.com> Link: http://lkml.kernel.org/r/20110927124423.128129033@goodmis.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10allow preemption in recursive migrate_disable callNicholas Mc Guire
Minor cleanup in migrate_disable/migrate_enable. The recursive case does not need to disable preemption as it is "pinned" to the current cpu any way so it is safe to preempt it. Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2014-04-10sched: Postpone actual migration disalbe to scheduleSteven Rostedt
The migrate_disable() can cause a bit of a overhead to the RT kernel, as changing the affinity is expensive to do at every lock encountered. As a running task can not migrate, the actual disabling of migration does not need to occur until the task is about to schedule out. In most cases, a task that disables migration will enable it before it schedules making this change improve performance tremendously. [ Frank Rowand: UP compile fix ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Clark Williams <williams@redhat.com> Link: http://lkml.kernel.org/r/20110927124422.779693167@goodmis.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10sched: teach migrate_disable about atomic contextsPeter Zijlstra
<NMI> [<ffffffff812dafd8>] spin_bug+0x94/0xa8 [<ffffffff812db07f>] do_raw_spin_lock+0x43/0xea [<ffffffff814fa9be>] _raw_spin_lock_irqsave+0x6b/0x85 [<ffffffff8106ff9e>] ? migrate_disable+0x75/0x12d [<ffffffff81078aaf>] ? pin_current_cpu+0x36/0xb0 [<ffffffff8106ff9e>] migrate_disable+0x75/0x12d [<ffffffff81115b9d>] pagefault_disable+0xe/0x1f [<ffffffff81047027>] copy_from_user_nmi+0x74/0xe6 [<ffffffff810489d7>] perf_callchain_user+0xf3/0x135 Now clearly we can't go around taking locks from NMI context, cure this by short-circuiting migrate_disable() when we're in an atomic context already. Add some extra debugging to avoid things like: preempt_disable() migrate_disable(); preempt_enable(); migrate_enable(); Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1314967297.1301.14.camel@twins Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/n/tip-wbot4vsmwhi8vmbf83hsclk6@git.kernel.org
2014-04-10sched, rt: Fix migrate_enable() thinkoMike Galbraith
Assigning mask = tsk_cpus_allowed(p) after p->migrate_disable = 0 ensures that we won't see a mask change.. no push/pull, we stack tasks on one CPU. Also add a couple fields to sched_debug for the next guy. [ Build fix from Stratos Psomadakis <psomas@gentoo.org> ] Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Paul E. McKenney <paulmck@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1314108763.6689.4.camel@marge.simson.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10sched: Generic migrate_disablePeter Zijlstra
Make migrate_disable() be a preempt_disable() for !rt kernels. This allows generic code to use it but still enforces that these code sections stay relatively small. A preemptible migrate_disable() accessible for general use would allow people growing arbitrary per-cpu crap instead of clean these things up. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-275i87sl8e1jcamtchmehonm@git.kernel.org
2014-04-10sched: Optimize migrate_disablePeter Zijlstra
Change from task_rq_lock() to raw_spin_lock(&rq->lock) to avoid a few atomic ops. See comment on why it should be safe. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-cbz6hkl5r5mvwtx5s3tor2y6@git.kernel.org
2014-04-10tracing: Show padding as unsigned shortSteven Rostedt
RT added two bytes to trace migrate disable counting to the trace events and used two bytes of the padding to make the change. The structures and all were updated correctly, but the display in the event formats was not: cat /debug/tracing/events/sched/sched_switch/format name: sched_switch ID: 51 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:unsigned short common_migrate_disable; offset:8; size:2; signed:0; field:int common_padding; offset:10; size:2; signed:0; The field for common_padding has the correct size and offset, but the use of "int" might confuse some parsers (and people that are reading it). This needs to be changed to "unsigned short". Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1321467575.4181.36.camel@frodo Cc: stable-rt@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10ftrace-migrate-disable-tracing.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10hotplug: Call cpu_unplug_begin() before DOWN_PREPAREYong Zhang
cpu_unplug_begin() should be called before CPU_DOWN_PREPARE, because at CPU_DOWN_PREPARE cpu_active is cleared and sched_domain is rebuilt. Otherwise the 'sync_unplug' thread will be running on the cpu on which it's created and not bound on the cpu which is about to go down. I found that by an incorrect warning on smp_processor_id() called by sync_unplug/1, and trace shows below: (echo 1 > /sys/device/system/cpu/cpu1/online) bash-1664 [000] 83.136620: _cpu_down: Bind sync_unplug to cpu 1 bash-1664 [000] 83.136623: sched_wait_task: comm=sync_unplug/1 pid=1724 prio=120 bash-1664 [000] 83.136624: _cpu_down: Wake sync_unplug bash-1664 [000] 83.136629: sched_wakeup: comm=sync_unplug/1 pid=1724 prio=120 success=1 target_cpu=000 Wants to be folded back.... Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Link: http://lkml.kernel.org/r/1318762607-2261-3-git-send-email-yong.zhang0@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10hotplug-use-migrate-disable.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10sched-migrate-disable.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10hotplug: Reread hotplug_pcp on pin_current_cpu() retryYong Zhang
When retry happens, it's likely that the task has been migrated to another cpu (except unplug failed), but it still derefernces the original hotplug_pcp per cpu data. Update the pointer to hotplug_pcp in the retry path, so it points to the current cpu. Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110728031600.GA338@windriver.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10hotplug: sync_unplug: No "\n" in task nameYong Zhang
Otherwise the output will look a little odd. Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Link: http://lkml.kernel.org/r/1318762607-2261-2-git-send-email-yong.zhang0@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10hotplug: Lightweight get online cpusThomas Gleixner
get_online_cpus() is a heavy weight function which involves a global mutex. migrate_disable() wants a simpler construct which prevents only a CPU from going doing while a task is in a migrate disabled section. Implement a per cpu lockless mechanism, which serializes only in the real unplug case on a global mutex. That serialization affects only tasks on the cpu which should be brought down. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10stomp_machine: Use mutex_trylock when called from inactive cpuThomas Gleixner
If the stop machinery is called from inactive CPU we cannot use mutex_lock, because some other stomp machine invokation might be in progress and the mutex can be contended. We cannot schedule from this context, so trylock and loop. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable-rt@vger.kernel.org
2014-04-10stomp-machine-raw-lock.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10stop_machine: convert stop_machine_run() to PREEMPT_RTIngo Molnar
Instead of playing with non-preemption, introduce explicit startup serialization. This is more robust and cleaner as well. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10sched/workqueue: Only wake up idle workers if not blocked on sleeping spin lockSteven Rostedt
In -rt, most spin_locks() turn into mutexes. One of these spin_lock conversions is performed on the workqueue gcwq->lock. When the idle worker is worken, the first thing it will do is grab that same lock and it too will block, possibly jumping into the same code, but because nr_running would already be decremented it prevents an infinite loop. But this is still a waste of CPU cycles, and it doesn't follow the method of mainline, as new workers should only be woken when a worker thread is truly going to sleep, and not just blocked on a spin_lock(). Check the saved_state too before waking up new workers. Cc: stable-rt@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2014-04-10sched: ttwu: Return success when only changing the saved_state valueThomas Gleixner
When a task blocks on a rt lock, it saves the current state in p->saved_state, so a lock related wake up will not destroy the original state. When a real wakeup happens, while the task is running due to a lock wakeup already, we update p->saved_state to TASK_RUNNING, but we do not return success, which might cause another wakeup in the waitqueue code and the task remains in the waitqueue list. Return success in that case as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable-rt@vger.kernel.org
2014-04-10sched-disable-ttwu-queue.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10cond-resched-softirq-fix.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10sched-cond-resched.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10sched-might-sleep-do-not-account-rcu-depth.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10sched-rt-mutex-wakeup.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10sched-mmdrop-delayed.patchThomas Gleixner
Needs thread context (pgd_lock) -> ifdeffed. workqueues wont work with RT Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10sched-limit-nr-migrate.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10sched-delay-put-task.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10posix-timers: Avoid wakeups when no timers are activeThomas Gleixner
Waking the thread even when no timers are scheduled is useless. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10posix-timers: Shorten posix_cpu_timers/<CPU> kernel thread namesArnaldo Carvalho de Melo
Shorten the softirq kernel thread names because they always overflow the limited comm length, appearing as "posix_cpu_timer" CPU# times. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10posix-timers: thread posix-cpu-timers on -rtJohn Stultz
posix-cpu-timer code takes non -rt safe locks in hard irq context. Move it to a thread. [ 3.0 fixes from Peter Zijlstra <peterz@infradead.org> ] Signed-off-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10hrtimer: Move schedule_work call to helper threadYang Shi
When run ltp leapsec_timer test, the following call trace is caught: BUG: sleeping function called from invalid context at kernel/rtmutex.c:659 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1 Preemption disabled at:[<ffffffff810857f3>] cpu_startup_entry+0x133/0x310 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.10.10-rt3 #2 Hardware name: Intel Corporation Calpella platform/MATXM-CORE-411-B, BIOS 4.6.3 08/18/2010 ffffffff81c2f800 ffff880076843e40 ffffffff8169918d ffff880076843e58 ffffffff8106db31 ffff88007684b4a0 ffff880076843e70 ffffffff8169d9c0 ffff88007684b4a0 ffff880076843eb0 ffffffff81059da1 0000001876851200 Call Trace: <IRQ> [<ffffffff8169918d>] dump_stack+0x19/0x1b [<ffffffff8106db31>] __might_sleep+0xf1/0x170 [<ffffffff8169d9c0>] rt_spin_lock+0x20/0x50 [<ffffffff81059da1>] queue_work_on+0x61/0x100 [<ffffffff81065aa1>] clock_was_set_delayed+0x21/0x30 [<ffffffff810883be>] do_timer+0x40e/0x660 [<ffffffff8108f487>] tick_do_update_jiffies64+0xf7/0x140 [<ffffffff8108fe42>] tick_check_idle+0x92/0xc0 [<ffffffff81044327>] irq_enter+0x57/0x70 [<ffffffff816a040e>] smp_apic_timer_interrupt+0x3e/0x9b [<ffffffff8169f80a>] apic_timer_interrupt+0x6a/0x70 <EOI> [<ffffffff8155ea1c>] ? cpuidle_enter_state+0x4c/0xc0 [<ffffffff8155eb68>] cpuidle_idle_call+0xd8/0x2d0 [<ffffffff8100b59e>] arch_cpu_idle+0xe/0x30 [<ffffffff8108585e>] cpu_startup_entry+0x19e/0x310 [<ffffffff8168efa2>] start_secondary+0x1ad/0x1b0 The clock_was_set_delayed is called in hard IRQ handler (timer interrupt), which calls schedule_work. Under PREEMPT_RT_FULL, schedule_work calls spinlocks which could sleep, so it's not safe to call schedule_work in interrupt context. Reference upstream commit b68d61c705ef02384c0538b8d9374545097899ca (rt,ntp: Move call to schedule_delayed_work() to helper thread) from git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git, which makes a similar change. add a helper thread which does the call to schedule_work and wake up that thread instead of calling schedule_work directly. Cc: stable-rt@vger.kernel.org Signed-off-by: Yang Shi <yang.shi@windriver.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2014-04-10hrtimer: Raise softirq if hrtimer irq stalledWatanabe
When the hrtimer stall detection hits the softirq is not raised. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable-rt@vger.kernel.org
2014-04-10hrtimer: fixup hrtimer callback changes for preempt-rtThomas Gleixner
In preempt-rt we can not call the callbacks which take sleeping locks from the timer interrupt context. Bring back the softirq split for now, until we fixed the signal delivery problem for real. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2014-04-10hrtimers: prepare full preemptionIngo Molnar
Make cancellation of a running callback in softirq context safe against preemption. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10timers: Avoid the switch timers base set to NULL trick on RTThomas Gleixner
On RT that code is preemptible, so we cannot assign NULL to timers base as a preempter would spin forever in lock_timer_base(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de>