Age | Commit message (Collapse) | Author |
|
With the migration pushdonw a few of the do{ }while(0)
loops became obsolete but got left over - this patch
only removes this fallout.
Patch applies on top of 3.12.9-rt13
Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
pushdown of migrate_disable/enable from read_*lock* to the rt_read_*lock*
api level
general mapping to mutexes:
read_*lock*
`-> rt_read_*lock*
`-> __spin_lock (the sleeping spin locks)
`-> rt_mutex
The real read_lock* mapping:
read_lock_irqsave -.
read_lock_irq `-> rt_read_lock_irqsave()
`->read_lock ---------. \
read_lock_bh ------+ \
`--> rt_read_lock()
if (rt_mutex_owner(lock) != current){
`-> __rt_spin_lock()
rt_spin_lock_fastlock()
`->rt_mutex_cmpxchg()
migrate_disable()
}
rwlock->read_depth++;
read_trylock mapping:
read_trylock
`-> rt_read_trylock
if (rt_mutex_owner(lock) != current){
`-> rt_mutex_trylock()
rt_mutex_fasttrylock()
rt_mutex_cmpxchg()
migrate_disable()
}
rwlock->read_depth++;
read_unlock* mapping:
read_unlock_bh --------+
read_unlock_irq -------+
read_unlock_irqrestore +
read_unlock -----------+
`-> rt_read_unlock()
if(--rwlock->read_depth==0){
`-> __rt_spin_unlock()
rt_spin_lock_fastunlock()
`-> rt_mutex_cmpxchg()
migrate_disable()
}
So calls to migrate_disable/enable() are better placed at the rt_read_*
level of lock/trylock/unlock as all of the read_*lock* API has this as a
common path. In the rt_read* API of lock/trylock/unlock the nesting level
is already being recorded in rwlock->read_depth, so we can push down the
migrate disable/enable to that level and condition it on the read_depth
going from 0 to 1 -> migrate_disable and 1 to 0 -> migrate_enable. This
eliminates the recursive calls that were needed when migrate_disable/enable
was done at the read_*lock* level.
The approach to read_*_bh also eliminates the concerns raised with the
regards to api inbalances (read_lock_bh -> read_unlock+local_bh_enable)
Tested-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
pushdown of migrate_disable/enable from write_*lock* to the rt_write_*lock*
api level
general mapping of write_*lock* to mutexes:
write_*lock*
`-> rt_write_*lock*
`-> __spin_lock (the sleeping __spin_lock)
`-> rt_mutex
write_*lock*s are non-recursive so we have two lock chains to consider
- write_trylock*/write_unlock
- write_lock*/wirte_unlock
for both paths the migration_disable/enable must be balanced.
write_trylock* mapping:
write_trylock_irqsave
`-> rt_write_trylock_irqsave
write_trylock \
`--------> rt_write_trylock
ret = rt_mutex_trylock
rt_mutex_fasttrylock
rt_mutex_cmpxchg
if (ret)
migrate_disable
write_lock* mapping:
write_lock_irqsave
`-> rt_write_lock_irqsave
write_lock_irq -> write_lock ----. \
write_lock_bh -+ \
`-> rt_write_lock
__rt_spin_lock()
rt_spin_lock_fastlock()
rt_mutex_cmpxchg()
migrate_disable()
write_unlock* mapping:
write_unlock_irqrestore.
write_unlock_bh -------+
write_unlock_irq -> write_unlock ----------+
`-> rt_write_unlock()
__rt_spin_unlock()
rt_spin_lock_fastunlock()
rt_mutex_cmpxchg()
migrate_enable()
So calls to migrate_disable/enable() are better placed at the rt_write_*
level of lock/trylock/unlock as all of the write_*lock* API has this as a
common path.
This approach to write_*_bh also eliminates the concerns raised with
regards to api inbalances (write_lock_bh -> write_unlock+local_bh_enable)
Tested-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Map spinlocks, rwlocks, rw_semaphores and semaphores to the rt_mutex
based locking functions for preempt-rt.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The processing of softirqs in irq thread context is a performance gain
for the non-rt workloads of a system, but it's counterproductive for
interrupts which are explicitely related to the realtime
workload. Allow such interrupts to prevent softirq processing in their
thread context.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable-rt@vger.kernel.org
|
|
When CONFIG_PREEMPT_RT_FULL is enabled, tasklets run as threads,
and spinlocks turn are mutexes. But this can cause issues with
tasks disabling tasklets. A tasklet runs under ksoftirqd, and
if a tasklets are disabled with tasklet_disable(), the tasklet
count is increased. When a tasklet runs, it checks this counter
and if it is set, it adds itself back on the softirq queue and
returns.
The problem arises in RT because ksoftirq will see that a softirq
is ready to run (the tasklet softirq just re-armed itself), and will
not sleep, but instead run the softirqs again. The tasklet softirq
will still see that the count is non-zero and will not execute
the tasklet and requeue itself on the softirq again, which will
cause ksoftirqd to run it again and again and again.
It gets worse because ksoftirqd runs as a real-time thread.
If it preempted the task that disabled tasklets, and that task
has migration disabled, or can't run for other reasons, the tasklet
softirq will never run because the count will never be zero, and
ksoftirqd will go into an infinite loop. As an RT task, it this
becomes a big problem.
This is a hack solution to have tasklet_disable stop tasklets, and
when a tasklet runs, instead of requeueing the tasklet softirqd
it delays it. When tasklet_enable() is called, and tasklets are
waiting, then the tasklet_enable() will kick the tasklets to continue.
This prevents the lock up from ksoftirq going into an infinite loop.
[ rostedt@goodmis.org: ported to 3.0-rt ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
<NMI> [<ffffffff812dafd8>] spin_bug+0x94/0xa8
[<ffffffff812db07f>] do_raw_spin_lock+0x43/0xea
[<ffffffff814fa9be>] _raw_spin_lock_irqsave+0x6b/0x85
[<ffffffff8106ff9e>] ? migrate_disable+0x75/0x12d
[<ffffffff81078aaf>] ? pin_current_cpu+0x36/0xb0
[<ffffffff8106ff9e>] migrate_disable+0x75/0x12d
[<ffffffff81115b9d>] pagefault_disable+0xe/0x1f
[<ffffffff81047027>] copy_from_user_nmi+0x74/0xe6
[<ffffffff810489d7>] perf_callchain_user+0xf3/0x135
Now clearly we can't go around taking locks from NMI context, cure
this by short-circuiting migrate_disable() when we're in an atomic
context already.
Add some extra debugging to avoid things like:
preempt_disable()
migrate_disable();
preempt_enable();
migrate_enable();
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1314967297.1301.14.camel@twins
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/n/tip-wbot4vsmwhi8vmbf83hsclk6@git.kernel.org
|
|
Make migrate_disable() be a preempt_disable() for !rt kernels. This
allows generic code to use it but still enforces that these code
sections stay relatively small.
A preemptible migrate_disable() accessible for general use would allow
people growing arbitrary per-cpu crap instead of clean these things
up.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-275i87sl8e1jcamtchmehonm@git.kernel.org
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
get_online_cpus() is a heavy weight function which involves a global
mutex. migrate_disable() wants a simpler construct which prevents only
a CPU from going doing while a task is in a migrate disabled section.
Implement a per cpu lockless mechanism, which serializes only in the
real unplug case on a global mutex. That serialization affects only
tasks on the cpu which should be brought down.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Needs thread context (pgd_lock) -> ifdeffed. workqueues wont work with
RT
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
posix-cpu-timer code takes non -rt safe locks in hard irq
context. Move it to a thread.
[ 3.0 fixes from Peter Zijlstra <peterz@infradead.org> ]
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
In preempt-rt we can not call the callbacks which take sleeping locks
from the timer interrupt context.
Bring back the softirq split for now, until we fixed the signal
delivery problem for real.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Make cancellation of a running callback in softirq context safe
against preemption.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
When softirqs can be preempted we need to make sure that cancelling
the timer from the active thread can not deadlock vs. a running timer
callback. Add a waitqueue to resolve that.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
He below is a boot-tested hack to shrink the page frame size back to
normal.
Should be a net win since there should be many less PTE-pages than
page-frames.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Drop recursive call to migrate_disabel/enable for local_*lock* api
reported by Steven Rostedt.
local_lock will call migrate_disable via get_local_var - call tree is
get_locked_var
`-> local_lock(lvar)
`-> __local_lock(&get_local_var(lvar));
`--> # define get_local_var(var) (*({
migrate_disable();
&__get_cpu_var(var); })) \
thus there should be no need to call migrate_disable/enable recursively in
spin_try/lock/unlock. This patch addes a spin_trylock_local and replaces
the migration disabling calls by the local calls.
This patch is incomplete as it does not yet cover the _irq/_irqsave variants
by local locks. This patch requires the API cleanup in kernel/softirq.c or
it would break softirq_lock/unlock with respect to migration.
Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Joe Korty reported, that __irq_set_affinity_locked() schedules a
workqueue while holding a rawlock which results in a might_sleep()
warning.
This patch moves the invokation into a process context so that we only
wakeup() a process while holding the lock.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
As per changes in include/linux/jbd_common.h for avoiding the
bit_spin_locks on RT ("fs: jbd/jbd2: Make state lock and journal
head lock rt safe") we do the same thing here.
We use the non atomic __set_bit and __clear_bit inside the scope of
the lock to preserve the ability of the existing LIST_DEBUG code to
use the zero'th bit in the sanity checks.
As a bit spinlock, we had no lockdep visibility into the usage
of the list head locking. Now, if we were to implement it as a
standard non-raw spinlock, we would see:
BUG: sleeping function called from invalid context at kernel/rtmutex.c:658
in_atomic(): 1, irqs_disabled(): 0, pid: 122, name: udevd
5 locks held by udevd/122:
#0: (&sb->s_type->i_mutex_key#7/1){+.+.+.}, at: [<ffffffff811967e8>] lock_rename+0xe8/0xf0
#1: (rename_lock){+.+...}, at: [<ffffffff811a277c>] d_move+0x2c/0x60
#2: (&dentry->d_lock){+.+...}, at: [<ffffffff811a0763>] dentry_lock_for_move+0xf3/0x130
#3: (&dentry->d_lock/2){+.+...}, at: [<ffffffff811a0734>] dentry_lock_for_move+0xc4/0x130
#4: (&dentry->d_lock/3){+.+...}, at: [<ffffffff811a0747>] dentry_lock_for_move+0xd7/0x130
Pid: 122, comm: udevd Not tainted 3.4.47-rt62 #7
Call Trace:
[<ffffffff810b9624>] __might_sleep+0x134/0x1f0
[<ffffffff817a24d4>] rt_spin_lock+0x24/0x60
[<ffffffff811a0c4c>] __d_shrink+0x5c/0xa0
[<ffffffff811a1b2d>] __d_drop+0x1d/0x40
[<ffffffff811a24be>] __d_move+0x8e/0x320
[<ffffffff811a278e>] d_move+0x3e/0x60
[<ffffffff81199598>] vfs_rename+0x198/0x4c0
[<ffffffff8119b093>] sys_renameat+0x213/0x240
[<ffffffff817a2de5>] ? _raw_spin_unlock+0x35/0x60
[<ffffffff8107781c>] ? do_page_fault+0x1ec/0x4b0
[<ffffffff817a32ca>] ? retint_swapgs+0xe/0x13
[<ffffffff813eb0e6>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff8119b0db>] sys_rename+0x1b/0x20
[<ffffffff817a3b96>] system_call_fastpath+0x1a/0x1f
Since we are only taking the lock during short lived list operations,
lets assume for now that it being raw won't be a significant latency
concern.
Cc: stable-rt@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
bit_spin_locks break under RT.
Based on a previous patch from Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
--
include/linux/buffer_head.h | 10 ++++++++++
include/linux/jbd_common.h | 24 ++++++++++++++++++++++++
2 files changed, 34 insertions(+)
|
|
Wrap the bit_spin_lock calls into a separate inline and add the RT
replacements with a real spinlock.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Bit spinlocks are not working on RT. Replace them.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
RT needs a few preempt_disable/enable points which are not necessary
otherwise. Implement variants to avoid #ifdeffery.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Add local_irq_*_(no)rt variant which are mainly used to break
interrupt disabled sections on PREEMPT_RT or to explicitely disable
interrupts on PREEMPT_RT.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|