Age | Commit message (Collapse) | Author |
|
Signed-off-by: Scott Wood <scottwood@freescale.com>
Conflicts:
arch/arm/kvm/mmu.c
arch/arm/mm/proc-v7-3level.S
arch/powerpc/kernel/vdso32/getcpu.S
drivers/crypto/caam/error.c
drivers/crypto/caam/sg_sw_sec4.h
drivers/usb/host/ehci-fsl.c
|
|
raise_softirq_irqoff() disables interrupts and wakes the softirq
daemon, but after reenabling interrupts there is no preemption check,
so the execution of the softirq thread might be delayed arbitrarily.
In principle we could add that check to local_irq_enable/restore, but
that's overkill as the rasie_softirq_irqoff() sections are the only
ones which show this behaviour.
Reported-by: Carsten Emde <cbe@osadl.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable-rt@vger.kernel.org
|
|
On RT write_seqcount_begin() disables preemption and device_rename()
allocates memory with GFP_KERNEL and grabs later the sysfs_mutex
mutex. Serialize with a mutex and add use the non preemption disabling
__write_seqcount_begin().
To avoid writer starvation, let the reader grab the mutex and release
it when it detects a writer in progress. This keeps the normal case
(no reader on the fly) fast.
[ tglx: Instead of replacing the seqcount by a mutex, add the mutex ]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
This code triggers the new WARN in __raise_softirq_irqsoff() though it
actually looks at the softirq pending bit and calls into the softirq
code, but that fits not well with the context related softirq model of
RT. It's correct on mainline though, but going through
local_bh_disable/enable here is not going to hurt badly.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The netfilter code relies only on the implicit semantics of
local_bh_disable() for serializing wt_write_recseq sections. RT breaks
that and needs explicit serialization here.
Reported-by: Peter LaDow <petela@gocougs.wsu.edu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable-rt@vger.kernel.org
|
|
in response to the oops in ip_output.c:ip_send_unicast_reply under high
network load with CONFIG_PREEMPT_RT_FULL=y, reported by Sami Pietikainen
<Sami.Pietikainen@wapice.com>, this patch adds local serialization in
ip_send_unicast_reply.
from ip_output.c:
/*
* Generic function to send a packet as reply to another packet.
* Used to send some TCP resets/acks so far.
*
* Use a fake percpu inet socket to avoid false sharing and contention.
*/
static DEFINE_PER_CPU(struct inet_sock, unicast_sock) = {
...
which was added in commit be9f4a44 in linux-stable. The git log, wich
introduced the PER_CPU unicast_sock, states:
<snip>
commit be9f4a44e7d41cee50ddb5f038fc2391cbbb4046
Author: Eric Dumazet <edumazet@google.com>
Date: Thu Jul 19 07:34:03 2012 +0000
ipv4: tcp: remove per net tcp_sock
tcp_v4_send_reset() and tcp_v4_send_ack() use a single socket
per network namespace.
This leads to bad behavior on multiqueue NICS, because many cpus
contend for the socket lock and once socket lock is acquired, extra
false sharing on various socket fields slow down the operations.
To better resist to attacks, we use a percpu socket. Each cpu can
run without contention, using appropriate memory (local node)
<snip>
The per-cpu here thus is assuming exclusivity serializing per cpu - so
the use of get_cpu_ligh introduced in
net-use-cpu-light-in-ip-send-unicast-reply.patch, which droped the
preempt_disable in favor of a migrate_disable is probably wrong as this
only handles the referencial consistency but not the serialization. To
evade a preempt_disable here a local lock would be needed.
Therapie:
* add local lock:
* and re-introduce local serialization:
Tested on x86 with high network load using the testcase from Sami Pietikainen
while : ; do wget -O - ftp://LOCAL_SERVER/empty_file > /dev/null 2>&1; done
Link: http://www.spinics.net/lists/linux-rt-users/msg11007.html
Cc: stable-rt@vger.kernel.org
Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Replace it by a local lock. Though that's pretty inefficient :(
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
1)enqueue_to_backlog() (called from netif_rx) should be
bind to a particluar CPU. This can be achieved by
disabling migration. No need to disable preemption
2)Fixes crash "BUG: scheduling while atomic: ksoftirqd"
in case of RT.
If preemption is disabled, enqueue_to_backog() is called
in atomic context. And if backlog exceeds its count,
kfree_skb() is called. But in RT, kfree_skb() might
gets scheduled out, so it expects non atomic context.
3)When CONFIG_PREEMPT_RT_FULL is not defined,
migrate_enable(), migrate_disable() maps to
preempt_enable() and preempt_disable(), so no
change in functionality in case of non-RT.
-Replace preempt_enable(), preempt_disable() with
migrate_enable(), migrate_disable() respectively
-Replace get_cpu(), put_cpu() with get_cpu_light(),
put_cpu_light() respectively
Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com>
Acked-by: Rajan Srivastava <Rajan.Srivastava@freescale.com>
Cc: <rostedt@goodmis.orgn>
Link: http://lkml.kernel.org/r/1337227511-2271-1-git-send-email-Priyanka.Jain@freescale.com
Cc: stable-rt@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
There are (probably rare) situations when a system crashed and the system
console becomes unresponsive but the network icmp layer still is alive.
Wouldn't it be wonderful, if we then could submit a sysreq command via ping?
This patch provides this facility. Please consult the updated documentation
Documentation/sysrq.txt for details.
Signed-off-by: Carsten Emde <C.Emde@osadl.org>
|
|
qdisc_lock is taken w/o disabling interrupts or bottom halfs. So code
holding a qdisc_lock() can be interrupted and softirqs can run on the
return of interrupt in !RT.
The spin_trylock() in net_tx_action() makes sure, that the softirq
does not deadlock. When the lock can't be acquired q is requeued and
the NET_TX softirq is raised. That causes the softirq to run over and
over.
That works in mainline as do_softirq() has a retry loop limit and
leaves the softirq processing in the interrupt return path and
schedules ksoftirqd. The task which holds qdisc_lock cannot be
preempted, so the lock is released and either ksoftirqd or the next
softirq in the return from interrupt path can proceed. Though it's a
bit strange to actually run MAX_SOFTIRQ_RESTART (10) loops before it
decides to bail out even if it's clear in the first iteration :)
On RT all softirq processing is done in a FIFO thread and we don't
have a loop limit, so ksoftirqd preempts the lock holder forever and
unqueues and requeues until the reset button is hit.
Due to the forced threading of ksoftirqd on RT we actually cannot
deadlock on qdisc_lock because it's a "sleeping lock". So it's safe to
replace the spin_trylock() with a spin_lock(). When contended,
ksoftirqd is scheduled out and the lock holder can proceed.
[ tglx: Massaged changelog and code comments ]
Solved-by: Thomas Gleixner <tglx@linuxtronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Tested-by: Carsten Emde <cbe@osadl.org>
Cc: Clark Williams <williams@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Claudio R. Goncalves <lclaudio@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Retry loops on RT might loop forever when the modifying side was
preempted. Use cpu_chill() instead of cpu_relax() to let the system
make progress.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable-rt@vger.kernel.org
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
for outstanding qdisc_run calls
On PREEMPT_RT enabled systems the interrupt handler run as threads at prio 50
(by default). If a high priority userspace process tries to shut down a busy
network interface it might spin in a yield loop waiting for the device to
become idle. With the interrupt thread having a lower priority than the
looping process it might never be scheduled and so result in a deadlock on UP
systems.
With Magic SysRq the following backtrace can be produced:
> test_app R running 0 174 168 0x00000000
> [<c02c7070>] (__schedule+0x220/0x3fc) from [<c02c7870>] (preempt_schedule_irq+0x48/0x80)
> [<c02c7870>] (preempt_schedule_irq+0x48/0x80) from [<c0008fa8>] (svc_preempt+0x8/0x20)
> [<c0008fa8>] (svc_preempt+0x8/0x20) from [<c001a984>] (local_bh_enable+0x18/0x88)
> [<c001a984>] (local_bh_enable+0x18/0x88) from [<c025316c>] (dev_deactivate_many+0x220/0x264)
> [<c025316c>] (dev_deactivate_many+0x220/0x264) from [<c023be04>] (__dev_close_many+0x64/0xd4)
> [<c023be04>] (__dev_close_many+0x64/0xd4) from [<c023be9c>] (__dev_close+0x28/0x3c)
> [<c023be9c>] (__dev_close+0x28/0x3c) from [<c023f7f0>] (__dev_change_flags+0x88/0x130)
> [<c023f7f0>] (__dev_change_flags+0x88/0x130) from [<c023f904>] (dev_change_flags+0x10/0x48)
> [<c023f904>] (dev_change_flags+0x10/0x48) from [<c024c140>] (do_setlink+0x370/0x7ec)
> [<c024c140>] (do_setlink+0x370/0x7ec) from [<c024d2f0>] (rtnl_newlink+0x2b4/0x450)
> [<c024d2f0>] (rtnl_newlink+0x2b4/0x450) from [<c024cfa0>] (rtnetlink_rcv_msg+0x158/0x1f4)
> [<c024cfa0>] (rtnetlink_rcv_msg+0x158/0x1f4) from [<c0256740>] (netlink_rcv_skb+0xac/0xc0)
> [<c0256740>] (netlink_rcv_skb+0xac/0xc0) from [<c024bbd8>] (rtnetlink_rcv+0x18/0x24)
> [<c024bbd8>] (rtnetlink_rcv+0x18/0x24) from [<c02561b8>] (netlink_unicast+0x13c/0x198)
> [<c02561b8>] (netlink_unicast+0x13c/0x198) from [<c025651c>] (netlink_sendmsg+0x264/0x2e0)
> [<c025651c>] (netlink_sendmsg+0x264/0x2e0) from [<c022af98>] (sock_sendmsg+0x78/0x98)
> [<c022af98>] (sock_sendmsg+0x78/0x98) from [<c022bb50>] (___sys_sendmsg.part.25+0x268/0x278)
> [<c022bb50>] (___sys_sendmsg.part.25+0x268/0x278) from [<c022cf08>] (__sys_sendmsg+0x48/0x78)
> [<c022cf08>] (__sys_sendmsg+0x48/0x78) from [<c0009320>] (ret_fast_syscall+0x0/0x2c)
This patch works around the problem by replacing yield() by msleep(1), giving
the interrupt thread time to finish, similar to other changes contained in the
rt patch set. Using wait_for_completion() instead would probably be a better
solution.
Cc: stable-rt@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
=======================================================
[ INFO: possible circular locking dependency detected ]
3.0.0-rc3+ #26
-------------------------------------------------------
ip/1104 is trying to acquire lock:
(local_softirq_lock){+.+...}, at: [<ffffffff81056d12>] __local_lock+0x25/0x68
but task is already holding lock:
(sk_lock-AF_INET){+.+...}, at: [<ffffffff81433308>] lock_sock+0x10/0x12
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (sk_lock-AF_INET){+.+...}:
[<ffffffff810836e5>] lock_acquire+0x103/0x12e
[<ffffffff813e2781>] lock_sock_nested+0x82/0x92
[<ffffffff81433308>] lock_sock+0x10/0x12
[<ffffffff81433afa>] tcp_close+0x1b/0x355
[<ffffffff81453c99>] inet_release+0xc3/0xcd
[<ffffffff813dff3f>] sock_release+0x1f/0x74
[<ffffffff813dffbb>] sock_close+0x27/0x2b
[<ffffffff81129c63>] fput+0x11d/0x1e3
[<ffffffff81126577>] filp_close+0x70/0x7b
[<ffffffff8112667a>] sys_close+0xf8/0x13d
[<ffffffff814ae882>] system_call_fastpath+0x16/0x1b
-> #0 (local_softirq_lock){+.+...}:
[<ffffffff81082ecc>] __lock_acquire+0xacc/0xdc8
[<ffffffff810836e5>] lock_acquire+0x103/0x12e
[<ffffffff814a7e40>] _raw_spin_lock+0x3b/0x4a
[<ffffffff81056d12>] __local_lock+0x25/0x68
[<ffffffff81056d8b>] local_bh_disable+0x36/0x3b
[<ffffffff814a7fc4>] _raw_write_lock_bh+0x16/0x4f
[<ffffffff81433c38>] tcp_close+0x159/0x355
[<ffffffff81453c99>] inet_release+0xc3/0xcd
[<ffffffff813dff3f>] sock_release+0x1f/0x74
[<ffffffff813dffbb>] sock_close+0x27/0x2b
[<ffffffff81129c63>] fput+0x11d/0x1e3
[<ffffffff81126577>] filp_close+0x70/0x7b
[<ffffffff8112667a>] sys_close+0xf8/0x13d
[<ffffffff814ae882>] system_call_fastpath+0x16/0x1b
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(sk_lock-AF_INET);
lock(local_softirq_lock);
lock(sk_lock-AF_INET);
lock(local_softirq_lock);
*** DEADLOCK ***
1 lock held by ip/1104:
#0: (sk_lock-AF_INET){+.+...}, at: [<ffffffff81433308>] lock_sock+0x10/0x12
stack backtrace:
Pid: 1104, comm: ip Not tainted 3.0.0-rc3+ #26
Call Trace:
[<ffffffff81081649>] print_circular_bug+0x1f8/0x209
[<ffffffff81082ecc>] __lock_acquire+0xacc/0xdc8
[<ffffffff81056d12>] ? __local_lock+0x25/0x68
[<ffffffff810836e5>] lock_acquire+0x103/0x12e
[<ffffffff81056d12>] ? __local_lock+0x25/0x68
[<ffffffff81046c75>] ? get_parent_ip+0x11/0x41
[<ffffffff814a7e40>] _raw_spin_lock+0x3b/0x4a
[<ffffffff81056d12>] ? __local_lock+0x25/0x68
[<ffffffff81046c8c>] ? get_parent_ip+0x28/0x41
[<ffffffff81056d12>] __local_lock+0x25/0x68
[<ffffffff81056d8b>] local_bh_disable+0x36/0x3b
[<ffffffff81433308>] ? lock_sock+0x10/0x12
[<ffffffff814a7fc4>] _raw_write_lock_bh+0x16/0x4f
[<ffffffff81433c38>] tcp_close+0x159/0x355
[<ffffffff81453c99>] inet_release+0xc3/0xcd
[<ffffffff813dff3f>] sock_release+0x1f/0x74
[<ffffffff813dffbb>] sock_close+0x27/0x2b
[<ffffffff81129c63>] fput+0x11d/0x1e3
[<ffffffff81126577>] filp_close+0x70/0x7b
[<ffffffff8112667a>] sys_close+0xf8/0x13d
[<ffffffff814ae882>] system_call_fastpath+0x16/0x1b
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
|
|
This reverts commit c9b7e8bcef1fb9476e4198c66bc066d1a22acb24.
Horia is right - this issue must be fixed in userspace to avoid inter-kernel version compatibility (e.g., this now makes the SDK kernel not able to run IPSec SHA2 with any other kernel). If the tools you are using don't support it, either use a tool that does, or fix the tool itself.
Change-Id: I48db232b1e55bc5bd9957667b88fe27e66114546
Reviewed-on: http://git.am.freescale.net:8181/12825
Reviewed-by: Scott Wood <scottwood@freescale.com>
Reviewed-by: Jose Rivera <German.Rivera@freescale.com>
Tested-by: Jose Rivera <German.Rivera@freescale.com>
|
|
As per RFC 4868,the ICV length of SHA2 should be 128 bits.
Signed-off-by: Biao Cao <B32719@freescale.com>
Signed-off-by: Ganga Negi <ganga.negi@freescale.com>
Change-Id: I54ee3eef1c6c9b3c5abe678a24126ddc46082397
Reviewed-on: http://git.am.freescale.net:8181/12814
Reviewed-by: Hemant Agrawal <hemant@freescale.com>
Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com>
Reviewed-by: Jose Rivera <German.Rivera@freescale.com>
|
|
In the case when KMs have no listeners, km_query() will fail and
temporary SAs are garbage collected immediately after their allocation.
This causes strain on memory allocation, leading even to OOM since
temporary SA alloc/free cycle is performed for every packet
and garbage collection does not keep up the pace.
The sane thing to do is to make sure we have audience before
temporary SA allocation.
Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Change-Id: I241fdd8133f4974aaf14a9bc12be089aaa1730ae
Reviewed-on: http://git.am.freescale.net:8181/12383
Reviewed-by: Marian Cristian Rotariu <marian.rotariu@freescale.com>
Reviewed-by: Jose Rivera <German.Rivera@freescale.com>
Tested-by: Jose Rivera <German.Rivera@freescale.com>
|
|
raise_softirq_irqoff() disables interrupts and wakes the softirq
daemon, but after reenabling interrupts there is no preemption check,
so the execution of the softirq thread might be delayed arbitrarily.
In principle we could add that check to local_irq_enable/restore, but
that's overkill as the rasie_softirq_irqoff() sections are the only
ones which show this behaviour.
Reported-by: Carsten Emde <cbe@osadl.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable-rt@vger.kernel.org
|
|
On RT write_seqcount_begin() disables preemption and device_rename()
allocates memory with GFP_KERNEL and grabs later the sysfs_mutex
mutex. Serialize with a mutex and add use the non preemption disabling
__write_seqcount_begin().
To avoid writer starvation, let the reader grab the mutex and release
it when it detects a writer in progress. This keeps the normal case
(no reader on the fly) fast.
[ tglx: Instead of replacing the seqcount by a mutex, add the mutex ]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
This code triggers the new WARN in __raise_softirq_irqsoff() though it
actually looks at the softirq pending bit and calls into the softirq
code, but that fits not well with the context related softirq model of
RT. It's correct on mainline though, but going through
local_bh_disable/enable here is not going to hurt badly.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The netfilter code relies only on the implicit semantics of
local_bh_disable() for serializing wt_write_recseq sections. RT breaks
that and needs explicit serialization here.
Reported-by: Peter LaDow <petela@gocougs.wsu.edu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable-rt@vger.kernel.org
|
|
in response to the oops in ip_output.c:ip_send_unicast_reply under high
network load with CONFIG_PREEMPT_RT_FULL=y, reported by Sami Pietikainen
<Sami.Pietikainen@wapice.com>, this patch adds local serialization in
ip_send_unicast_reply.
from ip_output.c:
/*
* Generic function to send a packet as reply to another packet.
* Used to send some TCP resets/acks so far.
*
* Use a fake percpu inet socket to avoid false sharing and contention.
*/
static DEFINE_PER_CPU(struct inet_sock, unicast_sock) = {
...
which was added in commit be9f4a44 in linux-stable. The git log, wich
introduced the PER_CPU unicast_sock, states:
<snip>
commit be9f4a44e7d41cee50ddb5f038fc2391cbbb4046
Author: Eric Dumazet <edumazet@google.com>
Date: Thu Jul 19 07:34:03 2012 +0000
ipv4: tcp: remove per net tcp_sock
tcp_v4_send_reset() and tcp_v4_send_ack() use a single socket
per network namespace.
This leads to bad behavior on multiqueue NICS, because many cpus
contend for the socket lock and once socket lock is acquired, extra
false sharing on various socket fields slow down the operations.
To better resist to attacks, we use a percpu socket. Each cpu can
run without contention, using appropriate memory (local node)
<snip>
The per-cpu here thus is assuming exclusivity serializing per cpu - so
the use of get_cpu_ligh introduced in
net-use-cpu-light-in-ip-send-unicast-reply.patch, which droped the
preempt_disable in favor of a migrate_disable is probably wrong as this
only handles the referencial consistency but not the serialization. To
evade a preempt_disable here a local lock would be needed.
Therapie:
* add local lock:
* and re-introduce local serialization:
Tested on x86 with high network load using the testcase from Sami Pietikainen
while : ; do wget -O - ftp://LOCAL_SERVER/empty_file > /dev/null 2>&1; done
Link: http://www.spinics.net/lists/linux-rt-users/msg11007.html
Cc: stable-rt@vger.kernel.org
Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Replace it by a local lock. Though that's pretty inefficient :(
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
1)enqueue_to_backlog() (called from netif_rx) should be
bind to a particluar CPU. This can be achieved by
disabling migration. No need to disable preemption
2)Fixes crash "BUG: scheduling while atomic: ksoftirqd"
in case of RT.
If preemption is disabled, enqueue_to_backog() is called
in atomic context. And if backlog exceeds its count,
kfree_skb() is called. But in RT, kfree_skb() might
gets scheduled out, so it expects non atomic context.
3)When CONFIG_PREEMPT_RT_FULL is not defined,
migrate_enable(), migrate_disable() maps to
preempt_enable() and preempt_disable(), so no
change in functionality in case of non-RT.
-Replace preempt_enable(), preempt_disable() with
migrate_enable(), migrate_disable() respectively
-Replace get_cpu(), put_cpu() with get_cpu_light(),
put_cpu_light() respectively
Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com>
Acked-by: Rajan Srivastava <Rajan.Srivastava@freescale.com>
Cc: <rostedt@goodmis.orgn>
Link: http://lkml.kernel.org/r/1337227511-2271-1-git-send-email-Priyanka.Jain@freescale.com
Cc: stable-rt@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
There are (probably rare) situations when a system crashed and the system
console becomes unresponsive but the network icmp layer still is alive.
Wouldn't it be wonderful, if we then could submit a sysreq command via ping?
This patch provides this facility. Please consult the updated documentation
Documentation/sysrq.txt for details.
Signed-off-by: Carsten Emde <C.Emde@osadl.org>
|
|
qdisc_lock is taken w/o disabling interrupts or bottom halfs. So code
holding a qdisc_lock() can be interrupted and softirqs can run on the
return of interrupt in !RT.
The spin_trylock() in net_tx_action() makes sure, that the softirq
does not deadlock. When the lock can't be acquired q is requeued and
the NET_TX softirq is raised. That causes the softirq to run over and
over.
That works in mainline as do_softirq() has a retry loop limit and
leaves the softirq processing in the interrupt return path and
schedules ksoftirqd. The task which holds qdisc_lock cannot be
preempted, so the lock is released and either ksoftirqd or the next
softirq in the return from interrupt path can proceed. Though it's a
bit strange to actually run MAX_SOFTIRQ_RESTART (10) loops before it
decides to bail out even if it's clear in the first iteration :)
On RT all softirq processing is done in a FIFO thread and we don't
have a loop limit, so ksoftirqd preempts the lock holder forever and
unqueues and requeues until the reset button is hit.
Due to the forced threading of ksoftirqd on RT we actually cannot
deadlock on qdisc_lock because it's a "sleeping lock". So it's safe to
replace the spin_trylock() with a spin_lock(). When contended,
ksoftirqd is scheduled out and the lock holder can proceed.
[ tglx: Massaged changelog and code comments ]
Solved-by: Thomas Gleixner <tglx@linuxtronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Tested-by: Carsten Emde <cbe@osadl.org>
Cc: Clark Williams <williams@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Claudio R. Goncalves <lclaudio@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Retry loops on RT might loop forever when the modifying side was
preempted. Use cpu_chill() instead of cpu_relax() to let the system
make progress.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable-rt@vger.kernel.org
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
for outstanding qdisc_run calls
On PREEMPT_RT enabled systems the interrupt handler run as threads at prio 50
(by default). If a high priority userspace process tries to shut down a busy
network interface it might spin in a yield loop waiting for the device to
become idle. With the interrupt thread having a lower priority than the
looping process it might never be scheduled and so result in a deadlock on UP
systems.
With Magic SysRq the following backtrace can be produced:
> test_app R running 0 174 168 0x00000000
> [<c02c7070>] (__schedule+0x220/0x3fc) from [<c02c7870>] (preempt_schedule_irq+0x48/0x80)
> [<c02c7870>] (preempt_schedule_irq+0x48/0x80) from [<c0008fa8>] (svc_preempt+0x8/0x20)
> [<c0008fa8>] (svc_preempt+0x8/0x20) from [<c001a984>] (local_bh_enable+0x18/0x88)
> [<c001a984>] (local_bh_enable+0x18/0x88) from [<c025316c>] (dev_deactivate_many+0x220/0x264)
> [<c025316c>] (dev_deactivate_many+0x220/0x264) from [<c023be04>] (__dev_close_many+0x64/0xd4)
> [<c023be04>] (__dev_close_many+0x64/0xd4) from [<c023be9c>] (__dev_close+0x28/0x3c)
> [<c023be9c>] (__dev_close+0x28/0x3c) from [<c023f7f0>] (__dev_change_flags+0x88/0x130)
> [<c023f7f0>] (__dev_change_flags+0x88/0x130) from [<c023f904>] (dev_change_flags+0x10/0x48)
> [<c023f904>] (dev_change_flags+0x10/0x48) from [<c024c140>] (do_setlink+0x370/0x7ec)
> [<c024c140>] (do_setlink+0x370/0x7ec) from [<c024d2f0>] (rtnl_newlink+0x2b4/0x450)
> [<c024d2f0>] (rtnl_newlink+0x2b4/0x450) from [<c024cfa0>] (rtnetlink_rcv_msg+0x158/0x1f4)
> [<c024cfa0>] (rtnetlink_rcv_msg+0x158/0x1f4) from [<c0256740>] (netlink_rcv_skb+0xac/0xc0)
> [<c0256740>] (netlink_rcv_skb+0xac/0xc0) from [<c024bbd8>] (rtnetlink_rcv+0x18/0x24)
> [<c024bbd8>] (rtnetlink_rcv+0x18/0x24) from [<c02561b8>] (netlink_unicast+0x13c/0x198)
> [<c02561b8>] (netlink_unicast+0x13c/0x198) from [<c025651c>] (netlink_sendmsg+0x264/0x2e0)
> [<c025651c>] (netlink_sendmsg+0x264/0x2e0) from [<c022af98>] (sock_sendmsg+0x78/0x98)
> [<c022af98>] (sock_sendmsg+0x78/0x98) from [<c022bb50>] (___sys_sendmsg.part.25+0x268/0x278)
> [<c022bb50>] (___sys_sendmsg.part.25+0x268/0x278) from [<c022cf08>] (__sys_sendmsg+0x48/0x78)
> [<c022cf08>] (__sys_sendmsg+0x48/0x78) from [<c0009320>] (ret_fast_syscall+0x0/0x2c)
This patch works around the problem by replacing yield() by msleep(1), giving
the interrupt thread time to finish, similar to other changes contained in the
rt patch set. Using wait_for_completion() instead would probably be a better
solution.
Cc: stable-rt@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
=======================================================
[ INFO: possible circular locking dependency detected ]
3.0.0-rc3+ #26
-------------------------------------------------------
ip/1104 is trying to acquire lock:
(local_softirq_lock){+.+...}, at: [<ffffffff81056d12>] __local_lock+0x25/0x68
but task is already holding lock:
(sk_lock-AF_INET){+.+...}, at: [<ffffffff81433308>] lock_sock+0x10/0x12
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (sk_lock-AF_INET){+.+...}:
[<ffffffff810836e5>] lock_acquire+0x103/0x12e
[<ffffffff813e2781>] lock_sock_nested+0x82/0x92
[<ffffffff81433308>] lock_sock+0x10/0x12
[<ffffffff81433afa>] tcp_close+0x1b/0x355
[<ffffffff81453c99>] inet_release+0xc3/0xcd
[<ffffffff813dff3f>] sock_release+0x1f/0x74
[<ffffffff813dffbb>] sock_close+0x27/0x2b
[<ffffffff81129c63>] fput+0x11d/0x1e3
[<ffffffff81126577>] filp_close+0x70/0x7b
[<ffffffff8112667a>] sys_close+0xf8/0x13d
[<ffffffff814ae882>] system_call_fastpath+0x16/0x1b
-> #0 (local_softirq_lock){+.+...}:
[<ffffffff81082ecc>] __lock_acquire+0xacc/0xdc8
[<ffffffff810836e5>] lock_acquire+0x103/0x12e
[<ffffffff814a7e40>] _raw_spin_lock+0x3b/0x4a
[<ffffffff81056d12>] __local_lock+0x25/0x68
[<ffffffff81056d8b>] local_bh_disable+0x36/0x3b
[<ffffffff814a7fc4>] _raw_write_lock_bh+0x16/0x4f
[<ffffffff81433c38>] tcp_close+0x159/0x355
[<ffffffff81453c99>] inet_release+0xc3/0xcd
[<ffffffff813dff3f>] sock_release+0x1f/0x74
[<ffffffff813dffbb>] sock_close+0x27/0x2b
[<ffffffff81129c63>] fput+0x11d/0x1e3
[<ffffffff81126577>] filp_close+0x70/0x7b
[<ffffffff8112667a>] sys_close+0xf8/0x13d
[<ffffffff814ae882>] system_call_fastpath+0x16/0x1b
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(sk_lock-AF_INET);
lock(local_softirq_lock);
lock(sk_lock-AF_INET);
lock(local_softirq_lock);
*** DEADLOCK ***
1 lock held by ip/1104:
#0: (sk_lock-AF_INET){+.+...}, at: [<ffffffff81433308>] lock_sock+0x10/0x12
stack backtrace:
Pid: 1104, comm: ip Not tainted 3.0.0-rc3+ #26
Call Trace:
[<ffffffff81081649>] print_circular_bug+0x1f8/0x209
[<ffffffff81082ecc>] __lock_acquire+0xacc/0xdc8
[<ffffffff81056d12>] ? __local_lock+0x25/0x68
[<ffffffff810836e5>] lock_acquire+0x103/0x12e
[<ffffffff81056d12>] ? __local_lock+0x25/0x68
[<ffffffff81046c75>] ? get_parent_ip+0x11/0x41
[<ffffffff814a7e40>] _raw_spin_lock+0x3b/0x4a
[<ffffffff81056d12>] ? __local_lock+0x25/0x68
[<ffffffff81046c8c>] ? get_parent_ip+0x28/0x41
[<ffffffff81056d12>] __local_lock+0x25/0x68
[<ffffffff81056d8b>] local_bh_disable+0x36/0x3b
[<ffffffff81433308>] ? lock_sock+0x10/0x12
[<ffffffff814a7fc4>] _raw_write_lock_bh+0x16/0x4f
[<ffffffff81433c38>] tcp_close+0x159/0x355
[<ffffffff81453c99>] inet_release+0xc3/0xcd
[<ffffffff813dff3f>] sock_release+0x1f/0x74
[<ffffffff813dffbb>] sock_close+0x27/0x2b
[<ffffffff81129c63>] fput+0x11d/0x1e3
[<ffffffff81126577>] filp_close+0x70/0x7b
[<ffffffff8112667a>] sys_close+0xf8/0x13d
[<ffffffff814ae882>] system_call_fastpath+0x16/0x1b
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
|
|
Signed-off-by: Scott Wood <scottwood@freescale.com>
Conflicts:
arch/sparc/Kconfig
drivers/tty/tty_buffer.c
|
|
The csum_tcpudp_magic argument requires a __be32 type while
‘union inet_addr’ type was passed in.
Signed-off-by: Mingkai Hu <Mingkai.Hu@freescale.com>
Change-Id: I3406f7b997d31b516df43f873908886fcc91c06b
Reviewed-on: http://git.am.freescale.net:8181/11809
Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com>
Reviewed-by: Tiefei Zang <tie-fei.zang@freescale.com>
Reviewed-by: Jose Rivera <German.Rivera@freescale.com>
|
|
commit 5981a8821b774ada0be512fd9bad7c241e17657e upstream.
This patch fixes authentication failure on LE link re-connection when
BlueZ acts as slave (peripheral). LTK is removed from the internal list
after its first use causing PIN or Key missing reply when re-connecting
the link. The LE Long Term Key Request event indicates that the master
is attempting to encrypt or re-encrypt the link.
Pre-condition: BlueZ host paired and running as slave.
How to reproduce(master):
1) Establish an ACL LE encrypted link
2) Disconnect the link
3) Try to re-establish the ACL LE encrypted link (fails)
> HCI Event: LE Meta Event (0x3e) plen 19
LE Connection Complete (0x01)
Status: Success (0x00)
Handle: 64
Role: Slave (0x01)
...
@ Device Connected: 00:02:72:DC:29:C9 (1) flags 0x0000
> HCI Event: LE Meta Event (0x3e) plen 13
LE Long Term Key Request (0x05)
Handle: 64
Random number: 875be18439d9aa37
Encryption diversifier: 0x76ed
< HCI Command: LE Long Term Key Request Reply (0x08|0x001a) plen 18
Handle: 64
Long term key: 2aa531db2fce9f00a0569c7d23d17409
> HCI Event: Command Complete (0x0e) plen 6
LE Long Term Key Request Reply (0x08|0x001a) ncmd 1
Status: Success (0x00)
Handle: 64
> HCI Event: Encryption Change (0x08) plen 4
Status: Success (0x00)
Handle: 64
Encryption: Enabled with AES-CCM (0x01)
...
@ Device Disconnected: 00:02:72:DC:29:C9 (1) reason 3
< HCI Command: LE Set Advertise Enable (0x08|0x000a) plen 1
Advertising: Enabled (0x01)
> HCI Event: Command Complete (0x0e) plen 4
LE Set Advertise Enable (0x08|0x000a) ncmd 1
Status: Success (0x00)
> HCI Event: LE Meta Event (0x3e) plen 19
LE Connection Complete (0x01)
Status: Success (0x00)
Handle: 64
Role: Slave (0x01)
...
@ Device Connected: 00:02:72:DC:29:C9 (1) flags 0x0000
> HCI Event: LE Meta Event (0x3e) plen 13
LE Long Term Key Request (0x05)
Handle: 64
Random number: 875be18439d9aa37
Encryption diversifier: 0x76ed
< HCI Command: LE Long Term Key Request Neg Reply (0x08|0x001b) plen 2
Handle: 64
> HCI Event: Command Complete (0x0e) plen 6
LE Long Term Key Request Neg Reply (0x08|0x001b) ncmd 1
Status: Success (0x00)
Handle: 64
> HCI Event: Disconnect Complete (0x05) plen 4
Status: Success (0x00)
Handle: 64
Reason: Authentication Failure (0x05)
@ Device Disconnected: 00:02:72:DC:29:C9 (1) reason 0
Signed-off-by: Claudio Takahasi <claudio.takahasi@openbossa.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit b04c46190219a4f845e46a459e3102137b7f6cac upstream.
Plug a group_info refcount leak in ping_init.
group_info is only needed during initialization and
the code failed to release the reference on exit.
While here move grabbing the reference to a place
where it is actually needed.
Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com>
Signed-off-by: Zhang Dongxing <dongxing.zhang@intel.com>
Signed-off-by: xiaoming wang <xiaoming.wang@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit b07c26511e94ab856f3700c56d582c0da36d5b4d upstream.
The combination of two commits:
commit 8e4e1713e4
("openvswitch: Simplify datapath locking.")
commit 2537b4dd0a
("openvswitch:: link upper device for port devices")
introduced a bug where upper_dev wasn't unlinked upon
netdev_unregister notification
The following steps:
modprobe openvswitch
ovs-dpctl add-dp test
ip tuntap add dev tap1 mode tap
ovs-dpctl add-if test tap1
ip tuntap del dev tap1 mode tap
are causing multiple warnings:
[ 62.747557] gre: GRE over IPv4 demultiplexor driver
[ 62.749579] openvswitch: Open vSwitch switching datapath
[ 62.755087] device test entered promiscuous mode
[ 62.765911] device tap1 entered promiscuous mode
[ 62.766033] IPv6: ADDRCONF(NETDEV_UP): tap1: link is not ready
[ 62.769017] ------------[ cut here ]------------
[ 62.769022] WARNING: CPU: 1 PID: 3267 at net/core/dev.c:5501 rollback_registered_many+0x20f/0x240()
[ 62.769023] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video
[ 62.769051] CPU: 1 PID: 3267 Comm: ip Not tainted 3.12.0-rc3+ #60
[ 62.769052] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
[ 62.769053] 0000000000000009 ffff8807f25cbd28 ffffffff8175e575 0000000000000006
[ 62.769055] 0000000000000000 ffff8807f25cbd68 ffffffff8105314c ffff8807f25cbd58
[ 62.769057] ffff8807f2634000 ffff8807f25cbdc8 ffff8807f25cbd88 ffff8807f25cbdc8
[ 62.769059] Call Trace:
[ 62.769062] [<ffffffff8175e575>] dump_stack+0x55/0x76
[ 62.769065] [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0
[ 62.769067] [<ffffffff8105319a>] warn_slowpath_null+0x1a/0x20
[ 62.769069] [<ffffffff8162a04f>] rollback_registered_many+0x20f/0x240
[ 62.769071] [<ffffffff8162a101>] rollback_registered+0x31/0x40
[ 62.769073] [<ffffffff8162a488>] unregister_netdevice_queue+0x58/0x90
[ 62.769075] [<ffffffff8154f900>] __tun_detach+0x140/0x340
[ 62.769077] [<ffffffff8154fb36>] tun_chr_close+0x36/0x60
[ 62.769080] [<ffffffff811bddaf>] __fput+0xff/0x260
[ 62.769082] [<ffffffff811bdf5e>] ____fput+0xe/0x10
[ 62.769084] [<ffffffff8107b515>] task_work_run+0xb5/0xe0
[ 62.769087] [<ffffffff810029b9>] do_notify_resume+0x59/0x80
[ 62.769089] [<ffffffff813a41fe>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 62.769091] [<ffffffff81770f5a>] int_signal+0x12/0x17
[ 62.769093] ---[ end trace 838756c62e156ffb ]---
[ 62.769481] ------------[ cut here ]------------
[ 62.769485] WARNING: CPU: 1 PID: 92 at fs/sysfs/inode.c:325 sysfs_hash_and_remove+0xa9/0xb0()
[ 62.769486] sysfs: can not remove 'master', no directory
[ 62.769486] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video
[ 62.769514] CPU: 1 PID: 92 Comm: kworker/1:2 Tainted: G W 3.12.0-rc3+ #60
[ 62.769515] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
[ 62.769518] Workqueue: events ovs_dp_notify_wq [openvswitch]
[ 62.769519] 0000000000000009 ffff880807ad3ac8 ffffffff8175e575 0000000000000006
[ 62.769521] ffff880807ad3b18 ffff880807ad3b08 ffffffff8105314c ffff880807ad3b28
[ 62.769523] 0000000000000000 ffffffff81a87a1f ffff8807f2634000 ffff880037038500
[ 62.769525] Call Trace:
[ 62.769528] [<ffffffff8175e575>] dump_stack+0x55/0x76
[ 62.769529] [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0
[ 62.769531] [<ffffffff81053236>] warn_slowpath_fmt+0x46/0x50
[ 62.769533] [<ffffffff8123e7e9>] sysfs_hash_and_remove+0xa9/0xb0
[ 62.769535] [<ffffffff81240e96>] sysfs_remove_link+0x26/0x30
[ 62.769538] [<ffffffff81631ef7>] __netdev_adjacent_dev_remove+0xf7/0x150
[ 62.769540] [<ffffffff81632037>] __netdev_adjacent_dev_unlink_lists+0x27/0x50
[ 62.769542] [<ffffffff8163213a>] __netdev_adjacent_dev_unlink_neighbour+0x3a/0x50
[ 62.769544] [<ffffffff8163218d>] netdev_upper_dev_unlink+0x3d/0x140
[ 62.769548] [<ffffffffa033c2db>] netdev_destroy+0x4b/0x80 [openvswitch]
[ 62.769550] [<ffffffffa033b696>] ovs_vport_del+0x46/0x60 [openvswitch]
[ 62.769552] [<ffffffffa0335314>] ovs_dp_detach_port+0x44/0x60 [openvswitch]
[ 62.769555] [<ffffffffa0336574>] ovs_dp_notify_wq+0xb4/0x150 [openvswitch]
[ 62.769557] [<ffffffff81075c28>] process_one_work+0x1d8/0x6a0
[ 62.769559] [<ffffffff81075bc8>] ? process_one_work+0x178/0x6a0
[ 62.769562] [<ffffffff8107659b>] worker_thread+0x11b/0x370
[ 62.769564] [<ffffffff81076480>] ? rescuer_thread+0x350/0x350
[ 62.769566] [<ffffffff8107f44a>] kthread+0xea/0xf0
[ 62.769568] [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150
[ 62.769570] [<ffffffff81770bac>] ret_from_fork+0x7c/0xb0
[ 62.769572] [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150
[ 62.769573] ---[ end trace 838756c62e156ffc ]---
[ 62.769574] ------------[ cut here ]------------
[ 62.769576] WARNING: CPU: 1 PID: 92 at fs/sysfs/inode.c:325 sysfs_hash_and_remove+0xa9/0xb0()
[ 62.769577] sysfs: can not remove 'upper_test', no directory
[ 62.769577] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video
[ 62.769603] CPU: 1 PID: 92 Comm: kworker/1:2 Tainted: G W 3.12.0-rc3+ #60
[ 62.769604] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
[ 62.769606] Workqueue: events ovs_dp_notify_wq [openvswitch]
[ 62.769607] 0000000000000009 ffff880807ad3ac8 ffffffff8175e575 0000000000000006
[ 62.769609] ffff880807ad3b18 ffff880807ad3b08 ffffffff8105314c ffff880807ad3b58
[ 62.769611] 0000000000000000 ffff880807ad3bd9 ffff8807f2634000 ffff880037038500
[ 62.769613] Call Trace:
[ 62.769615] [<ffffffff8175e575>] dump_stack+0x55/0x76
[ 62.769617] [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0
[ 62.769619] [<ffffffff81053236>] warn_slowpath_fmt+0x46/0x50
[ 62.769621] [<ffffffff8123e7e9>] sysfs_hash_and_remove+0xa9/0xb0
[ 62.769622] [<ffffffff81240e96>] sysfs_remove_link+0x26/0x30
[ 62.769624] [<ffffffff81631f22>] __netdev_adjacent_dev_remove+0x122/0x150
[ 62.769627] [<ffffffff81632037>] __netdev_adjacent_dev_unlink_lists+0x27/0x50
[ 62.769629] [<ffffffff8163213a>] __netdev_adjacent_dev_unlink_neighbour+0x3a/0x50
[ 62.769631] [<ffffffff8163218d>] netdev_upper_dev_unlink+0x3d/0x140
[ 62.769633] [<ffffffffa033c2db>] netdev_destroy+0x4b/0x80 [openvswitch]
[ 62.769636] [<ffffffffa033b696>] ovs_vport_del+0x46/0x60 [openvswitch]
[ 62.769638] [<ffffffffa0335314>] ovs_dp_detach_port+0x44/0x60 [openvswitch]
[ 62.769640] [<ffffffffa0336574>] ovs_dp_notify_wq+0xb4/0x150 [openvswitch]
[ 62.769642] [<ffffffff81075c28>] process_one_work+0x1d8/0x6a0
[ 62.769644] [<ffffffff81075bc8>] ? process_one_work+0x178/0x6a0
[ 62.769646] [<ffffffff8107659b>] worker_thread+0x11b/0x370
[ 62.769648] [<ffffffff81076480>] ? rescuer_thread+0x350/0x350
[ 62.769650] [<ffffffff8107f44a>] kthread+0xea/0xf0
[ 62.769652] [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150
[ 62.769654] [<ffffffff81770bac>] ret_from_fork+0x7c/0xb0
[ 62.769656] [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150
[ 62.769657] ---[ end trace 838756c62e156ffd ]---
[ 62.769724] device tap1 left promiscuous mode
This patch also affects moving devices between net namespaces.
OVS used to ignore netns move notifications which caused problems.
Like:
ovs-dpctl add-if test tap1
ip link set tap1 netns 3512
and then removing tap1 inside the namespace will cause hang on missing dev_put.
With this patch OVS will detach dev upon receiving netns move event.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
[ Upstream commit bf39b4247b8799935ea91d90db250ab608a58e50 ]
Binding might result in a NULL device which is later dereferenced
without checking.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
[ Upstream commit 43a43b6040165f7b40b5b489fe61a4cb7f8c4980 ]
After commit c15b1ccadb323ea ("ipv6: move DAD and addrconf_verify
processing to workqueue") some counters are now updated in process context
and thus need to disable bh before doing so, otherwise deadlocks can
happen on 32-bit archs. Fabio Estevam noticed this while while mounting
a NFS volume on an ARM board.
As a compensation for missing this I looked after the other *_STATS_BH
and found three other calls which need updating:
1) icmp6_send: ip6_fragment -> icmpv6_send -> icmp6_send (error handling)
2) ip6_push_pending_frames: rawv6_sendmsg -> rawv6_push_pending_frames -> ...
(only in case of icmp protocol with raw sockets in error handling)
3) ping6_v6_sendmsg (error handling)
Fixes: c15b1ccadb323ea ("ipv6: move DAD and addrconf_verify processing to workqueue")
Reported-by: Fabio Estevam <festevam@gmail.com>
Tested-by: Fabio Estevam <fabio.estevam@freescale.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
[ Upstream commit fc0d48b8fb449ca007b2057328abf736cb516168 ]
Currently, if the card supports CTAG acceleration we do not
account for the vlan header even if we are configuring an
8021AD vlan. This may not be best since we'll do software
tagging for 8021AD which will cause data copy on skb head expansion
Configure the length based on available hw offload capabilities and
vlan protocol.
CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
[ Upstream commit fbd02dd405d0724a0f25897ed4a6813297c9b96f ]
Commit 10ddceb22ba (ip_tunnel:multicast process cause panic due
to skb->_skb_refdst NULL pointer) removed dst-drop call from
ip-tunnel-recv.
Following commit reintroduce dst-drop and fix the original bug by
checking loopback packet before releasing dst.
Original bug: https://bugzilla.kernel.org/show_bug.cgi?id=70681
CC: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
[ Upstream commit a5d0e7c037119484a7006b883618bfa87996cb41 ]
If a topology event subscription fails for any reason, such as out
of memory, max number reached or because we received an invalid
request the correct behavior is to terminate the subscribers
connection to the topology server. This is currently broken and
produces the following oops:
[27.953662] tipc: Subscription rejected, illegal request
[27.955329] BUG: spinlock recursion on CPU#1, kworker/u4:0/6
[27.957066] lock: 0xffff88003c67f408, .magic: dead4ead, .owner: kworker/u4:0/6, .owner_cpu: 1
[27.958054] CPU: 1 PID: 6 Comm: kworker/u4:0 Not tainted 3.14.0-rc6+ #5
[27.960230] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[27.960874] Workqueue: tipc_rcv tipc_recv_work [tipc]
[27.961430] ffff88003c67f408 ffff88003de27c18 ffffffff815c0207 ffff88003de1c050
[27.962292] ffff88003de27c38 ffffffff815beec5 ffff88003c67f408 ffffffff817f0a8a
[27.963152] ffff88003de27c58 ffffffff815beeeb ffff88003c67f408 ffffffffa0013520
[27.964023] Call Trace:
[27.964292] [<ffffffff815c0207>] dump_stack+0x45/0x56
[27.964874] [<ffffffff815beec5>] spin_dump+0x8c/0x91
[27.965420] [<ffffffff815beeeb>] spin_bug+0x21/0x26
[27.965995] [<ffffffff81083df6>] do_raw_spin_lock+0x116/0x140
[27.966631] [<ffffffff815c6215>] _raw_spin_lock_bh+0x15/0x20
[27.967256] [<ffffffffa0008540>] subscr_conn_shutdown_event+0x20/0xa0 [tipc]
[27.968051] [<ffffffffa000fde4>] tipc_close_conn+0xa4/0xb0 [tipc]
[27.968722] [<ffffffffa00101ba>] tipc_conn_terminate+0x1a/0x30 [tipc]
[27.969436] [<ffffffffa00089a2>] subscr_conn_msg_event+0x1f2/0x2f0 [tipc]
[27.970209] [<ffffffffa0010000>] tipc_receive_from_sock+0x90/0xf0 [tipc]
[27.970972] [<ffffffffa000fa79>] tipc_recv_work+0x29/0x50 [tipc]
[27.971633] [<ffffffff8105dbf5>] process_one_work+0x165/0x3e0
[27.972267] [<ffffffff8105e869>] worker_thread+0x119/0x3a0
[27.972896] [<ffffffff8105e750>] ? manage_workers.isra.25+0x2a0/0x2a0
[27.973622] [<ffffffff810648af>] kthread+0xdf/0x100
[27.974168] [<ffffffff810647d0>] ? kthread_create_on_node+0x1a0/0x1a0
[27.974893] [<ffffffff815ce13c>] ret_from_fork+0x7c/0xb0
[27.975466] [<ffffffff810647d0>] ? kthread_create_on_node+0x1a0/0x1a0
The recursion occurs when subscr_terminate tries to grab the
subscriber lock, which is already taken by subscr_conn_msg_event.
We fix this by checking if the request to establish a new
subscription was successful, and if not we initiate termination of
the subscriber after we have released the subscriber lock.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|