summaryrefslogtreecommitdiff
path: root/net/ipv4
AgeCommit message (Collapse)Author
2015-02-13Merge branch 'rtmerge'Scott Wood
Signed-off-by: Scott Wood <scottwood@freescale.com> Conflicts: arch/arm/kvm/mmu.c arch/arm/mm/proc-v7-3level.S arch/powerpc/kernel/vdso32/getcpu.S drivers/crypto/caam/error.c drivers/crypto/caam/sg_sw_sec4.h drivers/usb/host/ehci-fsl.c
2015-02-13net: ip_send_unicast_reply: add missing local serializationNicholas Mc Guire
in response to the oops in ip_output.c:ip_send_unicast_reply under high network load with CONFIG_PREEMPT_RT_FULL=y, reported by Sami Pietikainen <Sami.Pietikainen@wapice.com>, this patch adds local serialization in ip_send_unicast_reply. from ip_output.c: /* * Generic function to send a packet as reply to another packet. * Used to send some TCP resets/acks so far. * * Use a fake percpu inet socket to avoid false sharing and contention. */ static DEFINE_PER_CPU(struct inet_sock, unicast_sock) = { ... which was added in commit be9f4a44 in linux-stable. The git log, wich introduced the PER_CPU unicast_sock, states: <snip> commit be9f4a44e7d41cee50ddb5f038fc2391cbbb4046 Author: Eric Dumazet <edumazet@google.com> Date: Thu Jul 19 07:34:03 2012 +0000 ipv4: tcp: remove per net tcp_sock tcp_v4_send_reset() and tcp_v4_send_ack() use a single socket per network namespace. This leads to bad behavior on multiqueue NICS, because many cpus contend for the socket lock and once socket lock is acquired, extra false sharing on various socket fields slow down the operations. To better resist to attacks, we use a percpu socket. Each cpu can run without contention, using appropriate memory (local node) <snip> The per-cpu here thus is assuming exclusivity serializing per cpu - so the use of get_cpu_ligh introduced in net-use-cpu-light-in-ip-send-unicast-reply.patch, which droped the preempt_disable in favor of a migrate_disable is probably wrong as this only handles the referencial consistency but not the serialization. To evade a preempt_disable here a local lock would be needed. Therapie: * add local lock: * and re-introduce local serialization: Tested on x86 with high network load using the testcase from Sami Pietikainen while : ; do wget -O - ftp://LOCAL_SERVER/empty_file > /dev/null 2>&1; done Link: http://www.spinics.net/lists/linux-rt-users/msg11007.html Cc: stable-rt@vger.kernel.org Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2015-02-13net: Use get_cpu_light() in ip_send_unicast_reply()Thomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-02-13net: sysrq via icmpCarsten Emde
There are (probably rare) situations when a system crashed and the system console becomes unresponsive but the network icmp layer still is alive. Wouldn't it be wonderful, if we then could submit a sysreq command via ping? This patch provides this facility. Please consult the updated documentation Documentation/sysrq.txt for details. Signed-off-by: Carsten Emde <C.Emde@osadl.org>
2015-02-13Reset to 3.12.37Scott Wood
2014-05-14net: ip_send_unicast_reply: add missing local serializationNicholas Mc Guire
in response to the oops in ip_output.c:ip_send_unicast_reply under high network load with CONFIG_PREEMPT_RT_FULL=y, reported by Sami Pietikainen <Sami.Pietikainen@wapice.com>, this patch adds local serialization in ip_send_unicast_reply. from ip_output.c: /* * Generic function to send a packet as reply to another packet. * Used to send some TCP resets/acks so far. * * Use a fake percpu inet socket to avoid false sharing and contention. */ static DEFINE_PER_CPU(struct inet_sock, unicast_sock) = { ... which was added in commit be9f4a44 in linux-stable. The git log, wich introduced the PER_CPU unicast_sock, states: <snip> commit be9f4a44e7d41cee50ddb5f038fc2391cbbb4046 Author: Eric Dumazet <edumazet@google.com> Date: Thu Jul 19 07:34:03 2012 +0000 ipv4: tcp: remove per net tcp_sock tcp_v4_send_reset() and tcp_v4_send_ack() use a single socket per network namespace. This leads to bad behavior on multiqueue NICS, because many cpus contend for the socket lock and once socket lock is acquired, extra false sharing on various socket fields slow down the operations. To better resist to attacks, we use a percpu socket. Each cpu can run without contention, using appropriate memory (local node) <snip> The per-cpu here thus is assuming exclusivity serializing per cpu - so the use of get_cpu_ligh introduced in net-use-cpu-light-in-ip-send-unicast-reply.patch, which droped the preempt_disable in favor of a migrate_disable is probably wrong as this only handles the referencial consistency but not the serialization. To evade a preempt_disable here a local lock would be needed. Therapie: * add local lock: * and re-introduce local serialization: Tested on x86 with high network load using the testcase from Sami Pietikainen while : ; do wget -O - ftp://LOCAL_SERVER/empty_file > /dev/null 2>&1; done Link: http://www.spinics.net/lists/linux-rt-users/msg11007.html Cc: stable-rt@vger.kernel.org Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2014-05-14net: Use get_cpu_light() in ip_send_unicast_reply()Thomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-05-14net: sysrq via icmpCarsten Emde
There are (probably rare) situations when a system crashed and the system console becomes unresponsive but the network icmp layer still is alive. Wouldn't it be wonderful, if we then could submit a sysreq command via ping? This patch provides this facility. Please consult the updated documentation Documentation/sysrq.txt for details. Signed-off-by: Carsten Emde <C.Emde@osadl.org>
2014-05-14Reset to 3.12.19Scott Wood
2014-05-14Merge remote-tracking branch 'stable/linux-3.12.y' into sdk-v1.6.xScott Wood
Signed-off-by: Scott Wood <scottwood@freescale.com> Conflicts: arch/sparc/Kconfig drivers/tty/tty_buffer.c
2014-05-05net: ipv4: current group_info should be put after using.Wang, Xiaoming
commit b04c46190219a4f845e46a459e3102137b7f6cac upstream. Plug a group_info refcount leak in ping_init. group_info is only needed during initialization and the code failed to release the reference on exit. While here move grabbing the reference to a place where it is actually needed. Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com> Signed-off-by: Zhang Dongxing <dongxing.zhang@intel.com> Signed-off-by: xiaoming wang <xiaoming.wang@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-04-18ip_tunnel: Fix dst ref-count.Pravin B Shelar
[ Upstream commit fbd02dd405d0724a0f25897ed4a6813297c9b96f ] Commit 10ddceb22ba (ip_tunnel:multicast process cause panic due to skb->_skb_refdst NULL pointer) removed dst-drop call from ip-tunnel-recv. Following commit reintroduce dst-drop and fix the original bug by checking loopback packet before releasing dst. Original bug: https://bugzilla.kernel.org/show_bug.cgi?id=70681 CC: Xin Long <lucien.xin@gmail.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-04-18ipmr: fix mfc notification flagsNicolas Dichtel
[ Upstream commit 65886f439ab0fdc2dff20d1fa87afb98c6717472 ] Commit 8cd3ac9f9b7b ("ipmr: advertise new mfc entries via rtnl") reuses the function ipmr_fill_mroute() to notify mfc events. But this function was used only for dump and thus was always setting the flag NLM_F_MULTI, which is wrong in case of a single notification. Libraries like libnl will wait forever for NLMSG_DONE. CC: Thomas Graf <tgraf@suug.ch> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-04-18tcp: tcp_release_cb() should release socket ownershipEric Dumazet
[ Upstream commit c3f9b01849ef3bc69024990092b9f42e20df7797 ] Lars Persson reported following deadlock : -000 |M:0x0:0x802B6AF8(asm) <-- arch_spin_lock -001 |tcp_v4_rcv(skb = 0x8BD527A0) <-- sk = 0x8BE6B2A0 -002 |ip_local_deliver_finish(skb = 0x8BD527A0) -003 |__netif_receive_skb_core(skb = 0x8BD527A0, ?) -004 |netif_receive_skb(skb = 0x8BD527A0) -005 |elk_poll(napi = 0x8C770500, budget = 64) -006 |net_rx_action(?) -007 |__do_softirq() -008 |do_softirq() -009 |local_bh_enable() -010 |tcp_rcv_established(sk = 0x8BE6B2A0, skb = 0x87D3A9E0, th = 0x814EBE14, ?) -011 |tcp_v4_do_rcv(sk = 0x8BE6B2A0, skb = 0x87D3A9E0) -012 |tcp_delack_timer_handler(sk = 0x8BE6B2A0) -013 |tcp_release_cb(sk = 0x8BE6B2A0) -014 |release_sock(sk = 0x8BE6B2A0) -015 |tcp_sendmsg(?, sk = 0x8BE6B2A0, ?, ?) -016 |sock_sendmsg(sock = 0x8518C4C0, msg = 0x87D8DAA8, size = 4096) -017 |kernel_sendmsg(?, ?, ?, ?, size = 4096) -018 |smb_send_kvec() -019 |smb_send_rqst(server = 0x87C4D400, rqst = 0x87D8DBA0) -020 |cifs_call_async() -021 |cifs_async_writev(wdata = 0x87FD6580) -022 |cifs_writepages(mapping = 0x852096E4, wbc = 0x87D8DC88) -023 |__writeback_single_inode(inode = 0x852095D0, wbc = 0x87D8DC88) -024 |writeback_sb_inodes(sb = 0x87D6D800, wb = 0x87E4A9C0, work = 0x87D8DD88) -025 |__writeback_inodes_wb(wb = 0x87E4A9C0, work = 0x87D8DD88) -026 |wb_writeback(wb = 0x87E4A9C0, work = 0x87D8DD88) -027 |wb_do_writeback(wb = 0x87E4A9C0, force_wait = 0) -028 |bdi_writeback_workfn(work = 0x87E4A9CC) -029 |process_one_work(worker = 0x8B045880, work = 0x87E4A9CC) -030 |worker_thread(__worker = 0x8B045880) -031 |kthread(_create = 0x87CADD90) -032 |ret_from_kernel_thread(asm) Bug occurs because __tcp_checksum_complete_user() enables BH, assuming it is running from softirq context. Lars trace involved a NIC without RX checksum support but other points are problematic as well, like the prequeue stuff. Problem is triggered by a timer, that found socket being owned by user. tcp_release_cb() should call tcp_write_timer_handler() or tcp_delack_timer_handler() in the appropriate context : BH disabled and socket lock held, but 'owned' field cleared, as if they were running from timer handlers. Fixes: 6f458dfb4092 ("tcp: improve latencies of timer triggered events") Reported-by: Lars Persson <lars.persson@axis.com> Tested-by: Lars Persson <lars.persson@axis.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-04-18inet: frag: make sure forced eviction removes all fragsFlorian Westphal
[ Upstream commit e588e2f286ed7da011ed357c24c5b9a554e26595 ] Quoting Alexander Aring: While fragmentation and unloading of 6lowpan module I got this kernel Oops after few seconds: BUG: unable to handle kernel paging request at f88bbc30 [..] Modules linked in: ipv6 [last unloaded: 6lowpan] Call Trace: [<c012af4c>] ? call_timer_fn+0x54/0xb3 [<c012aef8>] ? process_timeout+0xa/0xa [<c012b66b>] run_timer_softirq+0x140/0x15f Problem is that incomplete frags are still around after unload; when their frag expire timer fires, we get crash. When a netns is removed (also done when unloading module), inet_frag calls the evictor with 'force' argument to purge remaining frags. The evictor loop terminates when accounted memory ('work') drops to 0 or the lru-list becomes empty. However, the mem accounting is done via percpu counters and may not be accurate, i.e. loop may terminate prematurely. Alter evictor to only stop once the lru list is empty when force is requested. Reported-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Reported-by: Alexander Aring <alex.aring@gmail.com> Tested-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-04-18net: fix for a race condition in the inet frag codeNikolay Aleksandrov
[ Upstream commit 24b9bf43e93e0edd89072da51cf1fab95fc69dec ] I stumbled upon this very serious bug while hunting for another one, it's a very subtle race condition between inet_frag_evictor, inet_frag_intern and the IPv4/6 frag_queue and expire functions (basically the users of inet_frag_kill/inet_frag_put). What happens is that after a fragment has been added to the hash chain but before it's been added to the lru_list (inet_frag_lru_add) in inet_frag_intern, it may get deleted (either by an expired timer if the system load is high or the timer sufficiently low, or by the fraq_queue function for different reasons) before it's added to the lru_list, then after it gets added it's a matter of time for the evictor to get to a piece of memory which has been freed leading to a number of different bugs depending on what's left there. I've been able to trigger this on both IPv4 and IPv6 (which is normal as the frag code is the same), but it's been much more difficult to trigger on IPv4 due to the protocol differences about how fragments are treated. The setup I used to reproduce this is: 2 machines with 4 x 10G bonded in a RR bond, so the same flow can be seen on multiple cards at the same time. Then I used multiple instances of ping/ping6 to generate fragmented packets and flood the machines with them while running other processes to load the attacked machine. *It is very important to have the _same flow_ coming in on multiple CPUs concurrently. Usually the attacked machine would die in less than 30 minutes, if configured properly to have many evictor calls and timeouts it could happen in 10 minutes or so. An important point to make is that any caller (frag_queue or timer) of inet_frag_kill will remove both the timer refcount and the original/guarding refcount thus removing everything that's keeping the frag from being freed at the next inet_frag_put. All of this could happen before the frag was ever added to the LRU list, then it gets added and the evictor uses a freed fragment. An example for IPv6 would be if a fragment is being added and is at the stage of being inserted in the hash after the hash lock is released, but before inet_frag_lru_add executes (or is able to obtain the lru lock) another overlapping fragment for the same flow arrives at a different CPU which finds it in the hash, but since it's overlapping it drops it invoking inet_frag_kill and thus removing all guarding refcounts, and afterwards freeing it by invoking inet_frag_put which removes the last refcount added previously by inet_frag_find, then inet_frag_lru_add gets executed by inet_frag_intern and we have a freed fragment in the lru_list. The fix is simple, just move the lru_add under the hash chain locked region so when a removing function is called it'll have to wait for the fragment to be added to the lru_list, and then it'll remove it (it works because the hash chain removal is done before the lru_list one and there's no window between the two list adds when the frag can get dropped). With this fix applied I couldn't kill the same machine in 24 hours with the same setup. Fixes: 3ef0eb0db4bf ("net: frag, move LRU list maintenance outside of rwlock") CC: Florian Westphal <fw@strlen.de> CC: Jesper Dangaard Brouer <brouer@redhat.com> CC: David S. Miller <davem@davemloft.net> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-04-10Merge branch 'rtmerge' into sdk-v1.6.xScott Wood
This merges 3.12.15-rt25. Signed-off-by: Scott Wood <scottwood@freescale.com> Conflicts: drivers/misc/Makefile drivers/net/ethernet/freescale/gianfar.c drivers/net/ethernet/freescale/gianfar_ethtool.c drivers/net/ethernet/freescale/gianfar_sysfs.c
2014-04-10net: ip_send_unicast_reply: add missing local serializationNicholas Mc Guire
in response to the oops in ip_output.c:ip_send_unicast_reply under high network load with CONFIG_PREEMPT_RT_FULL=y, reported by Sami Pietikainen <Sami.Pietikainen@wapice.com>, this patch adds local serialization in ip_send_unicast_reply. from ip_output.c: /* * Generic function to send a packet as reply to another packet. * Used to send some TCP resets/acks so far. * * Use a fake percpu inet socket to avoid false sharing and contention. */ static DEFINE_PER_CPU(struct inet_sock, unicast_sock) = { ... which was added in commit be9f4a44 in linux-stable. The git log, wich introduced the PER_CPU unicast_sock, states: <snip> commit be9f4a44e7d41cee50ddb5f038fc2391cbbb4046 Author: Eric Dumazet <edumazet@google.com> Date: Thu Jul 19 07:34:03 2012 +0000 ipv4: tcp: remove per net tcp_sock tcp_v4_send_reset() and tcp_v4_send_ack() use a single socket per network namespace. This leads to bad behavior on multiqueue NICS, because many cpus contend for the socket lock and once socket lock is acquired, extra false sharing on various socket fields slow down the operations. To better resist to attacks, we use a percpu socket. Each cpu can run without contention, using appropriate memory (local node) <snip> The per-cpu here thus is assuming exclusivity serializing per cpu - so the use of get_cpu_ligh introduced in net-use-cpu-light-in-ip-send-unicast-reply.patch, which droped the preempt_disable in favor of a migrate_disable is probably wrong as this only handles the referencial consistency but not the serialization. To evade a preempt_disable here a local lock would be needed. Therapie: * add local lock: * and re-introduce local serialization: Tested on x86 with high network load using the testcase from Sami Pietikainen while : ; do wget -O - ftp://LOCAL_SERVER/empty_file > /dev/null 2>&1; done Link: http://www.spinics.net/lists/linux-rt-users/msg11007.html Cc: stable-rt@vger.kernel.org Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2014-04-10net: Use get_cpu_light() in ip_send_unicast_reply()Thomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-10net: sysrq via icmpCarsten Emde
There are (probably rare) situations when a system crashed and the system console becomes unresponsive but the network icmp layer still is alive. Wouldn't it be wonderful, if we then could submit a sysreq command via ping? This patch provides this facility. Please consult the updated documentation Documentation/sysrq.txt for details. Signed-off-by: Carsten Emde <C.Emde@osadl.org>
2014-04-08Merge branch 'sdk-kernel-3.12' into sdk-v1.6.xScott Wood
2014-04-08Fix build breaks in dma engine usersScott Wood
The dma engine was exempted from the 3.12 rollback because of excessive conflicts. Some users needed to be adjusted (net/ipv4/tcp.c) or exempted themselves (async_tx). Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-04-08Merge remote-tracking branch 'stable/linux-3.12.y' into sdk-v1.6.xScott Wood
Signed-off-by: Scott Wood <scottwood@freescale.com> Conflicts: drivers/mmc/card/block.c
2014-04-08Merge branch 'merge' into sdk-v1.6.xScott Wood
This reverts v3.13-rc3+ (78fd82238d0e5716) to v3.12, except for commits which I noticed which appear relevant to the SDK. Signed-off-by: Scott Wood <scottwood@freescale.com> Conflicts: arch/powerpc/include/asm/kvm_host.h arch/powerpc/kvm/book3s_hv_rmhandlers.S arch/powerpc/kvm/book3s_interrupts.S arch/powerpc/kvm/e500.c arch/powerpc/kvm/e500mc.c arch/powerpc/sysdev/fsl_soc.h drivers/Kconfig drivers/cpufreq/ppc-corenet-cpufreq.c drivers/dma/fsldma.c drivers/dma/s3c24xx-dma.c drivers/misc/Makefile drivers/mmc/host/sdhci-of-esdhc.c drivers/mtd/devices/m25p80.c drivers/net/ethernet/freescale/gianfar.h drivers/platform/Kconfig drivers/platform/Makefile drivers/spi/spi-fsl-espi.c include/crypto/algapi.h include/linux/netdev_features.h include/linux/skbuff.h include/net/ip.h net/core/ethtool.c
2014-04-07Rewind v3.13-rc3+ (78fd82238d0e5716) to v3.12Scott Wood
2014-03-14tcp: syncookies: reduce mss table to four valuesFlorian Westphal
commit 086293542b991fb88a2e41ae7b4f82ac65a20e1a upstream. Halve mss table size to make blind cookie guessing more difficult. This is sad since the tables were already small, but there is little alternative except perhaps adding more precise mss information in the tcp timestamp. Timestamps are unfortunately not ubiquitous. Guessing all possible cookie values still has 8-in 2**32 chance. Reported-by: Jakob Lell <jakob@jakoblell.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-03-14tcp: syncookies: reduce cookie lifetime to 128 secondsFlorian Westphal
commit 8c27bd75f04fb9cb70c69c3cfe24f4e6d8e15906 upstream. We currently accept cookies that were created less than 4 minutes ago (ie, cookies with counter delta 0-3). Combined with the 8 mss table values, this yields 32 possible values (out of 2**32) that will be valid. Reducing the lifetime to < 2 minutes halves the guessing chance while still providing a large enough period. While at it, get rid of jiffies value -- they overflow too quickly on 32 bit platforms. getnstimeofday is used to create a counter that increments every 64s. perf shows getnstimeofday cost is negible compared to sha_transform; normal tcp initial sequence number generation uses getnstimeofday, too. Reported-by: Jakob Lell <jakob@jakoblell.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-03-13ip_tunnel:multicast process cause panic due to skb->_skb_refdst NULL pointerXin Long
[ Upstream commit 10ddceb22bab11dab10ba645c7df2e4a8e7a5db5 ] when ip_tunnel process multicast packets, it may check if the packet is looped back packet though 'rt_is_output_route(skb_rtable(skb))' in ip_tunnel_rcv(), but before that , skb->_skb_refdst has been dropped in iptunnel_pull_header(), so which leads to a panic. fix the bug: https://bugzilla.kernel.org/show_bug.cgi?id=70681 Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-03-13net-tcp: fastopen: fix high order allocationsEric Dumazet
[ Upstream commit f5ddcbbb40aa0ba7fbfe22355d287603dbeeaaac ] This patch fixes two bugs in fastopen : 1) The tcp_sendmsg(..., @size) argument was ignored. Code was relying on user not fooling the kernel with iovec mismatches 2) When MTU is about 64KB, tcp_send_syn_data() attempts order-5 allocations, which are likely to fail when memory gets fragmented. Fixes: 783237e8daf13 ("net-tcp: Fast Open client - sending SYN-data") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Tested-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-02-26net: ip, ipv6: handle gso skbs in forwarding pathFlorian Westphal
commit fe6cc55f3a9a053482a76f5a6b2257cee51b4663 upstream. Marcelo Ricardo Leitner reported problems when the forwarding link path has a lower mtu than the incoming one if the inbound interface supports GRO. Given: Host <mtu1500> R1 <mtu1200> R2 Host sends tcp stream which is routed via R1 and R2. R1 performs GRO. In this case, the kernel will fail to send ICMP fragmentation needed messages (or pkt too big for ipv6), as GSO packets currently bypass dstmtu checks in forward path. Instead, Linux tries to send out packets exceeding the mtu. When locking route MTU on Host (i.e., no ipv4 DF bit set), R1 does not fragment the packets when forwarding, and again tries to send out packets exceeding R1-R2 link mtu. This alters the forwarding dstmtu checks to take the individual gso segment lengths into account. For ipv6, we send out pkt too big error for gso if the individual segments are too big. For ipv4, we either send icmp fragmentation needed, or, if the DF bit is not set, perform software segmentation and let the output path create fragments when the packet is leaving the machine. It is not 100% correct as the error message will contain the headers of the GRO skb instead of the original/segmented one, but it seems to work fine in my (limited) tests. Eric Dumazet suggested to simply shrink mss via ->gso_size to avoid sofware segmentation. However it turns out that skb_segment() assumes skb nr_frags is related to mss size so we would BUG there. I don't want to mess with it considering Herbert and Eric disagree on what the correct behavior should be. Hannes Frederic Sowa notes that when we would shrink gso_size skb_segment would then also need to deal with the case where SKB_MAX_FRAGS would be exceeded. This uses sofware segmentation in the forward path when we hit ipv4 non-DF packets and the outgoing link mtu is too small. Its not perfect, but given the lack of bug reports wrt. GRO fwd being broken this is a rare case anyway. Also its not like this could not be improved later once the dust settles. Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Reported-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-02-26ipv4: fix counter in_slow_totDuan Jiong
[ Upstream commit a6254864c08109c66a194612585afc0439005286 ] since commit 89aef8921bf("ipv4: Delete routing cache."), the counter in_slow_tot can't work correctly. The counter in_slow_tot increase by one when fib_lookup() return successfully in ip_route_input_slow(), but actually the dst struct maybe not be created and cached, so we can increase in_slow_tot after the dst struct is created. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-02-26tcp: tsq: fix nonagle handlingJohn Ogness
[ Upstream commit bf06200e732de613a1277984bf34d1a21c2de03d ] Commit 46d3ceabd8d9 ("tcp: TCP Small Queues") introduced a possible regression for applications using TCP_NODELAY. If TCP session is throttled because of tsq, we should consult tp->nonagle when TX completion is done and allow us to send additional segment, especially if this segment is not a full MSS. Otherwise this segment is sent after an RTO. [edumazet] : Cooked the changelog, added another fix about testing sk_wmem_alloc twice because TX completion can happen right before setting TSQ_THROTTLED bit. This problem is particularly visible with recent auto corking, but might also be triggered with low tcp_limit_output_bytes values or NIC drivers delaying TX completion by hundred of usec, and very low rtt. Thomas Glanzmann for example reported an iscsi regression, caused by tcp auto corking making this bug quite visible. Fixes: 46d3ceabd8d9 ("tcp: TCP Small Queues") Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Thomas Glanzmann <thomas@glanzmann.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-02-26ipv4: Fix runtime WARNING in rtmsg_ifa()Geert Uytterhoeven
[ Upstream commit 63b5f152eb4a5bb79b9caf7ec37b4201d12f6e66 ] On m68k/ARAnyM: WARNING: CPU: 0 PID: 407 at net/ipv4/devinet.c:1599 0x316a99() Modules linked in: CPU: 0 PID: 407 Comm: ifconfig Not tainted 3.13.0-atari-09263-g0c71d68014d1 #1378 Stack from 10c4fdf0: 10c4fdf0 002ffabb 000243e8 00000000 008ced6c 00024416 00316a99 0000063f 00316a99 00000009 00000000 002501b4 00316a99 0000063f c0a86117 00000080 c0a86117 00ad0c90 00250a5a 00000014 00ad0c90 00000000 00000000 00000001 00b02dd0 00356594 00000000 00356594 c0a86117 eff6c9e4 008ced6c 00000002 008ced60 0024f9b4 00250b52 00ad0c90 00000000 00000000 00252390 00ad0c90 eff6c9e4 0000004f 00000000 00000000 eff6c9e4 8000e25c eff6c9e4 80001020 Call Trace: [<000243e8>] warn_slowpath_common+0x52/0x6c [<00024416>] warn_slowpath_null+0x14/0x1a [<002501b4>] rtmsg_ifa+0xdc/0xf0 [<00250a5a>] __inet_insert_ifa+0xd6/0x1c2 [<0024f9b4>] inet_abc_len+0x0/0x42 [<00250b52>] inet_insert_ifa+0xc/0x12 [<00252390>] devinet_ioctl+0x2ae/0x5d6 Adding some debugging code reveals that net_fill_ifaddr() fails in put_cacheinfo(skb, ifa->ifa_cstamp, ifa->ifa_tstamp, preferred, valid)) nla_put complains: lib/nlattr.c:454: skb_tailroom(skb) = 12, nla_total_size(attrlen) = 20 Apparently commit 5c766d642bcaffd0c2a5b354db2068515b3846cf ("ipv4: introduce address lifetime") forgot to take into account the addition of struct ifa_cacheinfo in inet_nlmsg_size(). Hence add it, like is already done for ipv6. Suggested-by: Cong Wang <cwang@twopensource.com> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-02-06inet_diag: fix inet_diag_dump_icsk() timewait socket state logicNeal Cardwell
[ Based upon upstream commit 70315d22d3c7383f9a508d0aab21e2eb35b2303a ] Fix inet_diag_dump_icsk() to reflect the fact that both TIME_WAIT and FIN_WAIT2 connections are represented by inet_timewait_sock (not just TIME_WAIT). Thus: (a) We need to iterate through the time_wait buckets if the user wants either TIME_WAIT or FIN_WAIT2. (Before fixing this, "ss -nemoi state fin-wait-2" would not return any sockets, even if there were some in FIN_WAIT2.) (b) We need to check tw_substate to see if the user wants to dump sockets in the particular substate (TIME_WAIT or FIN_WAIT2) that a given connection is in. (Before fixing this, "ss -nemoi state time-wait" would actually return sockets in state FIN_WAIT2.) An analogous fix is in v3.13: 70315d22d3c7383f9a508d0aab21e2eb35b2303a ("inet_diag: fix inet_diag_dump_icsk() to use correct state for timewait sockets") but that patch is quite different because 3.13 code is very different in this area due to the unification of TCP hash tables in 05dbc7b ("tcp/dccp: remove twchain") in v3.13-rc1. I tested that this applies cleanly between v3.3 and v3.12, and tested that it works in both 3.3 and 3.12. It does not apply cleanly to 3.2 and earlier (though it makes semantic sense), and semantically is not the right fix for 3.13 and beyond (as mentioned above). Signed-off-by: Neal Cardwell <ncardwell@google.com> Cc: Eric Dumazet <edumazet@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-06net: gre: use icmp_hdr() to get inner ip headerDuan Jiong
[ Upstream commit c0c0c50ff7c3e331c90bab316d21f724fb9e1994 ] When dealing with icmp messages, the skb->data points the ip header that triggered the sending of the icmp message. In gre_cisco_err(), the parse_gre_header() is called, and the iptunnel_pull_header() is called to pull the skb at the end of the parse_gre_header(), so the skb->data doesn't point the inner ip header. Unfortunately, the ipgre_err still needs those ip addresses in inner ip header to look up tunnel by ip_tunnel_lookup(). So just use icmp_hdr() to get inner ip header instead of skb->data. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-06net: Fix memory leak if TPROXY used with TCP early demuxHolger Eitzenberger
[ Upstream commit a452ce345d63ddf92cd101e4196569f8718ad319 ] I see a memory leak when using a transparent HTTP proxy using TPROXY together with TCP early demux and Kernel v3.8.13.15 (Ubuntu stable): unreferenced object 0xffff88008cba4a40 (size 1696): comm "softirq", pid 0, jiffies 4294944115 (age 8907.520s) hex dump (first 32 bytes): 0a e0 20 6a 40 04 1b 37 92 be 32 e2 e8 b4 00 00 .. j@..7..2..... 02 00 07 01 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff810b710a>] kmem_cache_alloc+0xad/0xb9 [<ffffffff81270185>] sk_prot_alloc+0x29/0xc5 [<ffffffff812702cf>] sk_clone_lock+0x14/0x283 [<ffffffff812aaf3a>] inet_csk_clone_lock+0xf/0x7b [<ffffffff8129a893>] netlink_broadcast+0x14/0x16 [<ffffffff812c1573>] tcp_create_openreq_child+0x1b/0x4c3 [<ffffffff812c033e>] tcp_v4_syn_recv_sock+0x38/0x25d [<ffffffff812c13e4>] tcp_check_req+0x25c/0x3d0 [<ffffffff812bf87a>] tcp_v4_do_rcv+0x287/0x40e [<ffffffff812a08a7>] ip_route_input_noref+0x843/0xa55 [<ffffffff812bfeca>] tcp_v4_rcv+0x4c9/0x725 [<ffffffff812a26f4>] ip_local_deliver_finish+0xe9/0x154 [<ffffffff8127a927>] __netif_receive_skb+0x4b2/0x514 [<ffffffff8127aa77>] process_backlog+0xee/0x1c5 [<ffffffff8127c949>] net_rx_action+0xa7/0x200 [<ffffffff81209d86>] add_interrupt_randomness+0x39/0x157 But there are many more, resulting in the machine going OOM after some days. From looking at the TPROXY code, and with help from Florian, I see that the memory leak is introduced in tcp_v4_early_demux(): void tcp_v4_early_demux(struct sk_buff *skb) { /* ... */ iph = ip_hdr(skb); th = tcp_hdr(skb); if (th->doff < sizeof(struct tcphdr) / 4) return; sk = __inet_lookup_established(dev_net(skb->dev), &tcp_hashinfo, iph->saddr, th->source, iph->daddr, ntohs(th->dest), skb->skb_iif); if (sk) { skb->sk = sk; where the socket is assigned unconditionally to skb->sk, also bumping the refcnt on it. This is problematic, because in our case the skb has already a socket assigned in the TPROXY target. This then results in the leak I see. The very same issue seems to be with IPv6, but haven't tested. Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-06fib_frontend: fix possible NULL pointer dereferenceOliver Hartkopp
[ Upstream commit a0065f266a9b5d51575535a25c15ccbeed9a9966 ] The two commits 0115e8e30d (net: remove delay at device dismantle) and 748e2d9396a (net: reinstate rtnl in call_netdevice_notifiers()) silently removed a NULL pointer check for in_dev since Linux 3.7. This patch re-introduces this check as it causes crashing the kernel when setting small mtu values on non-ip capable netdevices. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-06ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is calledDuan Jiong
[ Upstream commit 11c21a307d79ea5f6b6fc0d3dfdeda271e5e65f6 ] commit a622260254ee48("ip_tunnel: fix kernel panic with icmp_dest_unreach") clear IPCB in ip_tunnel_xmit() , or else skb->cb[] may contain garbage from GSO segmentation layer. But commit 0e6fbc5b6c621("ip_tunnels: extend iptunnel_xmit()") refactor codes, and it clear IPCB behind the dst_link_failure(). So clear IPCB in ip_tunnel_xmit() just like commti a622260254ee48("ip_tunnel: fix kernel panic with icmp_dest_unreach"). Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-06tcp: metrics: Avoid duplicate entries with the same destination-IPChristoph Paasch
[ Upstream commit 77f99ad16a07aa062c2d30fae57b1fee456f6ef6 ] Because the tcp-metrics is an RCU-list, it may be that two soft-interrupts are inside __tcp_get_metrics() for the same destination-IP at the same time. If this destination-IP is not yet part of the tcp-metrics, both soft-interrupts will end up in tcpm_new and create a new entry for this IP. So, we will have two tcp-metrics with the same destination-IP in the list. This patch checks twice __tcp_get_metrics(). First without holding the lock, then while holding the lock. The second one is there to confirm that the entry has not been added by another soft-irq while waiting for the spin-lock. Fixes: 51c5d0c4b169b (tcp: Maintain dynamic metrics in local cache.) Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-06net: avoid reference counter overflows on fib_rules in multicast forwardingHannes Frederic Sowa
[ Upstream commit 95f4a45de1a0f172b35451fc52283290adb21f6e ] Bob Falken reported that after 4G packets, multicast forwarding stopped working. This was because of a rule reference counter overflow which freed the rule as soon as the overflow happend. This patch solves this by adding the FIB_LOOKUP_NOREF flag to fib_rules_lookup calls. This is safe even from non-rcu locked sections as in this case the flag only implies not taking a reference to the rule, which we don't need at all. Rules only hold references to the namespace, which are guaranteed to be available during the call of the non-rcu protected function reg_vif_xmit because of the interface reference which itself holds a reference to the net namespace. Fixes: f0ad0860d01e47 ("ipv4: ipmr: support multiple tables") Fixes: d1db275dd3f6e4 ("ipv6: ip6mr: support multiple tables") Reported-by: Bob Falken <NetFestivalHaveFun@gmx.com> Cc: Patrick McHardy <kaber@trash.net> Cc: Thomas Graf <tgraf@suug.ch> Cc: Julian Anastasov <ja@ssi.bg> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-15ipv4: fix tunneled VM traffic over hw VXLAN/GRE GSO NICWei-Chun Chao
[ Upstream commit 7a7ffbabf99445704be01bff5d7e360da908cf8e ] VM to VM GSO traffic is broken if it goes through VXLAN or GRE tunnel and the physical NIC on the host supports hardware VXLAN/GRE GSO offload (e.g. bnx2x and next-gen mlx4). Two issues - (VXLAN) VM traffic has SKB_GSO_DODGY and SKB_GSO_UDP_TUNNEL with SKB_GSO_TCP/UDP set depending on the inner protocol. GSO header integrity check fails in udp4_ufo_fragment if inner protocol is TCP. Also gso_segs is calculated incorrectly using skb->len that includes tunnel header. Fix: robust check should only be applied to the inner packet. (VXLAN & GRE) Once GSO header integrity check passes, NULL segs is returned and the original skb is sent to hardware. However the tunnel header is already pulled. Fix: tunnel header needs to be restored so that hardware can perform GSO properly on the original packet. Signed-off-by: Wei-Chun Chao <weichunc@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-15net: inet_diag: zero out uninitialized idiag_{src,dst} fieldsDaniel Borkmann
[ Upstream commit b1aac815c0891fe4a55a6b0b715910142227700f ] Jakub reported while working with nlmon netlink sniffer that parts of the inet_diag_sockid are not initialized when r->idiag_family != AF_INET6. That is, fields of r->id.idiag_src[1 ... 3], r->id.idiag_dst[1 ... 3]. In fact, it seems that we can leak 6 * sizeof(u32) byte of kernel [slab] memory through this. At least, in udp_dump_one(), we allocate a skb in ... rep = nlmsg_new(sizeof(struct inet_diag_msg) + ..., GFP_KERNEL); ... and then pass that to inet_sk_diag_fill() that puts the whole struct inet_diag_msg into the skb, where we only fill out r->id.idiag_src[0], r->id.idiag_dst[0] and leave the rest untouched: r->id.idiag_src[0] = inet->inet_rcv_saddr; r->id.idiag_dst[0] = inet->inet_daddr; struct inet_diag_msg embeds struct inet_diag_sockid that is correctly / fully filled out in IPv6 case, but for IPv4 not. So just zero them out by using plain memset (for this little amount of bytes it's probably not worth the extra check for idiag_family == AF_INET). Similarly, fix also other places where we fill that out. Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-15ip_gre: fix msg_name parsing for recvfrom/recvmsgTimo Teräs
[ Upstream commit 0e3da5bb8da45890b1dc413404e0f978ab71173e ] ipgre_header_parse() needs to parse the tunnel's ip header and it uses mac_header to locate the iphdr. This got broken when gre tunneling was refactored as mac_header is no longer updated to point to iphdr. Introduce skb_pop_mac_header() helper to do the mac_header assignment and use it in ipgre_rcv() to fix msg_name parsing. Bug introduced in commit c54419321455 (GRE: Refactor GRE tunneling code.) Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-15inet: fix NULL pointer Oops in fib(6)_rule_suppressStefan Tomanek
[ Upstream commit 673498b8ed4c4d4b7221c5309d891c5eac2b7528 ] This changes ensures that the routing entry investigated by the suppress function actually does point to a device struct before following that pointer, fixing a possible kernel oops situation when verifying the interface group associated with a routing table entry. According to Daniel Golle, this Oops can be triggered by a user process trying to establish an outgoing IPv6 connection while having no real IPv6 connectivity set up (only autoassigned link-local addresses). Fixes: 6ef94cfafba15 ("fib_rules: add route suppression based on ifgroup") Reported-by: Daniel Golle <daniel.golle@gmail.com> Tested-by: Daniel Golle <daniel.golle@gmail.com> Signed-off-by: Stefan Tomanek <stefan.tomanek@wertarbyte.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-20Revert "net: update consumers of MSG_MORE to recognize MSG_SENDPAGE_NOTLAST"Greg Kroah-Hartman
It turns out that commit: d3f7d56a7a4671d395e8af87071068a195257bf6 was applied to the tree twice, which didn't hurt anything, but it's good to fix this up. Reported-by: Veaceslav Falico <veaceslav@falico.eu> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Richard Weinberger <richard@nod.at> Cc: Shawn Landden <shawnlandden@gmail.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-14Merge remote-tracking branch 'linus/master' into mergeScott Wood
Conflicts: Documentation/hwmon/ina2xx arch/powerpc/Kconfig arch/powerpc/boot/dts/b4860emu.dts arch/powerpc/boot/dts/b4qds.dtsi arch/powerpc/boot/dts/fsl/b4si-post.dtsi arch/powerpc/boot/dts/fsl/qoriq-sec6.0-0.dtsi arch/powerpc/boot/dts/p1023rdb.dts arch/powerpc/boot/dts/t4240emu.dts arch/powerpc/boot/dts/t4240qds.dts arch/powerpc/configs/85xx/p1023_defconfig arch/powerpc/configs/corenet32_smp_defconfig arch/powerpc/configs/corenet64_smp_defconfig arch/powerpc/configs/mpc85xx_smp_defconfig arch/powerpc/include/asm/cputable.h arch/powerpc/include/asm/device.h arch/powerpc/include/asm/epapr_hcalls.h arch/powerpc/include/asm/kvm_host.h arch/powerpc/include/asm/mpic.h arch/powerpc/include/asm/pci.h arch/powerpc/include/asm/ppc-opcode.h arch/powerpc/include/asm/ppc_asm.h arch/powerpc/include/asm/reg_booke.h arch/powerpc/kernel/epapr_paravirt.c arch/powerpc/kernel/process.c arch/powerpc/kernel/prom.c arch/powerpc/kernel/setup-common.c arch/powerpc/kernel/setup_32.c arch/powerpc/kernel/setup_64.c arch/powerpc/kernel/smp.c arch/powerpc/kernel/swsusp_asm64.S arch/powerpc/kernel/swsusp_booke.S arch/powerpc/kvm/book3s_pr.c arch/powerpc/kvm/booke.c arch/powerpc/kvm/booke.h arch/powerpc/kvm/e500.c arch/powerpc/kvm/e500.h arch/powerpc/kvm/e500_emulate.c arch/powerpc/kvm/e500mc.c arch/powerpc/kvm/powerpc.c arch/powerpc/perf/e6500-pmu.c arch/powerpc/platforms/85xx/Kconfig arch/powerpc/platforms/85xx/Makefile arch/powerpc/platforms/85xx/b4_qds.c arch/powerpc/platforms/85xx/c293pcie.c arch/powerpc/platforms/85xx/corenet_ds.c arch/powerpc/platforms/85xx/corenet_ds.h arch/powerpc/platforms/85xx/p1023_rds.c arch/powerpc/platforms/85xx/p2041_rdb.c arch/powerpc/platforms/85xx/p3041_ds.c arch/powerpc/platforms/85xx/p4080_ds.c arch/powerpc/platforms/85xx/p5020_ds.c arch/powerpc/platforms/85xx/p5040_ds.c arch/powerpc/platforms/85xx/smp.c arch/powerpc/platforms/85xx/t4240_qds.c arch/powerpc/platforms/Kconfig arch/powerpc/sysdev/Makefile arch/powerpc/sysdev/fsl_mpic_timer_wakeup.c arch/powerpc/sysdev/fsl_msi.c arch/powerpc/sysdev/fsl_pci.c arch/powerpc/sysdev/fsl_pci.h arch/powerpc/sysdev/fsl_soc.h arch/powerpc/sysdev/mpic.c arch/powerpc/sysdev/mpic_timer.c drivers/Kconfig drivers/clk/Kconfig drivers/clk/clk-ppc-corenet.c drivers/cpufreq/Kconfig.powerpc drivers/cpufreq/Makefile drivers/cpufreq/ppc-corenet-cpufreq.c drivers/crypto/caam/Kconfig drivers/crypto/caam/Makefile drivers/crypto/caam/ctrl.c drivers/crypto/caam/desc_constr.h drivers/crypto/caam/intern.h drivers/crypto/caam/jr.c drivers/crypto/caam/regs.h drivers/dma/fsldma.c drivers/hwmon/ina2xx.c drivers/iommu/Kconfig drivers/iommu/fsl_pamu.c drivers/iommu/fsl_pamu.h drivers/iommu/fsl_pamu_domain.c drivers/iommu/fsl_pamu_domain.h drivers/misc/Makefile drivers/mmc/card/block.c drivers/mmc/core/core.c drivers/mmc/host/sdhci-esdhc.h drivers/mmc/host/sdhci-pltfm.c drivers/mtd/nand/fsl_ifc_nand.c drivers/net/ethernet/freescale/gianfar.c drivers/net/ethernet/freescale/gianfar.h drivers/net/ethernet/freescale/gianfar_ethtool.c drivers/net/phy/at803x.c drivers/net/phy/phy_device.c drivers/net/phy/vitesse.c drivers/pci/msi.c drivers/staging/Kconfig drivers/staging/Makefile drivers/uio/Kconfig drivers/uio/Makefile drivers/uio/uio.c drivers/usb/host/ehci-fsl.c drivers/vfio/Kconfig drivers/vfio/Makefile include/crypto/algapi.h include/linux/iommu.h include/linux/mmc/sdhci.h include/linux/msi.h include/linux/netdev_features.h include/linux/phy.h include/linux/skbuff.h include/net/ip.h include/uapi/linux/vfio.h net/core/ethtool.c net/ipv4/route.c net/ipv6/route.c
2013-12-12net: update consumers of MSG_MORE to recognize MSG_SENDPAGE_NOTLASTShawn Landden
commit d3f7d56a7a4671d395e8af87071068a195257bf6 upstream. Commit 35f9c09fe (tcp: tcp_sendpages() should call tcp_push() once) added an internal flag MSG_SENDPAGE_NOTLAST, similar to MSG_MORE. algif_hash, algif_skcipher, and udp used MSG_MORE from tcp_sendpages() and need to see the new flag as identical to MSG_MORE. This fixes sendfile() on AF_ALG. v3: also fix udp Cc: Tom Herbert <therbert@google.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David S. Miller <davem@davemloft.net> Reported-and-tested-by: Shawn Landden <shawnlandden@gmail.com> Original-patch: Richard Weinberger <richard@nod.at> Signed-off-by: Shawn Landden <shawn@churchofgit.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-08xfrm: Fix null pointer dereference when decoding sessionsSteffen Klassert
[ Upstream commit 84502b5ef9849a9694673b15c31bd3ac693010ae ] On some codepaths the skb does not have a dst entry when xfrm_decode_session() is called. So check for a valid skb_dst() before dereferencing the device interface index. We use 0 as the device index if there is no valid skb_dst(), or at reverse decoding we use skb_iif as device interface index. Bug was introduced with git commit bafd4bd4dc ("xfrm: Decode sessions with output interface."). Reported-by: Meelis Roos <mroos@linux.ee> Tested-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-08inet: fix possible seqlock deadlocksEric Dumazet
[ Upstream commit f1d8cba61c3c4b1eb88e507249c4cb8d635d9a76 ] In commit c9e9042994d3 ("ipv4: fix possible seqlock deadlock") I left another places where IP_INC_STATS_BH() were improperly used. udp_sendmsg(), ping_v4_sendmsg() and tcp_v4_connect() are called from process context, not from softirq context. This was detected by lockdep seqlock support. Reported-by: jongman heo <jongman.heo@samsung.com> Fixes: 584bdf8cbdf6 ("[IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP") Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-08net: update consumers of MSG_MORE to recognize MSG_SENDPAGE_NOTLASTShawn Landden
[ Upstream commit d3f7d56a7a4671d395e8af87071068a195257bf6 ] Commit 35f9c09fe (tcp: tcp_sendpages() should call tcp_push() once) added an internal flag MSG_SENDPAGE_NOTLAST, similar to MSG_MORE. algif_hash, algif_skcipher, and udp used MSG_MORE from tcp_sendpages() and need to see the new flag as identical to MSG_MORE. This fixes sendfile() on AF_ALG. v3: also fix udp Cc: Tom Herbert <therbert@google.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: <stable@vger.kernel.org> # 3.4.x + 3.2.x Reported-and-tested-by: Shawn Landden <shawnlandden@gmail.com> Original-patch: Richard Weinberger <richard@nod.at> Signed-off-by: Shawn Landden <shawn@churchofgit.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>