summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-11-16be2net: remove unused local rsstable arrayIvan Vecera
Remove rsstable array and its initialization from be_set_rss_hash_opts(). The array became unused after "e255787 be2net: Support for configurable RSS hash key". The initial RSS table is now filled and stored for later usage during Rx queue creation. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Acked-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-16ravb: Fix int mask value overwritten issueMasaru Nagai
When RX/TX interrupt for Network Control queue and Best Effort queue is issued at the same time, the interrupt mask of Network Control queue will be reset when the mask of Best Effort queue is set. This patch fixes this problem. Signed-off-by: Masaru Nagai <masaru.nagai.vx@renesas.com> Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-16net: smsc911x: Reset PHY during initializationPavel Fedin
On certain hardware after software reboot the chip may get stuck and fail to reinitialize during reset. This can be fixed by ensuring that PHY is reset too. Old PHY resetting method required operational MDIO interface, therefore the chip should have been already set up. In order to be able to function during probe, it is changed to use PMT_CTRL register. The problem could be observed on SMDK5410 board. Signed-off-by: Pavel Fedin <p.fedin@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-16bpf, arm64: start flushing icache range from headerDaniel Borkmann
While recently going over ARM64's BPF code, I noticed that the icache range we're flushing should start at header already and not at ctx.image. Reason is that after b569c1c622c5 ("net: bpf: arm64: address randomize and write protect JIT code"), we also want to make sure to flush the random-sized trap in front of the start of the actual program (analogous to x86). No operational differences from user side. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Zi Shen Lim <zlim.lnx@gmail.com> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-16bpf, arm: start flushing icache range from headerDaniel Borkmann
During review I noticed that the icache range we're flushing should start at header already and not at ctx.image. Reason is that after 55309dd3d4cd ("net: bpf: arm: address randomize and write protect JIT code"), we also want to make sure to flush the random-sized trap in front of the start of the actual program (analogous to x86). No operational differences from user side. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Nicolas Schichan <nschichan@freebox.fr> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-16bpf: samples: exclude asm/sysreg.h for arm64Yang Shi
commit 338d4f49d6f7114a017d294ccf7374df4f998edc ("arm64: kernel: Add support for Privileged Access Never") includes sysreg.h into futex.h and uaccess.h. But, the inline assembly used by asm/sysreg.h is incompatible with llvm so it will cause BPF samples build failure for ARM64. Since sysreg.h is useless for BPF samples, just exclude it from Makefile via defining __ASM_SYSREG_H. Signed-off-by: Yang Shi <yang.shi@linaro.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-16arm64: bpf: fix JIT frame pointer setupYang Shi
BPF fp should point to the top of the BPF prog stack. The original implementation made it point to the bottom incorrectly. Move A64_SP to fp before reserve BPF prog stack space. CC: Zi Shen Lim <zlim.lnx@gmail.com> CC: Xi Wang <xi.wang@gmail.com> Signed-off-by: Yang Shi <yang.shi@linaro.org> Reviewed-by: Zi Shen Lim <zlim.lnx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-16net: phy: vitesse: add support for VSC8601Måns Rullgård
This adds support for the Vitesse VSC8601 PHY. Generic functions are used for everything except interrupt handling. Signed-off-by: Mans Rullgard <mans@mansr.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-16net: phy: at803x: support interrupt on 8030 and 8035Måns Rullgård
Commit 77a993942 "phy/at8031: enable at8031 to work on interrupt mode" added interrupt support for the 8031 PHY but left out the other two chips supported by this driver. This patch sets the .ack_interrupt and .config_intr functions for the 8030 and 8035 drivers as well. Signed-off-by: Mans Rullgard <mans@mansr.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-16ip_tunnel: disable preemption when updating per-cpu tstatsJason A. Donenfeld
Drivers like vxlan use the recently introduced udp_tunnel_xmit_skb/udp_tunnel6_xmit_skb APIs. udp_tunnel6_xmit_skb makes use of ip6tunnel_xmit, and ip6tunnel_xmit, after sending the packet, updates the struct stats using the usual u64_stats_update_begin/end calls on this_cpu_ptr(dev->tstats). udp_tunnel_xmit_skb makes use of iptunnel_xmit, which doesn't touch tstats, so drivers like vxlan, immediately after, call iptunnel_xmit_stats, which does the same thing - calls u64_stats_update_begin/end on this_cpu_ptr(dev->tstats). While vxlan is probably fine (I don't know?), calling a similar function from, say, an unbound workqueue, on a fully preemptable kernel causes real issues: [ 188.434537] BUG: using smp_processor_id() in preemptible [00000000] code: kworker/u8:0/6 [ 188.435579] caller is debug_smp_processor_id+0x17/0x20 [ 188.435583] CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.2.6 #2 [ 188.435607] Call Trace: [ 188.435611] [<ffffffff8234e936>] dump_stack+0x4f/0x7b [ 188.435615] [<ffffffff81915f3d>] check_preemption_disabled+0x19d/0x1c0 [ 188.435619] [<ffffffff81915f77>] debug_smp_processor_id+0x17/0x20 The solution would be to protect the whole this_cpu_ptr(dev->tstats)/u64_stats_update_begin/end blocks with disabling preemption and then reenabling it. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-16Merge branch 'mv88e6060-fixes'David S. Miller
Neil Armstrong says: ==================== net: dsa: mv88e6060: cleanup and fix setup This patchset introduces some fixes and a registers addressing cleanup for the mv88e6060 DSA driver. The first patch removes the poll_link as mv88e6xxx. The 3 following patches fixes the setup in regards of the datasheet. The 2 last patches introduces a clean header and replaces all magic values. v2: cleanup InitReady patch, add missing Acked-by and fix header copyright notice ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-16net: dsa: mv88e6060: replace magic values with register definesNeil Armstrong
To align with the mv88e6xxx code, use the register defines to access all the register addresses and bit fields. Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-16net: dsa: mv88e6060: add register defines header fileNeil Armstrong
To align with the mv88e6xxx code, add a similar header file with all the register defines. The file is based on the mv88e6xxx header for coherency. Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Acked-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-16net: dsa: mv88e6060: use the correct bit shift for mac0Neil Armstrong
According to the mv88e6060 datasheet, the first mac byte must be at position 9 instead of 8 since the bit 8 is used to select if the mac address must differ for each port for Pause frames. Use the correct shift and set the same mac address for all port. Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-16net: dsa: mv88e6060: use the correct MaxFrameSize bitNeil Armstrong
According to the mv88e6060 datasheet, the MaxFrameSize bit position is 10 instead of 11 which is reserved. Use the bit correctly to setup max frame size to 1536. Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-16net: dsa: mv88e6060: use the correct InitReady bitNeil Armstrong
According to the mv88e6060 datasheet, the InitReady bit position is 11 and the polarity is inverted. Use the bit correctly to detect the end of initialization. Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Acked-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-16net: dsa: mv88e6060: remove poll_link callbackNeil Armstrong
As of mv88e6xxx remove the poll_link callback since the link state change polling is now handled by the phylib. Tested on a mv88e6060 B0 device with a TI DM816X SoC. Suggested-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15Merge branch 'mellanox-net-fixes'David S. Miller
Or Gerlitz says: ==================== Mellanox NIC driver update, Nov 12, 2015 Few small mlx5 and mlx4 fixes from the team... done over net commit c5a3788 "Merge branch 'akpm' (patches from Andrew)" Eran's patch needs to go to 4.2 and 4.3 stable kernels. Tariq's patch need to go to 4.3 stable too. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15net/mlx4_core: Avoid returning success in case of an error flowNoa Osherovich
The err variable wasn't set with the correct error value in some cases. Fixes: 47605df95398 ('mlx4: Modify proxy/tunnel QP mechanism [..]') Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15net/mlx4_core: Fix sleeping while holding spinlock at rem_slave_countersEran Ben Elisha
When cleaning slave's counter resources, we hold a spinlock that protects the slave's counters list. As part of the clean, we call __mlx4_clear_if_stat which calls mlx4_alloc_cmd_mailbox which is a sleepable function. In order to fix this issue, hold the spinlock, and copy all counter indices into a temporary array, and release the spinlock. Afterwards, iterate over this array and free every counter. Repeat this scenario until the original list is empty (a new counter might have been added while releasing the counters from the temporary array). Fixes: b72ca7e96acf ("net/mlx4_core: Reset counters data when freed") Reported-by: Moni Shoua <monis@mellanox.com> Tested-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15net/mlx5e: Use the right DMA free function on TX pathAchiad Shochat
On xmit path we use skb_frag_dma_map() which is using dma_map_page(), while upon completion we dma-unmap the skb fragments using dma_unmap_single() rather than dma_unmap_page(). To fix this, we now save the dma map type on xmit path and use this info to call the right dma unmap method upon TX completion. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15net/mlx5e: Max mtu comparison fixDoron Tsur
On change mtu the driver compares between hardware queried mtu and software requested mtu. We need to compare between software representation of the queried mtu and the requested mtu. Fixes: facc9699f0fe ('net/mlx5e: Fix HW MTU settings') Signed-off-by: Doron Tsur <doront@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15net/mlx5e: Added self loopback preventionTariq Toukan
Prevent outgoing multicast frames from looping back to the RX queue. By introducing new HW capability self_lb_en_modifiable, which indicates the support to modify self_lb_en bit in modify_tir command. When this capability is set we can prevent TIRs from sending back loopback multicast traffic to their own RQs, by "refreshing TIRs" with modify_tir command, on every time new channels (SQs/RQs) are created at device open. This is needed since TIRs are static and only allocated once on driver load, and the loopback decision is under their responsibility. Fixes issues of the kind: "IPv6: eth2: IPv6 duplicate address fe80::e61d:2dff:fe5c:f2e9 detected!" The issue is seen since the IPv6 solicitations multicast messages are loopedback and the network stack thinks they are coming from another host. Fixes: 5c50368f3831 ("net/mlx5e: Light-weight netdev open/stop") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15net/mlx5e: Fix inline header size calculationSaeed Mahameed
mlx5e_get_inline_hdr_size didn't take into account the vlan insertion into the inline WQE segment. This could lead to max inline violation in cases where skb_headlen(skb) + VLAN_HLEN >= sq->max_inline. Fixes: 3ea4891db8d0 ("net/mlx5e: Fix LSO vlan insertion") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15ipvs: use skb_to_full_sk() helperEric Dumazet
SYNACK packets might be attached to request sockets. Use skb_to_full_sk() helper to avoid illegal accesses to inet_sk(skb->sk) Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Acked-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15tcp: ensure proper barriers in lockless contextsEric Dumazet
Some functions access TCP sockets without holding a lock and might output non consistent data, depending on compiler and or architecture. tcp_diag_get_info(), tcp_get_info(), tcp_poll(), get_tcp4_sock() ... Introduce sk_state_load() and sk_state_store() to fix the issues, and more clearly document where this lack of locking is happening. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15net: thunder: Fix crash upon shutdown after failed probePavel Fedin
If device probe fails, driver remains bound to the PCI device. However, driver data has been reset to NULL. This causes crash upon dereferencing it in nicvf_remove() Signed-off-by: Pavel Fedin <p.fedin@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15sctp: translate host order to network order when setting a hmacidlucien
now sctp auth cannot work well when setting a hmacid manually, which is caused by that we didn't use the network order for hmacid, so fix it by adding the transformation in sctp_auth_ep_set_hmacs. even we set hmacid with the network order in userspace, it still can't work, because of this condition in sctp_auth_ep_set_hmacs(): if (id > SCTP_AUTH_HMAC_ID_MAX) return -EOPNOTSUPP; so this wasn't working before and thus it won't break compatibility. Fixes: 65b07e5d0d09 ("[SCTP]: API updates to suport SCTP-AUTH extensions.") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15Merge branch 'packet-fixes'David S. Miller
Daniel Borkmann says: ==================== packet fixes Fixes a couple of issues in packet sockets, i.e. on TX ring side. See individual patches for details. v2 -> v3: - First two patches unchanged, kept Jason's Ack - Reworked 3rd patch and split into 3: - check for dev type as discussed with Willem - infer skb->protocol - fix max len for dgram v1 -> v2: - Added patch 2 as suggested by Dave - Rest is unchanged from previous submission ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15packet: fix tpacket_snd max frame lenDaniel Borkmann
Since it's introduction in commit 69e3c75f4d54 ("net: TX_RING and packet mmap"), TX_RING could be used from SOCK_DGRAM and SOCK_RAW side. When used with SOCK_DGRAM only, the size_max > dev->mtu + reserve check should have reserve as 0, but currently, this is unconditionally set (in it's original form as dev->hard_header_len). I think this is not correct since tpacket_fill_skb() would then take dev->mtu and dev->hard_header_len into account for SOCK_DGRAM, the extra VLAN_HLEN could be possible in both cases. Presumably, the reserve code was copied from packet_snd(), but later on missed the check. Make it similar as we have it in packet_snd(). Fixes: 69e3c75f4d54 ("net: TX_RING and packet mmap") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15packet: infer protocol from ethernet header if unsetDaniel Borkmann
In case no struct sockaddr_ll has been passed to packet socket's sendmsg() when doing a TX_RING flush run, then skb->protocol is set to po->num instead, which is the protocol passed via socket(2)/bind(2). Applications only xmitting can go the path of allocating the socket as socket(PF_PACKET, <mode>, 0) and do a bind(2) on the TX_RING with sll_protocol of 0. That way, register_prot_hook() is neither called on creation nor on bind time, which saves cycles when there's no interest in capturing anyway. That leaves us however with po->num 0 instead and therefore the TX_RING flush run sets skb->protocol to 0 as well. Eric reported that this leads to problems when using tools like trafgen over bonding device. I.e. the bonding's hash function could invoke the kernel's flow dissector, which depends on skb->protocol being properly set. In the current situation, all the traffic is then directed to a single slave. Fix it up by inferring skb->protocol from the Ethernet header when not set and we have ARPHRD_ETHER device type. This is only done in case of SOCK_RAW and where we have a dev->hard_header_len length. In case of ARPHRD_ETHER devices, this is guaranteed to cover ETH_HLEN, and therefore being accessed on the skb after the skb_store_bits(). Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15packet: only allow extra vlan len on ethernet devicesDaniel Borkmann
Packet sockets can be used by various net devices and are not really restricted to ARPHRD_ETHER device types. However, when currently checking for the extra 4 bytes that can be transmitted in VLAN case, our assumption is that we generally probe on ARPHRD_ETHER devices. Therefore, before looking into Ethernet header, check the device type first. This also fixes the issue where non-ARPHRD_ETHER devices could have no dev->hard_header_len in TX_RING SOCK_RAW case, and thus the check would test unfilled linear part of the skb (instead of non-linear). Fixes: 57f89bfa2140 ("network: Allow af_packet to transmit +4 bytes for VLAN packets.") Fixes: 52f1454f629f ("packet: allow to transmit +4 byte in TX_RING slot for VLAN case") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15packet: always probe for transport headerDaniel Borkmann
We concluded that the skb_probe_transport_header() should better be called unconditionally. Avoiding the call into the flow dissector has also not really much to do with the direct xmit mode. While it seems that only virtio_net code makes use of GSO from non RX/TX ring packet socket paths, we should probe for a transport header nevertheless before they hit devices. Reference: http://thread.gmane.org/gmane.linux.network/386173/ Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15packet: do skb_probe_transport_header when we actually have dataDaniel Borkmann
In tpacket_fill_skb() commit c1aad275b029 ("packet: set transport header before doing xmit") and later on 40893fd0fd4e ("net: switch to use skb_probe_transport_header()") was probing for a transport header on the skb from a ring buffer slot, but at a time, where the skb has _not even_ been filled with data yet. So that call into the flow dissector is pretty useless. Lets do it after we've set up the skb frags. Fixes: c1aad275b029 ("packet: set transport header before doing xmit") Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15tools/net: Use include/uapi with __EXPORTED_HEADERS__Kamal Mostafa
Use the local uapi headers to keep in sync with "recently" added #define's (e.g. SKF_AD_VLAN_TPID). Refactored CFLAGS, and bpf_asm doesn't need -I. Fixes: 3f356385e8a4 ("filter: bpf_asm: add minimal bpf asm tool") Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15Merge branch 'ipv6-route-fixes'David S. Miller
Martin KaFai Lau says: ==================== ipv6: Fixes for pmtu update and DST_NOCACHE route This patchset fixes: 1. An oops during IPv6 pmtu update on a IPv4 GRE running in an IPSec setup 2. Misc fixes on DST_NOCACHE route ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15ipv6: Check rt->dst.from for the DST_NOCACHE routeMartin KaFai Lau
All DST_NOCACHE rt6_info used to have rt->dst.from set to its parent. After commit 8e3d5be73681 ("ipv6: Avoid double dst_free"), DST_NOCACHE is also set to rt6_info which does not have a parent (i.e. rt->dst.from is NULL). This patch catches the rt->dst.from == NULL case. Fixes: 8e3d5be73681 ("ipv6: Avoid double dst_free") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15ipv6: Check expire on DST_NOCACHE routeMartin KaFai Lau
Since the expires of the DST_NOCACHE rt can be set during the ip6_rt_update_pmtu(), we also need to consider the expires value when doing ip6_dst_check(). This patches creates __rt6_check_expired() to only check the expire value (if one exists) of the current rt. In rt6_dst_from_check(), it adds __rt6_check_expired() as one of the condition check. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15ipv6: Avoid creating RTF_CACHE from a rt that is not managed by fib6 treeMartin KaFai Lau
The original bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1272571 The setup has a IPv4 GRE tunnel running in a IPSec. The bug happens when ndisc starts sending router solicitation at the gre interface. The simplified oops stack is like: __lock_acquire+0x1b2/0x1c30 lock_acquire+0xb9/0x140 _raw_write_lock_bh+0x3f/0x50 __ip6_ins_rt+0x2e/0x60 ip6_ins_rt+0x49/0x50 ~~~~~~~~ __ip6_rt_update_pmtu.part.54+0x145/0x250 ip6_rt_update_pmtu+0x2e/0x40 ~~~~~~~~ ip_tunnel_xmit+0x1f1/0xf40 __gre_xmit+0x7a/0x90 ipgre_xmit+0x15a/0x220 dev_hard_start_xmit+0x2bd/0x480 __dev_queue_xmit+0x696/0x730 dev_queue_xmit+0x10/0x20 neigh_direct_output+0x11/0x20 ip6_finish_output2+0x21f/0x770 ip6_finish_output+0xa7/0x1d0 ip6_output+0x56/0x190 ~~~~~~~~ ndisc_send_skb+0x1d9/0x400 ndisc_send_rs+0x88/0xc0 ~~~~~~~~ The rt passed to ip6_rt_update_pmtu() is created by icmp6_dst_alloc() and it is not managed by the fib6 tree, so its rt6i_table == NULL. When __ip6_rt_update_pmtu() creates a RTF_CACHE clone, the newly created clone also has rt6i_table == NULL and it causes the ip6_ins_rt() oops. During pmtu update, we only want to create a RTF_CACHE clone from a rt which is currently managed (or owned) by the fib6 tree. It means either rt->rt6i_node != NULL or rt is a RTF_PCPU clone. It is worth to note that rt6i_table may not be NULL even it is not (yet) managed by the fib6 tree (e.g. addrconf_dst_alloc()). Hence, rt6i_node is a better check instead of rt6i_table. Fixes: 45e4fd26683c ("ipv6: Only create RTF_CACHE routes after encountering pmtu") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Reported-by: Chris Siebenmann <cks-rhbugzilla@cs.toronto.edu> Cc: Chris Siebenmann <cks-rhbugzilla@cs.toronto.edu> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15fjes: fix inconsistent indentingColin Ian King
minor change, indenting is one tab out. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15af-unix: fix use-after-free with concurrent readers while splicingHannes Frederic Sowa
During splicing an af-unix socket to a pipe we have to drop all af-unix socket locks. While doing so we allow another reader to enter unix_stream_read_generic which can read, copy and finally free another skb. If exactly this skb is just in process of being spliced we get a use-after-free report by kasan. First, we must make sure to not have a free while the skb is used during the splice operation. We simply increment its use counter before unlocking the reader lock. Stream sockets have the nice characteristic that we don't care about zero length writes and they never reach the peer socket's queue. That said, we can take the UNIXCB.consumed field as the indicator if the skb was already freed from the socket's receive queue. If the skb was fully consumed after we locked the reader side again we know it has been dropped by a second reader. We indicate a short read to user space and abort the current splice operation. This bug has been found with syzkaller (http://github.com/google/syzkaller) by Dmitry Vyukov. Fixes: 2b514574f7e8 ("net: af_unix: implement splice for stream af_unix sockets") Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-12stmmac: avoid ipq806x constant overflow warningArnd Bergmann
Building dwmac-ipq806x on a 64-bit architecture produces a harmless warning from gcc: stmmac/dwmac-ipq806x.c: In function 'ipq806x_gmac_probe': include/linux/bitops.h:6:19: warning: overflow in implicit constant conversion [-Woverflow] val = QSGMII_PHY_CDR_EN | stmmac/dwmac-ipq806x.c:333:8: note: in expansion of macro 'QSGMII_PHY_CDR_EN' #define QSGMII_PHY_CDR_EN BIT(0) #define BIT(nr) (1UL << (nr)) This is a result of the type conversion rules in C, when we take the logical OR of multiple different types. In particular, we have and unsigned long QSGMII_PHY_CDR_EN == BIT(0) == (1ul << 0) == 0x0000000000000001ul and a signed int 0xC << QSGMII_PHY_TX_DRV_AMP_OFFSET == 0xc0000000 which together gives a signed long value 0xffffffffc0000001l and when this is passed into a function that takes an unsigned int type, gcc warns about the signed overflow and the loss of the upper 32-bits that are all ones. This patch adds 'ul' type modifiers to the literal numbers passed in here, so now the expression remains an 'unsigned long' with the upper bits all zero, and that avoids the signed overflow and the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: b1c17215d718 ("stmmac: add ipq806x glue layer") Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree. This large batch that includes fixes for ipset, netfilter ingress, nf_tables dynamic set instantiation and a longstanding Kconfig dependency problem. More specifically, they are: 1) Add missing check for empty hook list at the ingress hook, from Florian Westphal. 2) Input and output interface are swapped at the ingress hook, reported by Patrick McHardy. 3) Resolve ipset extension alignment issues on ARM, patch from Jozsef Kadlecsik. 4) Fix bit check on bitmap in ipset hash type, also from Jozsef. 5) Release buckets when all entries have expired in ipset hash type, again from Jozsef. 6) Oneliner to initialize conntrack tuple object in the PPTP helper, otherwise the conntrack lookup may fail due to random bits in the structure holes, patch from Anthony Lineham. 7) Silence a bogus gcc warning in nfnetlink_log, from Arnd Bergmann. 8) Fix Kconfig dependency problems with TPROXY, socket and dup, also from Arnd. 9) Add __netdev_alloc_pcpu_stats() to allow creating percpu counters from atomic context, this is required by the follow up fix for nf_tables. 10) Fix crash from the dynamic set expression, we have to add new clone operation that should be defined when a simple memcpy is not enough. This resolves a crash when using per-cpu counters with new Patrick McHardy's flow table nft support. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-12r8169: fix kasan reported skb use-after-free.françois romieu
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Reported-by: Dave Jones <davej@codemonkey.org.uk> Fixes: d7d2d89d4b0af ("r8169: Add software counter for multicast packages") Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Corinna Vinschen <vinschen@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-11Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge final patch-bomb from Andrew Morton: "Various leftovers, mainly Christoph's pci_dma_supported() removals" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: pci: remove pci_dma_supported usbnet: remove ifdefed out call to dma_supported kaweth: remove ifdefed out call to dma_supported sfc: don't call dma_supported nouveau: don't call pci_dma_supported netup_unidvb: use pci_set_dma_mask insted of pci_dma_supported cx23885: use pci_set_dma_mask insted of pci_dma_supported cx25821: use pci_set_dma_mask insted of pci_dma_supported cx88: use pci_set_dma_mask insted of pci_dma_supported saa7134: use pci_set_dma_mask insted of pci_dma_supported saa7164: use pci_set_dma_mask insted of pci_dma_supported tw68-core: use pci_set_dma_mask insted of pci_dma_supported pcnet32: use pci_set_dma_mask insted of pci_dma_supported lib/string.c: add ULL suffix to the constant definition hugetlb: trivial comment fix selftests/mlock2: add ULL suffix to 64-bit constants selftests/mlock2: add missing #define _GNU_SOURCE
2015-11-11Merge branch 'misc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull misc kbuild updates from Michal Marek: "This is the non-critical part of kbuild: - several coccinelle updates - make deb-pkg creates an armhf package if CONFIG_VFP=y - make tags understands some more powerpc macros" * 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: coccinelle: Improve checking for missing NULL terminators coccinelle: ifnullfree: handle various destroy functions coccinelle: ifnullfree: various cleanups cocinelle: iterators: semantic patch to delete unneeded of_node_put deb-pkg: Add automatic support for armhf architecture scripts/coccinelle: fix typos coccinelle: misc: remove "complex return code" warnings Coccinelle: fix incorrect -include option transformation coccinelle: tests: improve odd_ptr_err.cocci coccinelle: misc: move constants to the right scripts/tags.sh: Teach tags about some powerpc macros
2015-11-11Merge branch 'kconfig' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kconfig updates from Michal Marek: - 'make xconfig' ported to Qt5, dropping support for Qt3 - merge_config.sh supports a single-input-file mode and also respects $KCONFIG_CONFIG - Fix for incorrect display of >= and > in dependency expressions * 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (44 commits) Add current selection check. Use pkg-config to find Qt 4 and 5 instead of direct qmake kconfig: Fix copy&paste error kconfig/merge_config.sh: Accept a single file kconfig/merge_config.sh: Support KCONFIG_CONFIG Update the buildsystem for KConfig finding Qt Port xconfig to Qt5 - Update copyright. Port xconfig to Qt5 - Fix goParent issue. Port xconfig to Qt5 - on Back clicked, deselect old item. Port xconfig to Qt5 - Add(back) one click checkbox toggle. Port xconfig to Qt5 - Add(back) lineedit editing. Port xconfig to Qt5 - Remove some commented code. Port xconfig to Qt5 - Source format. Port xconfig to Qt5 - Add horizontal scrollbar, and scroll per pixel. Port xconfig to Qt5 - Change ConfigItem constructor parent type. Port xconfig to Qt5 - Disable ConfigList soring Port xconfig to Qt5 - Remove ConfigList::updateMenuList template. Port xconfig to Qt5 - Add ConfigList::mode to initializer list. Port xconfig to Qt5 - Add ConfigItem::nextItem to initializer list. Port xconfig to Qt5 - Tree widget set column titles. ...
2015-11-11Merge branch 'kbuild' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kbuild update from Michal Marek: "The kbuild branch for v4.4-rc1 only has one commit: A new make kselftest-clean target cleans tools/testing/selftests" * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: kselftest: add kselftest-clean rule
2015-11-11Merge tag 'linux-kselftest-4.4-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest updates from Shuah Khan: "This 12 patch update for 4.4-rc1 consists of a new pstore test and fixes to existing tests" * tag 'linux-kselftest-4.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests: breakpoint: Actually build it selftests: vm: Try harder to allocate huge pages selftests: Make scripts executable selftests: kprobe: Choose an always-defined function to probe selftests: memfd: Stop unnecessary rebuilds selftests: Add missing #include directives selftests/seccomp: Be more precise with syscall arguments. selftests/seccomp: build and pass on arm64 selftests: memfd_test: Revised STACK_SIZE to make it 16-byte aligned selftests/pstore: add pstore test scripts going with reboot selftests/pstore: add pstore test script for pre-reboot selftests: add .gitignore for efivarfs
2015-11-11Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm fixes from Dave Airlie: "Two build fixes, one for VC4, one for nouveau where the ARM only code is doing something a bit strange. While people are discussing that, just workaround it and fix the build for now. The code in question will never get used on anything non-ARM anyways. Also one fix for AST that SuSE had been hiding in their kernel, that allows all fbdev apps to work on that driver" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/nouveau: fix build failures on all non ARM. drm/ast: Initialized data needed to map fbdev memory drm/vc4: Add dependency on HAVE_DMA_ATTRS, and select DRM_GEM_CMA_HELPER