summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-01-19net: introduce SO_BPF_EXTENSIONSMichal Sekletar
For user space packet capturing libraries such as libpcap, there's currently only one way to check which BPF extensions are supported by the kernel, that is, commit aa1113d9f85d ("net: filter: return -EINVAL if BPF_S_ANC* operation is not supported"). For querying all extensions at once this might be rather inconvenient. Therefore, this patch introduces a new option which can be used as an argument for getsockopt(), and allows one to obtain information about which BPF extensions are supported by the current kernel. As David Miller suggests, we do not need to define any bits right now and status quo can just return 0 in order to state that this versions supports SKF_AD_PROTOCOL up to SKF_AD_PAY_OFFSET. Later additions to BPF extensions need to add their bits to the bpf_tell_extensions() function, as documented in the comment. Signed-off-by: Michal Sekletar <msekleta@redhat.com> Cc: David Miller <davem@davemloft.net> Reviewed-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c net/ipv4/tcp_metrics.c Overlapping changes between the "don't create two tcp metrics objects with the same key" race fix in net and the addition of the destination address in the lookup key in net-next. Minor overlapping changes in bnx2x driver. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) The value choosen for the new SO_MAX_PACING_RATE socket option on parisc was very poorly choosen, let's fix it while we still can. From Eric Dumazet. 2) Our generic reciprocal divide was found to handle some edge cases incorrectly, part of this is encoded into the BPF as deep as the JIT engines themselves. Just use a real divide throughout for now. From Eric Dumazet. 3) Because the initial lookup is lockless, the TCP metrics engine can end up creating two entries for the same lookup key. Fix this by doing a second lookup under the lock before we actually create the new entry. From Christoph Paasch. 4) Fix scatter-gather list init in usbnet driver, from Bjørn Mork. 5) Fix unintended 32-bit truncation in cxgb4 driver's bit shifting. From Dan Carpenter. 6) Netlink socket dumping uses the wrong socket state for timewait sockets. Fix from Neal Cardwell. 7) Fix netlink memory leak in ieee802154_add_iface(), from Christian Engelmayer. 8) Multicast forwarding in ipv4 can overflow the per-rule reference counts, causing all multicast traffic to cease. Fix from Hannes Frederic Sowa. 9) via-rhine needs to stop all TX queues when it resets the device, from Richard Weinberger. 10) Fix RDS per-cpu accesses broken by the this_cpu_* conversions. From Gerald Schaefer. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: s390/bpf,jit: fix 32 bit divisions, use unsigned divide instructions parisc: fix SO_MAX_PACING_RATE typo ipv6: simplify detection of first operational link-local address on interface tcp: metrics: Avoid duplicate entries with the same destination-IP net: rds: fix per-cpu helper usage e1000e: Fix compilation warning when !CONFIG_PM_SLEEP bpf: do not use reciprocal divide be2net: add dma_mapping_error() check for dma_map_page() bnx2x: Don't release PCI bars on shutdown net,via-rhine: Fix tx_timeout handling batman-adv: fix batman-adv header overhead calculation qlge: Fix vlan netdev features. net: avoid reference counter overflows on fib_rules in multicast forwarding dm9601: add USB IDs for new dm96xx variants MAINTAINERS: add virtio-dev ML for virtio ieee802154: Fix memory leak in ieee802154_add_iface() net: usbnet: fix SG initialisation inet_diag: fix inet_diag_dump_icsk() to use correct state for timewait sockets cxgb4: silence shift wrapping static checker warning
2014-01-18Merge branch 'ixgbevf'David S. Miller
Aaron Brown says: ==================== Intel Wired LAN Driver Updates This series contains updates from Emil to ixgbevf. He cleans up the code by removing the adapter structure as a parameter from multiple functions in favor of using the ixgbevf_ring structure and moves hot-path specific statistic int the ring structure for anticipated performance gains. He also removes the Tx/Rx counters for checksum offload and adds counters for tx_restart_queue and tx_timeout_count. Next he makes it so that the first tx_buffer structure acts as a central storage location for most the skb info we are about to transmit, then takes advantage of the dma buffer always being present in the first descriptor and mapped as single allowing a call to dma_unmap_single which alleviates the need to check for DMA mapping in ixgbevf_clean_tx_irq(). Finally he merges the ixgbevf_tx_map call and the ixgbevf_tx_queue call into a single function. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18ixgbevf: merge ixgbevf_tx_map and ixgbevf_tx_queue into a single functionEmil Tantilov
This change merges the ixgbevf_tx_map call and the ixgbevf_tx_queue call into a single function. In order to make room for this setting of cmd_type and olinfo flags is done in separate functions. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18ixgbevf: redo dma mapping using the tx buffer infoEmil Tantilov
This patch takes advantage of the dma buffer always being present in the first descriptor and mapped as single. As such we can call dma_unmap_single and don't need to check for DMA mapping in ixgbevf_clean_tx_irq(). In addition this patch makes use of the DMA API. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18ixgbevf: make the first tx_buffer a repository for most of the skb infoEmil Tantilov
This change makes it so that the first tx_buffer structure acts as a central storage location for most of the info about the skb we are about to transmit. In addition this patch makes tx_flags part of the ixgbevf_tx_buffer struct. This allows us to use the flags directly from the stucture and as result removes the tx_flags parameter from some functions. Also as a cleanup mapped_as_page is folded into tx_flags and some unused flags were removed. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18ixgbevf: add tx countersEmil Tantilov
This patch adds counters for tx_restart_queue and tx_timeout_count. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18ixgbevf: remove counters for Tx/Rx checksum offloadEmil Tantilov
This patch removes the Tx/Rx counters for checksum offload. The Tx counter was never updated and the Rx counter is of limited use. This is in effort to clean up the counters and make them consistent with the counters shown by ixgbe. Also this patch removes some members of the adapter structure that were never used and shuffles others to reduce number of holes. before: /* size: 1568, cachelines: 25, members: 48 */ /* sum members: 1519, holes: 10, sum holes: 43 */ /* padding: 6 */ /* last cacheline: 32 bytes */ after: /* size: 1480, cachelines: 24, members: 43 */ /* sum members: 1479, holes: 1, sum holes: 1 */ /* last cacheline: 8 bytes */ Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18ixgbevf: move ring specific stats into ring specific structureEmil Tantilov
This patch moves hot-path specific statistics into the ring structure. This allows us to drop the adapter structure in some functions and should help with performance. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18ixgbevf: make use of the dev pointer in the ixgbevf_ring structEmil Tantilov
This patch cleans up the code by removing the adapter structure as parameter from multiple functions. The adapter structure was previously being used to access the dev pointer, but this can also be done via the ixgbevf_ring structure. This way we can drop the adapter as parameter from these functions. This patch also includes small cleanups in some error code paths. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18Merge branch 'i40e'David S. Miller
Aaron Brown says: ==================== Intel Wired LAN Driver Updates This series contains updates to i40e. Neerav implements DCB and DCBNL support and adds DCB options to Kconfig. DCB is disabled by default. Anjali refactors flow control director to fix inconsistencies that were preventing clean unloads of the driver, move the queues for handling flow director error into their own hardware VSI and implement a corrected version of the basic ethtool add ntuple rule. Jesse provides fixes for a compiler warning, firmware workaround, white space fixes and renames some defines. Shannon reworks the device ID #defines to follow the DEV_ID_ convention followed by our other drivers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18Bluetooth: remove direct compilation of 6lowpan_iphc.cStephen Warren
It's now built as a separate utility module, and enabling BT selects that module in Kconfig. This fixes: net/ieee802154/built-in.o:(___ksymtab_gpl+lowpan_process_data+0x0): multiple definition of `__ksymtab_lowpan_process_data' net/bluetooth/built-in.o:(___ksymtab_gpl+lowpan_process_data+0x0): first defined here net/ieee802154/built-in.o:(___ksymtab_gpl+lowpan_header_compress+0x0): multiple definition of `__ksymtab_lowpan_header_compress' net/bluetooth/built-in.o:(___ksymtab_gpl+lowpan_header_compress+0x0): first defined here net/ieee802154/built-in.o: In function `lowpan_header_compress': net/ieee802154/6lowpan_iphc.c:606: multiple definition of `lowpan_header_compress' net/bluetooth/built-in.o:/home/swarren/shared/git_wa/kernel/kernel.git/net/bluetooth/../ieee802154/6lowpan_iphc.c:606: first defined here net/ieee802154/built-in.o: In function `lowpan_process_data': net/ieee802154/6lowpan_iphc.c:344: multiple definition of `lowpan_process_data' net/bluetooth/built-in.o:/home/swarren/shared/git_wa/kernel/kernel.git/net/bluetooth/../ieee802154/6lowpan_iphc.c:344: first defined here make[1]: *** [net/built-in.o] Error 1 (this change probably simply wasn't "git add"d to a53d34c3465b) Fixes: a53d34c3465b ("net: move 6lowpan compression code to separate module") Fixes: 18722c247023 ("Bluetooth: Enable 6LoWPAN support for BT LE devices") Signed-off-by: Stephen Warren <swarren@nvidia.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18i40e: Fix device ID define names to align to standardShannon Nelson
Rework the device ID #defines to follow the _DEV_ID convention already established in the other Intel drivers. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18i40e: add DCB option to KconfigNeerav Parikh
Allow compiling DCB related files if I40E_DCB option is supported in the kernel configuration. DCB is disabled by default. Signed-off-by: Neerav Parikh <Neerav.Parikh@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Tested-By: Jack Morgan<jack.morgan@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18i40e: add DCB and DCBNL supportNeerav Parikh
This patch adds capability to configure DCB on i40e network interfaces using Intel XL710 adapter firmware APIs. By default all VSIs are only enabled for the default traffic class enabled by firmware for any given PF. The driver would query the firmware for the traffic classes that are enabled for the port and reconfigure the LAN VSI to match to the port traffic class settings. All other VSIs are only enabled for the default traffic class settings for now. The driver registers and listens to firmware events that may require change in the DCB settings. It may reconfigure the VSI settings based on these events. This patch exposes IEEE DCBNL interfaces for the i40e driver to allow any application to query the DCB settings on the adapter. Signed-off-by: Neerav Parikh <Neerav.Parikh@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-By: Jack Morgan<jack.morgan@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18i40e: implement DCB support infastructureNeerav Parikh
Intel XL710 series of adapters support QoS as per the IEEE 802.1 DCB (Data Center Bridging) standard. This is supported in conjuction with: - Enhanced Transmission Selection (ETS) - IEEE 802.1Qaz - Priority Flow Control (PFC) - IEEE 802.1Qbb - DCB eXchange Protocol (DCBX) - IEEE 802.1Qaz On Intel XL710 adapters DCBX is performed by the adapter firmware. The firmware runs DCBX in willing mode and configures the port as per the DCB settings recommended by it's link partner. By default in absence of any DCBX; firmware would configure the port with a single traffic class and all of the port bandwith will be allocated to that traffic class. This patch adds functions and calls to support querying and configuring DCB using firmware APIs. Signed-off-by: Neerav Parikh <Neerav.Parikh@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-By: Jack Morgan<jack.morgan@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18i40e: refactor flow directorAnjali Singhai Jain
The i40e hardware was generating some inconsistent results when using current programming methods. This refactor fixes the inconsistencies that were preventing clean unloads of the driver, and moves the queues for handling flow director errors into their own hardware VSI. This patch also implements a corrected version of the basic ethtool add ntuple rule, which will disable the driver's automatic flow programming. A future patch adds remove/replay/list support for ntuple. Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18i40e: rename definesJesse Brandeburg
The FLAG_FDIR_* defines can be renamed to be more descriptive. This patch is in preparation for the following where the fdir code is refactored. Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18i40e: whitespace fixesJesse Brandeburg
Fix more whitespace issues, including making some locals declared in a nicer order. Also update Copyright string printed when the driver loads. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18i40e: Change firmware workaroundJesse Brandeburg
Remove a workaround that is no longer necessary and implement a better understanding of what firmware is returning in the MSI-X vector count. This makes it so that the driver ends up with the right amount of queues when using all available MSI-X vectors. Change-ID: I34e60cc71dcfb1b5412f37df956fedcc49ade187 Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18i40e: fix compile warning on checksum_localJesse Brandeburg
Compile testing with higher warning levels found this complaint: i40e_nvm.c: warning: 'checksum_local' may be used uninitialized in this function Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18Bluetooth: remove direct compilation of 6lowpan_iphc.cStephen Warren
It's now built as a separate utility module, and enabling BT selects that module in Kconfig. This fixes: net/ieee802154/built-in.o:(___ksymtab_gpl+lowpan_process_data+0x0): multiple definition of `__ksymtab_lowpan_process_data' net/bluetooth/built-in.o:(___ksymtab_gpl+lowpan_process_data+0x0): first defined here net/ieee802154/built-in.o:(___ksymtab_gpl+lowpan_header_compress+0x0): multiple definition of `__ksymtab_lowpan_header_compress' net/bluetooth/built-in.o:(___ksymtab_gpl+lowpan_header_compress+0x0): first defined here net/ieee802154/built-in.o: In function `lowpan_header_compress': net/ieee802154/6lowpan_iphc.c:606: multiple definition of `lowpan_header_compress' net/bluetooth/built-in.o:/home/swarren/shared/git_wa/kernel/kernel.git/net/bluetooth/../ieee802154/6lowpan_iphc.c:606: first defined here net/ieee802154/built-in.o: In function `lowpan_process_data': net/ieee802154/6lowpan_iphc.c:344: multiple definition of `lowpan_process_data' net/bluetooth/built-in.o:/home/swarren/shared/git_wa/kernel/kernel.git/net/bluetooth/../ieee802154/6lowpan_iphc.c:344: first defined here make[1]: *** [net/built-in.o] Error 1 (this change probably simply wasn't "git add"d to a53d34c3465b) Fixes: a53d34c3465b ("net: move 6lowpan compression code to separate module") Fixes: 18722c247023 ("Bluetooth: Enable 6LoWPAN support for BT LE devices") Signed-off-by: Stephen Warren <swarren@nvidia.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18virtio-net: fix build error when CONFIG_AVERAGE is not enabledMichael Dalton
Commit ab7db91705e9 ("virtio-net: auto-tune mergeable rx buffer size for improved performance") introduced a virtio-net dependency on EWMA. The inclusion of EWMA is controlled by CONFIG_AVERAGE. Fix build error when CONFIG_AVERAGE is not enabled by adding select AVERAGE to virtio-net's Kconfig entry. Build failure reported using config make ARCH=s390 defconfig. Signed-off-by: Michael Dalton <mwdalton@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18Merge branch 'ixgbe'David S. Miller
Aaron Brown says: ==================== Intel Wired LAN Driver Updates This series contains an updates to ixgbe and ixgbevf. Jacob add braces around some ixgbe_qv_lock_* calls lto better adhere to Kernel style guidelines. Don bumps the versions on ixgbe and ixgbevf to match internal driver functionality better. ==================== Reviewed-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18ixgbevf: bump versionDon Skidmore
Bump the version number to better match functionality provided with out of tree driver of the same version. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18ixgbe: bump version numberDon Skidmore
Bump the version number to better match functionality provided with out of tree driver of the same version. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18ixgbe: add braces around else condition in ixgbe_qv_lock_* callsJacob Keller
This patch adds braces around the ixgbe_qv_lock_* calls which previously only had braces around the if portion. Kernel style guidelines for this require parenthesis around all conditions if they are required around one. In addition the comment while not illegal C syntax makes the code look wrong at a cursory glance. This patch corrects the style and adds braces so that the full if-else block is uniform. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18s390/bpf,jit: fix 32 bit divisions, use unsigned divide instructionsHeiko Carstens
The s390 bpf jit compiler emits the signed divide instructions "dr" and "d" for unsigned divisions. This can cause problems: the dividend will be zero extended to a 64 bit value and the divisor is the 32 bit signed value as specified A or X accumulator, even though A and X are supposed to be treated as unsigned values. The divide instrunctions will generate an exception if the result cannot be expressed with a 32 bit signed value. This is the case if e.g. the dividend is 0xffffffff and the divisor either 1 or also 0xffffffff (signed: -1). To avoid all these issues simply use unsigned divide instructions. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18net: ftgmac100: use kfree_skb() where appropriateEric Dumazet
In order to get correct drop monitor notifications for dropped packets, we should call kfree_skb() instead of dev_kfree_skb() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18Merge branch 'bonding_slave_sysfs'David S. Miller
Scott Feldman says: ==================== bonding: add slave netlink and sysfs support v2: - Address review comment from Ding (and Veacesiav): handle kobj cleanup if sysfs_create_file() fails when adding slave attribute nodes. v1: The following series adds bonding slave netlink and sysfs interfaces. Slave interfaces get a new IFLA_SLAVE set of netlink attributes, along with RTM_NEWLINK notification when slave's active status changes. The sysfs interface adds read-only nodes for slave attributes under a /slave dir, simliar to how bond interfaces get a /bonding dir for bonding attributes. ==================== Reviewed-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18bonding: add netlink attributes to slave link devsfeldma@cumulusnetworks.com
If link is IFF_SLAVE, extend link dev netlink attributes to include slave attributes with new IFLA_SLAVE nest. Add netlink notification (RTM_NEWLINK) when slave status changes from backup to active, or visa-versa. Adds new ndo_get_slave op to net_device_ops to fill skb with IFLA_SLAVE attributes. Currently only used by bonding driver, but could be used by other aggregating devices with slaves. Signed-off-by: Scott Feldman <sfeldma@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18bonding: add sysfs /slave dir for bond slave devices.sfeldma@cumulusnetworks.com
Add sub-directory under /sys/class/net/<interface>/slave with read-only attributes for slave. Directory only appears when <interface> is a slave. $ tree /sys/class/net/eth2/slave/ /sys/class/net/eth2/slave/ ├── ad_aggregator_id ├── link_failure_count ├── mii_status ├── perm_hwaddr ├── queue_id └── state $ cat /sys/class/net/eth2/slave/* 2 0 up 40:02:10:ef:06:01 0 active Signed-off-by: Scott Feldman <sfeldma@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18net: vxlan: do not use vxlan_net before checking event typeDaniel Borkmann
Jesse Brandeburg reported that commit acaf4e70997f caused a panic when adding a network namespace while vxlan module was present in the system: [<ffffffff814d0865>] vxlan_lowerdev_event+0xf5/0x100 [<ffffffff816e9e5d>] notifier_call_chain+0x4d/0x70 [<ffffffff810912be>] __raw_notifier_call_chain+0xe/0x10 [<ffffffff810912d6>] raw_notifier_call_chain+0x16/0x20 [<ffffffff815d9610>] call_netdevice_notifiers_info+0x40/0x70 [<ffffffff815d9656>] call_netdevice_notifiers+0x16/0x20 [<ffffffff815e1bce>] register_netdevice+0x1be/0x3a0 [<ffffffff815e1dce>] register_netdev+0x1e/0x30 [<ffffffff814cb94a>] loopback_net_init+0x4a/0xb0 [<ffffffffa016ed6e>] ? lockd_init_net+0x6e/0xb0 [lockd] [<ffffffff815d6bac>] ops_init+0x4c/0x150 [<ffffffff815d6d23>] setup_net+0x73/0x110 [<ffffffff815d725b>] copy_net_ns+0x7b/0x100 [<ffffffff81090e11>] create_new_namespaces+0x101/0x1b0 [<ffffffff81090f45>] copy_namespaces+0x85/0xb0 [<ffffffff810693d5>] copy_process.part.26+0x935/0x1500 [<ffffffff811d5186>] ? mntput+0x26/0x40 [<ffffffff8106a15c>] do_fork+0xbc/0x2e0 [<ffffffff811b7f2e>] ? ____fput+0xe/0x10 [<ffffffff81089c5c>] ? task_work_run+0xac/0xe0 [<ffffffff8106a406>] SyS_clone+0x16/0x20 [<ffffffff816ee689>] stub_clone+0x69/0x90 [<ffffffff816ee329>] ? system_call_fastpath+0x16/0x1b Apparently loopback device is being registered first and thus we receive an event notification when vxlan_net is not ready. Hence, when we call net_generic() and request vxlan_net_id, we seem to access garbage at that point in time. In setup_net() where we set up a newly allocated network namespace, we traverse the list of pernet ops ... list_for_each_entry(ops, &pernet_list, list) { error = ops_init(ops, net); if (error < 0) goto out_undo; } ... and loopback_net_init() is invoked first here, so in the middle of setup_net() we get this notification in vxlan. As currently we only care about devices that unregister, move access through net_generic() there. Fix is based on Cong Wang's proposal, but only changes what is needed here. It sucks a bit as we only work around the actual cure: right now it seems the only way to check if a netns actually finished traversing all init ops would be to check if it's part of net_namespace_list. But that I find quite expensive each time we go through a notifier callback. Anyway, did a couple of tests and it seems good for now. Fixes: acaf4e70997f ("net: vxlan: when lower dev unregisters remove vxlan dev as well") Reported-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Tested-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18Merge branch 'ixgbe'David S. Miller
Aaron Brown says: ==================== Intel Wired LAN Driver Updates This series contains updates to ixgbe Ethan Zhao. The first one replaces the magic number "63" with a macro, IXGBE_MAX_VFS_DRV_LIMIT, the second moves the call to set driver_max_VFS to before SRIOV is enabled. The code of these patches match the v3 (1/2) and v2 (2/2) versions sent to the e1000-devel and netdev mailing lists. The intermediate versions (v4, v5) are from sorting out style issues, mostly tabs to spaces and split lines probably introduced via mailer. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18ixgbe: set driver_max_VFs should be done before enabling SRIOVethan.zhao
commit 43dc4e01 Limit number of reported VFs to device specific value It doesn't work and always returns -EBUSY because VFs are already enabled. ixgbe_enable_sriov() pci_enable_sriov() sriov_enable() { ... .. iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE; pci_cfg_access_lock(dev); ... ... } pci_sriov_set_totalvfs() { ... ... if (dev->sriov->ctrl & PCI_SRIOV_CTRL_VFE) return -EBUSY; ... } So should set driver_max_VFs with pci_sriov_set_totalvfs() before enable VFs with ixgbe_enable_sriov(). V2: revised for net-next tree. Signed-off-by: Ethan Zhao <ethan.kernel@gmail.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18ixgbe: define IXGBE_MAX_VFS_DRV_LIMIT macro and cleanup const 63ethan.zhao
Because ixgbe driver limit the max number of VF functions could be enabled to 63, so define one macro IXGBE_MAX_VFS_DRV_LIMIT and cleanup the const 63 in code. v3: revised for net-next tree. Signed-off-by: Ethan Zhao <ethan.kernel@gmail.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18ipv4: fix a dst leak in tunnelsEric Dumazet
This patch : 1) Remove a dst leak if DST_NOCACHE was set on dst Fix this by holding a reference only if dst really cached. 2) Remove a lockdep warning in __tunnel_dst_set() This was reported by Cong Wang. 3) Remove usage of a spinlock where xchg() is enough 4) Remove some spurious inline keywords. Let compiler decide for us. Fixes: 7d442fab0a67 ("ipv4: Cache dst in tunnels") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Cong Wang <cwang@twopensource.com> Cc: Tom Herbert <therbert@google.com> Cc: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18sh_eth: Add support for r7s72100Simon Horman
The r7s72100 SoC includes a fast ethernet controller. Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18sh_eth: Use bool as return type of sh_eth_is_gether()Simon Horman
Return a boolean from sh_eth_is_gether() and refactor it as a one-liner. Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18ipv6: send Change Status Report after DAD is completedFlavio Leitner
The RFC 3810 defines two type of messages for multicast listeners. The "Current State Report" message, as the name implies, refreshes the *current* state to the querier. Since the querier sends Query messages periodically, there is no need to retransmit the report. On the other hand, any change should be reported immediately using "State Change Report" messages. Since it's an event triggered by a change and that it can be affected by packet loss, the rfc states it should be retransmitted [RobVar] times to make sure routers will receive timely. Currently, we are sending "Current State Reports" after DAD is completed. Before that, we send messages using unspecified address (::) which should be silently discarded by routers. This patch changes to send "State Change Report" messages after DAD is completed fixing the behavior to be RFC compliant and also to pass TAHI IPv6 testsuite. Signed-off-by: Flavio Leitner <fbl@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18parisc: fix SO_MAX_PACING_RATE typoEric Dumazet
SO_MAX_PACING_RATE definition on parisc got a typo. Its not too late to fix it, before 3.13 is official. Fixes: 62748f32d501 ("net: introduce SO_MAX_PACING_RATE") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18ipv6: simplify detection of first operational link-local address on interfaceHannes Frederic Sowa
In commit 1ec047eb4751e3 ("ipv6: introduce per-interface counter for dad-completed ipv6 addresses") I build the detection of the first operational link-local address much to complex. Additionally this code now has a race condition. Replace it with a much simpler variant, which just scans the address list when duplicate address detection completes, to check if this is the first valid link local address and send RS and MLD reports then. Fixes: 1ec047eb4751e3 ("ipv6: introduce per-interface counter for dad-completed ipv6 addresses") Reported-by: Jiri Pirko <jiri@resnulli.us> Cc: Flavio Leitner <fbl@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Flavio Leitner <fbl@redhat.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18tcp: metrics: Avoid duplicate entries with the same destination-IPChristoph Paasch
Because the tcp-metrics is an RCU-list, it may be that two soft-interrupts are inside __tcp_get_metrics() for the same destination-IP at the same time. If this destination-IP is not yet part of the tcp-metrics, both soft-interrupts will end up in tcpm_new and create a new entry for this IP. So, we will have two tcp-metrics with the same destination-IP in the list. This patch checks twice __tcp_get_metrics(). First without holding the lock, then while holding the lock. The second one is there to confirm that the entry has not been added by another soft-irq while waiting for the spin-lock. Fixes: 51c5d0c4b169b (tcp: Maintain dynamic metrics in local cache.) Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18qlcnic: remove unused codestephen hemminger
Remove function qlcnic_enable_eswitch which was defined but never used in current code. Compile tested only. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18qlcnic: make local functions staticstephen hemminger
Functions only used in one file should be static. Found by running make namespacecheck Compile tested only. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18ipv6: tcp: fix flowlabel value in ACK messages send from TIME_WAITFlorent Fourcot
This patch is following the commit b903d324bee262 (ipv6: tcp: fix TCLASS value in ACK messages sent from TIME_WAIT). For the same reason than tclass, we have to store the flow label in the inet_timewait_sock to provide consistency of flow label on the last ACK. Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18net: rds: fix per-cpu helper usageGerald Schaefer
commit ae4b46e9d "net: rds: use this_cpu_* per-cpu helper" broke per-cpu handling for rds. chpfirst is the result of __this_cpu_read(), so it is an absolute pointer and not __percpu. Therefore, __this_cpu_write() should not operate on chpfirst, but rather on cache->percpu->first, just like __this_cpu_read() did before. Cc: <stable@vger.kernel.org> # 3.8+ Signed-off-byd Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18Merge branch 'for-davem' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== Please pull this batch of updates for the 3.14 stream! For the mac80211 bits, Johannes says: "This time I have uAPSD fixes since I was working on that, hwsim improvements to make dynamic radios possible for the test suite, the evidently long-overdue channel_change_time removal and a few other small collected fix and improvements." For the iwlwifi bits, Emmanuel says: "Besides a few trivial patches, I have an important workaround for a HW issue that has kept me busy for a long time. Along with it, a fix that prevents an error from being printed. Eyal fixes our behavior against SISO APs and Ilan fixes an issue with multiple interface scenarios. Eliad fixes an error path in our init flow. We also have a few 'static analyzers' fix." For the NFC bits, Samuel says: "It includes: * A new NFC driver for Marvell's 8897, and a few NCI fixes and improvements needed to support this chipset. * An LLCP fix for how we were setting the default MIU on a p2p link. If there is no explicit MIU extension announced at connection time, we must use the default one and not the one announced at LLCP link establishement time. * A pn544 EEPROM config update. Some of the currently EEPROM configured values are overwriting the firmware ones while other should not be set by the driver itself. * Some NFC digital stack fixes and improvements. Asynchronous functions are better documented, RF technologies and CRC functions are set upon PSL_REQ reception, and a few minor bugs are fixed. * Minor and miscelaneous pn533, mei_phy and port100 fixes." For the ath bits, Kalle says: "Janusz added Kconfig option for DFS. The DFS code was there already, but after fixes to mac80211 we can now enable it. Bartosz added a runtime firmware feature flag to disable P2P. Our 10.1 firmware branch doesn't support P2P and ath10k can now disable that. He also added a limit for how many clients can connect to ath10k AP. Michal fixed WEP shared authentication, in case someone still uses it. And I added firmware debug log to help the firmware engineers." Along with that is a small batch of ath9k updates and a few other bits here and there. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace fixes from Eric Biederman: "This is a set of 3 regression fixes. This fixes /proc/mounts when using "ip netns add <netns>" to display the actual mount point. This fixes a regression in clone that broke lxc-attach. This fixes a regression in the permission checks for mounting /proc that made proc unmountable if binfmt_misc was in use. Oops. My apologies for sending this pull request so late. Al Viro gave interesting review comments about the d_path fix that I wanted to address in detail before I sent this pull request. Unfortunately a bad round of colds kept from addressing that in detail until today. The executive summary of the review was: Al: Is patching d_path really sufficient? The prepend_path, d_path, d_absolute_path, and __d_path family of functions is a really mess. Me: Yes, patching d_path is really sufficient. Yes, the code is mess. No it is not appropriate to rewrite all of d_path for a regression that has existed for entirely too long already, when a two line change will do" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: vfs: Fix a regression in mounting proc fork: Allow CLONE_PARENT after setns(CLONE_NEWPID) vfs: In d_path don't call d_dname on a mount point