summaryrefslogtreecommitdiff
path: root/drivers/net/bonding
AgeCommit message (Collapse)Author
2015-04-28bonding: Fix missed braceJianhua Xie
Change-Id: I6738d25d36476502c93f28f5023df95c4ba7adb3 Reviewed-on: http://git.am.freescale.net:8181/35494 Reviewed-by: Pan Jiafei <Jiafei.Pan@freescale.com> Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com> Change-Id: Ie61c478a82ce07b53fee94fcbc78984343168232 Reviewed-on: http://git.am.freescale.net:8181/35653 Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com> Reviewed-by: Honghua Yin <Hong-Hua.Yin@freescale.com>
2015-03-09bonding: Fix compile warningJianhua Xie
Change-Id: I3845a805706eb1f78ecd4e208c26252cec3bf5a3 Reviewed-on: http://git.am.freescale.net:8181/30479 Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com> Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
2015-03-09bonding: Remove unnecessary qman_query_fq()Jianhua Xie
when qman_init_fq() returns 0, which means fq init OK. qman_query_fq is not mandatory required any more. Change-Id: Ied9c151de47b3521949237fb123dd5be3081edfb Reviewed-on: http://git.am.freescale.net:8181/30329 Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com> Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
2015-03-09bonding: Add ping monitor support when link failoverJianhua Xie
when one of slaves link is down, ping reportes failed. This patch adds some debugging information to track it. Change-Id: Ied9a2f067e0b19ac4c1e94b52ea67245c04169e4 Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com> Reviewed-on: http://git.am.freescale.net:8181/28618 Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
2015-03-09bonding: fix OHFRIENDNAMSIZ issueJianhua Xie
Add 1 byte for store '\0' for future sprintf calling. Change-Id: I7911c22c0ce54144ef4e0acb43302d92e1379dcf Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com> Reviewed-on: http://git.am.freescale.net:8181/28617 Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
2015-03-09bonding: fix release_pcd_mem() calltraceJianhua Xie
Add NULL pointer checking to avoid release_pcd_mem calltrace while release memory. Change-Id: I83a3ed6dcd8fcff22db75dba6670ce03d427c04a Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com> Reviewed-on: http://git.am.freescale.net:8181/28616 Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
2015-03-09bonding: Replace a micro SLAVE_IS_OK with IS_UPJianhua Xie
At the hot path, use IS_UP() to instead of SLAVE_IS_OK() to reduce unnecessary condition checking. Change-Id: Ib17db501fb214f74e489940912a3c3be3920f633 Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com> Reviewed-on: http://git.am.freescale.net:8181/28615 Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
2015-03-09bonding: Remove a pair of duplicate readlock()Jianhua Xie
Near the caller, old codes have below read lock: ...... readlock(bond->lock); ...... readlock(bond->lock); readunlock(bond->lock); ... readunlock(bond->unlock); ...... The read lock in middle of above lines is unnecessary, which should be removed. Change-Id: Icbb1b3a15007d413101c8a36151e85dfaedd6e68 Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com> Reviewed-on: http://git.am.freescale.net:8181/28614 Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
2015-03-09bonding: Refactor public subroutine op_alloc_pool_channelJianhua Xie
Old codes hide alloc_pool_channel in a subroutine, which method can't create multiple pool channels. This patch is abstracting allocation pool channel relative codes to a new subroutine, which can create multiple pool channel for different bonding instances. Change-Id: I428bf4dae0386aeb9557959f641cfd55ad707988 Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com> Reviewed-on: http://git.am.freescale.net:8181/28613 Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
2015-03-09bonding: Fix klocwork issues.Jianhua Xie
Change-Id: Iedc5272bd1c367fc6944ff277b5207df44a00890 Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com> Reviewed-on: http://git.am.freescale.net:8181/28612 Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
2015-03-09bonding: Refactor is_dpa_eth_port()Jianhua Xie
Change-Id: Ie5718f4acc69fe8b9990885820727422ee7ae8b7 Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com> Reviewed-on: http://git.am.freescale.net:8181/28611 Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
2015-03-09bonding: Refactor CEETM QoS APIsJianhua Xie
Old codes parsed node full name to get information which ceetm required. This patch replaces the old method with the standard sys_call of_property_read_u32. Change-Id: I364b3b66837eab2e14a33977a182add3d48a273f Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com> Reviewed-on: http://git.am.freescale.net:8181/28610 Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
2015-02-13Merge branch 'rtmerge'Scott Wood
Signed-off-by: Scott Wood <scottwood@freescale.com> Conflicts: arch/arm/kvm/mmu.c arch/arm/mm/proc-v7-3level.S arch/powerpc/kernel/vdso32/getcpu.S drivers/crypto/caam/error.c drivers/crypto/caam/sg_sw_sec4.h drivers/usb/host/ehci-fsl.c
2015-02-13Reset to 3.12.37Scott Wood
2014-09-02bonding: fix missed readunlock() while hack LAGJianhua Xie
LAG is hacking bond_3ad_xmit_xor() with Freescale PCD. Original xmit policy is calculated by CPU and software, after hacking, this part is replaced with FMan Keygen hashing to instead. This patch is fixing an errors on missing readunlock() after called readlock() while hacking. Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com> Change-Id: Ieffe4b340c83e6f2d6f20fd01124eaf1865c81a5 Reviewed-on: http://git.am.freescale.net:8181/17717 Reviewed-by: Honghua Yin <Hong-Hua.Yin@freescale.com> Reviewed-by: Chenhui Zhao <chenhui.zhao@freescale.com>
2014-08-06bonding: correct FM_PCD_KgSchemeDelete() error in LAGJianhua Xie
1. Correct FM_PCD_KgSchemeDelete() error in LAG: In FMD UM, at the chapter Frame Manager PCD Runtime Unit API, FM_PCD_KgSchemeDelete() is told to be run at run-time, and FM_PCD_KgSchemeDelete() is allowed only following FM_PCD_Init() & FM_PCD_KgSchemeSet(). But in HW LAG test codes, even follows FM_PCD_Init() & FM_PCD_KgSchemeSet(), this API reports errors: cpu3/3: ! MINOR FM-PCD Error [CPU03, drivers/net/ethernet/freescale/fman/Peripherals/FM/Pcd/fm_kg.c:983 InvalidateSchemeSw]: Invalid State; cpu3/3: Trying to delete a scheme that has ports bound tocpu3/3: cpu3/3: ! MINOR FM-PCD Error [CPU03, drivers/net/ethernet/freescale/fman/Peripherals/FM/Pcd/fm_kg.c:3046 FM_PCD_KgSchemeDelete]: Invalid State; cpu3/3: cpu3/3: KgSchemeDelete(h_Schemes[0]) = c00000002e4c0038 Err. In order to simply the HW LAG codes, and get rid of this error, insert FM_PORT_DeletePCD() before FM_PCD_KgSchemeDelete(). 2. Adjust multiple schemes order with FM_PCD_KgSchemeSet: To ensure distribution order, adjust multiple schemes order, please refer to the chapter "The dist_order Element" of FMCTUG. Change-Id: I4aeef84e796b624d8625db48c0c3098bf9f81abd Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com> Reviewed-on: http://git.am.freescale.net:8181/14202 Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com> Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com> Reviewed-by: Richard Schmitt <richard.schmitt@freescale.com>
2014-08-06bonding: avoid re-allocate PCD memory for HW LAGJianhua Xie
Add a flag to avoid re-allocation PCD memory for HW LAG. Without this patch, old codes can introduce memory leak when call alloc_pcd_mem() in HW LAG. Change-Id: I92065867c8a2e8ea8315f4cd0de1b3ec99c512e7 Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com> Reviewed-on: http://git.am.freescale.net:8181/14201 Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com> Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com> Reviewed-by: Richard Schmitt <richard.schmitt@freescale.com>
2014-08-06bonding: fix wrong PCD Extract fieldJianhua Xie
fix wrong memset parameters for PCD Extract field ethernet.dst. Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com> Change-Id: I8f240ccce2743f1d4da7825e7cfaf0894e5fa33d Reviewed-on: http://git.am.freescale.net:8181/14200 Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com> Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com> Reviewed-by: Richard Schmitt <richard.schmitt@freescale.com>
2014-08-06bonding: fix "TCP/UDP over IPv4 can't forwarding" errorJianhua Xie
Old codes were helping to fill parser result only when FMan calculated CSUM. This patch is helping FMan to fill parser result no matter whether FMan calculates CSUM or not. Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com> Change-Id: I812f78a4b8f40dec3ef4d36e2624f4e140995112 Reviewed-on: http://git.am.freescale.net:8181/13581 Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com> Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com> Reviewed-by: Richard Schmitt <richard.schmitt@freescale.com>
2014-08-06bonding: fix unfit debug description.Jianhua Xie
fix unfit debug description, add caller source. Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com> Change-Id: I9266df0244ba7fb3918b1bfbbd89fc54f7d97175 Reviewed-on: http://git.am.freescale.net:8181/13580 Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com> Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com> Reviewed-by: Richard Schmitt <richard.schmitt@freescale.com>
2014-05-14Reset to 3.12.19Scott Wood
2014-05-14Merge remote-tracking branch 'stable/linux-3.12.y' into sdk-v1.6.xScott Wood
Signed-off-by: Scott Wood <scottwood@freescale.com> Conflicts: arch/sparc/Kconfig drivers/tty/tty_buffer.c
2014-04-22bonding: make LAG codes to fit different kernelJianhua Xie
Linux kernel API bond_for_each_slave of bonding.h has different params along with different kernel version as below table: Older version: bond_for_each_slave(bond, pos, int cnt), 3.11.0-rc1(dec1e90e8): bond_for_each_slave(bond, pos), 3.12.0-rc1(9caff1e7b): bond_for_each_slave(bond, slave, list_head *iter) This patch is making LAG codes to fit different kernel version. Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com> Change-Id: Ifd365cd232aba67a437b156568d88c6e16c44c1a Reviewed-on: http://git.am.freescale.net:8181/11293 Reviewed-by: Dongsheng Wang <dongsheng.wang@freescale.com> Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com> Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com> Reviewed-by: Scott Wood <scottwood@freescale.com>
2014-04-18Merge branch 'master-tmp' into sdk-v1.6.xScott Wood
master-tmp is the master branch as of 8b60f5ea90c49344692a70f62cd4aa349de38b48 with the following commits reverted due to excessive conflicts: commit b35a69559c46e066e6f24bb02d5a6090483786e3 Author: Scott Wood <scottwood@freescale.com> Date: Fri Apr 18 15:27:52 2014 -0500 Revert "net: add sysfs helpers for netdev_adjacent logic" This reverts commit 0be682ca768d671c91cfd1379759efcb3b29102a. commit 1c0dc06e47e11bf758f3e84ea90c2178a31dbf0f Author: Scott Wood <scottwood@freescale.com> Date: Fri Apr 18 15:27:47 2014 -0500 Revert "net: rename sysfs symlinks on device name change" This reverts commit 45ce45c69750b93b8262aa66792185bd49150293. Conflicts: drivers/iommu/fsl_pamu.c drivers/net/bonding/bond_3ad.c drivers/net/bonding/bond_sysfs.c drivers/net/bonding/bonding.h drivers/net/ethernet/freescale/gianfar.c Signed-off-by: Scott Wood <scottwood@freescale.com> Conflicts: drivers/iommu/fsl_pamu.c drivers/net/bonding/bond_3ad.c drivers/net/bonding/bond_sysfs.c drivers/net/bonding/bonding.h drivers/net/ethernet/freescale/gianfar.c
2014-04-18bonding: set correct vlan id for alb xmit pathdingtianhong
[ Upstream commit fb00bc2e6cd2046282ba4b03f4fe682aee70b2f8 ] The commit d3ab3ffd1d728d7ee77340e7e7e2c7cfe6a4013e (bonding: use rlb_client_info->vlan_id instead of ->tag) remove the rlb_client_info->tag, but occur some issues, The vlan_get_tag() will return 0 for success and -EINVAL for error, so the client_info->vlan_id always be set to 0 if the vlan_get_tag return 0 for success, so the client_info would never get a correct vlan id. We should only set the vlan id to 0 when the vlan_get_tag return error. Fixes: d3ab3ffd1d7 (bonding: use rlb_client_info->vlan_id instead of ->tag) CC: Ding Tianhong <dingtianhong@huawei.com> CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-04-15bonding: make pcd basefqid aligned in FSL LAGJianhua Xie
Distribution may result in less than hashDistributionNumOfFqids queues if baseFqid unaligned. Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com> Change-Id: I3bb9dd597d305759a81a0319957d92c096294c4c Reviewed-on: http://git.am.freescale.net:8181/10949 Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com> Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com> Reviewed-by: Jose Rivera <German.Rivera@freescale.com>
2014-04-11bonding: kernel space PCD for xmit distributionJianhua Xie
The kernel space PCD part provides hash based outgoing traffic distribution. The sources can be L2 MAC/L3 SRC and DST IP addr/ L4 SRC and DST port information. Current version only support L2 information hash which is the default transmit policy in the Linux bonding driver. Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com> Change-Id: Ifd85630ab0eebd77713574f6cf51fb92203a1c06 Reviewed-on: http://git.am.freescale.net:8181/10414 Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com> Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com> Reviewed-by: Florinel Iordache <Florin.Iordache@freescale.com> Reviewed-by: Jose Rivera <German.Rivera@freescale.com>
2014-04-11bonding: LAG with outgoing traffic distribution based on h/wJianhua Xie
Linux bonding driver provides a method for aggregating multiple network interface controllers (NICs) into a single logical bonded interface of two or more so called (NIC) slaves. Slave selection for outgoing traffic is done according to the transmit hash policy, which may be changed from the default simple XOR policy via the xmit_hash_policy option. This selection algorithm in Linux bonding driver is based on software. The QorIQ Data Path Acceleration Architecture (DPAA) is a comprehensive architecture which integrates all aspects of packet processing in the SoC, addressing issues and requirements resulting from the multicore nature of QorIQ SoCs. The DPAA includes Cores, Network and packet I/O, Hardware offload accelerators. Hardware offload accelerators include FMan/BMan/QMan and etc. Offline port is one of FMan elements, which supports (Parse, Classify, Distribute) PCD function on frames extracted frame descriptor (FD). Offline port also can inspect traffic, split it into FQs on ingress, and send traffic from the FQs to the interface on egress by the PCD function. These patches are enhancing Linux kernel LAG (Link Aggregation) with Freescale DPAA value added. The main idea is to utilize offline ports with PCD function to help to distribute outgoing traffics, including outgoing slaves device searching and selection. In another world, patches are using CRC-64 based hashing of Keygen/scheme and the parser result of outgoing frames header information to distribute outgoing frames. Beside of above, after integration this HW based LAG with Freescale CEETMQos, these two features can support hardware based Qos for bundles links rather than individual links. These patches mainly include 2 parts: 'glue logic' and 'kernel space PCD'. The glue logic first probes all available offline ports information via reading dts, including tx fqid/default fqid/errors fqid, pcd fqs, other private data pointer of offline ports for future reusing. The glue logic also creates frames from skb and then sends these frames to offline port directly, this offline port will continue to distribution frames from the PCD FQs to the slave interface on egress by the PCD function, rather than select slave device by CPU, neither make slave device driver create frame from skb, nor make slave devices driver send frames. These patches are supporting the mapping among offline ports and available bundles at run-time. PCD based outgoing traffic distribution can be enabled or disabled at run-time by sysfs interface in patches. To do: 1. PCD policy L23/L34 have not been veryfied. 2. offline port buffer pool/buffer layout will be enhanced. 3. software based L4 csum for now, offline port based L4 csum need be fixed. To test this HW based LAG after booting up Linux: cd /sys/class/net/bond0/bonding/ echo 4 >mode cat offline_ports echo fman0-oh@1 > oh_needed_for_hw_distribution cat oh_needed_for_hw_distribution cat oh_en echo 1 >oh_en cat oh_en echo +fm1-gb0 >slaves echo +fm1-gb1 >slaves ifconfig bond0 192.168.10.2/24 up ping 192.168.10.1 -c 5 Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com> Change-Id: I3a6664bfcc9ec9ca3f86a5e36381220c5fcb07cf Reviewed-on: http://git.am.freescale.net:8181/10413 Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com> Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com> Reviewed-by: Jose Rivera <German.Rivera@freescale.com>
2014-02-26bonding: 802.3ad: make aggregator_identifier bond-privateJiri Bohac
[ Upstream commit 163c8ff30dbe473abfbb24a7eac5536c87f3baa9 ] aggregator_identifier is used to assign unique aggregator identifiers to aggregators of a bond during device enslaving. aggregator_identifier is currently a global variable that is zeroed in bond_3ad_initialize(). This sequence will lead to duplicate aggregator identifiers for eth1 and eth3: create bond0 change bond0 mode to 802.3ad enslave eth0 to bond0 //eth0 gets agg id 1 enslave eth1 to bond0 //eth1 gets agg id 2 create bond1 change bond1 mode to 802.3ad enslave eth2 to bond1 //aggregator_identifier is reset to 0 //eth2 gets agg id 1 enslave eth3 to bond0 //eth3 gets agg id 2 Fix this by making aggregator_identifier private to the bond. Signed-off-by: Jiri Bohac <jbohac@suse.cz> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2013-12-08bonding: fix two race conditions in bond_store_updelay/downdelayNikolay Aleksandrov
[ Upstream commit b869ccfab1e324507fa3596e3e1308444fb68227 ] This patch fixes two race conditions between bond_store_updelay/downdelay and bond_store_miimon which could lead to division by zero as miimon can be set to 0 while either updelay/downdelay are being set and thus miss the zero check in the beginning, the zero div happens because updelay/downdelay are stored as new_value / bond->params.miimon. Use rtnl to synchronize with miimon setting. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-08bonding: don't permit to use ARP monitoring in 802.3ad modeVeaceslav Falico
[ Upstream commit ec9f1d15db8185f63a2c3143dc1e90ba18541b08 ] Currently the ARP monitoring is not supported with 802.3ad, and it's prohibited to use it via the module params. However we still can set it afterwards via sysfs, cause we only check for *LB modes there. To fix this - add a check for 802.3ad mode in bonding_store_arp_interval. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-08bonding: RCUify bond_set_rx_mode()Veaceslav Falico
[ Upstream commit b32418705107265dfca5edfe2b547643e53a732e ] Currently we rely on rtnl locking in bond_set_rx_mode(), however it's not always the case: RTNL: assertion failed at drivers/net/bonding/bond_main.c (3391) ... [<ffffffff81651ca5>] dump_stack+0x54/0x74 [<ffffffffa029e717>] bond_set_rx_mode+0xc7/0xd0 [bonding] [<ffffffff81553af7>] __dev_set_rx_mode+0x57/0xa0 [<ffffffff81557ff8>] __dev_mc_add+0x58/0x70 [<ffffffff81558020>] dev_mc_add+0x10/0x20 [<ffffffff8161e26e>] igmp6_group_added+0x18e/0x1d0 [<ffffffff81186f76>] ? kmem_cache_alloc_trace+0x236/0x260 [<ffffffff8161f80f>] ipv6_dev_mc_inc+0x29f/0x320 [<ffffffff8161f9e7>] ipv6_sock_mc_join+0x157/0x260 ... Fix this by using RCU primitives. Reported-by: Joe Lawrence <joe.lawrence@stratus.com> Tested-by: Joe Lawrence <joe.lawrence@stratus.com> CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-28bonding: disable arp and enable mii monitoring when bond change to no uses ↵dingtianhong
arp mode Because the ARP monitoring is not support for 802.3ad, but I still could change the mode to 802.3ad from ab mode while ARP monitoring is running, it is incorrect. So add a check for 802.3ad in bonding_store_mode to fix the problem, and make a new macro BOND_NO_USES_ARP() to simplify the code. v2: according to the Dan Williams's suggestion, bond mode is the most important bond option, it should override any of the other sub-options. So when the mode is changed, the conficting values should be cleared or reset, otherwise the user has to duplicate more operations to modify the logic. I disable the arp and enable mii monitoring when the bond mode is changed to AB, TB and 8023AD if the arp interval is true. v3: according to the Nik's suggestion, the default value of miimon should need a name, there is several place to use it, and the bond_store_arp_interval() could use micro BOND_NO_USES_ARP to make the code more simpify. Suggested-by: Dan Williams <dcbw@redhat.com> Suggested-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-15bonding: add ip checks when store ip targetWang Weidong
I met a Bug when I add ip target with the wrong ip address: echo +500.500.500.500 > /sys/class/net/bond0/bonding/arp_ip_target the wrong ip address will transfor to 245.245.245.244 and add to the ip target success, it is uncorrect, so I add checks to avoid adding wrong address. The in4_pton() will set wrong ip address to 0.0.0.0, it will return by the next check and will not add to ip target. v2 According Veaceslav's opinion, simplify the code. v3 According Veaceslav's opinion, add broadcast check and make a micro definition to package it. v4 Solve the problem of the format which David point out. Suggested-by: Veaceslav Falico <vfalico@redhat.com> Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-14bonding: fix two race conditions in bond_store_updelay/downdelayNikolay Aleksandrov
This patch fixes two race conditions between bond_store_updelay/downdelay and bond_store_miimon which could lead to division by zero as miimon can be set to 0 while either updelay/downdelay are being set and thus miss the zero check in the beginning, the zero div happens because updelay/downdelay are stored as new_value / bond->params.miimon. Use rtnl to synchronize with miimon setting. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-14bonding: don't permit to use ARP monitoring in 802.3ad modeVeaceslav Falico
Currently the ARP monitoring is not supported with 802.3ad, and it's prohibited to use it via the module params. However we still can set it afterwards via sysfs, cause we only check for *LB modes there. To fix this - add a check for 802.3ad mode in bonding_store_arp_interval. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) The addition of nftables. No longer will we need protocol aware firewall filtering modules, it can all live in userspace. At the core of nftables is a, for lack of a better term, virtual machine that executes byte codes to inspect packet or metadata (arriving interface index, etc.) and make verdict decisions. Besides support for loading packet contents and comparing them, the interpreter supports lookups in various datastructures as fundamental operations. For example sets are supports, and therefore one could create a set of whitelist IP address entries which have ACCEPT verdicts attached to them, and use the appropriate byte codes to do such lookups. Since the interpreted code is composed in userspace, userspace can do things like optimize things before giving it to the kernel. Another major improvement is the capability of atomically updating portions of the ruleset. In the existing netfilter implementation, one has to update the entire rule set in order to make a change and this is very expensive. Userspace tools exist to create nftables rules using existing netfilter rule sets, but both kernel implementations will need to co-exist for quite some time as we transition from the old to the new stuff. Kudos to Patrick McHardy, Pablo Neira Ayuso, and others who have worked so hard on this. 2) Daniel Borkmann and Hannes Frederic Sowa made several improvements to our pseudo-random number generator, mostly used for things like UDP port randomization and netfitler, amongst other things. In particular the taus88 generater is updated to taus113, and test cases are added. 3) Support 64-bit rates in HTB and TBF schedulers, from Eric Dumazet and Yang Yingliang. 4) Add support for new 577xx tigon3 chips to tg3 driver, from Nithin Sujir. 5) Fix two fatal flaws in TCP dynamic right sizing, from Eric Dumazet, Neal Cardwell, and Yuchung Cheng. 6) Allow IP_TOS and IP_TTL to be specified in sendmsg() ancillary control message data, much like other socket option attributes. From Francesco Fusco. 7) Allow applications to specify a cap on the rate computed automatically by the kernel for pacing flows, via a new SO_MAX_PACING_RATE socket option. From Eric Dumazet. 8) Make the initial autotuned send buffer sizing in TCP more closely reflect actual needs, from Eric Dumazet. 9) Currently early socket demux only happens for TCP sockets, but we can do it for connected UDP sockets too. Implementation from Shawn Bohrer. 10) Refactor inet socket demux with the goal of improving hash demux performance for listening sockets. With the main goals being able to use RCU lookups on even request sockets, and eliminating the listening lock contention. From Eric Dumazet. 11) The bonding layer has many demuxes in it's fast path, and an RCU conversion was started back in 3.11, several changes here extend the RCU usage to even more locations. From Ding Tianhong and Wang Yufen, based upon suggestions by Nikolay Aleksandrov and Veaceslav Falico. 12) Allow stackability of segmentation offloads to, in particular, allow segmentation offloading over tunnels. From Eric Dumazet. 13) Significantly improve the handling of secret keys we input into the various hash functions in the inet hashtables, TCP fast open, as well as syncookies. From Hannes Frederic Sowa. The key fundamental operation is "net_get_random_once()" which uses static keys. Hannes even extended this to ipv4/ipv6 fragmentation handling and our generic flow dissector. 14) The generic driver layer takes care now to set the driver data to NULL on device removal, so it's no longer necessary for drivers to explicitly set it to NULL any more. Many drivers have been cleaned up in this way, from Jingoo Han. 15) Add a BPF based packet scheduler classifier, from Daniel Borkmann. 16) Improve CRC32 interfaces and generic SKB checksum iterators so that SCTP's checksumming can more cleanly be handled. Also from Daniel Borkmann. 17) Add a new PMTU discovery mode, IP_PMTUDISC_INTERFACE, which forces using the interface MTU value. This helps avoid PMTU attacks, particularly on DNS servers. From Hannes Frederic Sowa. 18) Use generic XPS for transmit queue steering rather than internal (re-)implementation in virtio-net. From Jason Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1622 commits) random32: add test cases for taus113 implementation random32: upgrade taus88 generator to taus113 from errata paper random32: move rnd_state to linux/random.h random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized random32: add periodic reseeding random32: fix off-by-one in seeding requirement PHY: Add RTL8201CP phy_driver to realtek xtsonic: add missing platform_set_drvdata() in xtsonic_probe() macmace: add missing platform_set_drvdata() in mace_probe() ethernet/arc/arc_emac: add missing platform_set_drvdata() in arc_emac_probe() ipv6: protect for_each_sk_fl_rcu in mem_check with rcu_read_lock_bh vlan: Implement vlan_dev_get_egress_qos_mask as an inline. ixgbe: add warning when max_vfs is out of range. igb: Update link modes display in ethtool netfilter: push reasm skb through instead of original frag skbs ip6_output: fragment outgoing reassembled skb properly MAINTAINERS: mv643xx_eth: take over maintainership from Lennart net_sched: tbf: support of 64bit rates ixgbe: deleting dfwd stations out of order can cause null ptr deref ixgbe: fix build err, num_rx_queues is only available with CONFIG_RPS ...
2013-11-07bonding: extend round-robin mode with packets_per_slaveNikolay Aleksandrov
This patch aims to extend round-robin mode with a new option called packets_per_slave which can have the following values and effects: 0 - choose a random slave 1 (default) - standard round-robin, 1 packet per slave >1 - round-robin when >1 packets have been transmitted per slave The allowed values are between 0 and 65535. This patch also fixes the comment style in bond_xmit_roundrobin(). Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-02bonding: bond_get_size() returns wrong sizeDan Carpenter
There is an extra semi-colon so bond_get_size() doesn't return the correct value. Fixes: ec76aa49855f ('bonding: add Netlink support active_slave option') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-28Revert "Merge branch 'bonding_monitor_locking'"David S. Miller
This reverts commit 4d961a101e032b4bf223b279b4b35bc77576f5a8, reversing changes made to a00f6fcc7d0c62a91768d9c4ccba4c7d64fbbce3. Revert bond locking changes, they cause regressions and Veaceslav Falico doesn't like how the commit messages were done at all. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-27bonding: remove bond read lock for bond_3ad_state_machine_handler()dingtianhong
The bond slave list may change when the monitor is running, the slave list is no longer protected by bond->lock, only protected by rtnl lock(), so we have 3 ways to modify it: 1.add bond_master_upper_dev_link() and bond_upper_dev_unlink() in bond->lock, but it is unsafe to call call_netdevice_notifiers() in write lock. 2.remove unused bond->lock for monitor function, only use the existing rtnl lock(). 3.use rcu_read_lock() to protect it, of course, it will transform bond_for_each_slave to bond_for_each_slave_rcu() and performance is better, but in slow path, it is ignored. so I remove the bond->lock and move the rtnl lock to protect the whole monitor function. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-27bonding: remove bond read lock for bond_activebackup_arp_mon()dingtianhong
The bond slave list may change when the monitor is running, the slave list is no longer protected by bond->lock, only protected by rtnl lock(), so we have 3 ways to modify it: 1.add bond_master_upper_dev_link() and bond_upper_dev_unlink() in bond->lock, but it is unsafe to call call_netdevice_notifiers() in write lock. 2.remove unused bond->lock for monitor function, only use the existing rtnl lock(). 3.use rcu_read_lock() to protect it, of course, it will transform bond_for_each_slave to bond_for_each_slave_rcu() and performance is better, but in slow path, it is ignored. so I remove the bond->lock and move the rtnl lock to protect the whole monitor function. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-27bonding: remove bond read lock for bond_loadbalance_arp_mon()dingtianhong
The bond slave list may change when the monitor is running, the slave list is no longer protected by bond->lock, only protected by rtnl lock(), so we have 3 ways to modify it: 1.add bond_master_upper_dev_link() and bond_upper_dev_unlink() in bond->lock, but it is unsafe to call call_netdevice_notifiers() in write lock. 2.remove unused bond->lock for monitor function, only use the existing rtnl lock(). 3.use rcu_read_lock() to protect it, of course, it will transform bond_for_each_slave to bond_for_each_slave_rcu() and performance is better, but in slow path, it is ignored. so I remove the bond->lock and add the rtnl lock to protect the whole monitor function. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-27bonding: remove bond read lock for bond_alb_monitor()dingtianhong
The bond slave list may change when the monitor is running, the slave list is no longer protected by bond->lock, only protected by rtnl lock(), so we have 3 ways to modify it: 1.add bond_master_upper_dev_link() and bond_upper_dev_unlink() in bond->lock, but it is unsafe to call call_netdevice_notifiers() in write lock. 2.remove unused bond->lock for monitor function, only use the existing rtnl lock(). 3.use rcu_read_lock() to protect it, of course, it will transform bond_for_each_slave to bond_for_each_slave_rcu() and performance is better, but in slow path, it is ignored. so I remove the bond->lock and move the rtnl lock to protect the whole monitor function. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-27bonding: remove bond read lock for bond_mii_monitor()dingtianhong
The bond slave list may change when the monitor is running, the slave list is no longer protected by bond->lock, only protected by rtnl lock(), so we have 3 ways to modify it: 1.add bond_master_upper_dev_link() and bond_upper_dev_unlink() in bond->lock, but it is unsafe to call call_netdevice_notifiers() in write lock. 2.remove unused bond->lock for monitor function, only use the existing rtnl lock(). 3.use rcu_read_lock() to protect it, of course, it will transform bond_for_each_slave to bond_for_each_slave_rcu() and performance is better, but in slow path, it is ignored. so I remove the bond->lock and move the rtnl lock to protect the whole monitor function. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-25net: fix rtnl notification in atomic contextAlexei Starovoitov
commit 991fb3f74c "dev: always advertise rx_flags changes via netlink" introduced rtnl notification from __dev_set_promiscuity(), which can be called in atomic context. Steps to reproduce: ip tuntap add dev tap1 mode tap ifconfig tap1 up tcpdump -nei tap1 & ip tuntap del dev tap1 mode tap [ 271.627994] device tap1 left promiscuous mode [ 271.639897] BUG: sleeping function called from invalid context at mm/slub.c:940 [ 271.664491] in_atomic(): 1, irqs_disabled(): 0, pid: 3394, name: ip [ 271.677525] INFO: lockdep is turned off. [ 271.690503] CPU: 0 PID: 3394 Comm: ip Tainted: G W 3.12.0-rc3+ #73 [ 271.703996] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012 [ 271.731254] ffffffff81a58506 ffff8807f0d57a58 ffffffff817544e5 ffff88082fa0f428 [ 271.760261] ffff8808071f5f40 ffff8807f0d57a88 ffffffff8108bad1 ffffffff81110ff8 [ 271.790683] 0000000000000010 00000000000000d0 00000000000000d0 ffff8807f0d57af8 [ 271.822332] Call Trace: [ 271.838234] [<ffffffff817544e5>] dump_stack+0x55/0x76 [ 271.854446] [<ffffffff8108bad1>] __might_sleep+0x181/0x240 [ 271.870836] [<ffffffff81110ff8>] ? rcu_irq_exit+0x68/0xb0 [ 271.887076] [<ffffffff811a80be>] kmem_cache_alloc_node+0x4e/0x2a0 [ 271.903368] [<ffffffff810b4ddc>] ? vprintk_emit+0x1dc/0x5a0 [ 271.919716] [<ffffffff81614d67>] ? __alloc_skb+0x57/0x2a0 [ 271.936088] [<ffffffff810b4de0>] ? vprintk_emit+0x1e0/0x5a0 [ 271.952504] [<ffffffff81614d67>] __alloc_skb+0x57/0x2a0 [ 271.968902] [<ffffffff8163a0b2>] rtmsg_ifinfo+0x52/0x100 [ 271.985302] [<ffffffff8162ac6d>] __dev_notify_flags+0xad/0xc0 [ 272.001642] [<ffffffff8162ad0c>] __dev_set_promiscuity+0x8c/0x1c0 [ 272.017917] [<ffffffff81731ea5>] ? packet_notifier+0x5/0x380 [ 272.033961] [<ffffffff8162b109>] dev_set_promiscuity+0x29/0x50 [ 272.049855] [<ffffffff8172e937>] packet_dev_mc+0x87/0xc0 [ 272.065494] [<ffffffff81732052>] packet_notifier+0x1b2/0x380 [ 272.080915] [<ffffffff81731ea5>] ? packet_notifier+0x5/0x380 [ 272.096009] [<ffffffff81761c66>] notifier_call_chain+0x66/0x150 [ 272.110803] [<ffffffff8108503e>] __raw_notifier_call_chain+0xe/0x10 [ 272.125468] [<ffffffff81085056>] raw_notifier_call_chain+0x16/0x20 [ 272.139984] [<ffffffff81620190>] call_netdevice_notifiers_info+0x40/0x70 [ 272.154523] [<ffffffff816201d6>] call_netdevice_notifiers+0x16/0x20 [ 272.168552] [<ffffffff816224c5>] rollback_registered_many+0x145/0x240 [ 272.182263] [<ffffffff81622641>] rollback_registered+0x31/0x40 [ 272.195369] [<ffffffff816229c8>] unregister_netdevice_queue+0x58/0x90 [ 272.208230] [<ffffffff81547ca0>] __tun_detach+0x140/0x340 [ 272.220686] [<ffffffff81547ed6>] tun_chr_close+0x36/0x60 Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-22bonding: move bond-specific init after enslave happensVeaceslav Falico
As Jiri noted, currently we first do all bonding-specific initialization (specifically - bond_select_active_slave(bond)) before we actually attach the slave (so that it becomes visible through bond_for_each_slave() and friends). This might result in bond_select_active_slave() not seeing the first/new slave and, thus, not actually selecting an active slave. Fix this by moving all the bond-related init part after we've actually completely initialized and linked (via bond_master_upper_dev_link()) the new slave. Also, remove the bond_(de/a)ttach_slave(), it's useless to have functions to ++/-- one int. After this we have all the initialization of the new slave *before* linking, and all the stuff that needs to be done on bonding *after* it. It has also a bonus effect - we can remove the locking on the new slave init completely, and only use it for bond_select_active_slave(). Reported-by: Jiri Pirko <jiri@resnulli.us> CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Acked-by: Ding Tianhong@huawei.com Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-19bonding: Remove __exit tag from bond_netlink_fini().David S. Miller
It can be called from the module init function, so it cannot be in the exit section. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-19bonding: add Netlink support active_slave optionJiri Pirko
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-19bonding: add Netlink support mode optionJiri Pirko
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>