Age | Commit message (Collapse) | Author |
|
Change-Id: I6738d25d36476502c93f28f5023df95c4ba7adb3
Reviewed-on: http://git.am.freescale.net:8181/35494
Reviewed-by: Pan Jiafei <Jiafei.Pan@freescale.com>
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
Change-Id: Ie61c478a82ce07b53fee94fcbc78984343168232
Reviewed-on: http://git.am.freescale.net:8181/35653
Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com>
Reviewed-by: Honghua Yin <Hong-Hua.Yin@freescale.com>
|
|
Change-Id: I3845a805706eb1f78ecd4e208c26252cec3bf5a3
Reviewed-on: http://git.am.freescale.net:8181/30479
Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com>
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
|
|
when qman_init_fq() returns 0, which means fq init OK. qman_query_fq
is not mandatory required any more.
Change-Id: Ied9c151de47b3521949237fb123dd5be3081edfb
Reviewed-on: http://git.am.freescale.net:8181/30329
Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com>
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
|
|
when one of slaves link is down, ping reportes failed. This patch
adds some debugging information to track it.
Change-Id: Ied9a2f067e0b19ac4c1e94b52ea67245c04169e4
Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com>
Reviewed-on: http://git.am.freescale.net:8181/28618
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
|
|
Add 1 byte for store '\0' for future sprintf calling.
Change-Id: I7911c22c0ce54144ef4e0acb43302d92e1379dcf
Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com>
Reviewed-on: http://git.am.freescale.net:8181/28617
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
|
|
Add NULL pointer checking to avoid release_pcd_mem calltrace while
release memory.
Change-Id: I83a3ed6dcd8fcff22db75dba6670ce03d427c04a
Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com>
Reviewed-on: http://git.am.freescale.net:8181/28616
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
|
|
At the hot path, use IS_UP() to instead of SLAVE_IS_OK()
to reduce unnecessary condition checking.
Change-Id: Ib17db501fb214f74e489940912a3c3be3920f633
Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com>
Reviewed-on: http://git.am.freescale.net:8181/28615
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
|
|
Near the caller, old codes have below read lock:
......
readlock(bond->lock);
......
readlock(bond->lock);
readunlock(bond->lock);
...
readunlock(bond->unlock);
......
The read lock in middle of above lines is unnecessary, which should
be removed.
Change-Id: Icbb1b3a15007d413101c8a36151e85dfaedd6e68
Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com>
Reviewed-on: http://git.am.freescale.net:8181/28614
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
|
|
Old codes hide alloc_pool_channel in a subroutine, which method can't
create multiple pool channels. This patch is abstracting allocation
pool channel relative codes to a new subroutine, which can create
multiple pool channel for different bonding instances.
Change-Id: I428bf4dae0386aeb9557959f641cfd55ad707988
Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com>
Reviewed-on: http://git.am.freescale.net:8181/28613
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
|
|
Change-Id: Iedc5272bd1c367fc6944ff277b5207df44a00890
Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com>
Reviewed-on: http://git.am.freescale.net:8181/28612
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
|
|
Change-Id: Ie5718f4acc69fe8b9990885820727422ee7ae8b7
Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com>
Reviewed-on: http://git.am.freescale.net:8181/28611
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
|
|
Old codes parsed node full name to get information
which ceetm required. This patch replaces the old
method with the standard sys_call of_property_read_u32.
Change-Id: I364b3b66837eab2e14a33977a182add3d48a273f
Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com>
Reviewed-on: http://git.am.freescale.net:8181/28610
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
|
|
Signed-off-by: Scott Wood <scottwood@freescale.com>
Conflicts:
arch/arm/kvm/mmu.c
arch/arm/mm/proc-v7-3level.S
arch/powerpc/kernel/vdso32/getcpu.S
drivers/crypto/caam/error.c
drivers/crypto/caam/sg_sw_sec4.h
drivers/usb/host/ehci-fsl.c
|
|
|
|
LAG is hacking bond_3ad_xmit_xor() with Freescale PCD.
Original xmit policy is calculated by CPU and software,
after hacking, this part is replaced with FMan Keygen
hashing to instead.
This patch is fixing an errors on missing readunlock()
after called readlock() while hacking.
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
Change-Id: Ieffe4b340c83e6f2d6f20fd01124eaf1865c81a5
Reviewed-on: http://git.am.freescale.net:8181/17717
Reviewed-by: Honghua Yin <Hong-Hua.Yin@freescale.com>
Reviewed-by: Chenhui Zhao <chenhui.zhao@freescale.com>
|
|
1. Correct FM_PCD_KgSchemeDelete() error in LAG:
In FMD UM, at the chapter Frame Manager PCD Runtime Unit API,
FM_PCD_KgSchemeDelete() is told to be run at run-time, and
FM_PCD_KgSchemeDelete() is allowed only following FM_PCD_Init()
& FM_PCD_KgSchemeSet(). But in HW LAG test codes, even follows
FM_PCD_Init() & FM_PCD_KgSchemeSet(), this API reports errors:
cpu3/3: ! MINOR FM-PCD Error [CPU03,
drivers/net/ethernet/freescale/fman/Peripherals/FM/Pcd/fm_kg.c:983
InvalidateSchemeSw]: Invalid State;
cpu3/3: Trying to delete a scheme that has ports bound tocpu3/3:
cpu3/3: ! MINOR FM-PCD Error [CPU03,
drivers/net/ethernet/freescale/fman/Peripherals/FM/Pcd/fm_kg.c:3046
FM_PCD_KgSchemeDelete]: Invalid State;
cpu3/3: cpu3/3:
KgSchemeDelete(h_Schemes[0]) = c00000002e4c0038 Err.
In order to simply the HW LAG codes, and get rid of this error,
insert FM_PORT_DeletePCD() before FM_PCD_KgSchemeDelete().
2. Adjust multiple schemes order with FM_PCD_KgSchemeSet:
To ensure distribution order, adjust multiple schemes order,
please refer to the chapter "The dist_order Element" of FMCTUG.
Change-Id: I4aeef84e796b624d8625db48c0c3098bf9f81abd
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
Reviewed-on: http://git.am.freescale.net:8181/14202
Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com>
Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com>
Reviewed-by: Richard Schmitt <richard.schmitt@freescale.com>
|
|
Add a flag to avoid re-allocation PCD memory for HW LAG.
Without this patch, old codes can introduce memory leak
when call alloc_pcd_mem() in HW LAG.
Change-Id: I92065867c8a2e8ea8315f4cd0de1b3ec99c512e7
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
Reviewed-on: http://git.am.freescale.net:8181/14201
Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com>
Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com>
Reviewed-by: Richard Schmitt <richard.schmitt@freescale.com>
|
|
fix wrong memset parameters for PCD Extract field
ethernet.dst.
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
Change-Id: I8f240ccce2743f1d4da7825e7cfaf0894e5fa33d
Reviewed-on: http://git.am.freescale.net:8181/14200
Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com>
Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com>
Reviewed-by: Richard Schmitt <richard.schmitt@freescale.com>
|
|
Old codes were helping to fill parser result only
when FMan calculated CSUM. This patch is helping
FMan to fill parser result no matter whether FMan
calculates CSUM or not.
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
Change-Id: I812f78a4b8f40dec3ef4d36e2624f4e140995112
Reviewed-on: http://git.am.freescale.net:8181/13581
Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com>
Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com>
Reviewed-by: Richard Schmitt <richard.schmitt@freescale.com>
|
|
fix unfit debug description, add caller source.
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
Change-Id: I9266df0244ba7fb3918b1bfbbd89fc54f7d97175
Reviewed-on: http://git.am.freescale.net:8181/13580
Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com>
Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com>
Reviewed-by: Richard Schmitt <richard.schmitt@freescale.com>
|
|
|
|
Signed-off-by: Scott Wood <scottwood@freescale.com>
Conflicts:
arch/sparc/Kconfig
drivers/tty/tty_buffer.c
|
|
Linux kernel API bond_for_each_slave of bonding.h has different params
along with different kernel version as below table:
Older version: bond_for_each_slave(bond, pos, int cnt),
3.11.0-rc1(dec1e90e8): bond_for_each_slave(bond, pos),
3.12.0-rc1(9caff1e7b): bond_for_each_slave(bond, slave, list_head *iter)
This patch is making LAG codes to fit different kernel version.
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
Change-Id: Ifd365cd232aba67a437b156568d88c6e16c44c1a
Reviewed-on: http://git.am.freescale.net:8181/11293
Reviewed-by: Dongsheng Wang <dongsheng.wang@freescale.com>
Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com>
Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com>
Reviewed-by: Scott Wood <scottwood@freescale.com>
|
|
master-tmp is the master branch as of
8b60f5ea90c49344692a70f62cd4aa349de38b48
with the following commits reverted due to excessive conflicts:
commit b35a69559c46e066e6f24bb02d5a6090483786e3
Author: Scott Wood <scottwood@freescale.com>
Date: Fri Apr 18 15:27:52 2014 -0500
Revert "net: add sysfs helpers for netdev_adjacent logic"
This reverts commit 0be682ca768d671c91cfd1379759efcb3b29102a.
commit 1c0dc06e47e11bf758f3e84ea90c2178a31dbf0f
Author: Scott Wood <scottwood@freescale.com>
Date: Fri Apr 18 15:27:47 2014 -0500
Revert "net: rename sysfs symlinks on device name change"
This reverts commit 45ce45c69750b93b8262aa66792185bd49150293.
Conflicts:
drivers/iommu/fsl_pamu.c
drivers/net/bonding/bond_3ad.c
drivers/net/bonding/bond_sysfs.c
drivers/net/bonding/bonding.h
drivers/net/ethernet/freescale/gianfar.c
Signed-off-by: Scott Wood <scottwood@freescale.com>
Conflicts:
drivers/iommu/fsl_pamu.c
drivers/net/bonding/bond_3ad.c
drivers/net/bonding/bond_sysfs.c
drivers/net/bonding/bonding.h
drivers/net/ethernet/freescale/gianfar.c
|
|
[ Upstream commit fb00bc2e6cd2046282ba4b03f4fe682aee70b2f8 ]
The commit d3ab3ffd1d728d7ee77340e7e7e2c7cfe6a4013e
(bonding: use rlb_client_info->vlan_id instead of ->tag)
remove the rlb_client_info->tag, but occur some issues,
The vlan_get_tag() will return 0 for success and -EINVAL for
error, so the client_info->vlan_id always be set to 0 if the
vlan_get_tag return 0 for success, so the client_info would
never get a correct vlan id.
We should only set the vlan id to 0 when the vlan_get_tag return error.
Fixes: d3ab3ffd1d7 (bonding: use rlb_client_info->vlan_id instead of ->tag)
CC: Ding Tianhong <dingtianhong@huawei.com>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
Distribution may result in less than hashDistributionNumOfFqids
queues if baseFqid unaligned.
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
Change-Id: I3bb9dd597d305759a81a0319957d92c096294c4c
Reviewed-on: http://git.am.freescale.net:8181/10949
Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com>
Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com>
Reviewed-by: Jose Rivera <German.Rivera@freescale.com>
|
|
The kernel space PCD part provides hash based outgoing traffic
distribution. The sources can be L2 MAC/L3 SRC and DST IP addr/
L4 SRC and DST port information. Current version only support
L2 information hash which is the default transmit policy in the
Linux bonding driver.
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
Change-Id: Ifd85630ab0eebd77713574f6cf51fb92203a1c06
Reviewed-on: http://git.am.freescale.net:8181/10414
Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com>
Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com>
Reviewed-by: Florinel Iordache <Florin.Iordache@freescale.com>
Reviewed-by: Jose Rivera <German.Rivera@freescale.com>
|
|
Linux bonding driver provides a method for aggregating multiple network
interface controllers (NICs) into a single logical bonded interface of
two or more so called (NIC) slaves. Slave selection for outgoing traffic
is done according to the transmit hash policy, which may be changed from
the default simple XOR policy via the xmit_hash_policy option. This
selection algorithm in Linux bonding driver is based on software.
The QorIQ Data Path Acceleration Architecture (DPAA) is a comprehensive
architecture which integrates all aspects of packet processing in the
SoC, addressing issues and requirements resulting from the multicore
nature of QorIQ SoCs. The DPAA includes Cores, Network and packet I/O,
Hardware offload accelerators. Hardware offload accelerators include
FMan/BMan/QMan and etc.
Offline port is one of FMan elements, which supports (Parse, Classify,
Distribute) PCD function on frames extracted frame descriptor (FD).
Offline port also can inspect traffic, split it into FQs on ingress, and
send traffic from the FQs to the interface on egress by the PCD function.
These patches are enhancing Linux kernel LAG (Link Aggregation) with
Freescale DPAA value added. The main idea is to utilize offline ports
with PCD function to help to distribute outgoing traffics, including
outgoing slaves device searching and selection. In another world,
patches are using CRC-64 based hashing of Keygen/scheme and the parser
result of outgoing frames header information to distribute outgoing
frames.
Beside of above, after integration this HW based LAG with Freescale
CEETMQos, these two features can support hardware based Qos for bundles
links rather than individual links.
These patches mainly include 2 parts:
'glue logic' and 'kernel space PCD'.
The glue logic first probes all available offline ports information via
reading dts, including tx fqid/default fqid/errors fqid, pcd fqs, other
private data pointer of offline ports for future reusing. The glue logic
also creates frames from skb and then sends these frames to offline port
directly, this offline port will continue to distribution frames from
the PCD FQs to the slave interface on egress by the PCD function, rather
than select slave device by CPU, neither make slave device driver create
frame from skb, nor make slave devices driver send frames.
These patches are supporting the mapping among offline ports and
available bundles at run-time. PCD based outgoing traffic distribution
can be enabled or disabled at run-time by sysfs interface in patches.
To do:
1. PCD policy L23/L34 have not been veryfied.
2. offline port buffer pool/buffer layout will be enhanced.
3. software based L4 csum for now, offline port based L4 csum need be fixed.
To test this HW based LAG after booting up Linux:
cd /sys/class/net/bond0/bonding/
echo 4 >mode
cat offline_ports
echo fman0-oh@1 > oh_needed_for_hw_distribution
cat oh_needed_for_hw_distribution
cat oh_en
echo 1 >oh_en
cat oh_en
echo +fm1-gb0 >slaves
echo +fm1-gb1 >slaves
ifconfig bond0 192.168.10.2/24 up
ping 192.168.10.1 -c 5
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
Change-Id: I3a6664bfcc9ec9ca3f86a5e36381220c5fcb07cf
Reviewed-on: http://git.am.freescale.net:8181/10413
Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com>
Reviewed-by: Jiafei Pan <Jiafei.Pan@freescale.com>
Reviewed-by: Jose Rivera <German.Rivera@freescale.com>
|
|
[ Upstream commit 163c8ff30dbe473abfbb24a7eac5536c87f3baa9 ]
aggregator_identifier is used to assign unique aggregator identifiers
to aggregators of a bond during device enslaving.
aggregator_identifier is currently a global variable that is zeroed in
bond_3ad_initialize().
This sequence will lead to duplicate aggregator identifiers for eth1 and eth3:
create bond0
change bond0 mode to 802.3ad
enslave eth0 to bond0 //eth0 gets agg id 1
enslave eth1 to bond0 //eth1 gets agg id 2
create bond1
change bond1 mode to 802.3ad
enslave eth2 to bond1 //aggregator_identifier is reset to 0
//eth2 gets agg id 1
enslave eth3 to bond0 //eth3 gets agg id 2
Fix this by making aggregator_identifier private to the bond.
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Acked-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
[ Upstream commit b869ccfab1e324507fa3596e3e1308444fb68227 ]
This patch fixes two race conditions between bond_store_updelay/downdelay
and bond_store_miimon which could lead to division by zero as miimon can
be set to 0 while either updelay/downdelay are being set and thus miss the
zero check in the beginning, the zero div happens because updelay/downdelay
are stored as new_value / bond->params.miimon. Use rtnl to synchronize with
miimon setting.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit ec9f1d15db8185f63a2c3143dc1e90ba18541b08 ]
Currently the ARP monitoring is not supported with 802.3ad, and it's
prohibited to use it via the module params.
However we still can set it afterwards via sysfs, cause we only check for
*LB modes there.
To fix this - add a check for 802.3ad mode in bonding_store_arp_interval.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit b32418705107265dfca5edfe2b547643e53a732e ]
Currently we rely on rtnl locking in bond_set_rx_mode(), however it's not
always the case:
RTNL: assertion failed at drivers/net/bonding/bond_main.c (3391)
...
[<ffffffff81651ca5>] dump_stack+0x54/0x74
[<ffffffffa029e717>] bond_set_rx_mode+0xc7/0xd0 [bonding]
[<ffffffff81553af7>] __dev_set_rx_mode+0x57/0xa0
[<ffffffff81557ff8>] __dev_mc_add+0x58/0x70
[<ffffffff81558020>] dev_mc_add+0x10/0x20
[<ffffffff8161e26e>] igmp6_group_added+0x18e/0x1d0
[<ffffffff81186f76>] ? kmem_cache_alloc_trace+0x236/0x260
[<ffffffff8161f80f>] ipv6_dev_mc_inc+0x29f/0x320
[<ffffffff8161f9e7>] ipv6_sock_mc_join+0x157/0x260
...
Fix this by using RCU primitives.
Reported-by: Joe Lawrence <joe.lawrence@stratus.com>
Tested-by: Joe Lawrence <joe.lawrence@stratus.com>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
arp mode
Because the ARP monitoring is not support for 802.3ad, but I still
could change the mode to 802.3ad from ab mode while ARP monitoring
is running, it is incorrect.
So add a check for 802.3ad in bonding_store_mode to fix the problem,
and make a new macro BOND_NO_USES_ARP() to simplify the code.
v2: according to the Dan Williams's suggestion, bond mode is the most
important bond option, it should override any of the other sub-options.
So when the mode is changed, the conficting values should be cleared
or reset, otherwise the user has to duplicate more operations to modify
the logic. I disable the arp and enable mii monitoring when the bond mode
is changed to AB, TB and 8023AD if the arp interval is true.
v3: according to the Nik's suggestion, the default value of miimon should need
a name, there is several place to use it, and the bond_store_arp_interval()
could use micro BOND_NO_USES_ARP to make the code more simpify.
Suggested-by: Dan Williams <dcbw@redhat.com>
Suggested-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
I met a Bug when I add ip target with the wrong ip address:
echo +500.500.500.500 > /sys/class/net/bond0/bonding/arp_ip_target
the wrong ip address will transfor to 245.245.245.244 and add
to the ip target success, it is uncorrect, so I add checks to avoid
adding wrong address.
The in4_pton() will set wrong ip address to 0.0.0.0, it will return by
the next check and will not add to ip target.
v2
According Veaceslav's opinion, simplify the code.
v3
According Veaceslav's opinion, add broadcast check and make a micro
definition to package it.
v4
Solve the problem of the format which David point out.
Suggested-by: Veaceslav Falico <vfalico@redhat.com>
Suggested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes two race conditions between bond_store_updelay/downdelay
and bond_store_miimon which could lead to division by zero as miimon can
be set to 0 while either updelay/downdelay are being set and thus miss the
zero check in the beginning, the zero div happens because updelay/downdelay
are stored as new_value / bond->params.miimon. Use rtnl to synchronize with
miimon setting.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently the ARP monitoring is not supported with 802.3ad, and it's
prohibited to use it via the module params.
However we still can set it afterwards via sysfs, cause we only check for
*LB modes there.
To fix this - add a check for 802.3ad mode in bonding_store_arp_interval.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull networking updates from David Miller:
1) The addition of nftables. No longer will we need protocol aware
firewall filtering modules, it can all live in userspace.
At the core of nftables is a, for lack of a better term, virtual
machine that executes byte codes to inspect packet or metadata
(arriving interface index, etc.) and make verdict decisions.
Besides support for loading packet contents and comparing them, the
interpreter supports lookups in various datastructures as
fundamental operations. For example sets are supports, and
therefore one could create a set of whitelist IP address entries
which have ACCEPT verdicts attached to them, and use the appropriate
byte codes to do such lookups.
Since the interpreted code is composed in userspace, userspace can
do things like optimize things before giving it to the kernel.
Another major improvement is the capability of atomically updating
portions of the ruleset. In the existing netfilter implementation,
one has to update the entire rule set in order to make a change and
this is very expensive.
Userspace tools exist to create nftables rules using existing
netfilter rule sets, but both kernel implementations will need to
co-exist for quite some time as we transition from the old to the
new stuff.
Kudos to Patrick McHardy, Pablo Neira Ayuso, and others who have
worked so hard on this.
2) Daniel Borkmann and Hannes Frederic Sowa made several improvements
to our pseudo-random number generator, mostly used for things like
UDP port randomization and netfitler, amongst other things.
In particular the taus88 generater is updated to taus113, and test
cases are added.
3) Support 64-bit rates in HTB and TBF schedulers, from Eric Dumazet
and Yang Yingliang.
4) Add support for new 577xx tigon3 chips to tg3 driver, from Nithin
Sujir.
5) Fix two fatal flaws in TCP dynamic right sizing, from Eric Dumazet,
Neal Cardwell, and Yuchung Cheng.
6) Allow IP_TOS and IP_TTL to be specified in sendmsg() ancillary
control message data, much like other socket option attributes.
From Francesco Fusco.
7) Allow applications to specify a cap on the rate computed
automatically by the kernel for pacing flows, via a new
SO_MAX_PACING_RATE socket option. From Eric Dumazet.
8) Make the initial autotuned send buffer sizing in TCP more closely
reflect actual needs, from Eric Dumazet.
9) Currently early socket demux only happens for TCP sockets, but we
can do it for connected UDP sockets too. Implementation from Shawn
Bohrer.
10) Refactor inet socket demux with the goal of improving hash demux
performance for listening sockets. With the main goals being able
to use RCU lookups on even request sockets, and eliminating the
listening lock contention. From Eric Dumazet.
11) The bonding layer has many demuxes in it's fast path, and an RCU
conversion was started back in 3.11, several changes here extend the
RCU usage to even more locations. From Ding Tianhong and Wang
Yufen, based upon suggestions by Nikolay Aleksandrov and Veaceslav
Falico.
12) Allow stackability of segmentation offloads to, in particular, allow
segmentation offloading over tunnels. From Eric Dumazet.
13) Significantly improve the handling of secret keys we input into the
various hash functions in the inet hashtables, TCP fast open, as
well as syncookies. From Hannes Frederic Sowa. The key fundamental
operation is "net_get_random_once()" which uses static keys.
Hannes even extended this to ipv4/ipv6 fragmentation handling and
our generic flow dissector.
14) The generic driver layer takes care now to set the driver data to
NULL on device removal, so it's no longer necessary for drivers to
explicitly set it to NULL any more. Many drivers have been cleaned
up in this way, from Jingoo Han.
15) Add a BPF based packet scheduler classifier, from Daniel Borkmann.
16) Improve CRC32 interfaces and generic SKB checksum iterators so that
SCTP's checksumming can more cleanly be handled. Also from Daniel
Borkmann.
17) Add a new PMTU discovery mode, IP_PMTUDISC_INTERFACE, which forces
using the interface MTU value. This helps avoid PMTU attacks,
particularly on DNS servers. From Hannes Frederic Sowa.
18) Use generic XPS for transmit queue steering rather than internal
(re-)implementation in virtio-net. From Jason Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1622 commits)
random32: add test cases for taus113 implementation
random32: upgrade taus88 generator to taus113 from errata paper
random32: move rnd_state to linux/random.h
random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized
random32: add periodic reseeding
random32: fix off-by-one in seeding requirement
PHY: Add RTL8201CP phy_driver to realtek
xtsonic: add missing platform_set_drvdata() in xtsonic_probe()
macmace: add missing platform_set_drvdata() in mace_probe()
ethernet/arc/arc_emac: add missing platform_set_drvdata() in arc_emac_probe()
ipv6: protect for_each_sk_fl_rcu in mem_check with rcu_read_lock_bh
vlan: Implement vlan_dev_get_egress_qos_mask as an inline.
ixgbe: add warning when max_vfs is out of range.
igb: Update link modes display in ethtool
netfilter: push reasm skb through instead of original frag skbs
ip6_output: fragment outgoing reassembled skb properly
MAINTAINERS: mv643xx_eth: take over maintainership from Lennart
net_sched: tbf: support of 64bit rates
ixgbe: deleting dfwd stations out of order can cause null ptr deref
ixgbe: fix build err, num_rx_queues is only available with CONFIG_RPS
...
|
|
This patch aims to extend round-robin mode with a new option called
packets_per_slave which can have the following values and effects:
0 - choose a random slave
1 (default) - standard round-robin, 1 packet per slave
>1 - round-robin when >1 packets have been transmitted per slave
The allowed values are between 0 and 65535.
This patch also fixes the comment style in bond_xmit_roundrobin().
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is an extra semi-colon so bond_get_size() doesn't return the
correct value.
Fixes: ec76aa49855f ('bonding: add Netlink support active_slave option')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Veaceslav Falico <vfalico@redhat.com>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit 4d961a101e032b4bf223b279b4b35bc77576f5a8, reversing
changes made to a00f6fcc7d0c62a91768d9c4ccba4c7d64fbbce3.
Revert bond locking changes, they cause regressions and Veaceslav Falico
doesn't like how the commit messages were done at all.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The bond slave list may change when the monitor is running, the slave list is no longer
protected by bond->lock, only protected by rtnl lock(), so we have 3 ways to modify it:
1.add bond_master_upper_dev_link() and bond_upper_dev_unlink() in bond->lock, but it is unsafe
to call call_netdevice_notifiers() in write lock.
2.remove unused bond->lock for monitor function, only use the existing rtnl lock().
3.use rcu_read_lock() to protect it, of course, it will transform bond_for_each_slave to
bond_for_each_slave_rcu() and performance is better, but in slow path, it is ignored.
so I remove the bond->lock and move the rtnl lock to protect the whole monitor function.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The bond slave list may change when the monitor is running, the slave list is no longer
protected by bond->lock, only protected by rtnl lock(), so we have 3 ways to modify it:
1.add bond_master_upper_dev_link() and bond_upper_dev_unlink() in bond->lock, but it is unsafe
to call call_netdevice_notifiers() in write lock.
2.remove unused bond->lock for monitor function, only use the existing rtnl lock().
3.use rcu_read_lock() to protect it, of course, it will transform bond_for_each_slave to
bond_for_each_slave_rcu() and performance is better, but in slow path, it is ignored.
so I remove the bond->lock and move the rtnl lock to protect the whole monitor function.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The bond slave list may change when the monitor is running, the slave list is no longer
protected by bond->lock, only protected by rtnl lock(), so we have 3 ways to modify it:
1.add bond_master_upper_dev_link() and bond_upper_dev_unlink() in bond->lock, but it is unsafe
to call call_netdevice_notifiers() in write lock.
2.remove unused bond->lock for monitor function, only use the existing rtnl lock().
3.use rcu_read_lock() to protect it, of course, it will transform bond_for_each_slave to
bond_for_each_slave_rcu() and performance is better, but in slow path, it is ignored.
so I remove the bond->lock and add the rtnl lock to protect the whole monitor function.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The bond slave list may change when the monitor is running, the slave list is no longer
protected by bond->lock, only protected by rtnl lock(), so we have 3 ways to modify it:
1.add bond_master_upper_dev_link() and bond_upper_dev_unlink() in bond->lock, but it is unsafe
to call call_netdevice_notifiers() in write lock.
2.remove unused bond->lock for monitor function, only use the existing rtnl lock().
3.use rcu_read_lock() to protect it, of course, it will transform bond_for_each_slave to
bond_for_each_slave_rcu() and performance is better, but in slow path, it is ignored.
so I remove the bond->lock and move the rtnl lock to protect the whole monitor function.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The bond slave list may change when the monitor is running, the slave list is no longer
protected by bond->lock, only protected by rtnl lock(), so we have 3 ways to modify it:
1.add bond_master_upper_dev_link() and bond_upper_dev_unlink() in bond->lock, but it is unsafe
to call call_netdevice_notifiers() in write lock.
2.remove unused bond->lock for monitor function, only use the existing rtnl lock().
3.use rcu_read_lock() to protect it, of course, it will transform bond_for_each_slave to
bond_for_each_slave_rcu() and performance is better, but in slow path, it is ignored.
so I remove the bond->lock and move the rtnl lock to protect the whole monitor function.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
commit 991fb3f74c "dev: always advertise rx_flags changes via netlink"
introduced rtnl notification from __dev_set_promiscuity(),
which can be called in atomic context.
Steps to reproduce:
ip tuntap add dev tap1 mode tap
ifconfig tap1 up
tcpdump -nei tap1 &
ip tuntap del dev tap1 mode tap
[ 271.627994] device tap1 left promiscuous mode
[ 271.639897] BUG: sleeping function called from invalid context at mm/slub.c:940
[ 271.664491] in_atomic(): 1, irqs_disabled(): 0, pid: 3394, name: ip
[ 271.677525] INFO: lockdep is turned off.
[ 271.690503] CPU: 0 PID: 3394 Comm: ip Tainted: G W 3.12.0-rc3+ #73
[ 271.703996] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
[ 271.731254] ffffffff81a58506 ffff8807f0d57a58 ffffffff817544e5 ffff88082fa0f428
[ 271.760261] ffff8808071f5f40 ffff8807f0d57a88 ffffffff8108bad1 ffffffff81110ff8
[ 271.790683] 0000000000000010 00000000000000d0 00000000000000d0 ffff8807f0d57af8
[ 271.822332] Call Trace:
[ 271.838234] [<ffffffff817544e5>] dump_stack+0x55/0x76
[ 271.854446] [<ffffffff8108bad1>] __might_sleep+0x181/0x240
[ 271.870836] [<ffffffff81110ff8>] ? rcu_irq_exit+0x68/0xb0
[ 271.887076] [<ffffffff811a80be>] kmem_cache_alloc_node+0x4e/0x2a0
[ 271.903368] [<ffffffff810b4ddc>] ? vprintk_emit+0x1dc/0x5a0
[ 271.919716] [<ffffffff81614d67>] ? __alloc_skb+0x57/0x2a0
[ 271.936088] [<ffffffff810b4de0>] ? vprintk_emit+0x1e0/0x5a0
[ 271.952504] [<ffffffff81614d67>] __alloc_skb+0x57/0x2a0
[ 271.968902] [<ffffffff8163a0b2>] rtmsg_ifinfo+0x52/0x100
[ 271.985302] [<ffffffff8162ac6d>] __dev_notify_flags+0xad/0xc0
[ 272.001642] [<ffffffff8162ad0c>] __dev_set_promiscuity+0x8c/0x1c0
[ 272.017917] [<ffffffff81731ea5>] ? packet_notifier+0x5/0x380
[ 272.033961] [<ffffffff8162b109>] dev_set_promiscuity+0x29/0x50
[ 272.049855] [<ffffffff8172e937>] packet_dev_mc+0x87/0xc0
[ 272.065494] [<ffffffff81732052>] packet_notifier+0x1b2/0x380
[ 272.080915] [<ffffffff81731ea5>] ? packet_notifier+0x5/0x380
[ 272.096009] [<ffffffff81761c66>] notifier_call_chain+0x66/0x150
[ 272.110803] [<ffffffff8108503e>] __raw_notifier_call_chain+0xe/0x10
[ 272.125468] [<ffffffff81085056>] raw_notifier_call_chain+0x16/0x20
[ 272.139984] [<ffffffff81620190>] call_netdevice_notifiers_info+0x40/0x70
[ 272.154523] [<ffffffff816201d6>] call_netdevice_notifiers+0x16/0x20
[ 272.168552] [<ffffffff816224c5>] rollback_registered_many+0x145/0x240
[ 272.182263] [<ffffffff81622641>] rollback_registered+0x31/0x40
[ 272.195369] [<ffffffff816229c8>] unregister_netdevice_queue+0x58/0x90
[ 272.208230] [<ffffffff81547ca0>] __tun_detach+0x140/0x340
[ 272.220686] [<ffffffff81547ed6>] tun_chr_close+0x36/0x60
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As Jiri noted, currently we first do all bonding-specific initialization
(specifically - bond_select_active_slave(bond)) before we actually attach
the slave (so that it becomes visible through bond_for_each_slave() and
friends). This might result in bond_select_active_slave() not seeing the
first/new slave and, thus, not actually selecting an active slave.
Fix this by moving all the bond-related init part after we've actually
completely initialized and linked (via bond_master_upper_dev_link()) the
new slave.
Also, remove the bond_(de/a)ttach_slave(), it's useless to have functions
to ++/-- one int.
After this we have all the initialization of the new slave *before*
linking, and all the stuff that needs to be done on bonding *after* it. It
has also a bonus effect - we can remove the locking on the new slave init
completely, and only use it for bond_select_active_slave().
Reported-by: Jiri Pirko <jiri@resnulli.us>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Acked-by: Ding Tianhong@huawei.com
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It can be called from the module init function, so it cannot
be in the exit section.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|