summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-02-16igb: increase timeout for ethtool offline self-testStefan Assmann
On several machines with i350 adapters the ethtool offline self-test sometimes fails. This happens because link auto negotiation may take longer than the timeout of 4 seconds. Increasing the timeout by 1 seconds resolves the issue. Output from a failing i350 offline self-test: while [ 1 ]; do ethtool -t eth2 offline; done The test result is PASS The test extra info: Register test (offline) 0 Eeprom test (offline) 0 Interrupt test (offline) 0 Loopback test (offline) 0 Link test (on/offline) 0 The test result is FAIL The test extra info: Register test (offline) 0 Eeprom test (offline) 0 Interrupt test (offline) 0 Loopback test (offline) 0 Link test (on/offline) 1 The test result is PASS The test extra info: Register test (offline) 0 Eeprom test (offline) 0 Interrupt test (offline) 0 Loopback test (offline) 0 Link test (on/offline) 0 Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-02-16igbvf: Make next_to_watch a pointer and adjust memory barriers to avoid racesAlexander Duyck
This change is meant to address several race issues that become possible because next_to_watch could possibly be set to a value that shows that the descriptor is done when it is not. In order to correct that we instead make next_to_watch a pointer that is set to NULL during cleanup, and set to the eop_desc after the descriptor rings have been written. To enforce proper ordering the next_to_watch pointer is not set until after a wmb writing the values to the last descriptor in a transmit. In order to guarantee that the descriptor is not read until after the eop_desc we use the read_barrier_depends which is only really necessary on the alpha architecture. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-02-16e1000e: display a warning message when SmartSpeed worksKoki Sanagi
Current e1000e driver doesn't tell nothing when Link Speed is downgraded due to SmartSpeed. As a result, users suspect that there is something wrong with NIC. If the cause of it is SmartSpeed, there is no means to replace NIC. This patch make e1000e notify users that SmartSpeed worked. Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-02-16e1000: fix whitespace issues and multi-line commentsJeff Kirsher
Fixes whitespace issues, such as lines exceeding 80 chars, needless blank lines and the use of spaces where tabs are needed. In addition, fix multi-line comments to align with the networking standard. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com>
2013-02-15net: use skb_reset_mac_len() in dev_gro_receive()Eric Dumazet
We no longer need to use mac_len, lets cleanup things. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-15ieee802154: at86rf230: Remove empty suspend/resume callbacksLars-Peter Clausen
There is no need to implement empty suspend/resume callbacks if there is nothing to do during suspend/resume. The drivers will behave the same with no callbacks or empty callbacks during suspend/resume. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-15Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next Jeff Kirsher says: ==================== This series contains updates to igb and ixgbe. Most of the changes are against igb, except for one patch against ixgbe. There are 3 igb fixes from Carolyn which were reported by Dan Carpenter which resolve issues found in the get_i2c_client(). Alex does some cleanup of the igb driver to match similar functionality in ixgbe on transmit. Alex also makes it so that we can enable the use of build_skb for cases where jumbo frames are disabled. The advantage to this is that we do not have to perform a memcpy to populate the header and as a result we see a significant performance improvement. Akeem provides 4 patches to initialize function pointers and do a re-factoring of the function pointers in igb_get_variants() to assist with driver debugging. The ixgbe patch comes from Emil to reshuffle the switch/case structure of the flag assignment to allow for the flags to be set for each MAC type separately. This is needed for new hardware that does not have feature parity with older hardware. v2: updated patches 4 & 5 based on feedback from Ben Hutchings and Eric Dumazet ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-15v4 GRE: Add TCP segmentation offload for GREPravin B Shelar
Following patch adds GRE protocol offload handler so that skb_gso_segment() can segment GRE packets. SKB GSO CB is added to keep track of total header length so that skb_segment can push entire header. e.g. in case of GRE, skb_segment need to push inner and outer headers to every segment. New NETIF_F_GRE_GSO feature is added for devices which support HW GRE TSO offload. Currently none of devices support it therefore GRE GSO always fall backs to software GSO. [ Compute pkt_len before ip_local_out() invocation. -DaveM ] Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-15net: factor out skb_mac_gso_segment() from skb_gso_segment()Pravin B Shelar
This function will be used in next GRE_GSO patch. This patch does not change any functionality. It only exports skb_mac_gso_segment() function. [ Use skb_reset_mac_len() -DaveM ] Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-15net: Add skb_unclone() helper function.Pravin B Shelar
This function will be used in next GRE_GSO patch. This patch does not change any functionality. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Eric Dumazet <edumazet@google.com>
2013-02-15tg3: Update version to 3.130Michael Chan
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-15tg3: Set initial carrier state to off.Michael Chan
Before the device is opened, the carrier state should be off. It will not race with the link interrupt if we set it before calling register_netdev(). Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-15tg3: Fix 5762 NVRAM sizingMichael Chan
Don't set the default size to 128K if it is 5762. Instead, rely on the size we obtain from NVRAM location 0xf0. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-15tg3: Expand EEE support for all 5717 B0Michael Chan
This chip supports Energy Efficient Ethernet. The existing code only supports a smaller set of devices with 5718 PCI ID. Expand support for all devices with the same 5717 B0 chip ID. Signed-off-by: Michael Chan <mchan@broadocm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-15tg3: Add 57766 device support.Matt Carlson
The patch also adds a couple of fixes - For the 57766 and non Ax versions of 57765, bootcode needs to setup the PCIE Fast Training Sequence (FTS) value to prevent transmit hangs. Unfortunately, it does not have enough room in the selfboot case (i.e. devices with no NVRAM). The driver needs to implement this. - For performance reasons, the 2k DMA engine mode on the 57766 should be enabled and dma size limited to 2k for standard sized packets. Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-15ixgbe: refactor initialization of feature flagsEmil Tantilov
This patch reshuffles the switch/case structure of the flag assignment to allow for the flags to be set for each MAC type separately. This is needed for new HW that does not have feature parity with older HW. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Tested-by: Jack Morgan <jack.morgan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-02-15igb: Refractoring function pointers in igb_get_invariants functionAkeem G. Abodunrin
This patch simplifies igb_get_invariants function by moving all implemented function pointers in this function to individual separate functions, based on their functionalities, this would make debugging much easier. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-02-15igb: Intialize MAC function pointersAkeem G. Abodunrin
This patch initializes MAC function pointers for device configuration. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-02-15igb: Initialize NVM function pointersAkeem G. Abodunrin
This patch initializes NVM function pointers for device configuration. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-02-15igb: Initialize PHY function pointersAkeem G. Abodunrin
This patch initializes PHY function pointers for device configuration. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-02-15igb: Update igb to use a path similar to ixgbe to determine when to stop TxAlexander Duyck
After reviewing the igb and ixgbe code I realized there are a few issues in how the code is structured. Specifically we are not checking the size of the buffers being used in transmits and we are not using the same value to determine when to stop or start a Tx queue. As such the code is prone to be buggy. This patch makes it so that we have one value DESC_NEEDED that we will use for starting and stopping the queue. In addition we will check the size of buffers being used when setting up a transmit so as to avoid a possible buffer overrun if we were to receive a frame with a block of data larger than 32K in skb->data. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-02-15igb: Refix sparse warning in igb_get_i2c_clientCarolyn Wyborny
This patch correctly resolves the sparse warnings found with this function. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-02-15igb: Fix for improper allocation flag in igb_get_i2c_clientCarolyn Wyborny
This patch fixes the allocation function in igb_get_i2c_client to use GFP_ATOMIC instead of GFP_KERNEL because we have a spinlock. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-02-15igb: Fix for improper exit in igb_get_i2c_clientCarolyn Wyborny
This patch fixes an issue where we check for irq's disabled then exit after explicitly disabling them with spin_lock_irqsave. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Aaron Brown <arron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-02-15igb: Support using build_skb in the case that jumbo frames are disabledAlexander Duyck
This change makes it so that we can enable the use of build_skb for cases where jumbo frames are disabled. The advantage to this is that we do not have to perform a memcpy to populate the header and as a result we see a significant performance improvement. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-02-14net: Don't write to current task flags on every packet received.David S. Miller
Even for non-pfmalloc SKBs, __netif_receive_skb() will do a tsk_restore_flags() on current unconditionally. Make __netif_receive_skb() a shim around the existing code, renamed to __netif_receive_skb_core(). Let __netif_receive_skb() wrap the __netif_receive_skb_core() call with the task flag modifications, if necessary. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14gianfar: Fix and cleanup Rx FCB indicationClaudiu Manoil
This fixes a less obvious error on one hand, and prevents futher similar errors by disambiguating and optimizing RxFCB indication, on the other hand. The error consists in NETIF_F_HW_VLAN_TX flag being used as an indication of Rx FCB insertion. This happened as soon gfar_uses_fcb(), which despite its name indicates Rx FCB insertion, started incorporating is_vlan_on(). is_vlan_on(), on the other hand, is also a misleading construct because we need to differentiate b/w hw VLAN extraction/VLEX (marked by VLAN_RX flag) and hw VLAN insertion/VLINS (VLAN_TX flag), which are different mechanisms using different types of FCBs. The hw spec for the RxFCB feature is as follows: In the case of RxBD rings, FCBs (Frame Control Block) are inserted by the eTSEC whenever RCTRL[PRSDEP] is set to a non-zero value. Only one FCB is inserted per frame (in the buffer pointed to by the RxBD with bit F set). TOE acceleration for receive is enabled for all rx frames in this case. This patch introduces priv->uses_rxfcb field to quickly signal RxFCB insertion in accordance with the specification above. The dependency on FSL_GIANFAR_DEV_HAS_TIMER was also eliminated as another source of confusion. The actual dependency is to priv->hwts_rx_en. Upon changing priv->hwts_rx_en via IOCTL, the gfar device is being restarted and on init_mac() the priv->hwts_rx_en flag determines RxFCB insertion, and rctrl is programmed accordingly. The patch takes care of this case too. Though maybe not as self documenting as the inlining version uses_fcb(), priv->uses_rxfcb has the main purpose to quickly signal, on the hot path, that the incoming frame has a *Rx* FCB block inserted which needs to be pulled out before passing the skb to the stack. This is a performance critical operation, it needs to happen fast, that's why uses_rxfcb is placed in the first cacheline of gfar_private. This is also why a cached rctrl cannot be used instead: 1) because we don't have 32 bits available in the first cacheline of gfar_priv (but only 16); 2) bit operations are expensive on the hot path. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14gianfar: Remove wrong buffer size conditioning to VLAN h/w offloadClaudiu Manoil
The controller's ref manual states clearly that when the hw Rx vlan offload feature is enabled, meaning that the VLEX bit from RCTRL is correctly enabled, then the hw performs automatic VLAN tag extraction and deletion from the ethernet frames. So there's no point in trying to increase the rx buff size when rxvlan is on, as the frame is actually smaller. And the Tx vlan hw accel feature (VLINS) has nothing to do with rx buff size computation. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14gianfar: gfar_process_frame returns voidClaudiu Manoil
No return code is expected from gfar_process_frame(), hence change it to return void. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14gianfar: GRO_DROP is unlikelyClaudiu Manoil
The change is significant since it affects the rx hot path. Paul observed and documented the effects at asm level, see below: "It turns out that it does make a difference, since gfar_process_frame gets inlined, and so the increment code gets moved out of line (I have marked the if statment with * and the increment code within "-----"): ------------------------- as is currently ------------------ 4d14: 80 61 00 18 lwz r3,24(r1) 4d18: 7f c4 f3 78 mr r4,r30 4d1c: 48 00 00 01 bl 4d1c <gfar_clean_rx_ring+0x10c> * 4d20: 2f 83 00 04 cmpwi cr7,r3,4 4d24: 40 9e 00 1c bne- cr7,4d40 <gfar_clean_rx_ring+0x130> ---------------------------- 4d28: 81 3c 01 f8 lwz r9,504(r28) 4d2c: 81 5c 01 fc lwz r10,508(r28) 4d30: 31 4a 00 01 addic r10,r10,1 4d34: 7d 29 01 94 addze r9,r9 4d38: 91 3c 01 f8 stw r9,504(r28) 4d3c: 91 5c 01 fc stw r10,508(r28) ---------------------------- 4d40: a0 1f 00 24 lhz r0,36(r31) 4d44: 81 3f 00 00 lwz r9,0(r31) 4d48: 7f a4 eb 78 mr r4,r29 4d4c: 7f e3 fb 78 mr r3,r31 -------------------------- unlikely ------------------------ 4d14: 80 61 00 18 lwz r3,24(r1) 4d18: 7f c4 f3 78 mr r4,r30 4d1c: 48 00 00 01 bl 4d1c <gfar_clean_rx_ring+0x10c> * 4d20: 2f 83 00 04 cmpwi cr7,r3,4 4d24: 41 9e 03 94 beq- cr7,50b8 <gfar_clean_rx_ring+0x4a8> 4d28: a0 1f 00 24 lhz r0,36(r31) 4d2c: 81 3f 00 00 lwz r9,0(r31) 4d30: 7f a4 eb 78 mr r4,r29 4d34: 7f e3 fb 78 mr r3,r31 [...] 50b8: 81 3c 01 f8 lwz r9,504(r28) 50bc: 81 5c 01 fc lwz r10,508(r28) 50c0: 31 4a 00 01 addic r10,r10,1 50c4: 7d 29 01 94 addze r9,r9 50c8: 91 3c 01 f8 stw r9,504(r28) 50cc: 91 5c 01 fc stw r10,508(r28) 50d0: 4b ff fc 58 b 4d28 <gfar_clean_rx_ring+0x118> So, the increment does actually get moved ~1k away." Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14gianfar: Cleanup and optimize struct gfar_privateClaudiu Manoil
Group run-time critical fields within the 1st cacheline (32B) followed by the tx|rx_queue reference arrays and the interrupt group instances (gfargrp), all cacheline aligned. This has several benefits. Firstly comes the performance benefit by having the members required by the driver's hot path re-grouped in the structure's first cache lines, whereas the unimportant members were pushed towards the end of the struct. Another benefit comes from eliminating a 24 byte memory hole that was rendering gfar_priv's 2nd cacheline useless. The default gcc layout of gfar_private leaves an implicit 24 byte hole after the errata (enum) member. This patch fixes it. The uchar bitfields were pushed towards the end of the struct as these are not run-time performance critical (used for init time operations). Because there is no other 2 byte member around to couple the uchar bitfields memeber with, we will have an addititnal 2 byte hole after the bitfields. This is unsignificant however, and it doesn't influence gfar_priv's size, because the whole structure is padded to be a 32B multiple. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14gianfar: Add device ref (dev) in gfar_privateClaudiu Manoil
Use device pointer (dev) to simplify the code and to avoid double indirections, especially on the hot path. Basically, instead of accessing priv to get the ofdev reference and then accessing the ofdev structure to dereference the needed dev pointer, we will get the dev pointer directly from priv. The dev pointer is required on the hot path, see gfar_new_rxbdp or gfar_clean_rx_ring (or xmit), and this patch makes it available directly from priv's 1st cacheline. This change is reflected at asm level too, taking (the hot) gfar_new_rxbdp(): initial version - 18c0: 7c 7e 1b 78 mr r30,r3 18d0: 81 69 04 3c lwz r11,1084(r9) 18d8: 34 6b 00 10 addic. r3,r11,16 18dc: 41 82 00 08 beq- 18e4 patched version - 18d0: 80 69 04 38 lwz r3,1080(r9) 18d8: 2f 83 00 00 cmpwi cr7,r3,0 18dc: 41 9e 00 08 beq- cr7,18e4 Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14gianfar: Remove unused device_node ref in gfar_privateClaudiu Manoil
Remove unused device node pointer. Remove duplicated SET_NETDEV_DEV(). Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== 1) Remove a duplicated call to skb_orphan() in pf_key, from Cong Wang. 2) Prepare xfrm and pf_key for algorithms without pf_key support, from Jussi Kivilinna. 3) Fix an unbalanced lock in xfrm_output_one(), from Li RongQing. 4) Add an IPsec state resolution packet queue to handle packets that are send before the states are resolved. 5) xfrm4_policy_fini() is unused since 2.6.11, time to remove it. From Michal Kubecek. 6) The xfrm gc threshold was configurable just in the initial namespace, make it configurable in all namespaces. From Michal Kubecek. 7) We currently can not insert policies with mark and mask such that some flows would be matched from both policies. Allow this if the priorities of these policies are different, the one with the higher priority is used in this case. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14bridge: make ifla_br_policy and br_af_ops staticCong Wang
They are only used within this file. Cc: Vlad Yasevich <vyasevic@redhat.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14bgmac: add read of interrupt mask after disabling interruptsNathan Hintz
The specs prescribe an immediate read of the interrupt mask after disabling interrupts. This patch updates the driver to match the specs. Signed-off-by: Nathan Hintz <nlhintz@hotmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14bridge: use __u16 in if_bridge.hCong Wang
We should use "__u16" instead of "u16" in the user-space visable header. Cc: Vlad Yasevich <vyasevic@redhat.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14Merge branch 'bridge_vlan'David S. Miller
Vlad Yasevich says: ==================== VLAN filtering/VLAN aware bridge Changes since v10 * Updated implemenation of ndo_fdb_del in emulex and qlogic drivers. Changes since v9: * series re-ordering so make functionality more distinct. Basic vlan filtering is patches 1-4. Support for PVID/untagged vlans is patches 5 and 6. VLAN support for FDB/MDB is patches 7-11. Patch 12 is still additional egress policy. * Slight simplification to code that extracts the VID from skb. Since we now depend on the vlan module, at the time of input skb_tci is guaranteed to be set if the packet had 8021q header. We can simply refere to it. * Changed the opaque 'parent' pointer from prior patches to a union so we can be much more explicit in our assignments. * Lots of additional testing with STP turned on. No issues were observed. Changes since v8: * Unified vlans_to_* calls into a single interface * Fixed the rest of the issues report by Michal Miroslaw * Fixed a bug where fdb entries were not created for all added vlans. Changes since v7: * Rebases on the latest net-next and removed the vlan wrapper patch from the series. * Fixed a crash in br_fdb_add/br_fdb_delete. Changes since v6: * VLANs are now stored in a VLAN bitmap per port. This allows for O(1) lookup at ingress and egress. We simply check to see if the bit associated with the vlan id is set in the map. The drawback to this approach is that it wastes some space when there is only a small number of VLANs. * In addition to the build time configuration option, VLAN filtering also has a configuration paramter in sysfs. By default the filtering is turned off and all traffic is permitted. When the filtring is turned on, we do strict matching to the filter configured. Thus, if there is no configuration, all packets are rejected. This was done to make the behavior more streight forward. Without this (and if egress policy patch is rejected), the decision for how to forward untagged traffic that was not filtered at ingress is almost impossible to make. It would not be right to deliver to every port that has PVID set as, each port may have a different PVID. * Separate egress policy bitmap patch has been isolated and is provided last in the series. This has been a more contentious piece of functionality and I wanted to isolate it so that it could easily be dropped and not block the whole series. Changes since v5: - Pulled VLAN filtering into its own file and made it a configuration options. - Made new vlan filtering option dependent on VLAN_8021Q. - Got rid of HW filter inlines and moved then vlan_core.c. (All of the above suggested by Stephen Hemminger) Changes since v4: - Pull per-port vlan data into its own structures and give it to the bridge device thus making bridge device behave like a regular port for vlan configuration. - Add a per-vlan 'untagged' bitmap that determins egress policy. If a port is part of this bitmap, traffic egresses untagged. - PVID is now used for ingress policy only. Incomming frames without VLAN tag are assigned to the PVID vlan. Egress is determined via bitmap memberships. - Allow for incremental config of a vlan. Now, PVID and untagged memberships may be set on existing vlans. They however can NOT be cleared separately. - VLAN deletion is now done via RTM_DELLINK command for PF_BRIDGE family. This cleans up the netlink interface. Changes since v3: - Re-integrated compiler problems that got left out last time. Appologies. - checkpatches.pl errors fixed Changes since v2: - Added inline functiosn to manimulate vlan hw filters and re-use in 8021q and bridge code. - Use rtnl_dereference (Michael Tsirkin) - Remove synchronize_net() call (Eric Dumazet) - Fix NULL ptr deref bug I introduced in br_ifinfo_notify. Changes since v1: - Fixed some forwarding bugs. - Add vlan to local fdb entries. New local entries are created per vlan to facilite correct forwarding to bridge interface. - Allow configuration of vlans directly on the bridge master device in addition to ports. Changes since rfc v2: - Per-port vlan bitmap is gone and is replaced with a vlan list. - Added bridge vlan list, which is referenced by each port. Entries in the birdge vlan list have port bitmap that shows which port are parts of which vlan. - Netlink API changes. - Dropped sysfs support for now. If people think this is really usefull, can add it back. - Support for native/untagged vlans. Changes since rfc v1: - Comments addressed regarding formatting and RCU usage - iocts have been removed and changed over the netlink interface. - Added support of user added ndb entries. - changed sysfs interface to export a bitmap. Also added a write interface. I am not sure how much I like it, but it made my testing easier/faster. I might change the write interface to take text instead of binary. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14bridge: Separate egress policy bitmapVlad Yasevich
Add an ability to configure a separate "untagged" egress policy to the VLAN information of the bridge. This superseeds PVID policy and makes PVID ingress-only. The policy is configured with a new flag and is represented as a port bitmap per vlan. Egress frames with a VLAN id in "untagged" policy bitmap would egress the port without VLAN header. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14bridge: Add vlan support for local fdb entriesVlad Yasevich
When VLAN is added to the port, a local fdb entry for that port (the entry with the mac address of the port) is added for that VLAN. This way we can correctly determine if the traffic is for the bridge itself. If the address of the port changes, we try to change all the local fdb entries we have for that port. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14bridge: Add vlan support to static neighborsVlad Yasevich
When a user adds bridge neighbors, allow him to specify VLAN id. If the VLAN id is not specified, the neighbor will be added for VLANs currently in the ports filter list. If no VLANs are configured on the port, we use vlan 0 and only add 1 entry. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Acked-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14bridge: Add vlan id to multicast groupsVlad Yasevich
Add vlan_id to multicasts groups so that we know which vlan each group belongs to and can correctly forward to appropriate vlan. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14bridge: Add vlan to unicast fdb entriesVlad Yasevich
This patch adds vlan to unicast fdb entries that are created for learned addresses (not the manually configured ones). It adds vlan id into the hash mix and uses vlan as an addditional parameter for an entry match. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14bridge: Add the ability to configure pvidVlad Yasevich
A user may designate a certain vlan as PVID. This means that any ingress frame that does not contain a vlan tag is assigned to this vlan and any forwarding decisions are made with this vlan in mind. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14bridge: Implement vlan ingress/egress policy with PVID.Vlad Yasevich
At ingress, any untagged traffic is assigned to the PVID. Any tagged traffic is filtered according to membership bitmap. At egress, if the vlan matches the PVID, the frame is sent untagged. Otherwise the frame is sent tagged. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14bridge: Dump vlan information from a bridge portVlad Yasevich
Using the RTM_GETLINK dump the vlan filter list of a given bridge port. The information depends on setting the filter flag similar to how nic VF info is dumped. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14bridge: Add netlink interface to configure vlans on bridge portsVlad Yasevich
Add a netlink interface to add and remove vlan configuration on bridge port. The interface uses the RTM_SETLINK message and encodes the vlan configuration inside the IFLA_AF_SPEC. It is possble to include multiple vlans to either add or remove in a single message. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14bridge: Verify that a vlan is allowed to egress on given portVlad Yasevich
When bridge forwards a frame, make sure that a frame is allowed to egress on that port. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14bridge: Validate that vlan is permitted on ingressVlad Yasevich
When a frame arrives on a port or transmitted by the bridge, if we have VLANs configured, validate that a given VLAN is allowed to enter the bridge. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14bridge: Add vlan filtering infrastructureVlad Yasevich
Adds an optional infrustructure component to bridge that would allow native vlan filtering in the bridge. Each bridge port (as well as the bridge device) now get a VLAN bitmap. Each bit in the bitmap is associated with a vlan id. This way if the bit corresponding to the vid is set in the bitmap that the packet with vid is allowed to enter and exit the port. Write access the bitmap is protected by RTNL and read access protected by RCU. Vlan functionality is disabled by default. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>