summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2011-06-16netfilter: ipset: hash:net,iface type introducedJozsef Kadlecsik
The hash:net,iface type makes possible to store network address and interface name pairs in a set. It's mostly suitable for egress and ingress filtering. Examples: # ipset create test hash:net,iface # ipset add test 192.168.0.0/16,eth0 # ipset add test 192.168.0.0/24,eth1 Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: use the stored first cidr value instead of '1'Jozsef Kadlecsik
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: fix return code for destroy when sets are in useJozsef Kadlecsik
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: add xt_action_param to the variant level kadt functions, ↵Jozsef Kadlecsik
ipset API change With the change the sets can use any parameter available for the match and target extensions, like input/output interface. It's required for the hash:net,iface set type. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: use unified from/to address masking and check the usageJozsef Kadlecsik
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: take into account cidr value for the from address when ↵Jozsef Kadlecsik
creating the set When creating a set from a range expressed as a network like 10.1.1.172/29, the from address was taken as the IP address part and not masked with the netmask from the cidr. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: support range for IPv4 at adding/deleting elements for ↵Jozsef Kadlecsik
hash:*net* types The range internally is converted to the network(s) equal to the range. Example: # ipset new test hash:net # ipset add test 10.2.0.0-10.2.1.12 # ipset list test Name: test Type: hash:net Header: family inet hashsize 1024 maxelem 65536 Size in memory: 16888 References: 0 Members: 10.2.1.12 10.2.1.0/29 10.2.0.0/24 10.2.1.8/30 Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: set type support with multiple revisions addedJozsef Kadlecsik
A set type may have multiple revisions, for example when syntax is extended. Support continuous revision ranges in set types. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: fix adding ranges to hash typesJozsef Kadlecsik
When ranges are added to hash types, the elements may trigger rehashing the set. However, the last successfully added element was not kept track so the adding started again with the first element after the rehashing. Bug reported by Mr Dash Four. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: support listing setnames and headers tooJozsef Kadlecsik
Current listing makes possible to list sets with full content only. The patch adds support partial listings, i.e. listing just the existing setnames or listing set headers, without set members. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: options and flags support added to the kernel APIJozsef Kadlecsik
The support makes possible to specify the timeout value for the SET target and a flag to reset the timeout for already existing entries. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: whitespace fixes: some space before tab slipped inJozsef Kadlecsik
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: timeout can be modified for already added elementsJozsef Kadlecsik
When an element to a set with timeout added, one can change the timeout by "readding" the element with the "-exist" flag. That means the timeout value is reset to the specified one (or to the default from the set specification if the "timeout n" option is not used). Example ipset add foo 1.2.3.4 timeout 10 ipset add foo 1.2.3.4 timeout 600 -exist Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16Merge branch 'master' of ↵Patrick McHardy
git://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next-2.6
2011-06-16Merge branch 'master' of /repos/git/net-next-2.6Patrick McHardy
2011-06-14IPVS: remove unused init and cleanup functions.Hans Schillstrom
After restructuring, there is some unused or empty functions left to be removed. Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-06-14IPVS: labels at pos 0Hans Schillstrom
Put goto labels at the beginig of row acording to coding style example. Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-06-13l2tp: fix l2tp_ip_sendmsg() route handlingEric Dumazet
l2tp_ip_sendmsg() in non connected mode incorrectly calls sk_setup_caps(). Subsequent send() calls send data to wrong destination. We can also avoid changing dst refcount in connected mode, using appropriate rcu locking. Once output route lookups can also be done under rcu, sendto() calls wont change dst refcounts too. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@conan.davemloft.net>
2011-06-13net: export time stamp utility function for Ethernet MAC driversRichard Cochran
The network stack provides the function, skb_clone_tx_timestamp(). Ethernet MAC drivers can call this via the transmit time stamping hook, skb_tx_timestamp(). This commit exports the clone function so that drivers using it can be compiled as modules. Signed-off-by: Richard Cochran <richard.cochran@omicron.at> Signed-off-by: David S. Miller <davem@conan.davemloft.net>
2011-06-13IPVS: rename of netns init and cleanup functions.Hans Schillstrom
Make it more clear what the functions does, on request by Julian. Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-06-13ipvs: support more FTP PASV responsesJulian Anastasov
Change the parsing of FTP commands and responses to support skip character. It allows to detect variations in the 227 PASV response. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-06-11snmp: reduce percpu needs by 50%Eric Dumazet
SNMP mibs use two percpu arrays, one used in BH context, another in USER context. With increasing number of cpus in machines, and fact that ipv6 uses per network device ipstats_mib, this is consuming a lot of memory if many network devices are registered. commit be281e554e2a (ipv6: reduce per device ICMP mib sizes) shrinked percpu needs for ipv6, but we can reduce memory use a bit more. With recent percpu infrastructure (irqsafe_cpu_inc() ...), we no longer need this BH/USER separation since we can update counters in a single x86 instruction, regardless of the BH/USER context. Other arches than x86 might need to disable irq in their irqsafe_cpu_inc() implementation : If this happens to be a problem, we can make SNMP_ARRAY_SZ arch dependent, but a previous poll ( https://lkml.org/lkml/2011/3/17/174 ) to arch maintainers did not raise strong opposition. Only on 32bit arches, we need to disable BH for 64bit counters updates done from USER context (currently used for IP MIB) This also reduces vmlinux size : 1) x86_64 build $ size vmlinux.before vmlinux.after text data bss dec hex filename 7853650 1293772 1896448 11043870 a8841e vmlinux.before 7850578 1293772 1896448 11040798 a8781e vmlinux.after 2) i386 build $ size vmlinux.before vmlinux.afterpatch text data bss dec hex filename 6039335 635076 3670016 10344427 9dd7eb vmlinux.before 6037342 635076 3670016 10342434 9dd022 vmlinux.afterpatch Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Andi Kleen <andi@firstfloor.org> CC: Ingo Molnar <mingo@elte.hu> CC: Tejun Heo <tj@kernel.org> CC: Christoph Lameter <cl@linux-foundation.org> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org CC: linux-arch@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-11virtio_net: introduce VIRTIO_NET_HDR_F_DATA_VALIDJason Wang
There's no need for the guest to validate the checksum if it have been validated by host nics. So this patch introduces a new flag - VIRTIO_NET_HDR_F_DATA_VALID which is used to bypass the checksum examing in guest. The backend (tap/macvtap) may set this flag when met skbs with CHECKSUM_UNNECESSARY to save cpu utilization. No feature negotiation is needed as old driver just ignore this flag. Iperf shows 12%-30% performance improvement for UDP traffic. For TCP, when gro is on no difference as it produces skb with partial checksum. But when gro is disabled, 20% or even higher improvement could be measured by netperf. Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-11sctp: kzalloc() error handling on deleting last addressMichio Honda
Signed-off-by: Michio Honda <micchie@sfc.wide.ad.jp> Acked-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-10rtnetlink: Compute and store minimum ifinfo dump sizeGreg Rose
The message size allocated for rtnl ifinfo dumps was limited to a single page. This is not enough for additional interface info available with devices that support SR-IOV and caused a bug in which VF info would not be displayed if more than approximately 40 VFs were created per interface. Implement a new function pointer for the rtnl_register service that will calculate the amount of data required for the ifinfo dump and allocate enough data to satisfy the request. Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-06-09Merge branch 'batman-adv/next' of git://git.open-mesh.org/ecsv/linux-mergeDavid S. Miller
2011-06-09batman-adv: use NO_FLAGS define instead of hard-coding 0Marek Lindner
The definition NO_FLAGS was introduced to make the code more readable and shall be used to initialize flag fields. Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-06-09batman-adv: Use enums for related constantsSven Eckelmann
CodingStyle "Chapter 12: Macros, Enums and RTL" recommends to use enums for several related constants. Internal states can be used without defining the actual value, but all values which are visible to the outside must be defined as before. Normal values are assigned as usual and flags are defined by shifts of a bit. Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-06-09batman-adv: Rewrite debugfs kobj_to_* helpers as functionsSven Eckelmann
CodingStyle "Chapter 12: Macros, Enums and RTL" highly recommends to use functions instead of macros were possible. This ensures type safety and prevents shadowing of other variables. Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-06-09batman-adv: Fix signedness problem in parse_gw_bandwidthSven Eckelmann
strict_strtoul as used in parse_gw_bandwidth is defined for unsigned long and strict_strtol should be used instead for long. Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-06-09batman-adv: Don't return value in void functionSven Eckelmann
gw_node_delete is defined with "void" as return type, but still tries to return a value. The called function gw_node_delete is also return as void and thus doesn't provide a value for us. Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-06-09batman-adv: accept delayed rebroadcasts to avoid bogus routing under heavy loadDaniele Furlan
When a link is saturated (re)broadcasts of OGMs are delayed. Under heavy load this delay may exceed the orig interval which leads to OGMs being dropped (the code would only accept an OGM rebroadcast if it arrived before the next OGM was broadcasted). With this patch batman-adv will also accept delayed OGMs in order to avoid a bogus influence on the routing metric. Signed-off-by: Daniele Furlan <daniele.furlan@gmail.com> Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-06-09inetpeer: remove unused listEric Dumazet
Andi Kleen and Tim Chen reported huge contention on inetpeer unused_peers.lock, on memcached workload on a 40 core machine, with disabled route cache. It appears we constantly flip peers refcnt between 0 and 1 values, and we must insert/remove peers from unused_peers.list, holding a contended spinlock. Remove this list completely and perform a garbage collection on-the-fly, at lookup time, using the expired nodes we met during the tree traversal. This removes a lot of code, makes locking more standard, and obsoletes two sysctls (inet_peer_gc_mintime and inet_peer_gc_maxtime). This also removes two pointers in inet_peer structure. There is still a false sharing effect because refcnt is in first cache line of object [were the links and keys used by lookups are located], we might move it at the end of inet_peer structure to let this first cache line mostly read by cpus. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Andi Kleen <andi@firstfloor.org> CC: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-09tcp: RFC2988bis + taking RTT sample from 3WHS for the passive open sideJerry Chu
This patch lowers the default initRTO from 3secs to 1sec per RFC2988bis. It falls back to 3secs if the SYN or SYN-ACK packet has been retransmitted, AND the TCP timestamp option is not on. It also adds support to take RTT sample during 3WHS on the passive open side, just like its active open counterpart, and uses it, if valid, to seed the initRTO for the data transmission phase. The patch also resets ssthresh to its initial default at the beginning of the data transmission phase, and reduces cwnd to 1 if there has been MORE THAN ONE retransmission during 3WHS per RFC5681. Signed-off-by: H.K. Jerry Chu <hkchu@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-09ipv6: generate link local address for GRE tunnelstephen hemminger
Use same logic as SIT tunnel to handle link local address for GRE tunnel. OSPFv3 requires link-local address to function. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-08v2 ethtool: remove support for ETHTOOL_GRXNTUPLEAlexander Duyck
This change is meant to remove all support for displaying an ntuple as strings via ETHTOOL_GRXNTUPLE. The reason for this change is due to the fact that multiple issues have been found including: - Multiple buffer overruns for strings being displayed. - Incorrect filters displayed, cleared filters with ring of -2 are displayed - Setting get_rx_ntuple displays no rules if defined. - Endianess wrong on displayed values. - Hard limit of 1024 filters makes display functionality extremely limited The only driver that had supported this interface was ixgbe. Since it no longer uses the interface and due to the issues mentioned above I am submitting this patch to remove it. v2: Updated based on comments from Ben Hutchings - Left ETH_SS_NTUPLE_FILTERS in code but commented on it being deprecated - Removed ethtool_rx_ntuple_list and ethtool_rx_ntuple_flow_spec_container - Left ETHTOOL_GRXNTUPLE but commented it as deprecated Also cleaned up set_rx_ntuple since there is no flow spec container to maintain we can drop all the code for the alloc and free of it and just return ops->set_rx_ntuple(). Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-08Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
2011-06-07mac80211: Stop BA session event from deviceShahar Levi
Some devices support BT/WLAN co-existence algorigthms. In order not to harm the system performance and user experience, the device requests not to allow any RX BA session and tear down existing RX BA sessions based on system constraints such as periodic BT activity that needs to limit WLAN activity (eg.SCO or A2DP). In such cases, the intention is to limit the duration of the RX PPDU and therefore prevent the peer device to use A-MPDU aggregation. Adding ieee80211_stop_rx_ba_session() callback that can be used by the driver to stop existing BA sessions. Signed-off-by: Shahar Levi <shahar_levi@ti.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-07Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2011-06-07net: remove interrupt.h inclusion from netdevice.hAlexey Dobriyan
* remove interrupt.g inclusion from netdevice.h -- not needed * fixup fallout, add interrupt.h and hardirq.h back where needed. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-06sctp: Guard IPV6 specific code properly.David S. Miller
Outside of net/sctp/ipv6.c, IPV6 specific code needs to be ifdef guarded. This fixes build failures with IPV6 disabled. Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-06Revert "mac80211: Skip tailroom reservation for full HW-crypto devices"John W. Linville
This reverts commit aac6af5534fade2b18682a0b9efad1a6c04c34c6. Conflicts: net/mac80211/key.c That commit has a race that causes a warning, as documented in the thread here: http://marc.info/?l=linux-wireless&m=130717684914101&w=2 Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-05net: Remove unnecessary semicolonsJoe Perches
Semicolons are not necessary after switch/while/for/if braces so remove them. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-05af-packet: Use existing netdev reference for bound sockets.Ben Greear
This saves a network device lookup on each packet transmitted, for sockets that are bound to a network device. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-05af-packet: Hold reference to bound network devices.Ben Greear
Old code was probably safe, but with this change we can actually use the netdev object, not just compare the pointer values. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-04Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
2011-06-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (40 commits) tg3: Fix tg3_skb_error_unmap() net: tracepoint of net_dev_xmit sees freed skb and causes panic drivers/net/can/flexcan.c: add missing clk_put net: dm9000: Get the chip in a known good state before enabling interrupts drivers/net/davinci_emac.c: add missing clk_put af-packet: Add flag to distinguish VID 0 from no-vlan. caif: Fix race when conditionally taking rtnl lock usbnet/cdc_ncm: add missing .reset_resume hook vlan: fix typo in vlan_dev_hard_start_xmit() net/ipv4: Check for mistakenly passed in non-IPv4 address iwl4965: correctly validate temperature value bluetooth l2cap: fix locking in l2cap_global_chan_by_psm ath9k: fix two more bugs in tx power cfg80211: don't drop p2p probe responses Revert "net: fix section mismatches" drivers/net/usb/catc.c: Fix potential deadlock in catc_ctrl_run() sctp: stop pending timers and purge queues when peer restart asoc drivers/net: ks8842 Fix crash on received packet when in PIO mode. ip_options_compile: properly handle unaligned pointer iwlagn: fix incorrect PCI subsystem id for 6150 devices ...
2011-06-03Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into for-davem
2011-06-03mac80211: call dev_alloc_name before copying name to sdataThadeu Lima de Souza Cascardo
This partially reverts 1c5cae815d19ffe02bdfda1260949ef2b1806171, because the netdev name is copied into sdata->name, which is used for debugging messages, for example. Otherwise, we get messages like this: wlan%d: authenticated Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> Cc: Jiri Pirko <jpirko@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: "John W. Linville" <linville@tuxdriver.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-02net: tracepoint of net_dev_xmit sees freed skb and causes panicKoki Sanagi
Because there is a possibility that skb is kfree_skb()ed and zero cleared after ndo_start_xmit, we should not see the contents of skb like skb->len and skb->dev->name after ndo_start_xmit. But trace_net_dev_xmit does that and causes panic by NULL pointer dereference. This patch fixes trace_net_dev_xmit not to see the contents of skb directly. If you want to reproduce this panic, 1. Get tracepoint of net_dev_xmit on 2. Create 2 guests on KVM 2. Make 2 guests use virtio_net 4. Execute netperf from one to another for a long time as a network burden 5. host will panic(It takes about 30 minutes) Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>