summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2011-05-17Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/vmxnet3/vmxnet3_ethtool.c net/core/dev.c
2011-05-17ipv4: Don't use enums as bitmasks in ip_fragment.cDavid S. Miller
Noticed by Joe Perches. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-17net: ethtool: fix IPV6 checksum feature name stringMichał Mirosław
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-17net: Change netdev_fix_features messages loglevelMichael S. Tsirkin
Cool, how about we make 'Features changed' debug as well? This way userspace can't fill up the log just by tweaking tun features with an ioctl. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-17net: recvmmsg: Strip MSG_WAITFORONE when calling recvmsgAnton Blanchard
recvmmsg fails on a raw socket with EINVAL. The reason for this is packet_recvmsg checks the incoming flags: err = -EINVAL; if (flags & ~(MSG_PEEK|MSG_DONTWAIT|MSG_TRUNC|MSG_CMSG_COMPAT|MSG_ERRQUEUE)) goto out; This patch strips out MSG_WAITFORONE when calling recvmmsg which fixes the issue. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: stable@kernel.org [2.6.34+] Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-17Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6
2011-05-17net: ping: fix build failureVasiliy Kulikov
If CONFIG_PROC_SYSCTL=n the building process fails: ping.c:(.text+0x52af3): undefined reference to `inet_get_ping_group_range_net' Moved inet_get_ping_group_range_net() to ping.c. Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-17net: use hlist_del_rcu() in dev_change_name()Eric Dumazet
Using plain hlist_del() in dev_change_name() is wrong since a concurrent reader can crash trying to dereference LIST_POISON1. Bug introduced in commit 72c9528bab94 (net: Introduce dev_get_by_name_rcu()) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-17bluetooth: Fix warnings in l2cap_core.cDavid S. Miller
net/bluetooth/l2cap_core.c: In function ‘l2cap_recv_frame’: net/bluetooth/l2cap_core.c:3758:15: warning: ‘sk’ may be used uninitialized in this function net/bluetooth/l2cap_core.c:3758:15: note: ‘sk’ was declared here net/bluetooth/l2cap_core.c:3791:15: warning: ‘sk’ may be used uninitialized in this function net/bluetooth/l2cap_core.c:3791:15: note: ‘sk’ was declared here Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-17Merge branch 'for-davem' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
2011-05-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: net: Change netdev_fix_features messages loglevel vmxnet3: Fix inconsistent LRO state after initialization sfc: Fix oops in register dump after mapping change IPVS: fix netns if reading ip_vs_* procfs entries bridge: fix forwarding of IPv6
2011-05-16Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem Conflicts: drivers/net/wireless/iwlwifi/iwl-agn-tx.c net/mac80211/sta_info.h
2011-05-16net: Change netdev_fix_features messages loglevelMichał Mirosław
Those reduced to DEBUG can possibly be triggered by unprivileged processes and are nothing exceptional. Illegal checksum combinations can only be caused by driver bug, so promote those messages to WARN. Since GSO without SG will now only cause DEBUG message from netdev_fix_features(), remove the workaround from register_netdevice(). Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-16net: convert to new cpumask APIKOSAKI Motohiro
We plan to remove cpu_xx() old api later. Thus this patch convert it. This patch has no functional change. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-16ipv4: more compliant RFC 3168 supportEric Dumazet
Commit 6623e3b24a5e (ipv4: IP defragmentation must be ECN aware) was an attempt to not lose "Congestion Experienced" (CE) indications when performing datagram defragmentation. Stefanos Harhalakis raised the point that RFC 3168 requirements were not completely met by this commit. In particular, we MUST detect invalid combinations and eventually drop illegal frames. Reported-by: Stefanos Harhalakis <v13@v13.gr> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-16cfg80211: make stripping of 802.11 header optional from AMSDUYogesh Ashok Powar
Currently the devices that have already stripped IEEE 802.11 header from the AMSDU SKB can not use ieee80211_amsdu_to_8023s routine. This patch enhances ieee80211_amsdu_to_8023s() API by changing mandatory removing of IEEE 802.11 header from AMSDU to optional. Signed-off-by: Yogesh Ashok Powar <yogeshp@marvell.com> Signed-off-by: Bing Zhao <bzhao@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-05-16nl80211: Move peer link state definition to nl80211Javier Cardona
These definitions need to be exposed now that we can set the peer link states via NL80211_ATTR_STA_PLINK_STATE. They were already being (opaquely) reported by NL80211_STA_INFO_PLINK_STATE. Signed-off-by: Javier Cardona <javier@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-05-16net/rfkill/core.c: Avoid leaving freed data in a listJulia Lawall
The list_for_each_entry loop can fail, in which case the list element is not removed from the list rfkill_fds. Since this list is not accessed by the loop, the addition of &data->list into the list is just moved after the loop. The sematic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression E,E1,E2; identifier l; @@ *list_add(&E->l,E1); ... when != E1 when != list_del(&E->l) when != list_del_init(&E->l) when != E = E2 *kfree(E);// </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-05-16mac80211: sparse RCU annotationsJohannes Berg
This adds sparse RCU annotations to most of mac80211, only the mesh code remains to be done. Due the the previous patches, the annotations are pretty simple. The only thing that this actually changes is removing the RCU usage of key->sta in debugfs since this pointer isn't actually an RCU-managed pointer (it only has a single assignment done before the key even goes live). As that is otherwise harmless, I decided to make it part of this patch. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-05-16mac80211: fix TX a-MPDU lockingJohannes Berg
During my quest to make mac80211 not have any RCU warnings from sparse, I came across the a-MPDU code again and it wasn't quite clear why it isn't racy. So instead of assigning the tid_tx array with just the spinlock held in ieee80211_start_tx_ba_session use a separate temporary array protected only by the spinlock and protect all assignments to the "live" array by both the spinlock and the mutex so that other code is easily verified to be correct. Due to pointer assignment atomicity I don't think this is a real issue, but I'm not sure, especially on Alpha the current code might be problematic. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-05-16cfg80211: advertise possible interface combinationsJohannes Berg
Add the ability to advertise interface combinations in nl80211. This allows the driver to indicate what the combinations are that it supports. "Combinations" of just a single interface are implicit, as previously. Note that cfg80211 will enforce that the restrictions are met, but not for all drivers yet (once all drivers are updated, we can remove the flag and enforce for all). When no combinations are actually supported, an empty list will be exported so that userspace can know if the kernel exported this info or not (although it isn't clear to me what tools using the info should do if the kernel didn't export it). Since some interface types are purely virtual/software and don't fit the restrictions, those are exposed in a new list of pure SW types, not subject to restrictions. This mainly exists to handle AP-VLAN and monitor interfaces in mac80211. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-05-16ipv4: Trivial rt->rt_src conversions in net/ipv4/route.cDavid S. Miller
At these points we have a fully filled in value via the IP header the form of ip_hdr(skb)->saddr Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-16net: ping: dont call udp_ioctl()Eric Dumazet
udp_ioctl() really handles UDP and UDPLite protocols. 1) It can increment UDP_MIB_INERRORS in case first_packet_length() finds a frame with bad checksum. 2) It has a dependency on sizeof(struct udphdr), not applicable to ICMP/PING If ping sockets need to handle SIOCINQ/SIOCOUTQ ioctl, this should be done differently. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Vasiliy Kulikov <segoon@openwall.com> Acked-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-16netfilter: nf_ct_sip: fix SDP parsing in TCP SIP messages for some Cisco phonesPatrick McHardy
Some Cisco phones do not place the Content-Length field at the end of the SIP message. This is valid, due to a misunderstanding of the specification the parser expects the SDP body to start directly after the Content-Length field. Fix the parser to scan for \r\n\r\n to locate the beginning of the SDP body. Reported-by: Teresa Kang <teresa_kang@gemtek.com.tw> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-05-16netfilter: nf_ct_sip: validate Content-Length in TCP SIP messagesPatrick McHardy
Verify that the message length of a single SIP message, which is calculated based on the Content-Length field contained in the SIP message, does not exceed the packet boundaries. Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-05-15caif: remove unesesarry exportssjur.brandeland@stericsson.com
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15caif: Bugfix debugfs directory name must be unique.sjur.brandeland@stericsson.com
Race condition caused debugfs_create_dir() to fail due to duplicate name. Use atomic counter to create unique directory name. net_ratelimit() is introduced to limit debug printouts. Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15caif: Handle dev_queue_xmit errors.sjur.brandeland@stericsson.com
Do proper handling of dev_queue_xmit errors in order to avoid double free of skb and leaks in error conditions. In cfctrl pending requests are removed when CAIF Link layer goes down. Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15caif: prepare support for namespacessjur.brandeland@stericsson.com
Use struct net to reference CAIF configuration object instead of static variables. Refactor functions caif_connect_client, caif_disconnect_client and squach files cfcnfg.c and caif_config_utils. Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15caif: Protected in-flight packets using dev or sock refcont.sjur.brandeland@stericsson.com
CAIF Socket Layer and ip-interface registers reference counters in CAIF service layer. The functions sock_hold, sock_put and dev_hold, dev_put are used by CAIF Stack to protect from freeing memory while packets are in-flight. Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15caif: Move refcount from service layer to sock and dev.sjur.brandeland@stericsson.com
Instead of having reference counts in caif service layers, we hook into existing refcount handling in socket layer and netdevice. Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15caif: Add ref-count to framing layersjur.brandeland@stericsson.com
Introduce Per-cpu reference for lower part of CAIF Stack. Before freeing payload is disabled, synchronize_rcu() is called, and then ref-count verified to be zero. Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15caif: Use RCU and lists in cfcnfg.c for managing caif link layerssjur.brandeland@stericsson.com
RCU lists are used for handling the link layers instead of array. When generating CAIF phy-id, ifindex is used as base. Legal range is 1-6. Introduced set_phy_state() for managing CAIF Link layer state. Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15caif: Use RCU instead of spin-lock in caif_dev.csjur.brandeland@stericsson.com
RCU read_lock and refcount is used to protect in-flight packets. Use RCU and counters to manage freeing lower part of the CAIF stack if CAIF-link layer is removed. Old solution based on delaying removal of device is removed. When CAIF link layer goes down the use of CAIF link layer is disabled (by calling caif_set_phy_state()), but removal and freeing of the lower part of the CAIF stack is done when Link layer is unregistered. Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15caif: Use rcu_read_lock in CAIF mux layer.sjur.brandeland@stericsson.com
Replace spin_lock with rcu_read_lock when accessing lists to layers and cache. While packets are in flight rcu_read_lock should not be held, instead ref-counters are used in combination with RCU. Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15IPVS: fix netns if reading ip_vs_* procfs entriesHans Schillstrom
Without this patch every access to ip_vs in procfs will increase the netns count i.e. an unbalanced get_net()/put_net(). (ipvsadm commands also use procfs.) The result is you can't exit a netns if reading ip_vs_* procfs entries. Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-05-15bridge: fix forwarding of IPv6Stephen Hemminger
The commit 6b1e960fdbd75dcd9bcc3ba5ff8898ff1ad30b6e bridge: Reset IPCB when entering IP stack on NF_FORWARD broke forwarding of IPV6 packets in bridge because it would call bp_parse_ip_options with an IPV6 packet. Reported-by: Noah Meyerhans <noahm@debian.org> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-05-15net: ping: small changesEric Dumazet
ping_table is not __read_mostly, since it contains one rwlock, and is static to ping.c ping_port_rover & ping_v4_lookup are static Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15Merge branch 'batman-adv/next' of git://git.open-mesh.org/ecsv/linux-mergeDavid S. Miller
2011-05-14batman-adv: reset broadcast flood protection on errorMarek Lindner
The broadcast flood protection should be reset to its original value if the primary interface could not be retrieved. Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-05-14batman-adv: Add missing hardif_free_ref in forw_packet_freeSven Eckelmann
add_bcast_packet_to_list increases the refcount for if_incoming but the reference count is never decreased. The reference count must be increased for all kinds of forwarded packets which have the primary interface stored and forw_packet_free must decrease them. Also purge_outstanding_packets has to invoke forw_packet_free when a work item was really cancelled. This regression was introduced in 32ae9b221e788413ce68feaae2ca39e406211a0a. Reported-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-05-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: bridge: fix forwarding of IPv6 bonding,llc: Fix structure sizeof incompatibility for some PDUs ipv6: restore correct ECN handling on TCP xmit ne-h8300: Fix regression caused during net_device_ops conversion hydra: Fix regression caused during net_device_ops conversion zorro8390: Fix regression caused during net_device_ops conversion sfc: Always map MCDI shared memory as uncacheable ehea: Fix memory hotplug oops libertas: fix cmdpendingq locking iwlegacy: fix IBSS mode crashes ath9k: Fix a warning due to a queued work during S3 state mac80211: don't start the dynamic ps timer if not associated
2011-05-13ipv4: Remove rt->rt_dst reference from ip_forward_options().David S. Miller
At this point iph->daddr equals what rt->rt_dst would hold. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13ipv4: Remove route key identity dependencies in ip_rt_get_source().David S. Miller
Pass in the sk_buff so that we can fetch the necessary keys from the packet header when working with input routes. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13ipv4: Always call ip_options_build() after rest of IP header is filled in.David S. Miller
This will allow ip_options_build() to reliably look at the values of iph->{daddr,saddr} Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13ipv4: Kill spurious write to iph->daddr in ip_forward_options().David S. Miller
This code block executes when opt->srr_is_hit is set. It will be set only by ip_options_rcv_srr(). ip_options_rcv_srr() walks until it hits a matching nexthop in the SRR option addresses, and when it matches one 1) looks up the route for that nexthop and 2) on route lookup success it writes that nexthop value into iph->daddr. ip_forward_options() runs later, and again walks the SRR option addresses looking for the option matching the destination of the route stored in skb_rtable(). This route will be the same exact one looked up for the nexthop by ip_options_rcv_srr(). Therefore "rt->rt_dst == iph->daddr" must be true. All it really needs to do is record the route's source address in the matching SRR option adddress. It need not write iph->daddr again, since that has already been done by ip_options_rcv_srr() as detailed above. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13net:set valid name before calling ndo_init()Peter Pan(潘卫平)
In commit 1c5cae815d19 (net: call dev_alloc_name from register_netdevice), a bug of bonding was involved, see example 1 and 2. In register_netdevice(), the name of net_device is not valid until dev_get_valid_name() is called. But dev->netdev_ops->ndo_init(that is bond_init) is called before dev_get_valid_name(), and it uses the invalid name of net_device. I think register_netdevice() should make sure that the name of net_device is valid before calling ndo_init(). example 1: modprobe bonding ls /proc/net/bonding/bond%d ps -eLf root 3398 2 3398 0 1 21:34 ? 00:00:00 [bond%d] example 2: modprobe bonding max_bonds=3 [ 170.100292] bonding: Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) [ 170.101090] bonding: Warning: either miimon or arp_interval and arp_ip_target module parameters must be specified, otherwise bonding will not detect link failures! see bonding.txt for details. [ 170.102469] ------------[ cut here ]------------ [ 170.103150] WARNING: at /home/pwp/net-next-2.6/fs/proc/generic.c:586 proc_register+0x126/0x157() [ 170.104075] Hardware name: VirtualBox [ 170.105065] proc_dir_entry 'bonding/bond%d' already registered [ 170.105613] Modules linked in: bonding(+) sunrpc ipv6 uinput microcode ppdev parport_pc parport joydev e1000 pcspkr i2c_piix4 i2c_core [last unloaded: bonding] [ 170.108397] Pid: 3457, comm: modprobe Not tainted 2.6.39-rc2+ #14 [ 170.108935] Call Trace: [ 170.109382] [<c0438f3b>] warn_slowpath_common+0x6a/0x7f [ 170.109911] [<c051a42a>] ? proc_register+0x126/0x157 [ 170.110329] [<c0438fc3>] warn_slowpath_fmt+0x2b/0x2f [ 170.110846] [<c051a42a>] proc_register+0x126/0x157 [ 170.111870] [<c051a4dd>] proc_create_data+0x82/0x98 [ 170.112335] [<f94e6af6>] bond_create_proc_entry+0x3f/0x73 [bonding] [ 170.112905] [<f94dd806>] bond_init+0x77/0xa5 [bonding] [ 170.113319] [<c0721ac6>] register_netdevice+0x8c/0x1d3 [ 170.113848] [<f94e0e30>] bond_create+0x6c/0x90 [bonding] [ 170.114322] [<f94f4763>] bonding_init+0x763/0x7b1 [bonding] [ 170.114879] [<c0401240>] do_one_initcall+0x76/0x122 [ 170.115317] [<f94f4000>] ? 0xf94f3fff [ 170.115799] [<c0463f1e>] sys_init_module+0x1286/0x140d [ 170.116879] [<c07c6d9f>] sysenter_do_call+0x12/0x28 [ 170.117404] ---[ end trace 64e4fac3ae5fff1a ]--- [ 170.117924] bond%d: Warning: failed to register to debugfs [ 170.128728] ------------[ cut here ]------------ [ 170.129360] WARNING: at /home/pwp/net-next-2.6/fs/proc/generic.c:586 proc_register+0x126/0x157() [ 170.130323] Hardware name: VirtualBox [ 170.130797] proc_dir_entry 'bonding/bond%d' already registered [ 170.131315] Modules linked in: bonding(+) sunrpc ipv6 uinput microcode ppdev parport_pc parport joydev e1000 pcspkr i2c_piix4 i2c_core [last unloaded: bonding] [ 170.133731] Pid: 3457, comm: modprobe Tainted: G W 2.6.39-rc2+ #14 [ 170.134308] Call Trace: [ 170.134743] [<c0438f3b>] warn_slowpath_common+0x6a/0x7f [ 170.135305] [<c051a42a>] ? proc_register+0x126/0x157 [ 170.135820] [<c0438fc3>] warn_slowpath_fmt+0x2b/0x2f [ 170.137168] [<c051a42a>] proc_register+0x126/0x157 [ 170.137700] [<c051a4dd>] proc_create_data+0x82/0x98 [ 170.138174] [<f94e6af6>] bond_create_proc_entry+0x3f/0x73 [bonding] [ 170.138745] [<f94dd806>] bond_init+0x77/0xa5 [bonding] [ 170.139278] [<c0721ac6>] register_netdevice+0x8c/0x1d3 [ 170.139828] [<f94e0e30>] bond_create+0x6c/0x90 [bonding] [ 170.140361] [<f94f4763>] bonding_init+0x763/0x7b1 [bonding] [ 170.140927] [<c0401240>] do_one_initcall+0x76/0x122 [ 170.141494] [<f94f4000>] ? 0xf94f3fff [ 170.141975] [<c0463f1e>] sys_init_module+0x1286/0x140d [ 170.142463] [<c07c6d9f>] sysenter_do_call+0x12/0x28 [ 170.142974] ---[ end trace 64e4fac3ae5fff1b ]--- [ 170.144949] bond%d: Warning: failed to register to debugfs Signed-off-by: Weiping Pan <panweiping3@gmail.com> Reviewed-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13net: ipv4: add IPPROTO_ICMP socket kindVasiliy Kulikov
This patch adds IPPROTO_ICMP socket kind. It makes it possible to send ICMP_ECHO messages and receive the corresponding ICMP_ECHOREPLY messages without any special privileges. In other words, the patch makes it possible to implement setuid-less and CAP_NET_RAW-less /bin/ping. In order not to increase the kernel's attack surface, the new functionality is disabled by default, but is enabled at bootup by supporting Linux distributions, optionally with restriction to a group or a group range (see below). Similar functionality is implemented in Mac OS X: http://www.manpagez.com/man/4/icmp/ A new ping socket is created with socket(PF_INET, SOCK_DGRAM, PROT_ICMP) Message identifiers (octets 4-5 of ICMP header) are interpreted as local ports. Addresses are stored in struct sockaddr_in. No port numbers are reserved for privileged processes, port 0 is reserved for API ("let the kernel pick a free number"). There is no notion of remote ports, remote port numbers provided by the user (e.g. in connect()) are ignored. Data sent and received include ICMP headers. This is deliberate to: 1) Avoid the need to transport headers values like sequence numbers by other means. 2) Make it easier to port existing programs using raw sockets. ICMP headers given to send() are checked and sanitized. The type must be ICMP_ECHO and the code must be zero (future extensions might relax this, see below). The id is set to the number (local port) of the socket, the checksum is always recomputed. ICMP reply packets received from the network are demultiplexed according to their id's, and are returned by recv() without any modifications. IP header information and ICMP errors of those packets may be obtained via ancillary data (IP_RECVTTL, IP_RETOPTS, and IP_RECVERR). ICMP source quenches and redirects are reported as fake errors via the error queue (IP_RECVERR); the next hop address for redirects is saved to ee_info (in network order). socket(2) is restricted to the group range specified in "/proc/sys/net/ipv4/ping_group_range". It is "1 0" by default, meaning that nobody (not even root) may create ping sockets. Setting it to "100 100" would grant permissions to the single group (to either make /sbin/ping g+s and owned by this group or to grant permissions to the "netadmins" group), "0 4294967295" would enable it for the world, "100 4294967295" would enable it for the users, but not daemons. The existing code might be (in the unlikely case anyone needs it) extended rather easily to handle other similar pairs of ICMP messages (Timestamp/Reply, Information Request/Reply, Address Mask Request/Reply etc.). Userspace ping util & patch for it: http://openwall.info/wiki/people/segoon/ping For Openwall GNU/*/Linux it was the last step on the road to the setuid-less distro. A revision of this patch (for RHEL5/OpenVZ kernels) is in use in Owl-current, such as in the 2011/03/12 LiveCD ISOs: http://mirrors.kernel.org/openwall/Owl/current/iso/ Initially this functionality was written by Pavel Kankovsky for Linux 2.4.32, but unfortunately it was never made public. All ping options (-b, -p, -Q, -R, -s, -t, -T, -M, -I), are tested with the patch. PATCH v3: - switched to flowi4. - minor changes to be consistent with raw sockets code. PATCH v2: - changed ping_debug() to pr_debug(). - removed CONFIG_IP_PING. - removed ping_seq_fops.owner field (unused for procfs). - switched to proc_net_fops_create(). - switched to %pK in seq_printf(). PATCH v1: - fixed checksumming bug. - CAP_NET_RAW may not create icmp sockets anymore. RFC v2: - minor cleanups. - introduced sysctl'able group range to restrict socket(2). Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13bridge: fix forwarding of IPv6Stephen Hemminger
The commit 6b1e960fdbd75dcd9bcc3ba5ff8898ff1ad30b6e bridge: Reset IPCB when entering IP stack on NF_FORWARD broke forwarding of IPV6 packets in bridge because it would call bp_parse_ip_options with an IPV6 packet. Reported-by: Noah Meyerhans <noahm@debian.org> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13convert old cpumask API into new oneKOSAKI Motohiro
Adapt new API. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>