summaryrefslogtreecommitdiff
path: root/include/net
AgeCommit message (Collapse)Author
2013-01-17netfilter: add connlabel conntrack extensionFlorian Westphal
similar to connmarks, except labels are bit-based; i.e. all labels may be attached to a flow at the same time. Up to 128 labels are supported. Supporting more labels is possible, but requires increasing the ct offset delta from u8 to u16 type due to increased extension sizes. Mapping of bit-identifier to label name is done in userspace. The extension is enabled at run-time once "-m connlabel" netfilter rules are added. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-01-17ipv6: fix ipv6_prefix_equal64_half mask conversionFabio Baltieri
Fix the 64bit optimized version of ipv6_prefix_equal to convert the bitmask to network byte order only after the bit-shift. The bug was introduced in: 3867517 ipv6: 64bit version of ipv6_prefix_equal(). Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-17net: increase fragment memory usage limitsJesper Dangaard Brouer
Increase the amount of memory usage limits for incomplete IP fragments. Arguing for new thresh high/low values: High threshold = 4 MBytes Low threshold = 3 MBytes The fragmentation memory accounting code, tries to account for the real memory usage, by measuring both the size of frag queue struct (inet_frag_queue (ipv4:ipq/ipv6:frag_queue)) and the SKB's truesize. We want to be able to handle/hold-on-to enough fragments, to ensure good performance, without causing incomplete fragments to hurt scalability, by causing the number of inet_frag_queue to grow too much (resulting longer searches for frag queues). For IPv4, how much memory does the largest frag consume. Maximum size fragment is 64K, which is approx 44 fragments with MTU(1500) sized packets. Sizeof(struct ipq) is 200. A 1500 byte packet results in a truesize of 2944 (not 2048 as I first assumed) (44*2944)+200 = 129736 bytes The current default high thresh of 262144 bytes, is obviously problematic, as only two 64K fragments can fit in the queue at the same time. How many 64K fragment can we fit into 4 MBytes: 4*2^20/((44*2944)+200) = 32.34 fragment in queues An attacker could send a separate/distinct fake fragment packets per queue, causing us to allocate one inet_frag_queue per packet, and thus attacking the hash table and its lists. How many frag queue do we need to store, and given a current hash size of 64, what is the average list length. Using one MTU sized fragment per inet_frag_queue, each consuming (2944+200) 3144 bytes. 4*2^20/(2944+200) = 1334 frag queues -> 21 avg list length An attack could send small fragments, the smallest packet I could send resulted in a truesize of 896 bytes (I'm a little surprised by this). 4*2^20/(896+200) = 3827 frag queues -> 59 avg list length When increasing these number, we also need to followup with improvements, that is going to help scalability. Simply increasing the hash size, is not enough as the current implementation does not have a per hash bucket locking. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-17sk-filter: Add ability to lock a socket filter programVincent Bernat
While a privileged program can open a raw socket, attach some restrictive filter and drop its privileges (or send the socket to an unprivileged program through some Unix socket), the filter can still be removed or modified by the unprivileged program. This commit adds a socket option to lock the filter (SO_LOCK_FILTER) preventing any modification of a socket filter program. This is similar to OpenBSD BIOCLOCK ioctl on bpf sockets, except even root is not allowed change/drop the filter. The state of the lock can be read with getsockopt(). No error is triggered if the state is not changed. -EPERM is returned when a user tries to remove the lock or to change/remove the filter while the lock is active. The check is done directly in sk_attach_filter() and sk_detach_filter() and does not affect only setsockopt() syscall. Signed-off-by: Vincent Bernat <bernat@luffy.cx> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-17ipv6: Fix endianess warning in ip6_flow_hdr().YOSHIFUJI Hideaki
Commit 3e4e4c1f ("ipv6: Introduce ip6_flow_hdr() to fill version, tclass and flowlabel.) uses ntohl(), which should be htonl(). Found by Fengguang Wu <fengguang.wu@intel.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-16cfg80211: check radar interface combinationsSimon Wunderlich
To ease further DFS development regarding interface combinations, use the interface combinations structure to test for radar capabilities. Drivers can specify which channel widths they support, and in which modes. Right now only a single AP interface is allowed, but as the DFS code evolves other combinations can be enabled. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-16cfg80211: Allow use_mfp to be specified with the connect commandJouni Malinen
The NL80211_ATTR_USE_MFP attribute was originally added for NL80211_CMD_ASSOCIATE, but it is actually as useful (if not even more useful) with NL80211_CMD_CONNECT, so process that attribute with the connect command, too. Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-16nl80211: allow user-space to set address for P2P_DEVICEArend van Spriel
As per email discussion Jouni Malinen pointed out that: "P2P message exchanges can be executed on the current operating channel of any operation (both P2P and non-P2P station). These can be on 5 GHz and even on 60 GHz (so yes, you _can_ do GO Negotiation on 60 GHz). As an example, it would be possible to receive a GO Negotiation Request frame on a 5 GHz only radio and then to complete GO Negotiation on that band. This can happen both when connected to a P2P group (through client discoverability mechanism) and when connected to a legacy AP (assuming the station receive Probe Request frame from full scan in the beginning of P2P device discovery)." This means that P2P messages can be sent over different radio devices. However, these should use the same P2P device address so it should be able to provision this from user-space. This patch adds a parameter for this to struct vif_params which should only be used during creation of the P2P device interface. Cc: Jouni Malinen <j@w1.fi> Cc: Greg Goldman <ggoldman@broadcom.com> Cc: Jithu Jance <jithu@broadcom.com> Signed-off-by: Arend van Spriel <arend@broadcom.com> [add error checking] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-16{cfg,nl}80211: mesh power mode primitives and userspace accessMarco Porsch
Add the nl80211_mesh_power_mode enumeration which holds possible values for the mesh power mode. These modes are unknown, active, light sleep and deep sleep. Add power_mode entry to the mesh config structure to hold the user-configured default mesh power mode. This value will be used for new peer links. Add the dot11MeshAwakeWindowDuration value to the mesh config. The awake window is a duration in TU describing how long the STA will stay awake after transmitting its beacon in PS mode. Add access routines to: - get/set local link-specific power mode (STA) - get remote STA's link-specific power mode (STA) - get remote STA's non-peer power mode (STA) - get/set default mesh power mode (mesh config) - get/set mesh awake window duration (mesh config) All config changes may be done at mesh runtime and take effect immediately. Signed-off-by: Marco Porsch <marco@cozybit.com> Signed-off-by: Ivan Bezyazychnyy <ivan.bezyazychnyy@gmail.com> Signed-off-by: Mike Krinkin <krinkin.m.u@gmail.com> [fix commit message line length, error handling in set station] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-16{cfg,nl,mac}80211: set beacon interval and DTIM period on mesh joinMarco Porsch
Move the default mesh beacon interval and DTIM period to cfg80211 and make them accessible to nl80211. This enables setting both values when joining an MBSS. Previously the DTIM parameter was not set by mac80211 so the driver's default value was used. Signed-off-by: Marco Porsch <marco@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-16mac80211: call restart complete at wowlan resume timeJohannes Berg
When the driver's resume function can't completely restore the configuration in the device, it returns 1 from the callback which will be treated like a HW restart request, but done directly. In this case, also call the driver's restart_complete() function so it can finish the reconfiguration there. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-16{cfg,mac}80211.h: fix some kernel-doc warningsYacine Belkadi
When building the 80211 DocBook, scripts/kernel-doc reports the following type of warnings: Warning(include/net/cfg80211.h:334): No description found for return value of 'cfg80211_get_chandef_type' These warnings are only reported when scripts/kernel-doc runs in verbose mode. To fix these use "Return:" to describe function return values. Signed-off-by: Yacine Belkadi <yacine.belkadi.1@gmail.com> [adjust for freq_reg_info() change] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: Documentation/networking/ip-sysctl.txt drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c Both conflicts were simply overlapping context. A build fix for qlcnic is in here too, simply removing the added devinit annotations which no longer exist. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-14pkt_sched: namespace aware act_mirredBenjamin LaHaise
Eric Dumazet pointed out that act_mirred needs to find the current net_ns, and struct net pointer is not provided in the call chain. His original patch made use of current->nsproxy->net_ns to find the network namespace, but this fails to work correctly for userspace code that makes use of netlink sockets in different network namespaces. Instead, pass the "struct net *" down along the call chain to where it is needed. This version removes the ifb changes as Eric has submitted that patch separately, but is otherwise identical to the previous version. Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Tested-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-14ipv6 netevent: Remove old_neigh from netevent_redirect.YOSHIFUJI Hideaki / 吉藤英明
The only user is cxgb3 driver. old_neigh is used to check device change, but it must not happen on redirect. In this sense, we can remove old_neigh argument. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-14ipv6: 64bit version of ipv6_prefix_equal().YOSHIFUJI Hideaki / 吉藤英明
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-14ipv6: Remove __ipv6_prefix_equal().YOSHIFUJI Hideaki / 吉藤英明
ipv6_prefix_equal() just casts its arguments and it is the only user of __ipv6_prefix_equal(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-14ipv6: 64bit version of ipv6_addr_set().YOSHIFUJI Hideaki / 吉藤英明
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-14ipv6: 64bit version of ipv6_addr_v4mapped().YOSHIFUJI Hideaki / 吉藤英明
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-14ipv6: 64bit version of ipv6_addr_loopback().YOSHIFUJI Hideaki / 吉藤英明
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-14ipv6: 64bit version of ipv6_addr_diff().YOSHIFUJI Hideaki / 吉藤英明
Introduce __ipv6_addr_diff64() to to find the first different bit between two addresses on 64bit architectures. 32bit version is still available as __ipv6_addr_diff32(), and __ipv6_addr_diff() automatically selects appropriate version. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-14wireless: make the reg_notifier() voidLuis R. Rodriguez
The reg_notifier()'s return value need not be checked as it is only supposed to do post regulatory work and that should never fail. Any behaviour to regulatory that needs to be considered before cfg80211 does work to a driver should be specified by using the already existing flags, the reg_notifier() just does post processing should it find it needs to. Also make lbs_reg_notifier static. Signed-off-by: Luis R. Rodriguez <mcgrof@do-not-panic.com> [move lbs_reg_notifier to not break compile] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-14ipv6: Make ipv6_is_mld() inline and use it from ip6_mc_input().YOSHIFUJI Hideaki / 吉藤英明
Move generalized version of ipv6_is_mld() to header, and use it from ip6_mc_input(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-14ipv6: Introduce ip6_flowinfo() to extract flowinfo (tclass + flowlabel).YOSHIFUJI Hideaki / 吉藤英明
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-14ipv6: Introduce ip6_flow_hdr() to fill version, tclass and flowlabel.YOSHIFUJI Hideaki / 吉藤英明
This is not only for readability but also for optimization. What we do here is to build the 32bit word at the beginning of the ipv6 header (the "ip6_flow" virtual member of struct ip6_hdr in RFC3542) and we do not need to read the tclass portion of the target buffer. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-12netfilter: nf_conntrack: fix BUG_ON while removing nf_conntrack with netnsPablo Neira Ayuso
canqun zhang reported that we're hitting BUG_ON in the nf_conntrack_destroy path when calling kfree_skb while rmmod'ing the nf_conntrack module. Currently, the nf_ct_destroy hook is being set to NULL in the destroy path of conntrack.init_net. However, this is a problem since init_net may be destroyed before any other existing netns (we cannot assume any specific ordering while releasing existing netns according to what I read in recent emails). Thanks to Gao feng for initial patch to address this issue. Reported-by: canqun zhang <canqunzhang@gmail.com> Acked-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-01-10ipv6: Optimize ipv6_change_dsfield().YOSHIFUJI Hideaki / 吉藤英明
Do not convert endian back and forth. If the caller uses contant "mask" argument (and most callers do), we can omit runtime endian conversion here. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-09NFC: Initial Secure Element APISamuel Ortiz
Each NFC adapter can have several links to different secure elements and that property needs to be exported by the drivers. A secure element link can be enabled and disabled, and card emulation will be handled by the currently active one. Otherwise card emulation will be host implemented. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-01-09NFC: Add HCI quirks to support driver (non)standard implementationsEric Lapuyade
Some chips diverge from the HCI spec in their implementation of standard features. This adds a new quirks parameter to nfc_hci_allocate_device() to let the driver indicate its divergence. Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-01-09NFC: Added error handling in event_received hci opsEric Lapuyade
There is no use to return an error if the caller doesn't get it. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-01-09NFC: Fixed nfc core and hci unregistration and cleanupEric Lapuyade
When an adapter is removed, it will unregister itself from hci and/or nfc core. In order to do that safely, work tasks must first be canceled and prevented to be scheduled again, before the hci or nfc device can be destroyed. Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-01-09ipv6: move csum_ipv6_magic() and udp6_csum_init() into static libraryCong Wang
As suggested by David, udp6_csum_init() is too big to be inlined, move it to ipv6 static library, net/ipv6/ip6_checksum.c. And the generic csum_ipv6_magic() too. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-07tcp: make sysctl_tcp_ecn namespace awareHannes Frederic Sowa
As per suggestion from Eric Dumazet this patch makes tcp_ecn sysctl namespace aware. The reason behind this patch is to ease the testing of ecn problems on the internet and allows applications to tune their own use of ecn. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-07net: use ETHTOOL_FWVERS_LEN instead of ETHTOOL_BUSINFO_LEN for fw_ver stringsJiri Pirko
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-03wireless: use __alignedJohannes Berg
Use __aligned(...) instead of __attribute__((aligned(...))) in mac80211 and cfg80211. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-03mac80211: split TX aggregation stop actionJohannes Berg
When TX aggregation is stopped, there are a few different cases: - connection with the peer was dropped - session stop was requested locally - session stop was requested by the peer - connection was dropped while a session is stopping The behaviour in these cases should be different, if the connection is dropped then the driver should drop all frames, otherwise the frames may continue to be transmitted, aggregated in the case of a locally requested session stop or unaggregated in the case of the peer requesting session stop. Split these different cases so that the driver can act accordingly; however, treat local and remote stop the same way and ask the driver to not send frames as aggregated packets any more. In the case of connection drop, the stop callback the driver is otherwise supposed to call is no longer required. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-03mac80211: fix channel context iterationJohannes Berg
During suspend/resume channel contexts might be iterated even if they haven't been re-added to the driver, keep track of this and skip them in iteration. Also use the new status for sanity checks. Also clarify the fact that during HW restart all contexts are iterated over (thanks Eliad.) Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-03regulatory: use IS_ERR macro family for freq_reg_infoJohannes Berg
Instead of returning an error and filling a pointer return the pointer and an ERR_PTR value in error cases. Acked-by: Luis R. Rodriguez <mcgrof@do-not-panic.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-03regulatory: use RCU to protect last_requestJohannes Berg
This will allow making freq_reg_info() lock-free. Acked-by: Luis R. Rodriguez <mcgrof@do-not-panic.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-03regulatory: use RCU to protect global and wiphy regdomainsJohannes Berg
To simplify the locking and not require cfg80211_mutex (which nl80211 uses to access the global regdomain) and also to make it possible for drivers to access their wiphy->regd safely, use RCU to protect these pointers. Acked-by: Luis R. Rodriguez <mcgrof@do-not-panic.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-03regulatory: remove handling of channel bandwidthJohannes Berg
The channel bandwidth handling isn't really quite right, it assumes that a 40 MHz channel is really two 20 MHz channels, which isn't strictly true. This is the way the regulatory database handling is defined right now though so remove the logic to handle other channel widths. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-12-28Merge branch 'master' of git://1984.lsi.us.es/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== The following batch contains Netfilter fixes for 3.8-rc1. They are a mixture of old bugs that have passed unnoticed (I'll pass these to stable) and more fresh ones from the previous merge window, they are: * Fix for MAC address in 6in4 tunnels via NFLOG that results in ulogd showing up wrong address, from Bob Hockney. * Fix a comment in nf_conntrack_ipv6, from Florent Fourcot. * Fix a leak an error path in ctnetlink while creating an expectation, from Jesper Juhl. * Fix missing ICMP time exceeded in the IPv6 defragmentation code, from Haibo Xi. * Fix inconsistent handling of routing changes in MASQUERADE for the new connections case, from Andrew Collins. * Fix a missing skb_reset_transport in ip[6]t_REJECT that leads to crashes in the ixgbe driver (since it seems to access the transport header with TSO enabled), from Mukund Jampala. * Recover obsoleted NOTRACK target by including it into the CT and spot a warning via printk about being obsoleted. Many people don't check the scheduled to be removal file under Documentation, so we follow some less agressive approach to kill this in a year or so. Spotted by Florian Westphal, patch from myself. * Fix race condition in xt_hashlimit that allows to create two or more entries, from myself. * Fix crash if the CT is used due to the recently added facilities to consult the dying and unconfirmed conntrack lists, from myself. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-26netprio_cgroup: define sk_cgrp_prioidx only if NETPRIO_CGROUP is enabledLi Zefan
sock->sk_cgrp_prioidx won't be used at all if CONFIG_NETPRIO_CGROUP=n. Signed-off-by: Li Zefan <lizefan@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-24netfilter: xt_CT: recover NOTRACK target supportPablo Neira Ayuso
Florian Westphal reported that the removal of the NOTRACK target (9655050 netfilter: remove xt_NOTRACK) is breaking some existing setups. That removal was scheduled for removal since long time ago as described in Documentation/feature-removal-schedule.txt What: xt_NOTRACK Files: net/netfilter/xt_NOTRACK.c When: April 2011 Why: Superseded by xt_CT Still, people may have not notice / may have decided to stick to an old iptables version. I agree with him in that some more conservative approach by spotting some printk to warn users for some time is less agressive. Current iptables 1.4.16.3 already contains the aliasing support that makes it point to the CT target, so upgrading would fix it. Still, the policy so far has been to avoid pushing our users to upgrade. As a solution, this patch recovers the NOTRACK target inside the CT target and it now spots a warning. Reported-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-12-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Really fix tuntap SKB use after free bug, from Eric Dumazet. 2) Adjust SKB data pointer to point past the transport header before calling icmpv6_notify() so that the headers are in the state which that function expects. From Duan Jiong. 3) Fix ambiguities in the new tuntap multi-queue APIs. From Jason Wang. 4) mISDN needs to use del_timer_sync(), from Konstantin Khlebnikov. 5) Don't destroy mutex after freeing up device private in mac802154, fix also from Konstantin Khlebnikov. 6) Fix INET request socket leak in TCP and DCCP, from Christoph Paasch. 7) SCTP HMAC kconfig rework, from Neil Horman. 8) Fix SCTP jprobes function signature, otherwise things explode, from Daniel Borkmann. 9) Fix typo in ipv6-offload Makefile variable reference, from Simon Arlott. 10) Don't fail USBNET open just because remote wakeup isn't supported, from Oliver Neukum. 11) be2net driver bug fixes from Sathya Perla. 12) SOLOS PCI ATM driver bug fixes from Nathan Williams and David Woodhouse. 13) Fix MTU changing regression in 8139cp driver, from John Greene. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits) solos-pci: ensure all TX packets are aligned to 4 bytes solos-pci: add firmware upgrade support for new models solos-pci: remove superfluous debug output solos-pci: add GPIO support for newer versions on Geos board 8139cp: Prevent dev_close/cp_interrupt race on MTU change net: qmi_wwan: add ZTE MF880 drivers/net: Use of_match_ptr() macro in smsc911x.c drivers/net: Use of_match_ptr() macro in smc91x.c ipv6: addrconf.c: remove unnecessary "if" bridge: Correctly encode addresses when dumping mdb entries bridge: Do not unregister all PF_BRIDGE rtnl operations use generic usbnet_manage_power() usbnet: generic manage_power() usbnet: handle PM failure gracefully ksz884x: fix receive polling race condition qlcnic: update driver version qlcnic: fix unused variable warnings net: fec: forbid FEC_PTP on SoCs that do not support be2net: fix wrong frag_idx reported by RX CQ be2net: fix be_close() to ensure all events are ack'ed ...
2012-12-17Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace changes from Eric Biederman: "While small this set of changes is very significant with respect to containers in general and user namespaces in particular. The user space interface is now complete. This set of changes adds support for unprivileged users to create user namespaces and as a user namespace root to create other namespaces. The tyranny of supporting suid root preventing unprivileged users from using cool new kernel features is broken. This set of changes completes the work on setns, adding support for the pid, user, mount namespaces. This set of changes includes a bunch of basic pid namespace cleanups/simplifications. Of particular significance is the rework of the pid namespace cleanup so it no longer requires sending out tendrils into all kinds of unexpected cleanup paths for operation. At least one case of broken error handling is fixed by this cleanup. The files under /proc/<pid>/ns/ have been converted from regular files to magic symlinks which prevents incorrect caching by the VFS, ensuring the files always refer to the namespace the process is currently using and ensuring that the ptrace_mayaccess permission checks are always applied. The files under /proc/<pid>/ns/ have been given stable inode numbers so it is now possible to see if different processes share the same namespaces. Through the David Miller's net tree are changes to relax many of the permission checks in the networking stack to allowing the user namespace root to usefully use the networking stack. Similar changes for the mount namespace and the pid namespace are coming through my tree. Two small changes to add user namespace support were commited here adn in David Miller's -net tree so that I could complete the work on the /proc/<pid>/ns/ files in this tree. Work remains to make it safe to build user namespaces and 9p, afs, ceph, cifs, coda, gfs2, ncpfs, nfs, nfsd, ocfs2, and xfs so the Kconfig guard remains in place preventing that user namespaces from being built when any of those filesystems are enabled. Future design work remains to allow root users outside of the initial user namespace to mount more than just /proc and /sys." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (38 commits) proc: Usable inode numbers for the namespace file descriptors. proc: Fix the namespace inode permission checks. proc: Generalize proc inode allocation userns: Allow unprivilged mounts of proc and sysfs userns: For /proc/self/{uid,gid}_map derive the lower userns from the struct file procfs: Print task uids and gids in the userns that opened the proc file userns: Implement unshare of the user namespace userns: Implent proc namespace operations userns: Kill task_user_ns userns: Make create_new_namespaces take a user_ns parameter userns: Allow unprivileged use of setns. userns: Allow unprivileged users to create new namespaces userns: Allow setting a userns mapping to your current uid. userns: Allow chown and setgid preservation userns: Allow unprivileged users to create user namespaces. userns: Ignore suid and sgid on binaries if the uid or gid can not be mapped userns: fix return value on mntns_install() failure vfs: Allow unprivileged manipulation of the mount namespace. vfs: Only support slave subtrees across different user namespaces vfs: Add a user namespace reference from struct mnt_namespace ...
2012-12-16netfilter: xt_CT: fix crash while destroy ct templatesPablo Neira Ayuso
In (d871bef netfilter: ctnetlink: dump entries from the dying and unconfirmed lists), we assume that all conntrack objects are inserted in any of the existing lists. However, template conntrack objects were not. This results in hitting BUG_ON in the destroy_conntrack path while removing a rule that uses the CT target. This patch fixes the situation by adding the template lists, which is where template conntrack objects reside now. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-12-14inet: Fix kmemleak in tcp_v4/6_syn_recv_sock and dccp_v4/6_request_recv_sockChristoph Paasch
If in either of the above functions inet_csk_route_child_sock() or __inet_inherit_port() fails, the newsk will not be freed: unreferenced object 0xffff88022e8a92c0 (size 1592): comm "softirq", pid 0, jiffies 4294946244 (age 726.160s) hex dump (first 32 bytes): 0a 01 01 01 0a 01 01 02 00 00 00 00 a7 cc 16 00 ................ 02 00 03 01 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff8153d190>] kmemleak_alloc+0x21/0x3e [<ffffffff810ab3e7>] kmem_cache_alloc+0xb5/0xc5 [<ffffffff8149b65b>] sk_prot_alloc.isra.53+0x2b/0xcd [<ffffffff8149b784>] sk_clone_lock+0x16/0x21e [<ffffffff814d711a>] inet_csk_clone_lock+0x10/0x7b [<ffffffff814ebbc3>] tcp_create_openreq_child+0x21/0x481 [<ffffffff814e8fa5>] tcp_v4_syn_recv_sock+0x3a/0x23b [<ffffffff814ec5ba>] tcp_check_req+0x29f/0x416 [<ffffffff814e8e10>] tcp_v4_do_rcv+0x161/0x2bc [<ffffffff814eb917>] tcp_v4_rcv+0x6c9/0x701 [<ffffffff814cea9f>] ip_local_deliver_finish+0x70/0xc4 [<ffffffff814cec20>] ip_local_deliver+0x4e/0x7f [<ffffffff814ce9f8>] ip_rcv_finish+0x1fc/0x233 [<ffffffff814cee68>] ip_rcv+0x217/0x267 [<ffffffff814a7bbe>] __netif_receive_skb+0x49e/0x553 [<ffffffff814a7cc3>] netif_receive_skb+0x50/0x82 This happens, because sk_clone_lock initializes sk_refcnt to 2, and thus a single sock_put() is not enough to free the memory. Additionally, things like xfrm, memcg, cookie_values,... may have been initialized. We have to free them properly. This is fixed by forcing a call to tcp_done(), ending up in inet_csk_destroy_sock, doing the final sock_put(). tcp_done() is necessary, because it ends up doing all the cleanup on xfrm, memcg, cookie_values, xfrm,... Before calling tcp_done, we have to set the socket to SOCK_DEAD, to force it entering inet_csk_destroy_sock. To avoid the warning in inet_csk_destroy_sock, inet_num has to be set to 0. As inet_csk_destroy_sock does a dec on orphan_count, we first have to increase it. Calling tcp_done() allows us to remove the calls to tcp_clear_xmit_timer() and tcp_cleanup_congestion_control(). A similar approach is taken for dccp by calling dccp_done(). This is in the kernel since 093d282321 (tproxy: fix hash locking issue when using port redirection in __inet_inherit_port()), thus since version >= 2.6.37. Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-14ipv6: Change skb->data before using icmpv6_notify() to propagate redirectDuan Jiong
In function ndisc_redirect_rcv(), the skb->data points to the transport header, but function icmpv6_notify() need the skb->data points to the inner IP packet. So before using icmpv6_notify() to propagate redirect, change skb->data to point the inner IP packet that triggered the sending of the Redirect, and introduce struct rd_msg to make it easy. Signed-off-by: Duan Jiong <djduanjiong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-13Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial branch from Jiri Kosina: "Usual stuff -- comment/printk typo fixes, documentation updates, dead code elimination." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) HOWTO: fix double words typo x86 mtrr: fix comment typo in mtrr_bp_init propagate name change to comments in kernel source doc: Update the name of profiling based on sysfs treewide: Fix typos in various drivers treewide: Fix typos in various Kconfig wireless: mwifiex: Fix typo in wireless/mwifiex driver messages: i2o: Fix typo in messages/i2o scripts/kernel-doc: check that non-void fcts describe their return value Kernel-doc: Convention: Use a "Return" section to describe return values radeon: Fix typo and copy/paste error in comments doc: Remove unnecessary declarations from Documentation/accounting/getdelays.c various: Fix spelling of "asynchronous" in comments. Fix misspellings of "whether" in comments. eisa: Fix spelling of "asynchronous". various: Fix spelling of "registered" in comments. doc: fix quite a few typos within Documentation target: iscsi: fix comment typos in target/iscsi drivers treewide: fix typo of "suport" in various comments and Kconfig treewide: fix typo of "suppport" in various comments ...