summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2012-02-23Merge branch 'nf' of git://1984.lsi.us.es/netDavid S. Miller
2012-02-22netfilter: ip6_route_output() never returns NULL.RongQing.Li
ip6_route_output() never returns NULL, so it is wrong to check if the return value is NULL. Signed-off-by: RongQing.Li <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-22ipv6: ip6_route_output() never returns NULL.RongQing.Li
ip6_route_output() never returns NULL, so it is wrong to check if the return value is NULL. Signed-off-by: RongQing.Li <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-22atm: clip: remove clip_tblEric Dumazet
Commit 32092ecf0644 (atm: clip: Use device neigh support on top of "arp_tbl".) introduced a bug since clip_tbl is zeroed : Crash occurs in __neigh_for_each_release() idle_timer_check() must use instead arp_tbl and neigh_check_cb() should ignore non clip neighbours. Idea from David Miller. Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Meelis Roos <mroos@linux.ee> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-21ipv4: ping: Fix recvmsg MSG_OOB error handling.David S. Miller
Don't return an uninitialized variable as the error, return -EOPNOTSUPP instead. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-21rtnetlink: Fix problem with buffer allocationGreg Rose
Implement a new netlink attribute type IFLA_EXT_MASK. The mask is a 32 bit value that can be used to indicate to the kernel that certain extended ifinfo values are requested by the user application. At this time the only mask value defined is RTEXT_FILTER_VF to indicate that the user wants the ifinfo dump to send information about the VFs belonging to the interface. This patch fixes a bug in which certain applications do not have large enough buffers to accommodate the extra information returned by the kernel with large numbers of SR-IOV virtual functions. Those applications will not send the new netlink attribute with the interface info dump request netlink messages so they will not get unexpectedly large request buffers returned by the kernel. Modifies the rtnl_calcit function to traverse the list of net devices and compute the minimum buffer size that can hold the info dumps of all matching devices based upon the filter passed in via the new netlink attribute filter mask. If no filter mask is sent then the buffer allocation defaults to NLMSG_GOODSIZE. With this change it is possible to add yet to be defined netlink attributes to the dump request which should make it fairly extensible in the future. Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-21neighbour: Fixed race condition at tbl->nhtMichel Machado
When the fixed race condition happens: 1. While function neigh_periodic_work scans the neighbor hash table pointed by field tbl->nht, it unlocks and locks tbl->lock between buckets in order to call cond_resched. 2. Assume that function neigh_periodic_work calls cond_resched, that is, the lock tbl->lock is available, and function neigh_hash_grow runs. 3. Once function neigh_hash_grow finishes, and RCU calls neigh_hash_free_rcu, the original struct neigh_hash_table that function neigh_periodic_work was using doesn't exist anymore. 4. Once back at neigh_periodic_work, whenever the old struct neigh_hash_table is accessed, things can go badly. Signed-off-by: Michel Machado <michel@digirati.com.br> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-21netfilter: ctnetlink: fix soft lockup when netlink adds new entriesJozsef Kadlecsik
Marcell Zambo and Janos Farago noticed and reported that when new conntrack entries are added via netlink and the conntrack table gets full, soft lockup happens. This is because the nf_conntrack_lock is held while nf_conntrack_alloc is called, which is in turn wants to lock nf_conntrack_lock while evicting entries from the full table. The patch fixes the soft lockup with limiting the holding of the nf_conntrack_lock to the minimum, where it's absolutely required. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-02-20Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2012-02-19netem: fix dequeueEric Dumazet
commit 50612537e9 (netem: fix classful handling) added two errors in netem_dequeue() 1) After checking skb at the head of tfifo queue for time constraints, it dequeues tail skb, thus adding unwanted reordering. 2) qdisc stats are updated twice per packet (one when packet dequeued from tfifo, once when delivered) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-15mac80211: do not call rate control .tx_status before .rate_initFelix Fietkau
Most rate control implementations assume .get_rate and .tx_status are only called once the per-station data has been fully initialized. minstrel_ht crashes if this assumption is violated. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Tested-by: Arend van Spriel <arend@broadcom.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-02-15mac80211: call rate control only after initJohannes Berg
There are situations where we don't have the necessary rate control information yet for station entries, e.g. when associating. This currently doesn't really happen due to the dummy station handling; explicitly disabling rate control when it's not initialised will allow us to remove dummy stations. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-02-15Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2012-02-15Bluetooth: Fix possible use after free in delete pathUlisses Furquim
We need to use the _sync() version for cancelling the info and security timer in the L2CAP connection delete path. Otherwise the delayed work handler might run after the connection object is freed. Signed-off-by: Ulisses Furquim <ulisses@profusion.mobi> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2012-02-15Bluetooth: Remove usage of __cancel_delayed_work()Ulisses Furquim
__cancel_delayed_work() is being used in some paths where we cannot sleep waiting for the delayed work to finish. However, that function might return while the timer is running and the work will be queued again. Replace the calls with safer cancel_delayed_work() version which spins until the timer handler finishes on other CPUs and cancels the delayed work. Signed-off-by: Ulisses Furquim <ulisses@profusion.mobi> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2012-02-15Bluetooth: Add missing QUIRK_NO_RESET test to hci_dev_do_closeJohan Hedberg
We should only perform a reset in hci_dev_do_close if the HCI_QUIRK_NO_RESET flag is set (since in such a case a reset will not be performed when initializing the device). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org>
2012-02-15Bluetooth: Fix RFCOMM session reference counting issueOctavian Purdila
There is an imbalance in the rfcomm_session_hold / rfcomm_session_put operations which causes the following crash: [ 685.010159] BUG: unable to handle kernel paging request at 6b6b6b6b [ 685.010169] IP: [<c149d76d>] rfcomm_process_dlcs+0x1b/0x15e [ 685.010181] *pdpt = 000000002d665001 *pde = 0000000000000000 [ 685.010191] Oops: 0000 [#1] PREEMPT SMP [ 685.010247] [ 685.010255] Pid: 947, comm: krfcommd Tainted: G C 3.0.16-mid8-dirty #44 [ 685.010266] EIP: 0060:[<c149d76d>] EFLAGS: 00010246 CPU: 1 [ 685.010274] EIP is at rfcomm_process_dlcs+0x1b/0x15e [ 685.010281] EAX: e79f551c EBX: 6b6b6b6b ECX: 00000007 EDX: e79f40b4 [ 685.010288] ESI: e79f4060 EDI: ed4e1f70 EBP: ed4e1f68 ESP: ed4e1f50 [ 685.010295] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 685.010303] Process krfcommd (pid: 947, ti=ed4e0000 task=ed43e5e0 task.ti=ed4e0000) [ 685.010308] Stack: [ 685.010312] ed4e1f68 c149eb53 e5925150 e79f4060 ed500000 ed4e1f70 ed4e1f80 c149ec10 [ 685.010331] 00000000 ed43e5e0 00000000 ed4e1f90 ed4e1f9c c149ec87 0000bf54 00000000 [ 685.010348] 00000000 ee03bf54 c149ec37 ed4e1fe4 c104fe01 00000000 00000000 00000000 [ 685.010367] Call Trace: [ 685.010376] [<c149eb53>] ? rfcomm_process_rx+0x6e/0x74 [ 685.010387] [<c149ec10>] rfcomm_process_sessions+0xb7/0xde [ 685.010398] [<c149ec87>] rfcomm_run+0x50/0x6d [ 685.010409] [<c149ec37>] ? rfcomm_process_sessions+0xde/0xde [ 685.010419] [<c104fe01>] kthread+0x63/0x68 [ 685.010431] [<c104fd9e>] ? __init_kthread_worker+0x42/0x42 [ 685.010442] [<c14dae82>] kernel_thread_helper+0x6/0xd This issue has been brought up earlier here: https://lkml.org/lkml/2011/5/21/127 The issue appears to be the rfcomm_session_put in rfcomm_recv_ua. This operation doesn't seem be to required as for the non-initiator case we have the rfcomm_process_rx doing an explicit put and in the initiator case the last dlc_unlink will drive the reference counter to 0. There have been several attempts to fix these issue: 6c2718d Bluetooth: Do not call rfcomm_session_put() for RFCOMM UA on closed socket 683d949 Bluetooth: Never deallocate a session when some DLC points to it but AFAICS they do not fix the issue just make it harder to reproduce. Signed-off-by: Octavian Purdila <octavian.purdila@intel.com> Signed-off-by: Gopala Krishna Murala <gopala.krishna.murala@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2012-02-15Bluetooth: silence lockdep warningOctavian Purdila
Since bluetooth uses multiple protocols types, to avoid lockdep warnings, we need to use different lockdep classes (one for each protocol type). This is already done in bt_sock_create but it misses a couple of cases when new connections are created. This patch corrects that to fix the following warning: <4>[ 1864.732366] ======================================================= <4>[ 1864.733030] [ INFO: possible circular locking dependency detected ] <4>[ 1864.733544] 3.0.16-mid3-00007-gc9a0f62 #3 <4>[ 1864.733883] ------------------------------------------------------- <4>[ 1864.734408] t.android.btclc/4204 is trying to acquire lock: <4>[ 1864.734869] (rfcomm_mutex){+.+.+.}, at: [<c14970ea>] rfcomm_dlc_close+0x15/0x30 <4>[ 1864.735541] <4>[ 1864.735549] but task is already holding lock: <4>[ 1864.736045] (sk_lock-AF_BLUETOOTH){+.+.+.}, at: [<c1498bf7>] lock_sock+0xa/0xc <4>[ 1864.736732] <4>[ 1864.736740] which lock already depends on the new lock. <4>[ 1864.736750] <4>[ 1864.737428] <4>[ 1864.737437] the existing dependency chain (in reverse order) is: <4>[ 1864.738016] <4>[ 1864.738023] -> #1 (sk_lock-AF_BLUETOOTH){+.+.+.}: <4>[ 1864.738549] [<c1062273>] lock_acquire+0x104/0x140 <4>[ 1864.738977] [<c13d35c1>] lock_sock_nested+0x58/0x68 <4>[ 1864.739411] [<c1493c33>] l2cap_sock_sendmsg+0x3e/0x76 <4>[ 1864.739858] [<c13d06c3>] __sock_sendmsg+0x50/0x59 <4>[ 1864.740279] [<c13d0ea2>] sock_sendmsg+0x94/0xa8 <4>[ 1864.740687] [<c13d0ede>] kernel_sendmsg+0x28/0x37 <4>[ 1864.741106] [<c14969ca>] rfcomm_send_frame+0x30/0x38 <4>[ 1864.741542] [<c1496a2a>] rfcomm_send_ua+0x58/0x5a <4>[ 1864.741959] [<c1498447>] rfcomm_run+0x441/0xb52 <4>[ 1864.742365] [<c104f095>] kthread+0x63/0x68 <4>[ 1864.742742] [<c14d5182>] kernel_thread_helper+0x6/0xd <4>[ 1864.743187] <4>[ 1864.743193] -> #0 (rfcomm_mutex){+.+.+.}: <4>[ 1864.743667] [<c1061ada>] __lock_acquire+0x988/0xc00 <4>[ 1864.744100] [<c1062273>] lock_acquire+0x104/0x140 <4>[ 1864.744519] [<c14d2c70>] __mutex_lock_common+0x3b/0x33f <4>[ 1864.744975] [<c14d303e>] mutex_lock_nested+0x2d/0x36 <4>[ 1864.745412] [<c14970ea>] rfcomm_dlc_close+0x15/0x30 <4>[ 1864.745842] [<c14990d9>] __rfcomm_sock_close+0x5f/0x6b <4>[ 1864.746288] [<c1499114>] rfcomm_sock_shutdown+0x2f/0x62 <4>[ 1864.746737] [<c13d275d>] sys_socketcall+0x1db/0x422 <4>[ 1864.747165] [<c14d42f0>] syscall_call+0x7/0xb Signed-off-by: Octavian Purdila <octavian.purdila@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2012-02-15Bluetooth: l2cap_set_timer needs jiffies as timeout valueAndrzej Kaczmarek
After moving L2CAP timers to workqueues l2cap_set_timer expects timeout value to be specified in jiffies but constants defined in miliseconds are used. This makes timeouts unreliable when CONFIG_HZ is not set to 1000. __set_chan_timer macro still uses jiffies as input to avoid multiple conversions from/to jiffies for sk_sndtimeo value which is already specified in jiffies. Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com> Ackec-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2012-02-15Bluetooth: Fix sk_sndtimeo initialization for L2CAP socketAndrzej Kaczmarek
sk_sndtime value should be specified in jiffies thus initial value needs to be converted from miliseconds. Otherwise this timeout is unreliable when CONFIG_HZ is not set to 1000. Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2012-02-15Bluetooth: Remove bogus inline declaration from l2cap_chan_connectJohan Hedberg
As reported by Dan Carpenter this function causes a Sparse warning and shouldn't be declared inline: include/net/bluetooth/l2cap.h:837:30 error: marked inline, but without a definition" Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org>
2012-02-15Bluetooth: Fix l2cap conn failures for ssp devicesPeter Hurley
Commit 330605423c fixed l2cap conn establishment for non-ssp remote devices by not setting HCI_CONN_ENCRYPT_PEND every time conn security is tested (which was always returning failure on any subsequent security checks). However, this broke l2cap conn establishment for ssp remote devices when an ACL link was already established at SDP-level security. This fix ensures that encryption must be pending whenever authentication is also pending. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Tested-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2012-02-14netpoll: netpoll_poll_dev() should access dev->flagsEric Dumazet
commit 5a698af53f (bond: service netpoll arp queue on master device) tested IFF_SLAVE flag against dev->priv_flags instead of dev->flags Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: WANG Cong <amwang@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-14RxRPC: Fix kcalloc parameters swappedAxel Lin
The first parameter should be "number of elements" and the second parameter should be "element size". Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-14tcp: fix tcp_shifted_skb() adjustment of lost_cnt_hint for FACKNeal Cardwell
This commit ensures that lost_cnt_hint is correctly updated in tcp_shifted_skb() for FACK TCP senders. The lost_cnt_hint adjustment in tcp_sacktag_one() only applies to non-FACK senders, so FACK senders need their own adjustment. This applies the spirit of 1e5289e121372a3494402b1b131b41bfe1cf9b7f - except now that the sequence range passed into tcp_sacktag_one() is correct we need only have a special case adjustment for FACK. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-13tcp: fix range tcp_shifted_skb() passes to tcp_sacktag_one()Neal Cardwell
Fix the newly-SACKed range to be the range of newly-shifted bytes. Previously - since 832d11c5cd076abc0aa1eaf7be96c81d1a59ce41 - tcp_shifted_skb() incorrectly called tcp_sacktag_one() with the start and end sequence numbers of the skb it passes in set to the range just beyond the range that is newly-SACKed. This commit also removes a special-case adjustment to lost_cnt_hint in tcp_shifted_skb() since the pre-existing adjustment of lost_cnt_hint in tcp_sacktag_one() now properly handles this things now that the correct start sequence number is passed in. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-13tcp: allow tcp_sacktag_one() to tag ranges not aligned with skbsNeal Cardwell
This commit allows callers of tcp_sacktag_one() to pass in sequence ranges that do not align with skb boundaries, as tcp_shifted_skb() needs to do in an upcoming fix in this patch series. In fact, now tcp_sacktag_one() does not need to depend on an input skb at all, which makes its semantics and dependencies more clear. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Quoth David: 1) GRO MAC header comparisons were ethernet specific, breaking other link types. This required a multi-faceted fix to cure the originally noted case (Infiniband), because IPoIB was lying about it's actual hard header length. Thanks to Eric Dumazet, Roland Dreier, and others. 2) Fix build failure when INET_UDP_DIAG is built in and ipv6 is modular. From Anisse Astier. 3) Off by ones and other bug fixes in netprio_cgroup from Neil Horman. 4) ipv4 TCP reset generation needs to respect any network interface binding from the socket, otherwise route lookups might give a different result than all the other segments received. From Shawn Lu. 5) Fix unintended regression in ipv4 proxy ARP responses, from Thomas Graf. 6) Fix SKB under-allocation bug in sh_eth, from Yoshihiro Shimoda. 7) Revert skge PCI mapping changes that are causing crashes for some folks, from Stephen Hemminger. 8) IPV4 route lookups fill in the wildcarded fields of the given flow lookup key passed in, which is fine most of the time as this is exactly what the caller's want. However there are a few cases that want to retain the original flow key values afterwards, so handle those cases properly. Fix from Julian Anastasov. 9) IGB/IXGBE VF lookup bug fixes from Greg Rose. 10) Properly null terminate filename passed to ethtool flash device method, from Ben Hutchings. 11) S3 resume fix in via-velocity from David Lv. 12) Fix double SKB free during xmit failure in CAIF, from Dmitry Tarnyagin. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (72 commits) net: Don't proxy arp respond if iif == rt->dst.dev if private VLAN is disabled ipv4: Fix wrong order of ip_rt_get_source() and update iph->daddr. netprio_cgroup: fix wrong memory access when NETPRIO_CGROUP=m netprio_cgroup: don't allocate prio table when a device is registered netprio_cgroup: fix an off-by-one bug bna: fix error handling of bnad_get_flash_partition_by_offset() isdn: type bug in isdn_net_header() net: Make qdisc_skb_cb upper size bound explicit. ixgbe: ethtool: stats user buffer overrun ixgbe: dcb: up2tc mapping lost on disable/enable CEE DCB state ixgbe: do not update real num queues when netdev is going away ixgbe: Fix broken dependency on MAX_SKB_FRAGS being related to page size ixgbe: Fix case of Tx Hang in PF with 32 VFs ixgbe: fix vf lookup igb: fix vf lookup e1000: add dropped DMA receive enable back in for WoL gro: more generic L2 header check IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses zd1211rw: firmware needs duration_id set to zero for non-pspoll frames net: enable TC35815 for MIPS again ...
2012-02-10net: Don't proxy arp respond if iif == rt->dst.dev if private VLAN is disabledThomas Graf
Commit 653241 (net: RFC3069, private VLAN proxy arp support) changed the behavior of arp proxy to send arp replies back out on the interface the request came in even if the private VLAN feature is disabled. Previously we checked rt->dst.dev != skb->dev for in scenarios, when proxy arp is enabled on for the netdevice and also when individual proxy neighbour entries have been added. This patch adds the check back for the pneigh_lookup() scenario. Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-10ipv4: Fix wrong order of ip_rt_get_source() and update iph->daddr.Li Wei
This patch fix a bug which introduced by commit ac8a4810 (ipv4: Save nexthop address of LSRR/SSRR option to IPCB.).In that patch, we saved the nexthop of SRR in ip_option->nexthop and update iph->daddr until we get to ip_forward_options(), but we need to update it before ip_rt_get_source(), otherwise we may get a wrong src. Signed-off-by: Li Wei <lw@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-10netprio_cgroup: fix wrong memory access when NETPRIO_CGROUP=mNeil Horman
When the netprio_cgroup module is not loaded, net_prio_subsys_id is -1, and so sock_update_prioidx() accesses cgroup_subsys array with negative index subsys[-1]. Make the code resembles cls_cgroup code, which is bug free. Origionally-authored-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-10netprio_cgroup: don't allocate prio table when a device is registeredNeil Horman
So we delay the allocation till the priority is set through cgroup, and this makes skb_update_priority() faster when it's not set. This also eliminates an off-by-one bug similar with the one fixed in the previous patch. Origionally-authored-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-10netprio_cgroup: fix an off-by-one bugNeil Horman
# mount -t cgroup xxx /mnt # mkdir /mnt/tmp # cat /mnt/tmp/net_prio.ifpriomap lo 0 eth0 0 virbr0 0 # echo 'lo 999' > /mnt/tmp/net_prio.ifpriomap # cat /mnt/tmp/net_prio.ifpriomap lo 999 eth0 0 virbr0 4101267344 We got weired output, because we exceeded the boundary of the array. We may even crash the kernel.. Origionally-authored-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-09mac80211: Fix a rwlock bad magic bugMohammed Shafi Shajakhan
read_lock(&tpt_trig->trig.leddev_list_lock) is accessed via the path ieee80211_open (->) ieee80211_do_open (->) ieee80211_mod_tpt_led_trig (->) ieee80211_start_tpt_led_trig (->) tpt_trig_timer before initializing it. the intilization of this read/write lock happens via the path ieee80211_led_init (->) led_trigger_register, but we are doing 'ieee80211_led_init' after 'ieeee80211_if_add' where we register netdev_ops. so we access leddev_list_lock before initializing it and causes the following bug in chrome laptops with AR928X cards with the following script while true do sudo modprobe -v ath9k sleep 3 sudo modprobe -r ath9k sleep 3 done BUG: rwlock bad magic on CPU#1, wpa_supplicant/358, f5b9eccc Pid: 358, comm: wpa_supplicant Not tainted 3.0.13 #1 Call Trace: [<8137b9df>] rwlock_bug+0x3d/0x47 [<81179830>] do_raw_read_lock+0x19/0x29 [<8137f063>] _raw_read_lock+0xd/0xf [<f9081957>] tpt_trig_timer+0xc3/0x145 [mac80211] [<f9081f3a>] ieee80211_mod_tpt_led_trig+0x152/0x174 [mac80211] [<f9076a3f>] ieee80211_do_open+0x11e/0x42e [mac80211] [<f9075390>] ? ieee80211_check_concurrent_iface+0x26/0x13c [mac80211] [<f9076d97>] ieee80211_open+0x48/0x4c [mac80211] [<812dbed8>] __dev_open+0x82/0xab [<812dc0c9>] __dev_change_flags+0x9c/0x113 [<812dc1ae>] dev_change_flags+0x18/0x44 [<8132144f>] devinet_ioctl+0x243/0x51a [<81321ba9>] inet_ioctl+0x93/0xac [<812cc951>] sock_ioctl+0x1c6/0x1ea [<812cc78b>] ? might_fault+0x20/0x20 [<810b1ebb>] do_vfs_ioctl+0x46e/0x4a2 [<810a6ebb>] ? fget_light+0x2f/0x70 [<812ce549>] ? sys_recvmsg+0x3e/0x48 [<810b1f35>] sys_ioctl+0x46/0x69 [<8137fa77>] sysenter_do_call+0x12/0x2 Cc: <stable@vger.kernel.org> Cc: Gary Morain <gmorain@google.com> Cc: Paul Stewart <pstew@google.com> Cc: Abhijit Pradhan <abhijit@qca.qualcomm.com> Cc: Vasanthakumar Thiagarajan <vthiagar@qca.qualcomm.com> Cc: Rajkumar Manoharan <rmanohar@qca.qualcomm.com> Acked-by: Johannes Berg <johannes.berg@intel.com> Tested-by: Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com> Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-02-09netfilter: nf_queue: fix queueing of bridged gro skbsFlorian Westphal
When trying to nf_queue GRO/GSO skbs, nf_queue uses skb_gso_segment to split the skb. However, if nf_queue is called via bridge netfilter, the mac header won't be preserved -- packets will thus contain a bogus mac header. Fix this by setting skb->data to the mac header when skb->nf_bridge is set and restoring skb->data afterwards for all segments. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-02-09net: Make qdisc_skb_cb upper size bound explicit.David S. Miller
Just like skb->cb[], so that qdisc_skb_cb can be encapsulated inside of other data structures. This is intended to be used by IPoIB so that it can remember addressing information stored at hard_header_ops->create() time that it can fetch when the packet gets to the transmit routine. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-08gro: more generic L2 header checkEric Dumazet
Shlomo Pongratz reported GRO L2 header check was suited for Ethernet only, and failed on IB/ipoib traffic. He provided a patch faking a zeroed header to let GRO aggregates frames. Roland Dreier, Herbert Xu, and others suggested we change GRO L2 header check to be more generic, ie not assuming L2 header is 14 bytes, but taking into account hard_header_len. __napi_gro_receive() has special handling for the common case (Ethernet) to avoid a memcmp() call and use an inline optimized function instead. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: Shlomo Pongratz <shlomop@mellanox.com> Cc: Roland Dreier <roland@kernel.org> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-07net: Fix build regression when INET_UDP_DIAG=y and IPV6=mAnisse Astier
Tested-by: Anisse Astier <anisse@astier.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-04tcp_v4_send_reset: binding oif to iif in no sock caseShawn Lu
Binding RST packet outgoing interface to incoming interface for tcp v4 when there is no socket associate with it. when sk is not NULL, using sk->sk_bound_dev_if instead. (suggested by Eric Dumazet). This has few benefits: 1. tcp_v6_send_reset already did that. 2. This helps tcp connect with SO_BINDTODEVICE set. When connection is lost, we still able to sending out RST using same interface. 3. we are sending reply, it is most likely to be succeed if iif is used Signed-off-by: Shawn Lu <shawn.lu@ericsson.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-04netprio_cgroup: Fix obo in get_prioidxNeil Horman
It was recently pointed out to me that the get_prioidx function sets a bit in the prioidx map prior to checking to see if the index being set is out of bounds. This patch corrects that, avoiding the possiblity of us writing beyond the end of the array Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: Stanislaw Gruszka <sgruszka@redhat.com> CC: Stanislaw Gruszka <sgruszka@redhat.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-04ipvs: fix matching of fwmark templates during schedulingSimon Horman
Commit f11017ec2d1859c661f4e2b12c4a8d250e1f47cf (2.6.37) moved the fwmark variable in subcontext that is invalidated before reaching the ip_vs_ct_in_get call. As vaddr is provided as pointer in the param structure make sure the fwmark variable is in same context. As the fwmark templates can not be matched, more and more template connections are created and the controlled connections can not go to single real server. Signed-off-by: Julian Anastasov <ja@ssi.bg> Cc: stable@vger.kernel.org Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-02-03Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2012-02-02Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: rbd: fix safety of rbd_put_client() rbd: fix a memory leak in rbd_get_client() ceph: create a new session lock to avoid lock inversion ceph: fix length validation in parse_reply_info() ceph: initialize client debugfs outside of monc->mutex ceph: change "ceph.layout" xattr to be "ceph.file.layout"
2012-02-02ceph: initialize client debugfs outside of monc->mutexSage Weil
Initializing debufs under monc->mutex introduces a lock dependency for sb->s_type->i_mutex_key, which (combined with several other dependencies) leads to an annoying lockdep warning. There's no particular reason to do the debugfs setup under this lock, so move it out. It used to be the case that our first monmap could come from the OSD; that is no longer the case with recent servers, so we will reliably set up the client entry during the initial authentication. We don't have to worry about racing with debugfs teardown by ceph_debugfs_client_cleanup() because ceph_destroy_client() calls ceph_msgr_flush() first, which will wait for the message dispatch work to complete (and the debugfs init to complete). Fixes: #1940 Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-02caif: Bugfix double kfree_skb upon xmit failureDmitry Tarnyagin
SKB is freed twice upon send error. The Network stack consumes SKB even when it returns error code. Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-02caif: Bugfix list_del_rcu race in cfmuxl_ctrlcmd.sjur.brandeland@stericsson.com
Always use cfmuxl_remove_uplayer when removing a up-layer. cfmuxl_ctrlcmd() can be called independently and in parallel with cfmuxl_remove_uplayer(). The race between them could cause list_del_rcu to be called on a node which has been already taken out from the list. That lead to a (rare) crash on accessing poisoned node->prev inside list_del_rcu. This fix ensures that deletion are done holding the same lock. Reported-by: Dmitry Tarnyagin <dmitry.tarnyagin@stericsson.com> Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-02tcp: properly initialize tcp memory limitsJason Wang
Commit 4acb4190 tries to fix the using uninitialized value introduced by commit 3dc43e3, but it would make the per-socket memory limits too small. This patch fixes this and also remove the redundant codes introduced in 4acb4190. Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Glauber Costa <glommer@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-01mac80211: timeout a single frame in the rx reorder bufferEliad Peller
The current code checks for stored_mpdu_num > 1, causing the reorder_timer to be triggered indefinitely, but the frame is never timed-out (until the next packet is received) Signed-off-by: Eliad Peller <eliad@wizery.com> Cc: <stable@vger.kernel.org> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-02-01ethtool: Null-terminate filename passed to ethtool_ops::flash_deviceBen Hutchings
The parameters for ETHTOOL_FLASHDEV include a filename, which ought to be null-terminated. Currently the only driver that implements ethtool_ops::flash_device attempts to add a null terminator if necessary, but does it wrongly. Do it in the ethtool core instead. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-01net: Disambiguate kernel messageArun Sharma
Some of our machines were reporting: TCP: too many of orphaned sockets even when the number of orphaned sockets was well below the limit. We print a different message depending on whether we're out of TCP memory or there are too many orphaned sockets. Also move the check out of line and cleanup the messages that were printed. Signed-off-by: Arun Sharma <asharma@fb.com> Suggested-by: Mohan Srinivasan <mohan@fb.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: David Miller <davem@davemloft.net> Cc: Glauber Costa <glommer@parallels.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>