summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2012-12-26netfilter: xt_hashlimit: fix race that results in duplicated entriesPablo Neira Ayuso
Two packets may race to create the same entry in the hashtable, double check if this packet lost race. This double checking only happens in the path of the packet that creates the hashtable for first time. Note that, with this patch, no packet drops occur if the race happens. Reported-by: Feng Gao <gfree.wind@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-12-25arp: fix a regression in arp_solicit()Cong Wang
Sedat reported the following commit caused a regression: commit 9650388b5c56578fdccc79c57a8c82fb92b8e7f1 Author: Eric Dumazet <edumazet@google.com> Date: Fri Dec 21 07:32:10 2012 +0000 ipv4: arp: fix a lockdep splat in arp_solicit This is due to the 6th parameter of arp_send() needs to be NULL for the broadcast case, the above commit changed it to an all-zero array by mistake. Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Eric Dumazet <edumazet@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-24netfilter: xt_CT: recover NOTRACK target supportPablo Neira Ayuso
Florian Westphal reported that the removal of the NOTRACK target (9655050 netfilter: remove xt_NOTRACK) is breaking some existing setups. That removal was scheduled for removal since long time ago as described in Documentation/feature-removal-schedule.txt What: xt_NOTRACK Files: net/netfilter/xt_NOTRACK.c When: April 2011 Why: Superseded by xt_CT Still, people may have not notice / may have decided to stick to an old iptables version. I agree with him in that some more conservative approach by spotting some printk to warn users for some time is less agressive. Current iptables 1.4.16.3 already contains the aliasing support that makes it point to the CT target, so upgrading would fix it. Still, the policy so far has been to avoid pushing our users to upgrade. As a solution, this patch recovers the NOTRACK target inside the CT target and it now spots a warning. Reported-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-12-22net: sched: integer overflow fixStefan Hasko
Fixed integer overflow in function htb_dequeue Signed-off-by: Stefan Hasko <hasko.stevo@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-22CONFIG_HOTPLUG removal from networking coreGreg KH
CONFIG_HOTPLUG is always enabled now, so remove the unused code that was trying to be compiled out when this option was disabled, in the networking core. Cc: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-21bridge: call br_netpoll_disable in br_add_ifGao feng
When netdev_set_master faild in br_add_if, we should call br_netpoll_disable to do some cleanup jobs,such as free the memory of struct netpoll which allocated in br_netpoll_enable. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Acked-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-21ipv4: arp: fix a lockdep splat in arp_solicit()Eric Dumazet
Yan Burman reported following lockdep warning : ============================================= [ INFO: possible recursive locking detected ] 3.7.0+ #24 Not tainted --------------------------------------------- swapper/1/0 is trying to acquire lock: (&n->lock){++--..}, at: [<ffffffff8139f56e>] __neigh_event_send +0x2e/0x2f0 but task is already holding lock: (&n->lock){++--..}, at: [<ffffffff813f63f4>] arp_solicit+0x1d4/0x280 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&n->lock); lock(&n->lock); *** DEADLOCK *** May be due to missing lock nesting notation 4 locks held by swapper/1/0: #0: (((&n->timer))){+.-...}, at: [<ffffffff8104b350>] call_timer_fn+0x0/0x1c0 #1: (&n->lock){++--..}, at: [<ffffffff813f63f4>] arp_solicit +0x1d4/0x280 #2: (rcu_read_lock_bh){.+....}, at: [<ffffffff81395400>] dev_queue_xmit+0x0/0x5d0 #3: (rcu_read_lock_bh){.+....}, at: [<ffffffff813cb41e>] ip_finish_output+0x13e/0x640 stack backtrace: Pid: 0, comm: swapper/1 Not tainted 3.7.0+ #24 Call Trace: <IRQ> [<ffffffff8108c7ac>] validate_chain+0xdcc/0x11f0 [<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30 [<ffffffff81120565>] ? kmem_cache_free+0xe5/0x1c0 [<ffffffff8108d570>] __lock_acquire+0x440/0xc30 [<ffffffff813c3570>] ? inet_getpeer+0x40/0x600 [<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30 [<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0 [<ffffffff8108ddf5>] lock_acquire+0x95/0x140 [<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0 [<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30 [<ffffffff81448d4b>] _raw_write_lock_bh+0x3b/0x50 [<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0 [<ffffffff8139f56e>] __neigh_event_send+0x2e/0x2f0 [<ffffffff8139f99b>] neigh_resolve_output+0x16b/0x270 [<ffffffff813cb62d>] ip_finish_output+0x34d/0x640 [<ffffffff813cb41e>] ? ip_finish_output+0x13e/0x640 [<ffffffffa046f146>] ? vxlan_xmit+0x556/0xbec [vxlan] [<ffffffff813cb9a0>] ip_output+0x80/0xf0 [<ffffffff813ca368>] ip_local_out+0x28/0x80 [<ffffffffa046f25a>] vxlan_xmit+0x66a/0xbec [vxlan] [<ffffffffa046f146>] ? vxlan_xmit+0x556/0xbec [vxlan] [<ffffffff81394a50>] ? skb_gso_segment+0x2b0/0x2b0 [<ffffffff81449355>] ? _raw_spin_unlock_irqrestore+0x65/0x80 [<ffffffff81394c57>] ? dev_queue_xmit_nit+0x207/0x270 [<ffffffff813950c8>] dev_hard_start_xmit+0x298/0x5d0 [<ffffffff813956f3>] dev_queue_xmit+0x2f3/0x5d0 [<ffffffff81395400>] ? dev_hard_start_xmit+0x5d0/0x5d0 [<ffffffff813f5788>] arp_xmit+0x58/0x60 [<ffffffff813f59db>] arp_send+0x3b/0x40 [<ffffffff813f6424>] arp_solicit+0x204/0x280 [<ffffffff813a1a70>] ? neigh_add+0x310/0x310 [<ffffffff8139f515>] neigh_probe+0x45/0x70 [<ffffffff813a1c10>] neigh_timer_handler+0x1a0/0x2a0 [<ffffffff8104b3cf>] call_timer_fn+0x7f/0x1c0 [<ffffffff8104b350>] ? detach_if_pending+0x120/0x120 [<ffffffff8104b748>] run_timer_softirq+0x238/0x2b0 [<ffffffff813a1a70>] ? neigh_add+0x310/0x310 [<ffffffff81043e51>] __do_softirq+0x101/0x280 [<ffffffff814518cc>] call_softirq+0x1c/0x30 [<ffffffff81003b65>] do_softirq+0x85/0xc0 [<ffffffff81043a7e>] irq_exit+0x9e/0xc0 [<ffffffff810264f8>] smp_apic_timer_interrupt+0x68/0xa0 [<ffffffff8145122f>] apic_timer_interrupt+0x6f/0x80 <EOI> [<ffffffff8100a054>] ? mwait_idle+0xa4/0x1c0 [<ffffffff8100a04b>] ? mwait_idle+0x9b/0x1c0 [<ffffffff8100a6a9>] cpu_idle+0x89/0xe0 [<ffffffff81441127>] start_secondary+0x1b2/0x1b6 Bug is from arp_solicit(), releasing the neigh lock after arp_send() In case of vxlan, we eventually need to write lock a neigh lock later. Its a false positive, but we can get rid of it without lockdep annotations. We can instead use neigh_ha_snapshot() helper. Reported-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-21net: devnet_rename_seq should be a seqcountEric Dumazet
Using a seqlock for devnet_rename_seq is not a good idea, as device_rename() can sleep. As we hold RTNL, we dont need a protection for writers, and only need a seqcount so that readers can catch a change done by a writer. Bug added in commit c91f6df2db4972d3 (sockopt: Change getsockopt() of SO_BINDTODEVICE to return an interface name) Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-21ip_gre: fix possible use after freeEric Dumazet
Once skb_realloc_headroom() is called, tiph might point to freed memory. Cache tiph->ttl value before the reallocation, to avoid unexpected behavior. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-21ip_gre: make ipgre_tunnel_xmit() not parse network header as IP unconditionallyIsaku Yamahata
ipgre_tunnel_xmit() parses network header as IP unconditionally. But transmitting packets are not always IP packet. For example such packet can be sent by packet socket with sockaddr_ll.sll_protocol set. So make the function check if skb->protocol is IP. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-20Merge branch 'for-3.8' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd update from Bruce Fields: "Included this time: - more nfsd containerization work from Stanislav Kinsbursky: we're not quite there yet, but should be by 3.9. - NFSv4.1 progress: implementation of basic backchannel security negotiation and the mandatory BACKCHANNEL_CTL operation. See http://wiki.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues for remaining TODO's - Fixes for some bugs that could be triggered by unusual compounds. Our xdr code wasn't designed with v4 compounds in mind, and it shows. A more thorough rewrite is still a todo. - If you've ever seen "RPC: multiple fragments per record not supported" logged while using some sort of odd userland NFS client, that should now be fixed. - Further work from Jeff Layton on our mechanism for storing information about NFSv4 clients across reboots. - Further work from Bryan Schumaker on his fault-injection mechanism (which allows us to discard selective NFSv4 state, to excercise rarely-taken recovery code paths in the client.) - The usual mix of miscellaneous bugs and cleanup. Thanks to everyone who tested or contributed this cycle." * 'for-3.8' of git://linux-nfs.org/~bfields/linux: (111 commits) nfsd4: don't leave freed stateid hashed nfsd4: free_stateid can use the current stateid nfsd4: cleanup: replace rq_resused count by rq_next_page pointer nfsd: warn on odd reply state in nfsd_vfs_read nfsd4: fix oops on unusual readlike compound nfsd4: disable zero-copy on non-final read ops svcrpc: fix some printks NFSD: Correct the size calculation in fault_inject_write NFSD: Pass correct buffer size to rpc_ntop nfsd: pass proper net to nfsd_destroy() from NFSd kthreads nfsd: simplify service shutdown nfsd: replace boolean nfsd_up flag by users counter nfsd: simplify NFSv4 state init and shutdown nfsd: introduce helpers for generic resources init and shutdown nfsd: make NFSd service structure allocated per net nfsd: make NFSd service boot time per-net nfsd: per-net NFSd up flag introduced nfsd: move per-net startup code to separated function nfsd: pass net to __write_ports() and down nfsd: pass net to nfsd_set_nrthreads() ...
2012-12-20Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph update from Sage Weil: "There are a few different groups of commits here. The largest is Alex's ongoing work to enable the coming RBD features (cloning, striping). There is some cleanup in libceph that goes along with it. Cyril and David have fixed some problems with NFS reexport (leaking dentries and page locks), and there is a batch of patches from Yan fixing problems with the fs client when running against a clustered MDS. There are a few bug fixes mixed in for good measure, many of which will be going to the stable trees once they're upstream. My apologies for the late pull. There is still a gremlin in the rbd map/unmap code and I was hoping to include the fix for that as well, but we haven't been able to confirm the fix is correct yet; I'll send that in a separate pull once it's nailed down." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (68 commits) rbd: get rid of rbd_{get,put}_dev() libceph: register request before unregister linger libceph: don't use rb_init_node() in ceph_osdc_alloc_request() libceph: init event->node in ceph_osdc_create_event() libceph: init osd->o_node in create_osd() libceph: report connection fault with warning libceph: socket can close in any connection state rbd: don't use ENOTSUPP rbd: remove linger unconditionally rbd: get rid of RBD_MAX_SEG_NAME_LEN libceph: avoid using freed osd in __kick_osd_requests() ceph: don't reference req after put rbd: do not allow remove of mounted-on image libceph: Unlock unprocessed pages in start_read() error path ceph: call handle_cap_grant() for cap import message ceph: Fix __ceph_do_pending_vmtruncate ceph: Don't add dirty inode to dirty list if caps is in migration ceph: Fix infinite loop in __wake_requests ceph: Don't update i_max_size when handling non-auth cap bdi_register: add __printf verification, fix arg mismatch ...
2012-12-20libceph: register request before unregister lingerAlex Elder
In kick_requests(), we need to register the request before we unregister the linger request. Otherwise the unregister will reset the request's osd pointer to NULL. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2012-12-20libceph: don't use rb_init_node() in ceph_osdc_alloc_request()Alex Elder
The red-black node in the ceph osd request structure is initialized in ceph_osdc_alloc_request() using rbd_init_node(). We do need to initialize this, because in __unregister_request() we call RB_EMPTY_NODE(), which expects the node it's checking to have been initialized. But rb_init_node() is apparently overkill, and may in fact be on its way out. So use RB_CLEAR_NODE() instead. For a little more background, see this commit: 4c199a93 rbtree: empty nodes have no color" Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2012-12-20libceph: init event->node in ceph_osdc_create_event()Alex Elder
The red-black node node in the ceph osd event structure is not initialized in create_osdc_create_event(). Because this node can be the subject of a RB_EMPTY_NODE() call later on, we should ensure the node is initialized properly for that. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2012-12-20libceph: init osd->o_node in create_osd()Alex Elder
The red-black node node in the ceph osd structure is not initialized in create_osd(). Because this node can be the subject of a RB_EMPTY_NODE() call later on, we should ensure the node is initialized properly for that. Add a call to RB_CLEAR_NODE() initialize it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2012-12-20libceph: report connection fault with warningAlex Elder
When a connection's socket disconnects, or if there's a protocol error of some kind on the connection, a fault is signaled and the connection is reset (closed and reopened, basically). We currently get an error message on the log whenever this occurs. A ceph connection will attempt to reestablish a socket connection repeatedly if a fault occurs. This means that these error messages will get repeatedly added to the log, which is undesirable. Change the error message to be a warning, so they don't get logged by default. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2012-12-20Merge tag 'virtio-next-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull virtio update from Rusty Russell: "Some nice cleanups, and even a patch my wife did as a "live" demo for Latinoware 2012. There's a slightly non-trivial merge in virtio-net, as we cleaned up the virtio add_buf interface while DaveM accepted the mq virtio-net patches." * tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (27 commits) virtio_console: Add support for remoteproc serial virtio_console: Merge struct buffer_token into struct port_buffer virtio: add drv_to_virtio to make code clearly virtio: use dev_to_virtio wrapper in virtio virtio-mmio: Fix irq parsing in command line parameter virtio_console: Free buffers from out-queue upon close virtio: Convert dev_printk(KERN_<LEVEL> to dev_<level>( virtio_console: Use kmalloc instead of kzalloc virtio_console: Free buffer if splice fails virtio: tools: make it clear that virtqueue_add_buf() no longer returns > 0 virtio: scsi: make it clear that virtqueue_add_buf() no longer returns > 0 virtio: rpmsg: make it clear that virtqueue_add_buf() no longer returns > 0 virtio: net: make it clear that virtqueue_add_buf() no longer returns > 0 virtio: console: make it clear that virtqueue_add_buf() no longer returns > 0 virtio: make virtqueue_add_buf() returning 0 on success, not capacity. virtio: console: don't rely on virtqueue_add_buf() returning capacity. virtio_net: don't rely on virtqueue_add_buf() returning capacity. virtio-net: remove unused skb_vnet_hdr->num_sg field virtio-net: correct capacity math on ring full virtio: move queue_index and num_free fields into core struct virtqueue. ...
2012-12-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Really fix tuntap SKB use after free bug, from Eric Dumazet. 2) Adjust SKB data pointer to point past the transport header before calling icmpv6_notify() so that the headers are in the state which that function expects. From Duan Jiong. 3) Fix ambiguities in the new tuntap multi-queue APIs. From Jason Wang. 4) mISDN needs to use del_timer_sync(), from Konstantin Khlebnikov. 5) Don't destroy mutex after freeing up device private in mac802154, fix also from Konstantin Khlebnikov. 6) Fix INET request socket leak in TCP and DCCP, from Christoph Paasch. 7) SCTP HMAC kconfig rework, from Neil Horman. 8) Fix SCTP jprobes function signature, otherwise things explode, from Daniel Borkmann. 9) Fix typo in ipv6-offload Makefile variable reference, from Simon Arlott. 10) Don't fail USBNET open just because remote wakeup isn't supported, from Oliver Neukum. 11) be2net driver bug fixes from Sathya Perla. 12) SOLOS PCI ATM driver bug fixes from Nathan Williams and David Woodhouse. 13) Fix MTU changing regression in 8139cp driver, from John Greene. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits) solos-pci: ensure all TX packets are aligned to 4 bytes solos-pci: add firmware upgrade support for new models solos-pci: remove superfluous debug output solos-pci: add GPIO support for newer versions on Geos board 8139cp: Prevent dev_close/cp_interrupt race on MTU change net: qmi_wwan: add ZTE MF880 drivers/net: Use of_match_ptr() macro in smsc911x.c drivers/net: Use of_match_ptr() macro in smc91x.c ipv6: addrconf.c: remove unnecessary "if" bridge: Correctly encode addresses when dumping mdb entries bridge: Do not unregister all PF_BRIDGE rtnl operations use generic usbnet_manage_power() usbnet: generic manage_power() usbnet: handle PM failure gracefully ksz884x: fix receive polling race condition qlcnic: update driver version qlcnic: fix unused variable warnings net: fec: forbid FEC_PTP on SoCs that do not support be2net: fix wrong frag_idx reported by RX CQ be2net: fix be_close() to ensure all events are ack'ed ...
2012-12-19ipv6: addrconf.c: remove unnecessary "if"Cong Ding
the value of err is always negative if it goes to errout, so we don't need to check the value of err. Signed-off-by: Cong Ding <dinggnu@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-19bridge: Correctly encode addresses when dumping mdb entriesVlad Yasevich
When dumping mdb table, set the addresses the kernel returns based on the address protocol type. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Acked-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-19bridge: Do not unregister all PF_BRIDGE rtnl operationsVlad Yasevich
Bridge fdb and link rtnl operations are registered in core/rtnetlink. Bridge mdb operations are registred in bridge/mdb. When removing bridge module, do not unregister ALL PF_BRIDGE ops since that would remove the ops from rtnetlink as well. Do remove mdb ops when bridge is destroyed. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-18Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull (again) user namespace infrastructure changes from Eric Biederman: "Those bugs, those darn embarrasing bugs just want don't want to get fixed. Linus I just updated my mirror of your kernel.org tree and it appears you successfully pulled everything except the last 4 commits that fix those embarrasing bugs. When you get a chance can you please repull my branch" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: userns: Fix typo in description of the limitation of userns_install userns: Add a more complete capability subset test to commit_creds userns: Require CAP_SYS_ADMIN for most uses of setns. Fix cap_capable to only allow owners in the parent user namespace to have caps.
2012-12-18Merge tag 'nfs-for-3.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "Features include: - Full audit of BUG_ON asserts in the NFS, SUNRPC and lockd client code. Remove altogether where possible, and replace with WARN_ON_ONCE and appropriate error returns where not. - NFSv4.1 client adds session dynamic slot table management. There is matching server side code that has been submitted to Bruce for consideration. Together, this code allows the server to dynamically manage the amount of memory it allocates to the duplicate request cache for each client. It will constantly resize those caches to reserve more memory for clients that are hot while shrinking caches for those that are quiescent. In addition, there are assorted bugfixes for the generic NFS write code, fixes to deal with the drop_nlink() warnings, and yet another fix for NFSv4 getacl." * tag 'nfs-for-3.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (106 commits) SUNRPC: continue run over clients list on PipeFS event instead of break NFS: Don't use SetPageError in the NFS writeback code SUNRPC: variable 'svsk' is unused in function bc_send_request SUNRPC: Handle ECONNREFUSED in xs_local_setup_socket NFSv4.1: Deal effectively with interrupted RPC calls. NFSv4.1: Move the RPC timestamp out of the slot. NFSv4.1: Try to deal with NFS4ERR_SEQ_MISORDERED. NFS: nfs_lookup_revalidate should not trust an inode with i_nlink == 0 NFS: Fix calls to drop_nlink() NFS: Ensure that we always drop inodes that have been marked as stale nfs: Remove unused list nfs4_clientid_list nfs: Remove duplicate function declaration in internal.h NFS: avoid NULL dereference in nfs_destroy_server SUNRPC handle EKEYEXPIRED in call_refreshresult SUNRPC set gss gc_expiry to full lifetime nfs: fix page dirtying in NFS DIO read codepath nfs: don't zero out the rest of the page if we hit the EOF on a DIO READ NFSv4.1: Be conservative about the client highest slotid NFSv4.1: Handle NFS4ERR_BADSLOT errors correctly nfs: don't extend writes to cover entire page if pagecache is invalid ...
2012-12-18atm: use scnprintf() instead of sprintf()chas williams - CONTRACTOR
As reported by Chen Gang <gang.chen@asianux.com>, we should ensure there is enough space when formatting the sysfs buffers. Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-18netlink: validate addr_len on bindHannes Frederic Sowa
Otherwise an out of bounds read could happen. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-18netlink: change presentation of portid in procfs to unsignedHannes Frederic Sowa
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-18nfsd4: cleanup: replace rq_resused count by rq_next_page pointerJ. Bruce Fields
It may be a matter of personal taste, but I find this makes the code clearer. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-17Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace changes from Eric Biederman: "While small this set of changes is very significant with respect to containers in general and user namespaces in particular. The user space interface is now complete. This set of changes adds support for unprivileged users to create user namespaces and as a user namespace root to create other namespaces. The tyranny of supporting suid root preventing unprivileged users from using cool new kernel features is broken. This set of changes completes the work on setns, adding support for the pid, user, mount namespaces. This set of changes includes a bunch of basic pid namespace cleanups/simplifications. Of particular significance is the rework of the pid namespace cleanup so it no longer requires sending out tendrils into all kinds of unexpected cleanup paths for operation. At least one case of broken error handling is fixed by this cleanup. The files under /proc/<pid>/ns/ have been converted from regular files to magic symlinks which prevents incorrect caching by the VFS, ensuring the files always refer to the namespace the process is currently using and ensuring that the ptrace_mayaccess permission checks are always applied. The files under /proc/<pid>/ns/ have been given stable inode numbers so it is now possible to see if different processes share the same namespaces. Through the David Miller's net tree are changes to relax many of the permission checks in the networking stack to allowing the user namespace root to usefully use the networking stack. Similar changes for the mount namespace and the pid namespace are coming through my tree. Two small changes to add user namespace support were commited here adn in David Miller's -net tree so that I could complete the work on the /proc/<pid>/ns/ files in this tree. Work remains to make it safe to build user namespaces and 9p, afs, ceph, cifs, coda, gfs2, ncpfs, nfs, nfsd, ocfs2, and xfs so the Kconfig guard remains in place preventing that user namespaces from being built when any of those filesystems are enabled. Future design work remains to allow root users outside of the initial user namespace to mount more than just /proc and /sys." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (38 commits) proc: Usable inode numbers for the namespace file descriptors. proc: Fix the namespace inode permission checks. proc: Generalize proc inode allocation userns: Allow unprivilged mounts of proc and sysfs userns: For /proc/self/{uid,gid}_map derive the lower userns from the struct file procfs: Print task uids and gids in the userns that opened the proc file userns: Implement unshare of the user namespace userns: Implent proc namespace operations userns: Kill task_user_ns userns: Make create_new_namespaces take a user_ns parameter userns: Allow unprivileged use of setns. userns: Allow unprivileged users to create new namespaces userns: Allow setting a userns mapping to your current uid. userns: Allow chown and setgid preservation userns: Allow unprivileged users to create user namespaces. userns: Ignore suid and sgid on binaries if the uid or gid can not be mapped userns: fix return value on mntns_install() failure vfs: Allow unprivileged manipulation of the mount namespace. vfs: Only support slave subtrees across different user namespaces vfs: Add a user namespace reference from struct mnt_namespace ...
2012-12-17svcrpc: fix some printksJ. Bruce Fields
Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-17libceph: socket can close in any connection stateAlex Elder
A connection's socket can close for any reason, independent of the state of the connection (and without irrespective of the connection mutex). As a result, the connectino can be in pretty much any state at the time its socket is closed. Handle those other cases at the top of con_work(). Pull this whole block of code into a separate function to reduce the clutter. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2012-12-17rbd: remove linger unconditionallyAlex Elder
In __unregister_linger_request(), the request is being removed from the osd client's req_linger list only when the request has a non-null osd pointer. It should be done whether or not the request currently has an osd. This is most likely a non-issue because I believe the request will always have an osd when this function is called. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2012-12-17SUNRPC: continue run over clients list on PipeFS event instead of breakStanislav Kinsbursky
There are SUNRPC clients, which program doesn't have pipe_dir_name. These clients can be skipped on PipeFS events, because nothing have to be created or destroyed. But instead of breaking in case of such a client was found, search for suitable client over clients list have to be continued. Otherwise some clients could not be covered by PipeFS event handler. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: stable@vger.kernel.org [>= v3.4] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-17libceph: avoid using freed osd in __kick_osd_requests()Alex Elder
If an osd has no requests and no linger requests, __reset_osd() will just remove it with a call to __remove_osd(). That drops a reference to the osd, and therefore the osd may have been free by the time __reset_osd() returns. That function offers no indication this may have occurred, and as a result the osd will continue to be used even when it's no longer valid. Change__reset_osd() so it returns an error (ENODEV) when it deletes the osd being reset. And change __kick_osd_requests() so it returns immediately (before referencing osd again) if __reset_osd() returns *any* error. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2012-12-17ceph: don't reference req after putAlex Elder
In __unregister_request(), there is a call to list_del_init() referencing a request that was the subject of a call to ceph_osdc_put_request() on the previous line. This is not safe, because the request structure could have been freed by the time we reach the list_del_init(). Fix this by reversing the order of these lines. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-off-by: Sage Weil <sage@inktank.com>
2012-12-17netfilter: nfnetlink_log: fix possible compilation issue due to missing includePablo Neira Ayuso
In (0c36b48 netfilter: nfnetlink_log: fix mac address for 6in4 tunnels) the include file that defines ARPD_SIT was missing. This passed unnoticed during my tests (I did not hit this problem here). net/netfilter/nfnetlink_log.c: In function '__build_packet_message': net/netfilter/nfnetlink_log.c:494:25: error: 'ARPHRD_SIT' undeclared (first use in this function) net/netfilter/nfnetlink_log.c:494:25: note: each undeclared identifier is reported only once for +each function it appears in Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-12-16Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "A quiet cycle for the security subsystem with just a few maintenance updates." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: Smack: create a sysfs mount point for smackfs Smack: use select not depends in Kconfig Yama: remove locking from delete path Yama: add RCU to drop read locking drivers/char/tpm: remove tasklet and cleanup KEYS: Use keyring_alloc() to create special keyrings KEYS: Reduce initial permissions on keys KEYS: Make the session and process keyrings per-thread seccomp: Make syscall skipping and nr changes more consistent key: Fix resource leak keys: Fix unreachable code KEYS: Add payload preparsing opportunity prior to key instantiate or update
2012-12-16netfilter: xt_CT: fix crash while destroy ct templatesPablo Neira Ayuso
In (d871bef netfilter: ctnetlink: dump entries from the dying and unconfirmed lists), we assume that all conntrack objects are inserted in any of the existing lists. However, template conntrack objects were not. This results in hitting BUG_ON in the destroy_conntrack path while removing a rule that uses the CT target. This patch fixes the situation by adding the template lists, which is where template conntrack objects reside now. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-12-16netfilter: nfnetlink_log: fix mac address for 6in4 tunnelsBob Hockney
For tunnelled ipv6in4 packets, the LOG target (xt_LOG.c) adjusts the start of the mac field to start at the ethernet header instead of the ipv4 header for the tunnel. This patch conforms what is passed by the NFLOG target through nfnetlink to what the LOG target does. Code borrowed from xt_LOG.c. Signed-off-by: Bob Hockney <bhockney@ix.netcom.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-12-16netfilter: nf_ct_reasm: fix conntrack reassembly expire codeHaibo Xi
Commit b836c99fd6c9 (ipv6: unify conntrack reassembly expire code with standard one) use the standard IPv6 reassembly code(ip6_expire_frag_queue) to handle conntrack reassembly expire. In ip6_expire_frag_queue, it invoke dev_get_by_index_rcu to get which device received this expired packet.so we must save ifindex when NF_conntrack get this packet. With this patch applied, I can see ICMP Time Exceeded sent from the receiver when the sender sent out 1/2 fragmented IPv6 packet. Signed-off-by: Haibo Xi <haibbo@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-12-16netfilter: nf_conntrack_ipv6: fix comment for packets without dataFlorent Fourcot
Remove ambiguity of double negation. Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Acked-by: Rick Jones <rick.jones2@hp.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-12-16netfilter: nf_nat: Also handle non-ESTABLISHED routing changes in MASQUERADEAndrew Collins
Since (a0ecb85 netfilter: nf_nat: Handle routing changes in MASQUERADE target), the MASQUERADE target handles routing changes which affect the output interface of a connection, but only for ESTABLISHED connections. It is also possible for NEW connections which already have a conntrack entry to be affected by routing changes. This adds a check to drop entries in the NEW+conntrack state when the oif has changed. Signed-off-by: Andrew Collins <bsderandrew@gmail.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-12-16netfilter: ip[6]t_REJECT: fix wrong transport header pointer in TCP resetMukund Jampala
The problem occurs when iptables constructs the tcp reset packet. It doesn't initialize the pointer to the tcp header within the skb. When the skb is passed to the ixgbe driver for transmit, the ixgbe driver attempts to access the tcp header and crashes. Currently, other drivers (such as our 1G e1000e or igb drivers) don't access the tcp header on transmit unless the TSO option is turned on. <1>BUG: unable to handle kernel NULL pointer dereference at 0000000d <1>IP: [<d081621c>] ixgbe_xmit_frame_ring+0x8cc/0x2260 [ixgbe] <4>*pdpt = 0000000085e5d001 *pde = 0000000000000000 <0>Oops: 0000 [#1] SMP [...] <4>Pid: 0, comm: swapper Tainted: P 2.6.35.12 #1 Greencity/Thurley <4>EIP: 0060:[<d081621c>] EFLAGS: 00010246 CPU: 16 <4>EIP is at ixgbe_xmit_frame_ring+0x8cc/0x2260 [ixgbe] <4>EAX: c7628820 EBX: 00000007 ECX: 00000000 EDX: 00000000 <4>ESI: 00000008 EDI: c6882180 EBP: dfc6b000 ESP: ced95c48 <4> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 <0>Process swapper (pid: 0, ti=ced94000 task=ced73bd0 task.ti=ced94000) <0>Stack: <4> cbec7418 c779e0d8 c77cc888 c77cc8a8 0903010a 00000000 c77c0008 00000002 <4><0> cd4997c0 00000010 dfc6b000 00000000 d0d176c9 c77cc8d8 c6882180 cbec7318 <4><0> 00000004 00000004 cbec7230 cbec7110 00000000 cbec70c0 c779e000 00000002 <0>Call Trace: <4> [<d0d176c9>] ? 0xd0d176c9 <4> [<d0d18a4d>] ? 0xd0d18a4d <4> [<411e243e>] ? dev_hard_start_xmit+0x218/0x2d7 <4> [<411f03d7>] ? sch_direct_xmit+0x4b/0x114 <4> [<411f056a>] ? __qdisc_run+0xca/0xe0 <4> [<411e28b0>] ? dev_queue_xmit+0x2d1/0x3d0 <4> [<411e8120>] ? neigh_resolve_output+0x1c5/0x20f <4> [<411e94a1>] ? neigh_update+0x29c/0x330 <4> [<4121cf29>] ? arp_process+0x49c/0x4cd <4> [<411f80c9>] ? nf_hook_slow+0x3f/0xac <4> [<4121ca8d>] ? arp_process+0x0/0x4cd <4> [<4121ca8d>] ? arp_process+0x0/0x4cd <4> [<4121c6d5>] ? T.901+0x38/0x3b <4> [<4121c918>] ? arp_rcv+0xa3/0xb4 <4> [<4121ca8d>] ? arp_process+0x0/0x4cd <4> [<411e1173>] ? __netif_receive_skb+0x32b/0x346 <4> [<411e19e1>] ? netif_receive_skb+0x5a/0x5f <4> [<411e1ea9>] ? napi_skb_finish+0x1b/0x30 <4> [<d0816eb4>] ? ixgbe_xmit_frame_ring+0x1564/0x2260 [ixgbe] <4> [<41013468>] ? lapic_next_event+0x13/0x16 <4> [<410429b2>] ? clockevents_program_event+0xd2/0xe4 <4> [<411e1b03>] ? net_rx_action+0x55/0x127 <4> [<4102da1a>] ? __do_softirq+0x77/0xeb <4> [<4102dab1>] ? do_softirq+0x23/0x27 <4> [<41003a67>] ? do_IRQ+0x7d/0x8e <4> [<41002a69>] ? common_interrupt+0x29/0x30 <4> [<41007bcf>] ? mwait_idle+0x48/0x4d <4> [<4100193b>] ? cpu_idle+0x37/0x4c <0>Code: df 09 d7 0f 94 c2 0f b6 d2 e9 e7 fb ff ff 31 db 31 c0 e9 38 ff ff ff 80 78 06 06 0f 85 3e fb ff ff 8b 7c 24 38 8b 8f b8 00 00 00 <0f> b6 51 0d f6 c2 01 0f 85 27 fb ff ff 80 e2 02 75 0d 8b 6c 24 <0>EIP: [<d081621c>] ixgbe_xmit_frame_ring+0x8cc/0x2260 [ixgbe] SS:ESP Signed-off-by: Mukund Jampala <jbmukund@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-12-16ipv6: Fix Makefile offload objectsSimon Arlott
The following commit breaks IPv6 TCP transmission for me: Commit 75fe83c32248d99e6d5fe64155e519b78bb90481 Author: Vlad Yasevich <vyasevic@redhat.com> Date: Fri Nov 16 09:41:21 2012 +0000 ipv6: Preserve ipv6 functionality needed by NET This patch fixes the typo "ipv6_offload" which should be "ipv6-offload". I don't know why not including the offload modules should break TCP. Disabling all offload options on the NIC didn't help. Outgoing pulseaudio traffic kept stalling. Signed-off-by: Simon Arlott <simon@fire.lp0.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-16sctp: jsctp_sf_eat_sack: fix jprobes function signature mismatchDaniel Borkmann
Commit 24cb81a6a (sctp: Push struct net down into all of the state machine functions) introduced the net structure into all state machine functions, but jsctp_sf_eat_sack was not updated, hence when SCTP association probing is enabled in the kernel, any simple SCTP client/server program from userspace will panic the kernel. Cc: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-16bridge: add flags to distinguish permanent mdb entiresAmerigo Wang
This patch adds a flag to each mdb entry, so that we can distinguish permanent entries with temporary entries. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-16sctp: Change defaults on cookie hmac selectionNeil Horman
Recently I posted commit 3c68198e75 which made selection of the cookie hmac algorithm selectable. This is all well and good, but Linus noted that it changes the default config: http://marc.info/?l=linux-netdev&m=135536629004808&w=2 I've modified the sctp Kconfig file to reflect the recommended way of making this choice, using the thermal driver example specified, and brought the defaults back into line with the way they were prior to my origional patch Also, on Linus' suggestion, re-adding ability to select default 'none' hmac algorithm, so we don't needlessly bloat the kernel by forcing a non-none default. This also led me to note that we won't honor the default none condition properly because of how sctp_net_init is encoded. Fix that up as well. Tested by myself (allbeit fairly quickly). All configuration combinations seems to work soundly. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: David Miller <davem@davemloft.net> CC: Linus Torvalds <torvalds@linux-foundation.org> CC: Vlad Yasevich <vyasevich@gmail.com> CC: linux-sctp@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-15SUNRPC: variable 'svsk' is unused in function bc_send_requestTrond Myklebust
Silence a compile time warning. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-15SUNRPC: Handle ECONNREFUSED in xs_local_setup_socketTrond Myklebust
Silence the unnecessary warning "unhandled error (111) connecting to..." and convert it to a dprintk for debugging purposes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-15userns: Require CAP_SYS_ADMIN for most uses of setns.Eric W. Biederman
Andy Lutomirski <luto@amacapital.net> found a nasty little bug in the permissions of setns. With unprivileged user namespaces it became possible to create new namespaces without privilege. However the setns calls were relaxed to only require CAP_SYS_ADMIN in the user nameapce of the targed namespace. Which made the following nasty sequence possible. pid = clone(CLONE_NEWUSER | CLONE_NEWNS); if (pid == 0) { /* child */ system("mount --bind /home/me/passwd /etc/passwd"); } else if (pid != 0) { /* parent */ char path[PATH_MAX]; snprintf(path, sizeof(path), "/proc/%u/ns/mnt"); fd = open(path, O_RDONLY); setns(fd, 0); system("su -"); } Prevent this possibility by requiring CAP_SYS_ADMIN in the current user namespace when joing all but the user namespace. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>