summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2011-03-16xprt: remove redundant null checkj223yang@asset.uwaterloo.ca
'req' is dereferenced before checked for NULL. The patch simply removes the check. Signed-off-by: Jinqiu Yang<crindy646@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-15Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (57 commits) tidy the trailing symlinks traversal up Turn resolution of trailing symlinks iterative everywhere simplify link_path_walk() tail Make trailing symlink resolution in path_lookupat() iterative update nd->inode in __do_follow_link() instead of after do_follow_link() pull handling of one pathname component into a helper fs: allow AT_EMPTY_PATH in linkat(), limit that to CAP_DAC_READ_SEARCH Allow passing O_PATH descriptors via SCM_RIGHTS datagrams readlinkat(), fchownat() and fstatat() with empty relative pathnames Allow O_PATH for symlinks New kind of open files - "location only". ext4: Copy fs UUID to superblock ext3: Copy fs UUID to superblock. vfs: Export file system uuid via /proc/<pid>/mountinfo unistd.h: Add new syscalls numbers to asm-generic x86: Add new syscalls for x86_64 x86: Add new syscalls for x86_32 fs: Remove i_nlink check from file system link callback fs: Don't allow to create hardlink for deleted file vfs: Add open by file handle support ...
2011-03-15Merge branch 'next' into for-linusJames Morris
2011-03-15xfrm: fix __xfrm_route_forward()Eric Dumazet
This function should return 0 in case of error, 1 if OK commit 452edd598f60522 (xfrm: Return dst directly from xfrm_lookup()) got it wrong. Reported-and-bisected-by: Michael Smith <msmith@cbnco.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-15Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
2011-03-15Phonet: fix aligned-mode pipe socket buffer header reserveRémi Denis-Courmont
When the pipe uses aligned-mode data packets, we must reserve 4 bytes instead of 3 for the pipe protocol header. Otherwise the Phonet header would not be aligned, resulting in potentially corrupted headers with later unaligned memory writes. Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-15Merge branch 'for-davem' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
2011-03-15Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 Conflicts: Documentation/feature-removal-schedule.txt
2011-03-15netfilter: xt_addrtype: ipv6 supportFlorian Westphal
The kernel will refuse certain types that do not work in ipv6 mode. We can then add these features incrementally without risk of userspace breakage. Signed-off-by: Florian Westphal <fwestphal@astaro.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-15netfilter: ipt_addrtype: rename to xt_addrtypeFlorian Westphal
Followup patch will add ipv6 support. ipt_addrtype.h is retained for compatibility reasons, but no longer used by the kernel. Signed-off-by: Florian Westphal <fwestphal@astaro.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-15Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
2011-03-15libceph: Fix base64-decoding when input ends in newline.Tommi Virtanen
It used to return -EINVAL because it thought the end was not aligned to 4 bytes. Clean up superfluous src < end test in if, the while itself guarantees that. Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-15net/9p: Implement syncfs 9P operationAneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15[net/9p] Small non-IO PDUs for zero-copy supporting transports.Venkateswararao Jujjuri (JV)
If a transport prefers payload to be sent separate from the PDU (P9_TRANS_PREF_PAYLOAD_SEP), there is no need to allocate msize PDU buffers(struct p9_fcall). This patch allocates only upto 4k buffers for this kind of transports and there won't be any change to the legacy transports. Hence, this patch on top of zero copy changes allows user to specify higher msizes through the mount option without hogging the kernel heap. Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15[net/9p] Handle Zero Copy TREAD/RERROR case in !dotl case.Venkateswararao Jujjuri (JV)
This takes care of copying out error buffers from user buffer payloads when we are using zero copy. This happens because the only payload buffer the server has to respond to the request is the user buffer given for the zero copy read. Because we only use zerocopy when the amount of data to transfer is greater than a certain size (currently 4K) and error strings are limited to ERRMAX (currently 128) we don't need to worry about there being sufficient space for the error to fit in the payload. Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15[net/9p] readdir zerocopy changes for 9P2000.L protocol.Venkateswararao Jujjuri (JV)
Modify p9_client_readdir() to check the transport preference and act according If the preference is P9_TRANS_PREF_PAYLOAD_SEP, send the payload separately instead of putting it directly on PDU. Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15[net/9p] Write side zerocopy changes for 9P2000.L protocol.Venkateswararao Jujjuri (JV)
Modify p9_client_write() to check the transport preference and act accordingly. If the preference is P9_TRANS_PREF_PAYLOAD_SEP, send the payload separately instead of putting it directly on PDU. Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15[net/9p] Read side zerocopy changes for 9P2000.L protocol.Venkateswararao Jujjuri (JV)
Modify p9_client_read() to check the transport preference and act accordingly. If the preference is P9_TRANS_PREF_PAYLOAD_SEP, send the payload separately instead of putting it directly on PDU. Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15[net/9p] Add preferences to transport layer.Venkateswararao Jujjuri (JV)
This patch adds preferences field to the p9_trans_module. Through this, now transport layer can express its preference about the payload. i.e if payload neds to be part of the PDU or it prefers it to be sent sepearetly so that the transport layer can handle it in a better way. Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15[net/9p] Add gup/zero_copy support to VirtIO transport layer.Venkateswararao Jujjuri (JV)
Modify p9_virtio_request() and req_done() functions to support additional payload sent down to the transport layer through tc->pubuf and tc->pkbuf. Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15[net/9p] Assign type of transaction to tc->pdu->id which is otherwise unsed.Venkateswararao Jujjuri (JV)
This will be used by the transport layer to determine the out going request type. Transport layer uses this information to correctly place the mapped pages in the PDU. Patches following this will make use of this to achieve zero copy. Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15[net/9p] Preparation and helper functions for zero copyVenkateswararao Jujjuri (JV)
This patch prepares p9_fcall structure for zero copy. Added fields send the payload buffer information to the transport layer. In addition it adds a 'private' field for the transport layer to store mapped/pinned page information so that it can be freed/unpinned during req_done. This patch also creates trans_common.[ch] to house helper functions. It adds the following helper functions. p9_release_req_pages - Release pages after the transaction. p9_nr_pages - Return number of pages needed to accomodate the payload. payload_gup - Translates user buffer into kernel pages. Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15ipv6: netfilter: ip6_tables: fix infoleak to userspaceVasiliy Kulikov
Structures ip6t_replace, compat_ip6t_replace, and xt_get_revision are copied from userspace. Fields of these structs that are zero-terminated strings are not checked. When they are used as argument to a format string containing "%s" in request_module(), some sensitive information is leaked to userspace via argument of spawned modprobe process. The first bug was introduced before the git epoch; the second was introduced in 3bc3fe5e (v2.6.25-rc1); the third is introduced by 6b7d31fc (v2.6.15-rc1). To trigger the bug one should have CAP_NET_ADMIN. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-15netfilter: ip_tables: fix infoleak to userspaceVasiliy Kulikov
Structures ipt_replace, compat_ipt_replace, and xt_get_revision are copied from userspace. Fields of these structs that are zero-terminated strings are not checked. When they are used as argument to a format string containing "%s" in request_module(), some sensitive information is leaked to userspace via argument of spawned modprobe process. The first and the third bugs were introduced before the git epoch; the second was introduced in 2722971c (v2.6.17-rc1). To trigger the bug one should have CAP_NET_ADMIN. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-15netfilter: arp_tables: fix infoleak to userspaceVasiliy Kulikov
Structures ipt_replace, compat_ipt_replace, and xt_get_revision are copied from userspace. Fields of these structs that are zero-terminated strings are not checked. When they are used as argument to a format string containing "%s" in request_module(), some sensitive information is leaked to userspace via argument of spawned modprobe process. The first bug was introduced before the git epoch; the second is introduced by 6b7d31fc (v2.6.15-rc1); the third is introduced by 6b7d31fc (v2.6.15-rc1). To trigger the bug one should have CAP_NET_ADMIN. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-15netfilter: xt_connlimit: remove connlimit_rnd_initedChangli Gao
A potential race condition when generating connlimit_rnd is also fixed. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-15netfilter: xt_connlimit: use hlist insteadChangli Gao
The header of hlist is smaller than list. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-15netfilter: xt_connlimit: use kmalloc() instead of kzalloc()Changli Gao
All the members are initialized after kzalloc(). Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-15netfilter: xt_connlimit: fix daddr connlimit in SNAT scenarioChangli Gao
We use the reply tuples when limiting the connections by the destination addresses, however, in SNAT scenario, the final reply tuples won't be ready until SNAT is done in POSTROUING or INPUT chain, and the following nf_conntrack_find_get() in count_tem() will get nothing, so connlimit can't work as expected. In this patch, the original tuples are always used, and an additional member addr is appended to save the address in either end. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-15Allow passing O_PATH descriptors via SCM_RIGHTS datagramsAl Viro
Just need to make sure that AF_UNIX garbage collector won't confuse O_PATHed socket on filesystem for real AF_UNIX opened socket. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-15IPVS: Add __ip_vs_control_{init,cleanup}_sysctl()Simon Horman
Break out the portions of __ip_vs_control_init() and __ip_vs_control_cleanup() where aren't necessary when CONFIG_SYSCTL is undefined. Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15IPVS: Conditionally define and use ip_vs_lblc{r}_tableSimon Horman
ip_vs_lblc_table and ip_vs_lblcr_table, and code that uses them are unnecessary when CONFIG_SYSCTL is undefined. Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15IPVS: Minimise ip_vs_leave when CONFIG_SYSCTL is undefinedSimon Horman
Much of ip_vs_leave() is unnecessary if CONFIG_SYSCTL is undefined. I tried an approach of breaking the now #ifdef'ed portions out into a separate function. However this appeared to grow the compiled code on x86_64 by about 200 bytes in the case where CONFIG_SYSCTL is defined. So I have gone with the simpler though less elegant #ifdef'ed solution for now. Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15IPVS: Conditinally use sysctl_lblc{r}_expirationSimon Horman
In preparation for not including sysctl_lblc{r}_expiration in struct netns_ipvs when CONFIG_SYCTL is not defined. Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15IPVS: Add expire_quiescent_template()Simon Horman
In preparation for not including sysctl_expire_quiescent_template in struct netns_ipvs when CONFIG_SYCTL is not defined. Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15IPVS: Add sysctl_expire_nodest_conn()Simon Horman
In preparation for not including sysctl_expire_nodest_conn in struct netns_ipvs when CONFIG_SYCTL is not defined. Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15IPVS: Add sysctl_sync_ver()Simon Horman
In preparation for not including sysctl_sync_ver in struct netns_ipvs when CONFIG_SYCTL is not defined. Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15IPVS: Add {sysctl_sync_threshold,period}()Simon Horman
In preparation for not including sysctl_sync_threshold in struct netns_ipvs when CONFIG_SYCTL is not defined. Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15IPVS: Add sysctl_nat_icmp_send()Simon Horman
In preparation for not including sysctl_nat_icmp_send in struct netns_ipvs when CONFIG_SYCTL is not defined. Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15IPVS: Add sysctl_snat_reroute()Simon Horman
In preparation for not including sysctl_snat_reroute in struct netns_ipvs when CONFIG_SYCTL is not defined. Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15IPVS: Add ip_vs_route_me_harder()Simon Horman
Add ip_vs_route_me_harder() to avoid repeating the same code twice. Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15ipvs: rename estimator functionsJulian Anastasov
Rename ip_vs_new_estimator to ip_vs_start_estimator and ip_vs_kill_estimator to ip_vs_stop_estimator to better match their logic. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15ipvs: optimize rates readingJulian Anastasov
Move the estimator reading from estimation_timer to user context. ip_vs_read_estimator() will be used to decode the rate values. As the decoded rates are not set by estimation timer there is no need to reset them in ip_vs_zero_stats. There is no need ip_vs_new_estimator() to encode stats to rates, if the destination is in trash both the stats and the rates are inactive. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15ipvs: properly zero stats and ratesJulian Anastasov
Currently, the new percpu counters are not zeroed and the zero commands do not work as expected, we still show the old sum of percpu values. OTOH, we can not reset the percpu counters from user context without causing the incrementing to use old and bogus values. So, as Eric Dumazet suggested fix that by moving all overhead to stats reading in user context. Do not introduce overhead in timer context (estimator) and incrementing (packet handling in softirqs). The new ustats0 field holds the zero point for all counter values, the rates always use 0 as base value as before. When showing the values to user space just give the difference between counters and the base values. The only drawback is that percpu stats are not zeroed, they are accessible only from /proc and are new interface, so it should not be a compatibility problem as long as the sum stats are correct after zeroing. Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15ipvs: reorganize tot_statsJulian Anastasov
The global tot_stats contains cpustats field just like the stats for dest and svc, so better use it to simplify the usage in estimation_timer. As tot_stats is registered as estimator we can remove the special ip_vs_read_cpu_stats call for tot_stats. Fix ip_vs_read_cpu_stats to be called under stats lock because it is still used as synchronization between estimation timer and user context (the stats readers). Also, make sure ip_vs_stats_percpu_show reads properly the u64 stats from user context. Signed-off-by: Julian Anastasov <ja@ssi.bg> Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15netfilter:ipvs: use kmemdupShan Wei
The semantic patch that makes this output is available in scripts/coccinelle/api/memdup.cocci. More information about semantic patching is available at http://coccinelle.lip6.fr/ Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15ipvs: remove _bh from percpu stats readingJulian Anastasov
ip_vs_read_cpu_stats is called only from timer, so no need for _bh locks. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15ipvs: avoid lookup for fwmark 0Julian Anastasov
Restore the previous behaviour to lookup for fwmark service only when fwmark is non-null. This saves only CPU. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15net: dcbnl: Update copyright datesMark Rustad
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14tcp_cubic: fix low utilization of CUBIC with HyStartSangtae Ha
HyStart sets the initial exit point of slow start. Suppose that HyStart exits at 0.5BDP in a BDP network and no history exists. If the BDP of a network is large, CUBIC's initial cwnd growth may be too conservative to utilize the link. CUBIC increases the cwnd 20% per RTT in this case. Signed-off-by: Sangtae Ha <sangtae.ha@gmail.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>