Age | Commit message (Collapse) | Author |
|
Speedup vlan dismantling in CONFIG_VLAN_8021Q_GVRP=y cases,
by using a call_rcu() to free the memory instead of waiting with
expensive synchronize_rcu() [ while RTNL is held ]
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ben Greear <greearb@candelatech.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
veth devices dont use the batched device unregisters yet.
Since veth are a pair of devices, it makes sense to use a batch of two
unregisters, this roughly divides dismantle time by two.
Fix this by changing dellink() callers to always provide a non NULL
head. (Idea from Michał Mirosław)
This patch also handles macvlan case : We now dismantle all macvlans on
top of a lower dev at once.
Reported-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Michał Mirosław <mirqus@gmail.com>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
I messed things up when I converted over to the transport
flow, I passed the ipv4 address value instead of it's address.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This way rt->rt_dst accesses are unnecessary.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This way ip_output.c no longer needs rt->rt_{src,dst}.
We already have these keys sitting, ready and waiting, on the stack or
in a socket structure.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We have two cases.
Either the socket is in TCP_ESTABLISHED state and connect() filled
in the inet socket cork flow, or we looked up the route here and
used an on-stack flow.
Track which one it was, and use it to obtain src/dst addrs.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch enables ethtool to set the loopback mode on a given interface.
By configuring the interface in loopback mode in conjunction with a policy
route / rule, a userland application can stress the egress / ingress path
exposing the flows of the change in progress and potentially help developer(s)
understand the impact of those changes without even sending a packet out
on the network.
Following set of commands illustrates one such example -
a) ip -4 addr add 192.168.1.1/24 dev eth1
b) ip -4 rule add from all iif eth1 lookup 250
c) ip -4 route add local 0/0 dev lo proto kernel scope host table 250
d) arp -Ds 192.168.1.100 eth1
e) arp -Ds 192.168.1.200 eth1
f) sysctl -w net.ipv4.ip_nonlocal_bind=1
g) sysctl -w net.ipv4.conf.all.accept_local=1
# Assuming that the machine has 8 cores
h) taskset 000f netserver -L 192.168.1.200
i) taskset 00f0 netperf -t TCP_CRR -L 192.168.1.100 -H 192.168.1.200 -l 30
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
I don't know why %pI6 doesn't compress, but the format specifier is
kernel-standard, so use it.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now we can pick it out of the transport's flow key.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now we can pick it out of the provided flow key.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This allows us to acquire the exact route keying information from the
protocol, however that might be managed.
It handles all of the possibilities, from the simplest case of storing
the key in inet->cork.fl to the more complex setup SCTP has where
individual transports determine the flow.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Operation order is now transposed, we first create the child
socket then we try to hook up the route.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is just like inet_csk_route_req() except that it operates after
we've created the new child socket.
In this way we can use the new socket's cork flow for proper route
key storage.
This will be used by DCCP and TCP child socket creation handling.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Several future simplifications are possible now because of this.
For example, the sctp_addr unions can simply refer directly to
the flowi information.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
All invokers of ip_queue_xmit() must make certain that the
socket is locked. All of SCTP, TCP, DCCP, and L2TP now make
sure this is the case.
Therefore we can use the cork flow during output route lookup in
ip_queue_xmit() when the socket route check fails.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
These two functions must be invoked only when the socket is locked
(because socket identity modifications are made non-atomically).
Therefore we can use the cork flow for output route lookups.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is to make sure that an l2tp socket's inet cork flow is
fully filled in, when it's encapsulated in UDP.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now that the socket is consistently locked in these two routines,
this transformation is legal.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
l2tp_xmit_skb() must take the socket lock. It makes use of ip_queue_xmit()
which expects to execute in a socket atomic context.
Since we execute this function in software interrupts, we cannot use the
usual lock_sock()/release_sock() sequence, instead we have to use
bh_lock_sock() and see if a user has the socket locked, and if so drop
the packet.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Both l2tp_ip_connect() and l2tp_ip_sendmsg() must take the socket
lock. They both modify socket state non-atomically, and in particular
l2tp_ip_sendmsg() increments socket private counters without using
atomic operations.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since this is invoked from inet_stream_connect() the socket is locked
and therefore this usage is safe.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since this is invoked from inet_stream_connect() the socket is locked
and therefore this usage is safe.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After that all the upstream kernel drivers now use phys_id,
and the old ethtool_ops interface (phys_id) can be removed.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In function is_bidirectional_neigh the code that find out the one hop
neighbor is duplicated.
Signed-off-by: Daniele Furlan <daniele.furlan@gmail.com>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
|
|
It is slightly irritating that comments after a long line span over
multiple lines without any code. It is easier to put them before the
actual code and reduce the number of lines which the eye has to read.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
|
|
To be coherent, all the functions/variables/constants have been renamed
to the TranslationTable style
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
|
|
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
|
|
The hard_if_event is called by the notifier with rtnl_lock and tries to
remove sysfs entries when a NETDEV_UNREGISTER event is received. This
will automatically take the s_active lock.
The s_active lock is also used when a new interface is added to a meshif
through sysfs. In that situation we cannot wait for the rntl_lock before
creating the actual batman-adv interface to prevent a deadlock. It is
still possible to try to get the rtnl_lock and immediately abort the
current operation when the trylock call failed.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
|
|
hardif_list_lock is unneccessary because we already ensure that no
multiple admin operations can take place through rtnl_lock.
hardif_list_lock only adds additional overhead and complexity.
Critical functions now check whether they are called with rtnl_lock
using ASSERT_RTNL.
It indirectly fixes the problem that orig_hash_del_if() expects that
only one interface is deleted from hardif_list at a time, but
hardif_remove_interfaces() removes all at once and then calls
orig_hash_del_if().
Reported-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
|
|
The bridge loop detection for batman-adv allows the bat0 interface
to be bridged into an ethernet segment which other batman-adv nodes
are connected to. In order to also allow multiple VLANs on top of
the bat0 interface to be bridged into the ethernet segment this
patch extends the aforementioned bridge loop detection.
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
|
|
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
|
|
Noticed by Joe Perches.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ip_setup_cork() explicitly initializes every member of
inet_cork except flags, addr, and opt. So we can simply
set those three members to zero instead of using a
memset() via an empty struct assignment.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
|
|
When we fast path datagram sends to avoid locking by putting
the inet_cork on the stack we use up lots of space that isn't
necessary.
This is because inet_cork contains a "struct flowi" which isn't
used in these code paths.
Split inet_cork to two parts, "inet_cork" and "inet_cork_full".
Only the latter of which has the "struct flowi" and is what is
stored in inet_sock.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
|
|
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
drivers/net/tg3.c
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
|
|
This patch adds a multiple message send syscall and is the send
version of the existing recvmmsg syscall. This is heavily
based on the patch by Arnaldo that added recvmmsg.
I wrote a microbenchmark to test the performance gains of using
this new syscall:
http://ozlabs.org/~anton/junkcode/sendmmsg_test.c
The test was run on a ppc64 box with a 10 Gbit network card. The
benchmark can send both UDP and RAW ethernet packets.
64B UDP
batch pkts/sec
1 804570
2 872800 (+ 8 %)
4 916556 (+14 %)
8 939712 (+17 %)
16 952688 (+18 %)
32 956448 (+19 %)
64 964800 (+20 %)
64B raw socket
batch pkts/sec
1 1201449
2 1350028 (+12 %)
4 1461416 (+22 %)
8 1513080 (+26 %)
16 1541216 (+28 %)
32 1553440 (+29 %)
64 1557888 (+30 %)
We see a 20% improvement in throughput on UDP send and 30%
on raw socket send.
[ Add sparc syscall entries. -DaveM ]
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Force dev_alloc_name() to be called from register_netdevice() by
dev_get_valid_name(). That allows to remove multiple explicit
dev_alloc_name() calls.
The possibility to call dev_alloc_name in advance remains.
This also fixes veth creation regresion caused by
84c49d8c3e4abefb0a41a77b25aa37ebe8d6b743
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
Conflicts:
drivers/net/wireless/libertas/if_cs.c
drivers/net/wireless/rtlwifi/pci.c
net/bluetooth/l2cap_sock.c
|
|
The devices that require IV generation in software need tailroom
reservation for ICVs used in TKIP or WEP encryptions.
Currently, decision to skip the tailroom reservation in the tx
path was taken only on whether driver wants MMIC to be generated
in software or not. Following patch appends IV generation check for
such decisions and fixes the following warning.
WARNING: at net/mac80211/wep.c:101 ieee80211_wep_add_iv+0x56/0xf3()
Hardware name: 64756D6
Modules linked in: ath9k ath9k_common ath9k_hw
Pid: 0, comm: swapper Tainted: G W 2.6.39-rc5-wl
Call Trace:
[<c102fd29>] warn_slowpath_common+0x65/0x7a
[<c1465c4e>] ? ieee80211_wep_add_iv+0x56/0xf3
[<c102fd4d>] warn_slowpath_null+0xf/0x13
[<c1465c4e>] ieee80211_wep_add_iv+0x56/0xf3
[<c1466007>] ieee80211_crypto_wep_encrypt+0x63/0x88
[<c1478bf3>] ieee80211_tx_h_encrypt+0x2f/0x63
[<c1478cba>] invoke_tx_handlers+0x93/0xe1
[<c1478eda>] ieee80211_tx+0x4b/0x6d
[<c147907c>] ieee80211_xmit+0x180/0x188
[<c147779d>] ? ieee80211_skb_resize+0x95/0xd9
[<c1479edf>] ieee80211_subif_start_xmit+0x64f/0x668
[<c13956fc>] dev_hard_start_xmit+0x368/0x48c
[<c13a8bd6>] sch_direct_xmit+0x4d/0x101
[<c1395ae1>] dev_queue_xmit+0x2c1/0x43f
[<c13a74a2>] ? eth_header+0x1e/0x90
[<c13a7400>] ? eth_type_trans+0x91/0xc2
[<c13a7484>] ? eth_rebuild_header+0x53/0x53
[<c139f079>] neigh_resolve_output+0x223/0x27e
[<c13c6b23>] ip_finish_output2+0x1d4/0x1fe
[<c13c6bc6>] ip_finish_output+0x79/0x7d
[<c13c6cbe>] T.1075+0x43/0x48
[<c13c6e6e>] ip_output+0x75/0x7b
[<c13c4970>] dst_output+0xc/0xe
[<c13c62c9>] ip_local_out+0x17/0x1a
[<c13c67bb>] ip_queue_xmit+0x2aa/0x2f8
[<c138b742>] ? sk_setup_caps+0x21/0x92
[<c13d95ea>] ? __tcp_v4_send_check+0x7e/0xb7
[<c13d5d2e>] tcp_transmit_skb+0x6a1/0x6d7
[<c13d533b>] ? tcp_established_options+0x20/0x8b
[<c13d6f28>] tcp_retransmit_skb+0x43a/0x527
[<c13d8d6d>] tcp_retransmit_timer+0x32e/0x45d
[<c13d8f23>] tcp_write_timer+0x87/0x16c
[<c103a030>] run_timer_softirq+0x156/0x1f9
[<c13d8e9c>] ? tcp_retransmit_timer+0x45d/0x45d
[<c1034d65>] __do_softirq+0x97/0x14a
[<c1034cce>] ? irq_enter+0x4d/0x4d
Cc: Yogesh Powar <yogeshp@marvell.com>
Reported-by: Fabio Rossi <rossi.f@inwind.it>
Tested-by: Fabio Rossi <rossi.f@inwind.it>
Signed-off-by: Mohammed Shafi Shajakhan <mshajakhan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
can: rename can_try_module_get to can_get_proto
can_try_module_get does return a struct can_proto.
The name explains what is done in so much detail that a caller
may not notice that a struct can_proto is locked/unlocked.
Signed-off-by: Kurt Van Dijck <kurt.van.dijck@eia.be>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
commit 53914b67993c724cec585863755c9ebc8446e83b had the
same message. That commit did put everything in place but
did not make can_proto const itself.
Signed-off-by: Kurt Van Dijck <kurt.van.dijck@eia.be>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 4a94445c9a5c (net: Use ip_route_input_noref() in input path)
added a bug in IP defragmentation handling, in case timeout is fired.
When a frame is defragmented, we use last skb dst field when building
final skb. Its dst is valid, since we are in rcu read section.
But if a timeout occurs, we take first queued fragment to build one ICMP
TIME EXCEEDED message. Problem is all queued skb have weak dst pointers,
since we escaped RCU critical section after their queueing. icmp_send()
might dereference a now freed (and possibly reused) part of memory.
Calling skb_dst_drop() and ip_route_input_noref() to revalidate route is
the only possible choice.
Reported-by: Denys Fedoryshchenko <denys@visp.net.lb>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Instead of rt->rt_{dst,src}
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
First, make callers pass on-stack flowi4 to ip_route_output_gre()
so they can get at the fully resolved flow key.
Next, use that in ipgre_tunnel_xmit() to avoid the need to use
rt->rt_{dst,src}.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This eliminates the need to use rt->rt_{src,dst}.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Instead of rt->rt_{dst,src}
The only tricky part is source route option handling.
If the source route option is enabled we can't just use plain 'daddr',
we have to use opt->opt.faddr.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Instead of rt->rt_{dst,src}
Signed-off-by: David S. Miller <davem@davemloft.net>
|