diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2013-01-28 19:41:37 (GMT) |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2013-01-28 19:41:37 (GMT) |
commit | 22f837981514e157f8f9737b25ac6d7d90a14006 (patch) | |
tree | 5537a70dcd9225023335b1bd1cd0e9a9c0e95cb9 /net/ipv4 | |
parent | 949db153b6466c6f7cad5a427ecea94985927311 (diff) | |
parent | 6642f91c92da07369cf1e582503ea3ccb4a7f1a9 (diff) | |
download | linux-fsl-qoriq-22f837981514e157f8f9737b25ac6d7d90a14006.tar.xz |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking updates from David Miller:
"Much more accumulated than I would have liked due to an unexpected
bout with a nasty flu:
1) AH and ESP input don't set ECN field correctly because the
transport head of the SKB isn't set correctly, fix from Li
RongQing.
2) If netfilter conntrack zones are disabled, we can return an
uninitialized variable instead of the proper error code. Fix from
Borislav Petkov.
3) Fix double SKB free in ath9k driver beacon handling, from Felix
Feitkau.
4) Remove bogus assumption about netns cleanup ordering in
nf_conntrack, from Pablo Neira Ayuso.
5) Remove a bogus BUG_ON in the new TCP fastopen code, from Eric
Dumazet. It uses spin_is_locked() in it's test and is therefore
unsuitable for UP.
6) Fix SELINUX labelling regressions added by the tuntap multiqueue
changes, from Paul Moore.
7) Fix CRC errors with jumbo frame receive in tg3 driver, from Nithin
Nayak Sujir.
8) CXGB4 driver sets interrupt coalescing parameters only on first
queue, rather than all of them. Fix from Thadeu Lima de Souza
Cascardo.
9) Fix regression in the dispatch of read/write registers in dm9601
driver, from Tushar Behera.
10) ipv6_append_data miscalculates header length, from Romain KUNTZ.
11) Fix PMTU handling regressions on ipv4 routes, from Steffen
Klassert, Timo Teräs, and Julian Anastasov.
12) In 3c574_cs driver, add necessary parenthesis to "x << y & z"
expression. From Nickolai Zeldovich.
13) macvlan_get_size() causes underallocation netlink message space,
fix from Eric Dumazet.
14) Avoid division by zero in xfrm_replay_advance_bmp(), from Nickolai
Zeldovich. Amusingly the zero check was already there, we were
just performing it after the modulus :-)
15) Some more splice bug fixes from Eric Dumazet, which fix things
mostly eminating from how we now more aggressively use high-order
pages in SKBs.
16) Fix size calculation bug when freeing hash tables in the IPSEC
xfrm code, from Michal Kubecek.
17) Fix PMTU event propagation into socket cached routes, from Steffen
Klassert.
18) Fix off by one in TX buffer release in netxen driver, from Eric
Dumazet.
19) Fix rediculous memory allocation requirements introduced by the
tuntap multiqueue changes, from Jason Wang.
20) Remove bogus AMD platform workaround in r8169 driver that causes
major problems in normal operation, from Timo Teräs.
21) virtio-net set affinity and select queue don't handle
discontiguous cpu numbers properly, fix from Wanlong Gao.
22) Fix a route refcounting issue in loopback driver, from Eric
Dumazet. There's a similar fix coming that we might add to the
macvlan driver as well.
23) Fix SKB leaks in batman-adv's distributed arp table code, from
Matthias Schiffer.
24) r8169 driver gives descriptor ownership back the hardware before
we're done reading the VLAN tag out of it, fix from Francois
Romieu.
25) Checksums not calculated properly in GRE tunnel driver fix from
Pravin B Shelar.
26) Fix SCTP memory leak on namespace exit."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (101 commits)
dm9601: support dm9620 variant
SCTP: Free the per-net sysctl table on net exit. v2
net: phy: icplus: fix broken INTR pin settings
net: phy: icplus: Use the RGMII interface mode to configure clock delays
IP_GRE: Fix kernel panic in IP_GRE with GRE csum.
sctp: set association state to established in dupcook_a handler
ip6mr: limit IPv6 MRT_TABLE identifiers
r8169: fix vlan tag read ordering.
net: cdc_ncm: use IAD provided by the USB core
batman-adv: filter ARP packets with invalid MAC addresses in DAT
batman-adv: check for more types of invalid IP addresses in DAT
batman-adv: fix skb leak in batadv_dat_snoop_incoming_arp_reply()
net: loopback: fix a dst refcounting issue
virtio-net: reset virtqueue affinity when doing cpu hotplug
virtio-net: split out clean affinity function
virtio-net: fix the set affinity bug when CPU IDs are not consecutive
can: pch_can: fix invalid error codes
can: ti_hecc: fix invalid error codes
can: c_can: fix invalid error codes
r8169: remove the obsolete and incorrect AMD workaround
...
Diffstat (limited to 'net/ipv4')
-rw-r--r-- | net/ipv4/ah4.c | 18 | ||||
-rw-r--r-- | net/ipv4/datagram.c | 25 | ||||
-rw-r--r-- | net/ipv4/esp4.c | 12 | ||||
-rw-r--r-- | net/ipv4/ip_gre.c | 6 | ||||
-rw-r--r-- | net/ipv4/ipcomp.c | 7 | ||||
-rw-r--r-- | net/ipv4/ping.c | 1 | ||||
-rw-r--r-- | net/ipv4/raw.c | 1 | ||||
-rw-r--r-- | net/ipv4/route.c | 54 | ||||
-rw-r--r-- | net/ipv4/tcp_ipv4.c | 9 | ||||
-rw-r--r-- | net/ipv4/udp.c | 1 |
10 files changed, 117 insertions, 17 deletions
diff --git a/net/ipv4/ah4.c b/net/ipv4/ah4.c index a0d8392..a69b4e4 100644 --- a/net/ipv4/ah4.c +++ b/net/ipv4/ah4.c @@ -269,7 +269,11 @@ static void ah_input_done(struct crypto_async_request *base, int err) skb->network_header += ah_hlen; memcpy(skb_network_header(skb), work_iph, ihl); __skb_pull(skb, ah_hlen + ihl); - skb_set_transport_header(skb, -ihl); + + if (x->props.mode == XFRM_MODE_TUNNEL) + skb_reset_transport_header(skb); + else + skb_set_transport_header(skb, -ihl); out: kfree(AH_SKB_CB(skb)->tmp); xfrm_input_resume(skb, err); @@ -381,7 +385,10 @@ static int ah_input(struct xfrm_state *x, struct sk_buff *skb) skb->network_header += ah_hlen; memcpy(skb_network_header(skb), work_iph, ihl); __skb_pull(skb, ah_hlen + ihl); - skb_set_transport_header(skb, -ihl); + if (x->props.mode == XFRM_MODE_TUNNEL) + skb_reset_transport_header(skb); + else + skb_set_transport_header(skb, -ihl); err = nexthdr; @@ -413,9 +420,12 @@ static void ah4_err(struct sk_buff *skb, u32 info) if (!x) return; - if (icmp_hdr(skb)->type == ICMP_DEST_UNREACH) + if (icmp_hdr(skb)->type == ICMP_DEST_UNREACH) { + atomic_inc(&flow_cache_genid); + rt_genid_bump(net); + ipv4_update_pmtu(skb, net, info, 0, 0, IPPROTO_AH, 0); - else + } else ipv4_redirect(skb, net, 0, 0, IPPROTO_AH, 0); xfrm_state_put(x); } diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c index 424fafb..b28e863 100644 --- a/net/ipv4/datagram.c +++ b/net/ipv4/datagram.c @@ -85,3 +85,28 @@ out: return err; } EXPORT_SYMBOL(ip4_datagram_connect); + +void ip4_datagram_release_cb(struct sock *sk) +{ + const struct inet_sock *inet = inet_sk(sk); + const struct ip_options_rcu *inet_opt; + __be32 daddr = inet->inet_daddr; + struct flowi4 fl4; + struct rtable *rt; + + if (! __sk_dst_get(sk) || __sk_dst_check(sk, 0)) + return; + + rcu_read_lock(); + inet_opt = rcu_dereference(inet->inet_opt); + if (inet_opt && inet_opt->opt.srr) + daddr = inet_opt->opt.faddr; + rt = ip_route_output_ports(sock_net(sk), &fl4, sk, daddr, + inet->inet_saddr, inet->inet_dport, + inet->inet_sport, sk->sk_protocol, + RT_CONN_FLAGS(sk), sk->sk_bound_dev_if); + if (!IS_ERR(rt)) + __sk_dst_set(sk, &rt->dst); + rcu_read_unlock(); +} +EXPORT_SYMBOL_GPL(ip4_datagram_release_cb); diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c index b61e9de..3b4f0cd 100644 --- a/net/ipv4/esp4.c +++ b/net/ipv4/esp4.c @@ -346,7 +346,10 @@ static int esp_input_done2(struct sk_buff *skb, int err) pskb_trim(skb, skb->len - alen - padlen - 2); __skb_pull(skb, hlen); - skb_set_transport_header(skb, -ihl); + if (x->props.mode == XFRM_MODE_TUNNEL) + skb_reset_transport_header(skb); + else + skb_set_transport_header(skb, -ihl); err = nexthdr[1]; @@ -499,9 +502,12 @@ static void esp4_err(struct sk_buff *skb, u32 info) if (!x) return; - if (icmp_hdr(skb)->type == ICMP_DEST_UNREACH) + if (icmp_hdr(skb)->type == ICMP_DEST_UNREACH) { + atomic_inc(&flow_cache_genid); + rt_genid_bump(net); + ipv4_update_pmtu(skb, net, info, 0, 0, IPPROTO_ESP, 0); - else + } else ipv4_redirect(skb, net, 0, 0, IPPROTO_ESP, 0); xfrm_state_put(x); } diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c index 303012a..e81b1ca 100644 --- a/net/ipv4/ip_gre.c +++ b/net/ipv4/ip_gre.c @@ -963,8 +963,12 @@ static netdev_tx_t ipgre_tunnel_xmit(struct sk_buff *skb, struct net_device *dev ptr--; } if (tunnel->parms.o_flags&GRE_CSUM) { + int offset = skb_transport_offset(skb); + *ptr = 0; - *(__sum16 *)ptr = ip_compute_csum((void *)(iph+1), skb->len - sizeof(struct iphdr)); + *(__sum16 *)ptr = csum_fold(skb_checksum(skb, offset, + skb->len - offset, + 0)); } } diff --git a/net/ipv4/ipcomp.c b/net/ipv4/ipcomp.c index d3ab47e..9a46dae 100644 --- a/net/ipv4/ipcomp.c +++ b/net/ipv4/ipcomp.c @@ -47,9 +47,12 @@ static void ipcomp4_err(struct sk_buff *skb, u32 info) if (!x) return; - if (icmp_hdr(skb)->type == ICMP_DEST_UNREACH) + if (icmp_hdr(skb)->type == ICMP_DEST_UNREACH) { + atomic_inc(&flow_cache_genid); + rt_genid_bump(net); + ipv4_update_pmtu(skb, net, info, 0, 0, IPPROTO_COMP, 0); - else + } else ipv4_redirect(skb, net, 0, 0, IPPROTO_COMP, 0); xfrm_state_put(x); } diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c index 8f3d054..6f9c072 100644 --- a/net/ipv4/ping.c +++ b/net/ipv4/ping.c @@ -738,6 +738,7 @@ struct proto ping_prot = { .recvmsg = ping_recvmsg, .bind = ping_bind, .backlog_rcv = ping_queue_rcv_skb, + .release_cb = ip4_datagram_release_cb, .hash = ping_v4_hash, .unhash = ping_v4_unhash, .get_port = ping_v4_get_port, diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c index 73d1e4d..6f08991 100644 --- a/net/ipv4/raw.c +++ b/net/ipv4/raw.c @@ -894,6 +894,7 @@ struct proto raw_prot = { .recvmsg = raw_recvmsg, .bind = raw_bind, .backlog_rcv = raw_rcv_skb, + .release_cb = ip4_datagram_release_cb, .hash = raw_hash_sk, .unhash = raw_unhash_sk, .obj_size = sizeof(struct raw_sock), diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 844a9ef..a0fcc47 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -912,6 +912,9 @@ static void __ip_rt_update_pmtu(struct rtable *rt, struct flowi4 *fl4, u32 mtu) struct dst_entry *dst = &rt->dst; struct fib_result res; + if (dst_metric_locked(dst, RTAX_MTU)) + return; + if (dst->dev->mtu < mtu) return; @@ -962,7 +965,7 @@ void ipv4_update_pmtu(struct sk_buff *skb, struct net *net, u32 mtu, } EXPORT_SYMBOL_GPL(ipv4_update_pmtu); -void ipv4_sk_update_pmtu(struct sk_buff *skb, struct sock *sk, u32 mtu) +static void __ipv4_sk_update_pmtu(struct sk_buff *skb, struct sock *sk, u32 mtu) { const struct iphdr *iph = (const struct iphdr *) skb->data; struct flowi4 fl4; @@ -975,6 +978,53 @@ void ipv4_sk_update_pmtu(struct sk_buff *skb, struct sock *sk, u32 mtu) ip_rt_put(rt); } } + +void ipv4_sk_update_pmtu(struct sk_buff *skb, struct sock *sk, u32 mtu) +{ + const struct iphdr *iph = (const struct iphdr *) skb->data; + struct flowi4 fl4; + struct rtable *rt; + struct dst_entry *dst; + bool new = false; + + bh_lock_sock(sk); + rt = (struct rtable *) __sk_dst_get(sk); + + if (sock_owned_by_user(sk) || !rt) { + __ipv4_sk_update_pmtu(skb, sk, mtu); + goto out; + } + + __build_flow_key(&fl4, sk, iph, 0, 0, 0, 0, 0); + + if (!__sk_dst_check(sk, 0)) { + rt = ip_route_output_flow(sock_net(sk), &fl4, sk); + if (IS_ERR(rt)) + goto out; + + new = true; + } + + __ip_rt_update_pmtu((struct rtable *) rt->dst.path, &fl4, mtu); + + dst = dst_check(&rt->dst, 0); + if (!dst) { + if (new) + dst_release(&rt->dst); + + rt = ip_route_output_flow(sock_net(sk), &fl4, sk); + if (IS_ERR(rt)) + goto out; + + new = true; + } + + if (new) + __sk_dst_set(sk, &rt->dst); + +out: + bh_unlock_sock(sk); +} EXPORT_SYMBOL_GPL(ipv4_sk_update_pmtu); void ipv4_redirect(struct sk_buff *skb, struct net *net, @@ -1120,7 +1170,7 @@ static unsigned int ipv4_mtu(const struct dst_entry *dst) if (!mtu || time_after_eq(jiffies, rt->dst.expires)) mtu = dst_metric_raw(dst, RTAX_MTU); - if (mtu && rt_is_output_route(rt)) + if (mtu) return mtu; mtu = dst->dev->mtu; diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 54139fa..70b09ef 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -369,11 +369,10 @@ void tcp_v4_err(struct sk_buff *icmp_skb, u32 info) * We do take care of PMTU discovery (RFC1191) special case : * we can receive locally generated ICMP messages while socket is held. */ - if (sock_owned_by_user(sk) && - type != ICMP_DEST_UNREACH && - code != ICMP_FRAG_NEEDED) - NET_INC_STATS_BH(net, LINUX_MIB_LOCKDROPPEDICMPS); - + if (sock_owned_by_user(sk)) { + if (!(type == ICMP_DEST_UNREACH && code == ICMP_FRAG_NEEDED)) + NET_INC_STATS_BH(net, LINUX_MIB_LOCKDROPPEDICMPS); + } if (sk->sk_state == TCP_CLOSE) goto out; diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 79c8dbe..1f4d405 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1952,6 +1952,7 @@ struct proto udp_prot = { .recvmsg = udp_recvmsg, .sendpage = udp_sendpage, .backlog_rcv = __udp_queue_rcv_skb, + .release_cb = ip4_datagram_release_cb, .hash = udp_lib_hash, .unhash = udp_lib_unhash, .rehash = udp_v4_rehash, |