summaryrefslogtreecommitdiff
path: root/net/bridge/br_netfilter_hooks.c
AgeCommit message (Collapse)Author
2017-03-22bridge: drop netfilter fake rtable unconditionallyFlorian Westphal
[ Upstream commit a13b2082ece95247779b9995c4e91b4246bed023 ] Andreas reports kernel oops during rmmod of the br_netfilter module. Hannes debugged the oops down to a NULL rt6info->rt6i_indev. Problem is that br_netfilter has the nasty concept of adding a fake rtable to skb->dst; this happens in a br_netfilter prerouting hook. A second hook (in bridge LOCAL_IN) is supposed to remove these again before the skb is handed up the stack. However, on module unload hooks get unregistered which means an skb could traverse the prerouting hook that attaches the fake_rtable, while the 'fake rtable remove' hook gets removed from the hooklist immediately after. Fixes: 34666d467cbf1e2e3c7 ("netfilter: bridge: move br_netfilter out of the core") Reported-by: Andreas Karis <akaris@redhat.com> Debugged-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-19bridge: netfilter: Fix dropping packets that moving through bridge interfaceArtur Molchanov
commit 14221cc45caad2fcab3a8543234bb7eda9b540d5 upstream. Problem: br_nf_pre_routing_finish() calls itself instead of br_nf_pre_routing_finish_bridge(). Due to this bug reverse path filter drops packets that go through bridge interface. User impact: Local docker containers with bridge network can not communicate with each other. Fixes: c5136b15ea36 ("netfilter: bridge: add and use br_nf_hook_thresh") Signed-off-by: Artur Molchanov <artur.molchanov@synesis.ru> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-25netfilter: replace list_head with single linked listAaron Conole
The netfilter hook list never uses the prev pointer, and so can be trimmed to be a simple singly-linked list. In addition to having a more light weight structure for hook traversal, struct net becomes 5568 bytes (down from 6400) and struct net_device becomes 2176 bytes (down from 2240). Signed-off-by: Aaron Conole <aconole@bytheb.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-09-24netfilter: bridge: add and use br_nf_hook_threshFlorian Westphal
This replaces the last uses of NF_HOOK_THRESH(). Followup patch will remove it and rename nf_hook_thresh. The reason is that inet (non-bridge) netfilter no longer invokes the hooks from hooks, so we do no longer need the thresh value to skip hooks with a lower priority. The bridge netfilter however may need to do this. br_nf_hook_thresh is a wrapper that is supposed to do this, i.e. only call hooks with a priority that exceeds NF_BR_PRI_BRNF. It's used only in the recursion cases of br_netfilter. It invokes nf_hook_slow while holding an rcu read-side critical section to make a future cleanup simpler. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Aaron Conole <aconole@bytheb.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-06-30ipv4: Fix ip_skb_dst_mtu to use the sk passed by ip_finish_outputShmulik Ladkani
ip_skb_dst_mtu uses skb->sk, assuming it is an AF_INET socket (e.g. it calls ip_sk_use_pmtu which casts sk as an inet_sk). However, in the case of UDP tunneling, the skb->sk is not necessarily an inet socket (could be AF_PACKET socket, or AF_UNSPEC if arriving from tun/tap). OTOH, the sk passed as an argument throughout IP stack's output path is the one which is of PMTU interest: - In case of local sockets, sk is same as skb->sk; - In case of a udp tunnel, sk is the tunneling socket. Fix, by passing ip_finish_output's sk to ip_skb_dst_mtu. This augments 7026b1ddb6 'netfilter: Pass socket pointer down through okfn().' Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28net: rename IP_INC_STATS_BH()Eric Dumazet
Rename IP_INC_STATS_BH() to __IP_INC_STATS(), to better express this is used in non preemptible context. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02netfilter: bridge: register hooks only when bridge interface is addedFlorian Westphal
This moves bridge hooks to a register-when-needed scheme. We use a device notifier to register the 'call-iptables' netfilter hooks only once a bridge gets added. This means that if the initial namespace uses a bridge, newly created network namespaces no longer get the PRE_ROUTING ipt_sabotage hook. It will registered in that network namespace once a bridge is created within that namespace. A few modules still use global hooks: - conntrack - bridge PF_BRIDGE hooks - IPVS - CLUSTER match (deprecated) - SYNPROXY As long as these modules are not loaded/used, a new network namespace has empty hook list and NF_HOOK() will boil down to single list_empty test even if initial namespace does stateless packet filtering. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-17Merge branch 'master' of ↵Pablo Neira Ayuso
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next This merge resolves conflicts with 75aec9df3a78 ("bridge: Remove br_nf_push_frag_xmit_sk") as part of Eric Biederman's effort to improve netns support in the network stack that reached upstream via David's net-next tree. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Conflicts: net/bridge/br_netfilter_hooks.c
2015-10-16netfilter: remove hook owner refcountingFlorian Westphal
since commit 8405a8fff3f8 ("netfilter: nf_qeueue: Drop queue entries on nf_unregister_hook") all pending queued entries are discarded. So we can simply remove all of the owner handling -- when module is removed it also needs to unregister all its hooks. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-12netfilter: bridge: avoid unused label warningArnd Bergmann
With the ARM mini2440_defconfig, the bridge netfilter code gets built with both CONFIG_NF_DEFRAG_IPV4 and CONFIG_NF_DEFRAG_IPV6 disabled, which leads to a harmless gcc warning: net/bridge/br_netfilter_hooks.c: In function 'br_nf_dev_queue_xmit': net/bridge/br_netfilter_hooks.c:792:2: warning: label 'drop' defined but not used [-Wunused-label] This gets rid of the warning by cleaning up the code to avoid the respective #ifdefs causing this problem, and replacing them with if(IS_ENABLED()) checks. I have verified that the resulting object code is unchanged, and an additional advantage is that we now get compile coverage of the unused functions in more configurations. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: dd302b59bde0 ("netfilter: bridge: don't leak skb in error paths") Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-30bridge: Remove br_nf_push_frag_xmit_skEric W. Biederman
Now that this compatability function no longer has any callers remove it. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2015-09-30ipv6: Pass struct net through ip6_fragmentEric W. Biederman
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2015-09-30ipv4: Pass struct net through ip_fragmentEric W. Biederman
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2015-09-29bridge: Pass net into br_validate_ipv4 and br_validate_ipv6Eric W. Biederman
The network namespace is easiliy available in state->net so use it. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-18netfilter: Pass priv instead of nf_hook_ops to netfilter hooksEric W. Biederman
Only pass the void *priv parameter out of the nf_hook_ops. That is all any of the functions are interested now, and by limiting what is passed it becomes simpler to change implementation details. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-18netfilter: Pass net into okfnEric W. Biederman
This is immediately motivated by the bridge code that chains functions that call into netfilter. Without passing net into the okfns the bridge code would need to guess about the best expression for the network namespace to process packets in. As net is frequently one of the first things computed in continuation functions after netfilter has done it's job passing in the desired network namespace is in many cases a code simplification. To support this change the function dst_output_okfn is introduced to simplify passing dst_output as an okfn. For the moment dst_output_okfn just silently drops the struct net. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-18netfilter: Pass struct net into the netfilter hooksEric W. Biederman
Pass a network namespace parameter into the netfilter hooks. At the call site of the netfilter hooks the path a packet is taking through the network stack is well known which allows the network namespace to be easily and reliabily. This allows the replacement of magic code like "dev_net(state->in?:state->out)" that appears at the start of most netfilter hooks with "state->net". In almost all cases the network namespace passed in is derived from the first network device passed in, guaranteeing those paths will not see any changes in practice. The exceptions are: xfrm/xfrm_output.c:xfrm_output_resume() xs_net(skb_dst(skb)->xfrm) ipvs/ip_vs_xmit.c:ip_vs_nat_send_or_cont() ip_vs_conn_net(cp) ipvs/ip_vs_xmit.c:ip_vs_send_or_cont() ip_vs_conn_net(cp) ipv4/raw.c:raw_send_hdrinc() sock_net(sk) ipv6/ip6_output.c:ip6_xmit() sock_net(sk) ipv6/ndisc.c:ndisc_send_skb() dev_net(skb->dev) not dev_net(dst->dev) ipv6/raw.c:raw6_send_hdrinc() sock_net(sk) br_netfilter_hooks.c:br_nf_pre_routing_finish() dev_net(skb->dev) before skb->dev is set to nf_bridge->physindev In all cases these exceptions seem to be a better expression for the network namespace the packet is being processed in then the historic "dev_net(in?in:out)". I am documenting them in case something odd pops up and someone starts trying to track down what happened. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-18bridge: Cache net in br_nf_pre_routing_finishEric W. Biederman
This is prep work for passing net to the netfilter hooks. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-18bridge: Pass net into br_nf_push_frag_xmitEric W. Biederman
When struct net starts being passed through the ipv4 and ipv6 fragment routines br_nf_push_frag_xmit will need to take a net parameter. Prepare br_nf_push_frag_xmit before that is needed and introduce br_nf_push_frag_xmit_sk for the call sites that still need the old calling conventions. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-18bridge: Pass net into br_nf_ip_fragmentEric W. Biederman
This is a prep work for passing struct net through ip_do_fragment and later the netfilter okfn. Doing this independently makes the later code changes clearer. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-30netfilter: bridge: do not initialize statics to 0 or NULLBernhard Thaler
Fix checkpatch.pl "ERROR: do not initialise statics to 0 or NULL" for all statics explicitly initialized to 0. Signed-off-by: Bernhard Thaler <bernhard.thaler@wvnet.at> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-07-30netfilter: bridge: reduce nf_bridge_info to 32 bytes againFlorian Westphal
We can use union for most of the temporary cruft (original ipv4/ipv6 address, source mac, physoutdev) since they're used during different stages of br netfilter traversal. Also get rid of the last two ->mask users. Shrinks struct from 48 to 32 on 64bit arch. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-07-02netfilter: bridge: don't leak skb in error pathsFlorian Westphal
br_nf_dev_queue_xmit must free skb in its error path. NF_DROP is misleading -- its an okfn, not a netfilter hook. Fixes: 462fb2af9788a ("bridge : Sanitize skb before it enters the IP stack") Fixes: efb6de9b4ba00 ("netfilter: bridge: forward IPv6 fragmented packets") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-07-02netfilter: bridge: fix CONFIG_NF_DEFRAG_IPV4/6 related warnings/errorsBernhard Thaler
br_nf_ip_fragment() is not needed when neither CONFIG_NF_DEFRAG_IPV4 nor CONFIG_NF_DEFRAG_IPV6 is set. struct brnf_frag_data must be available if either CONFIG_NF_DEFRAG_IPV4 or CONFIG_NF_DEFRAG_IPV6 is set. Fixes: efb6de9b4ba0 ("netfilter: bridge: forward IPv6 fragmented packets") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Bernhard Thaler <bernhard.thaler@wvnet.at> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-18netfilter: bridge: split ipv6 code into separated filePablo Neira Ayuso
Resolve compilation breakage when CONFIG_IPV6 is not set by moving the IPv6 code into a separated br_netfilter_ipv6.c file. Fixes: efb6de9b4ba0 ("netfilter: bridge: forward IPv6 fragmented packets") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-18netfilter: bridge: rename br_netfilter.c to br_netfilter_hooks.cPablo Neira Ayuso
To prepare separation of the IPv6 code into different file. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>