summaryrefslogtreecommitdiff
path: root/net/openvswitch
AgeCommit message (Collapse)Author
2013-03-15Merge branch 'fixes' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch Jesse Gross says: ==================== A few different bug fixes, including several for issues with userspace communication that have gone unnoticed up until now. These are intended for net/3.9. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-28hlist: drop the node parameter from iteratorsSasha Levin
I'm not sure why, but the hlist for each entry iterators were conceived list_for_each_entry(pos, head, member) The hlist ones were greedy and wanted an extra parameter: hlist_for_each_entry(tpos, pos, head, member) Why did they need an extra pos parameter? I'm not quite sure. Not only they don't really need it, it also prevents the iterator from looking exactly like the list iterator, which is unfortunate. Besides the semantic patch, there was some manual work required: - Fix up the actual hlist iterators in linux/list.h - Fix up the declaration of other iterators based on the hlist ones. - A very small amount of places were using the 'node' parameter, this was modified to use 'obj->member' instead. - Coccinelle didn't handle the hlist_for_each_entry_safe iterator properly, so those had to be fixed up manually. The semantic patch which is mostly the work of Peter Senna Tschudin is here: @@ iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host; type T; expression a,c,d,e; identifier b; statement S; @@ -T b; <+... when != b ( hlist_for_each_entry(a, - b, c, d) S | hlist_for_each_entry_continue(a, - b, c) S | hlist_for_each_entry_from(a, - b, c) S | hlist_for_each_entry_rcu(a, - b, c, d) S | hlist_for_each_entry_rcu_bh(a, - b, c, d) S | hlist_for_each_entry_continue_rcu_bh(a, - b, c) S | for_each_busy_worker(a, c, - b, d) S | ax25_uid_for_each(a, - b, c) S | ax25_for_each(a, - b, c) S | inet_bind_bucket_for_each(a, - b, c) S | sctp_for_each_hentry(a, - b, c) S | sk_for_each(a, - b, c) S | sk_for_each_rcu(a, - b, c) S | sk_for_each_from -(a, b) +(a) S + sk_for_each_from(a) S | sk_for_each_safe(a, - b, c, d) S | sk_for_each_bound(a, - b, c) S | hlist_for_each_entry_safe(a, - b, c, d, e) S | hlist_for_each_entry_continue_rcu(a, - b, c) S | nr_neigh_for_each(a, - b, c) S | nr_neigh_for_each_safe(a, - b, c, d) S | nr_node_for_each(a, - b, c) S | nr_node_for_each_safe(a, - b, c, d) S | - for_each_gfn_sp(a, c, d, b) S + for_each_gfn_sp(a, c, d) S | - for_each_gfn_indirect_valid_sp(a, c, d, b) S + for_each_gfn_indirect_valid_sp(a, c, d) S | for_each_host(a, - b, c) S | for_each_host_safe(a, - b, c, d) S | for_each_mesh_entry(a, - b, c, d) S ) ...+> [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c] [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c] [akpm@linux-foundation.org: checkpatch fixes] [akpm@linux-foundation.org: fix warnings] [akpm@linux-foudnation.org: redo intrusive kvm changes] Tested-by: Peter Senna Tschudin <peter.senna@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-23openvswitch: remove some useless commentsCong Wang
These comments are useless in upstream kernel. Cc: David S. Miller <davem@davemloft.net> Cc: Jesse Gross <jesse@nicira.com> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-02-23openvswitch: fix the calculation of checksum for vlan headerCong Wang
In vlan_insert_tag(), we insert a 4-byte VLAN header _after_ mac header: memmove(skb->data, skb->data + VLAN_HLEN, 2 * ETH_ALEN); ... veth->h_vlan_proto = htons(ETH_P_8021Q); ... veth->h_vlan_TCI = htons(vlan_tci); so after it, we should recompute the checksum to include these 4 bytes. skb->data still points to the mac header, therefore VLAN header is at (2 * ETH_ALEN = 12) bytes after it, not (ETH_HLEN = 14) bytes. This can also be observed via tcpdump: 0x0000: ffff ffff ffff 5254 005d 6f6e 8100 000a 0x0010: 0806 0001 0800 0604 0001 5254 005d 6f6e 0x0020: c0a8 026e 0000 0000 0000 c0a8 0282 Similar for __pop_vlan_tci(), the vlan header we remove is the one overwritten in: memmove(skb->data + VLAN_HLEN, skb->data, 2 * ETH_ALEN); Therefore the VLAN_HLEN = 4 bytes after 2 * ETH_ALEN is the part we want to sub from checksum. Cc: David S. Miller <davem@davemloft.net> Cc: Jesse Gross <jesse@nicira.com> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-02-23openvswitch: Fix parsing invalid LLC/SNAP ethertypesRich Lane
Before this patch, if an LLC/SNAP packet with OUI 00:00:00 had an ethertype less than 1536 the flow key given to userspace in the upcall would contain the invalid ethertype (for example, 3). If userspace attempted to insert a kernel flow for this key it would be rejected by ovs_flow_from_nlattrs. This patch allows OVS to pass the OFTest pktact.DirectBadLlcPackets. Signed-off-by: Rich Lane <rlane@bigswitch.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-02-23openvswitch: Call genlmsg_end in queue_userspace_packetRich Lane
Without genlmsg_end the upcall message ends (according to nlmsg_len) after the struct ovs_header. Signed-off-by: Rich Lane <rlane@bigswitch.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-02-23openvswitch: Fix ovs_vport_cmd_new return value on successRich Lane
If the pointer does not represent an error then the PTR_ERR macro may still return a nonzero value. Signed-off-by: Rich Lane <rlane@bigswitch.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-02-23openvswitch: Fix ovs_vport_cmd_del return value on successRich Lane
If the pointer does not represent an error then the PTR_ERR macro may still return a nonzero value. The fix is the same as in ovs_vport_cmd_set. Signed-off-by: Rich Lane <rlane@bigswitch.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-02-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Synchronize with 'net' in order to sort out some l2tp, wireless, and ipv6 GRE fixes that will be built on top of in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06net: adjust skb_gso_segment() for calling in rx pathCong Wang
skb_gso_segment() is almost always called in tx path, except for openvswitch. It calls this function when it receives the packet and tries to queue it to user-space. In this special case, the ->ip_summed check inside skb_gso_segment() is no longer true, as ->ip_summed value has different meanings on rx path. This patch adjusts skb_gso_segment() so that we can at least avoid such warnings on checksum. Cc: Jesse Gross <jesse@nicira.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-22openvswitch: Move LRO check from transmit to receive.Jesse Gross
The check for LRO packets was incorrectly put in the transmit path instead of on receive. Since this check is supposed to protect OVS (and other parts of the system) from packets that it cannot handle it is obviously not useful on egress. Therefore, this commit moves it back to the receive side. The primary problem that this caused is upcalls to userspace tried to segment the packet even though no segmentation information is available. This would later cause NULL pointer dereferences when skb_gso_segment() did nothing. Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-01-10openvswitch: Use FIELD_SIZEOF() in dp_init().YOSHIFUJI Hideaki / 吉藤英明
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-07ethtool: fix drvinfo strings set in driversJiri Pirko
Use strlcpy where possible to ensure the string is \0 terminated. Use always sizeof(string) instead of 32, ETHTOOL_BUSINFO_LEN and custom defines. Use snprintf instead of sprint. Remove unnecessary inits of ->fw_version Remove unnecessary inits of drvinfo struct. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-04net: remove unnecessary NET_ADDR_RANDOM "bitclean"Jiri Pirko
NET_ADDR_SET is set in dev_set_mac_address() no need to alter dev->addr_assign_type value in drivers. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-28openvswitch: Use RCU callback when detaching netdevices.Jesse Gross
Currently, each time a device is detached from an OVS datapath we call synchronize RCU before freeing associated data structures. However, if a bridge is deleted (which detaches all ports) when many devices are connected then there can be a long delay. This switches to use call_rcu() to group the cost together. Reported-by: Justin Pettit <jpettit@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-11-26openvswitch: add skb mark matching and set actionAnsis Atteka
This patch adds support for skb mark matching and set action. Signed-off-by: Ansis Atteka <aatteka@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-11-16net: openvswitch: use this_cpu_ptr per-cpu helperShan Wei
just use more faster this_cpu_ptr instead of per_cpu_ptr(p, smp_processor_id()); Signed-off-by: Shan Wei <davidshan@tencent.com> Reviewed-by: Christoph Lameter <cl@linux.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-11-13openvswitch: add ipv6 'set' actionAnsis Atteka
This patch adds ipv6 set action functionality. It allows to change traffic class, flow label, hop-limit, ipv6 source and destination address fields. Signed-off-by: Ansis Atteka <aatteka@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-11-02openvswitch: Process RARP packets with ethertype 0x8035 similar to ARP packets.Mehak Mahajan
With this commit, OVS will match the data in the RARP packets having ethertype 0x8035, in the same way as the data in the ARP packets. Signed-off-by: Mehak Mahajan <mmahajan@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-10-31openvswitch: Store flow key len if ARP opcode is not request or reply.Mehak Mahajan
We currently only extract the ARP payload if the opcode indicates that it is a request or reply. However, we also only set the key length in these situations even though it should still be possible to match on the opcode. There's no real reason to restrict the ARP opcode since all have the same format so this simply removes the check. Signed-off-by: Mehak Mahajan <mmahajan@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-10-30openvswitch: Print device when warning about over MTU packets.Jesse Gross
If an attempt is made to transmit a packet that is over the device's MTU then we log it using the datapath's name. However, it is much more helpful to use the device name instead. Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-09-18net/openvswitch/vport.c: Remove unecessary semicolonPeter Senna Tschudin
Found by http://coccinelle.lip6.fr/ Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: net/netfilter/nfnetlink_log.c net/netfilter/xt_LOG.c Rather easy conflict resolution, the 'net' tree had bug fixes to make sure we checked if a socket is a time-wait one or not and elide the logging code if so. Whereas on the 'net-next' side we are calculating the UID and GID from the creds using different interfaces due to the user namespace changes from Eric Biederman. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-10netlink: Rename pid to portid to avoid confusionEric W. Biederman
It is a frequent mistake to confuse the netlink port identifier with a process identifier. Try to reduce this confusion by renaming fields that hold port identifiers portid instead of pid. I have carefully avoided changing the structures exported to userspace to avoid changing the userspace API. I have successfully built an allyesconfig kernel with this change. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-04Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch
2012-09-04openvswitch: Increase maximum number of datapath ports.Pravin B Shelar
Use hash table to store ports of datapath. Allow 64K ports per switch. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-09-04openvswitch: Fix FLOW_BUFSIZE definition.Jesse Gross
The vlan encapsulation fields in the maximum flow defintion were never updated when the representation changed before upstreaming. In theory this could cause a kernel panic when a maximum length flow is used. In practice this has never happened (to my knowledge) because skb allocations are padded out to a cache line so you would need the right combination of flow and packet being sent to userspace. Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-09-02openvswitch: Fix typoJoe Stringer
Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-08-31openvswitch: using kfree_rcu() to simplify the codeWei Yongjun
The callback function of call_rcu() just calls a kfree(), so we can use kfree_rcu() instead of call_rcu() + callback function. spatch with a semantic match is used to found this problem. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-22openvswitch: Add support for network namespaces.Pravin B Shelar
Following patch adds support for network namespace to openvswitch. Since it must release devices when namespaces are destroyed, a side effect of this patch is that the module no longer keeps a refcount but instead cleans up any state when it is unloaded. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-08-06openvswitch: Relax set header validation.Jesse Gross
When installing a flow with an action to set a particular field we need to validate that the packets that are part of the flow actually contain that header. With IP we use zeroed addresses and with TCP/UDP the check is for zeroed ports. This check is overly broad and can catch packets like DHCP requests that have a zero source address in a legitimate header. This changes the check to look for a zeroed protocol number for IP or for both ports be zero for TCP/UDP before considering the header to not exist. Reported-by: Ethan Jackson <ethan@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-07-27Revert "openvswitch: potential NULL deref in sample()"Jesse Gross
This reverts commit 5b3e7e6cb5771bedda51cdb6f715d1da8cd9e644. The problem that the original commit was attempting to fix can never happen in practice because validation is done one a per-flow basis rather than a per-packet basis. Adding additional checks at runtime is unnecessary and inconsistent with the rest of the code. CC: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-23openvswitch: potential NULL deref in sample()Dan Carpenter
If there is no OVS_SAMPLE_ATTR_ACTIONS set then "acts_list" is NULL and it leads to a NULL dereference when we call nla_len(acts_list). This is a static checker fix, not something I have seen in testing. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-20Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch Jesse Gross says: ==================== A few bug fixes and small enhancements for net-next/3.6. ... Ansis Atteka (1): openvswitch: Do not send notification if ovs_vport_set_options() failed Ben Pfaff (1): openvswitch: Check gso_type for correct sk_buff in queue_gso_packets(). Jesse Gross (2): openvswitch: Enable retrieval of TCP flags from IPv6 traffic. openvswitch: Reset upper layer protocol info on internal devices. Leo Alterman (1): openvswitch: Fix typo in documentation. Pravin B Shelar (1): openvswitch: Check currect return value from skb_gso_segment() Raju Subramanian (1): openvswitch: Replace Nicira Networks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-20openvswitch: Check gso_type for correct sk_buff in queue_gso_packets().Ben Pfaff
At the point where it was used, skb_shinfo(skb)->gso_type referred to a post-GSO sk_buff. Thus, it would always be 0. We want to know the pre-GSO gso_type, so we need to obtain it before segmenting. Before this change, the kernel would pass inconsistent data to userspace: packets for UDP fragments with nonzero offset would be passed along with flow keys that indicate a zero offset (that is, the flow key for "later" fragments claimed to be "first" fragments). This inconsistency tended to confuse Open vSwitch userspace, causing it to log messages about "failed to flow_del" the flows with "later" fragments. Signed-off-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-07-20openvswitch: Check currect return value from skb_gso_segment()Pravin B Shelar
Fix return check typo. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-05-25openvswitch: Reset upper layer protocol info on internal devices.Jesse Gross
It's possible that packets that are sent on internal devices (from the OVS perspective) have already traversed the local IP stack. After they go through the internal device, they will again travel through the IP stack which may get confused by the presence of existing information in the skb. The problem can be observed when switching between namespaces. This clears out that information to avoid problems but deliberately leaves other metadata alone. This is to provide maximum flexibility in chaining together OVS and other Linux components. Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-05-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2012-05-15net: Convert net_ratelimit uses to net_<level>_ratelimitedJoe Perches
Standardize the net core ratelimited logging functions. Coalesce formats, align arguments. Change a printk then vprintk sequence to use printf extension %pV. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-13openvswitch: checking wrong variable in queue_userspace_packet()Dan Carpenter
"skb" is non-NULL here, for example we dereference it in skb_clone(). The intent was to test "nskb" which was just set. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-08openvswitch: Validation of IPv6 set port action uses IPv4 headerPravin B Shelar
When the kernel validates set TCP/UDP port actions, it looks at the ports in the existing flow to make sure that the L4 header exists. However, these actions always use the IPv4 version of the struct. Following patch fixes this by checking for flow ip protocol first. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-05-04openvswitch: Replace Nicira Networks.Raju Subramanian
Replaced all instances of Nicira Networks(, Inc) to Nicira, Inc. Signed-off-by: Raju Subramanian <rsubramanian@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-05-04openvswitch: Release rtnl_lock if ovs_vport_cmd_build_info() failed.Ansis Atteka
This patch fixes a possible lock-up bug where rtnl_lock might not get released. Signed-off-by: Ansis Atteka <aatteka@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-04-15net: cleanup unsigned to unsigned intEric Dumazet
Use of "unsigned int" is preferred to bare "unsigned" in net tree. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2012-04-09openvswitch: Do not send notification if ovs_vport_set_options() failedAnsis Atteka
There is no need to send a notification if ovs_vport_set_options() failed and ovs_vport_cmd_set() did not change anything. Signed-off-by: Ansis Atteka <aatteka@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-04-02openvswitch: Enable retrieval of TCP flags from IPv6 traffic.Jesse Gross
We currently check that a packet is IPv4 and TCP before fetching the TCP flags. This enables fetching from IPv6 packets as well. Reported-by: Michael Mao <mmao@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-04-02openvswitch: Add length check when retrieving TCP flags.Jesse Gross
When collecting TCP flags we check that the IP header indicates that a TCP header is present but not that the packet is actually long enough to contain the header. This adds a check to prevent reading off the end of the packet. In practice, this is only likely to result in reading of bad data and not a crash due to the presence of struct skb_shared_info at the end of the packet. Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-04-01openvswitch: Stop using NLA_PUT*().David S. Miller
These macros contain a hidden goto, and are thus extremely error prone and make code hard to audit. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-28Remove all #inclusions of asm/system.hDavid Howells
Remove all #inclusions of asm/system.h preparatory to splitting and killing it. Performed with the following command: perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *` Signed-off-by: David Howells <dhowells@redhat.com>