summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2012-09-22tcp: TCP Fast Open Server - call tcp_validate_incoming() for all packetsNeal Cardwell
A TCP Fast Open (TFO) passive connection must call both tcp_check_req() and tcp_validate_incoming() for all incoming ACKs that are attempting to complete the 3WHS. This is needed to parallel all the action that happens for a non-TFO connection, where for an ACK that is attempting to complete the 3WHS we call both tcp_check_req() and tcp_validate_incoming(). For example, upon receiving the ACK that completes the 3WHS, we need to call tcp_fast_parse_options() and update ts_recent based on the incoming timestamp value in the ACK. One symptom of the problem with the previous code was that for passive TFO connections using TCP timestamps, the outgoing TS ecr values ignored the incoming TS val value on the ACK that completed the 3WHS. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-22tcp: TCP Fast Open Server - note timestamps and retransmits for SYNACK RTTNeal Cardwell
Previously, when using TCP Fast Open a server would return from tcp_check_req() before updating snt_synack based on TCP timestamp echo replies and whether or not we've retransmitted the SYNACK. The result was that (a) for TFO connections using timestamps we used an incorrect baseline SYNACK send time (tcp_time_stamp of SYNACK send instead of rcv_tsecr), and (b) for TFO connections that do not have TCP timestamps but retransmit the SYNACK we took a SYNACK RTT sample when we should not take a sample. This fix merely moves the snt_synack update logic a bit earlier in the function, so that connections using TCP Fast Open will properly do these updates when the ACK for the SYNACK arrives. Moving this snt_synack update logic means that with TCP_DEFER_ACCEPT enabled we do a few instructions of wasted work on each bare ACK, but that seems OK. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-22tcp: TCP Fast Open Server - take SYNACK RTT after completing 3WHSNeal Cardwell
When taking SYNACK RTT samples for servers using TCP Fast Open, fix the code to ensure that we only call tcp_valid_rtt_meas() after we receive the ACK that completes the 3-way handshake. Previously we were always taking an RTT sample in tcp_v4_syn_recv_sock(). However, for TCP Fast Open connections tcp_v4_conn_req_fastopen() calls tcp_v4_syn_recv_sock() at the time we receive the SYN. So for TFO we must wait until tcp_rcv_state_process() to take the RTT sample. To fix this, we wait until after TFO calls tcp_v4_syn_recv_sock() before we set the snt_synack timestamp, since tcp_synack_rtt_meas() already ensures that we only take a SYNACK RTT sample if snt_synack is non-zero. To be careful, we only take a snt_synack timestamp when a SYNACK transmit or retransmit succeeds. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-22tcp: extract code to compute SYNACK RTTNeal Cardwell
In preparation for adding another spot where we compute the SYNACK RTT, extract this code so that it can be shared. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-22ptp: clarify the clock_name sysfs attributeRichard Cochran
There has been some confusion among PHC driver authors about the intended purpose of the clock_name attribute. This patch expands the documation in order to clarify how the clock_name field should be understood. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-22ptp: link the phc device to its parent deviceRichard Cochran
PTP Hardware Clock devices appear as class devices in sysfs. This patch changes the registration API to use the parent device, clarifying the clock's relationship to the underlying device. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-22ptp: provide the clock's adjusted frequencyRichard Cochran
If the timex.mode field indicates a query, then we provide the value of the current frequency adjustment. [ Get rid of extraneous empty lines -DaveM ] Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-22ptp: remember the adjusted frequencyRichard Cochran
This patch adds a field to the representation of a PTP hardware clock in order to remember the frequency adjustment value dialed by the user. Adding this field will let us answer queries in the manner of adjtimex in a follow on patch. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-22Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next Jeff Kirsher says: ==================== This series contains updates to igb only. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-22can: sja1000: Add support for listen-only mode and one-shot modeAndreas Larsson
One-shot mode uses the TCS bit of the status register to discern whether a transmission was successful or not. On a failed transmission, the frame is not echoed back. Signed-off-by: Andreas Larsson <andreas@gaisler.com> Acked-by: Wolfgang Grandegger <wg@grandegger.com> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2012-09-22igb: Use dma_unmap_addr and dma_unmap_len definesAlexander Duyck
This change is meant to improve performance on systems that do not require the DMA unmap calls. On those systems we do not need to make use of the unmap address for Tx or the unmap length so we can drop both thereby reducing the size of the Tx buffer info structure. In addition I have changed the logic to check for unmap length instead of unmap address when checking to see if a buffer needs to be unmapped from DMA use. The reasons for this change is that on some platforms it is possible to receive a valid DMA address of 0 from an IOMMU. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-09-22igb: Simplify how we populate the RSS keyAlexander Duyck
Instead of storing the RSS key as a character array we can simplify the configuration by making it a u32 array. This allows us to just write one value per register without any unnecessary operations to construct the value. This change will produce the same exact key, the only difference is that I translated the u8 array to a u32 array which will be correctly ordered on writes to hardware by the cpu_to_le32 operations that are built into the writel calls. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-09-22igb: Change how we populate the RSS indirection tableAlexander Duyck
This patch cleans up our RSS indirection table configuration so that we generate the same table regardless of CPU endianness. In addition it changes the table setup so that instead of doing a modulo based setup it is instead a divisor based setup. The advantage to this is that we should be able to take the Rx hash and compute the Rx queue with very little CPU overhead if needed. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-09-22igb: Change Tx cleanup loop to do/while instead of forAlexander Duyck
This change makes it so that Tx cleanup is done in a do/while loop instead of a for loop. The main motivation behind this is the fact that we should never be invoked with a budget less than 1 so we can skip checking the budget before processing the first descriptor. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-09-22igb: Remove logic that was doing NUMA pseudo-aware allocationsAlexander Duyck
This change removes the code that was doing the NUMA allocations for the q_vectors, rings, and ring resources. The problem is the logic used assumed that the NUMA nodes were always interleved and that is not always the case. At some point I hope to add this functionality back in a more controlled manner in the future. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-09-22igb: Fix stats output on i210/i211 parts.Carolyn Wyborny
Due to a hardware issue, on i210 and i211 parts, the TNCRS statistic provides an invalid value. This patch changes the update stats function to increment the stat only for non-i210/i211 parts. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-09-22igb: Change how we check for pre-existing and assigned VFsStefan Assmann
Adapt the pre-existing and assigned VFs code to the ixgbe way introduced in commit 9297127b9cdd8d30c829ef5fd28b7cc0323a7bcd. Instead of searching the enabled VFs we use pci_num_vf to determine enabled VFs. By comparing to which PF an assigned VF is owned it's possible to decide whether to leave it enabled or not. Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Acked-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Robert Garrett <robertx.e.garrett@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-09-21can: mscan-mpc5xxx: fix return value check in mpc512x_can_get_clock()Wei Yongjun
In case of error, the function clk_get() returns ERR_PTR() and never returns NULL pointer. The NULL test in the error handling should be replaced with IS_ERR(). dpatch engine is used to auto generated this patch. (https://github.com/weiyj/dpatch) Cc: stable <stable@vger.kernel.org> Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2012-09-21can: usb: peak: rename peak_usb dump_mem functionRandy Dunlap
Rename generic-sounding function dump_mem() to pcan_dump_mem() so that it does not conflict with the dump_mem() function in arch/sh/include/asm/kdebug.h. drivers/net/can/usb/peak_usb/pcan_usb_core.c: error: conflicting types for 'dump_mem': => 56:6 drivers/net/can/usb/peak_usb/pcan_usb_core.h: error: conflicting types for 'dump_mem': => 134:6 Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Stephane Grosjean <s.grosjean@peak-system.com> Cc: Wolfgang Grandegger <wg@grandegger.com> Cc: Marc Kleine-Budde <mkl@pengutronix.de> [mkl: convert all users of dump_mem(), too] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2012-09-21can: c_can: Adopt pinctrl supportAnilKumar Ch
Adopt pinctrl support to c_can driver based on c_can device pointer, pinctrl driver configure SoC pins to d_can mode according to definitions provided in .dts file. In device specific device tree file 'pinctrl-names = "default";' and 'pinctrl-0 = <&d_can1_pins>;' needs to add to configure pins from c_can driver. d_can1_pins node contains the pinmux/config details of d_can L/H pins. Signed-off-by: AnilKumar Ch <anilkumar@ti.com> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2012-09-21can: c_can: Add d_can suspend resume supportAnilKumar Ch
Adds suspend resume support to DCAN driver which enables DCAN power down mode bit (PDR). Then DCAN will ack the local power-down mode by setting PDA bit in STATUS register. Signed-off-by: AnilKumar Ch <anilkumar@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2012-09-21can: c_can: Add runtime PM support to Bosch C_CAN/D_CAN controllerAnilKumar Ch
Add Runtime PM support to C_CAN/D_CAN controller. The runtime PM APIs control clocks for C_CAN/D_CAN IP and prevent access to the register of C_CAN/D_CAN IP when clock is turned off. Signed-off-by: AnilKumar Ch <anilkumar@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2012-09-21can: c_can: Add device tree support to Bosch C_CAN/D_CAN controllerAnilKumar Ch
Add device tree support to C_CAN/D_CAN controller and usage details are added to device tree documentation. Driver was tested on AM335x EVM. Signed-off-by: AnilKumar Ch <anilkumar@ti.com> For the of binding doc: Reviewed-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2012-09-21can: c_can: Modify c_can device namesAnilKumar Ch
Modify c_can device names from *_CAN_DEVTYPE to BOSCH_*_CAN to make use of same names for array indexes in c_can_id_table[] as well as device names. This patch also add indexes to c_can_id_table array. Signed-off-by: AnilKumar Ch <anilkumar@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2012-09-21netlink: use <linux/export.h> instead of <linux/module.h>Pablo Neira Ayuso
Since (9f00d97 netlink: hide struct module parameter in netlink_kernel_create), linux/netlink.h includes linux/module.h because of the use of THIS_MODULE. Use linux/export.h instead, as suggested by Stephen Rothwell, which is significantly smaller and defines THIS_MODULES. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-21sunbmac: Remove unused local variable.David S. Miller
Commit eb716c54b1c71ad28ab20461bff831bd481066c4 ("sunbmac: remove unnecessary setting of skb->dev") caused the local varible 'dev' in bigmac_init_rings to become unused. And now the compiler warns about it. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-21team: send port changed when addedJiri Pirko
On some hw, link is not up during adding iface to team. That causes event not being sent to userspace and that may cause confusion. Fix this bug by sending port changed event once it's added to team. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-21ipconfig: add nameserver IPs to kernel-parameter ip=Christoph Fritz
On small systems (e.g. embedded ones) IP addresses are often configured by bootloaders and get assigned to kernel via parameter "ip=". If set to "ip=dhcp", even nameserver entries from DHCP daemons are handled. These entries exported in /proc/net/pnp are commonly linked by /etc/resolv.conf. To configure nameservers for networks without DHCP, this patch adds option <dns0-ip> and <dns1-ip> to kernel-parameter 'ip='. Signed-off-by: Christoph Fritz <chf.fritz@googlemail.com> Tested-by: Jan Weitzel <j.weitzel@phytec.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-21net: qmi_wwan: adding Huawei E367, ZTE MF683 and Pantech P4200Bjørn Mork
One of the modes of Huawei E367 has this QMI/wwan interface: I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=01 Prot=07 Driver=(none) E: Ad=83(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms Huawei use subclass and protocol to identify vendor specific functions, so adding a new vendor rule for this combination. The Pantech devices UML290 (106c:3718) and P4200 (106c:3721) use the same subclass to identify the QMI/wwan function. Replace the existing device specific UML290 entries with generic vendor matching, adding support for the Pantech P4200. The ZTE MF683 has 6 vendor specific interfaces, all using ff/ff/ff for cls/sub/prot. Adding a match on interface #5 which is a QMI/wwan interface. Cc: Fangxiaozhi (Franko) <fangxiaozhi@huawei.com> Cc: Thomas Schäfer <tschaefer@t-online.de> Cc: Dan Williams <dcbw@redhat.com> Cc: Shawn J. Goff <shawn7400@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-21l2tp: fix compile error when CONFIG_IPV6=m and CONFIG_L2TP=yAmerigo Wang
When CONFIG_IPV6=m and CONFIG_L2TP=y, I got the following compile error: LD init/built-in.o net/built-in.o: In function `l2tp_xmit_core': l2tp_core.c:(.text+0x147781): undefined reference to `inet6_csk_xmit' net/built-in.o: In function `l2tp_tunnel_create': (.text+0x149067): undefined reference to `udpv6_encap_enable' net/built-in.o: In function `l2tp_ip6_recvmsg': l2tp_ip6.c:(.text+0x14e991): undefined reference to `ipv6_recv_error' net/built-in.o: In function `l2tp_ip6_sendmsg': l2tp_ip6.c:(.text+0x14ec64): undefined reference to `fl6_sock_lookup' l2tp_ip6.c:(.text+0x14ed6b): undefined reference to `datagram_send_ctl' l2tp_ip6.c:(.text+0x14eda0): undefined reference to `fl6_sock_lookup' l2tp_ip6.c:(.text+0x14ede5): undefined reference to `fl6_merge_options' l2tp_ip6.c:(.text+0x14edf4): undefined reference to `ipv6_fixup_options' l2tp_ip6.c:(.text+0x14ee5d): undefined reference to `fl6_update_dst' l2tp_ip6.c:(.text+0x14eea3): undefined reference to `ip6_dst_lookup_flow' l2tp_ip6.c:(.text+0x14eee7): undefined reference to `ip6_dst_hoplimit' l2tp_ip6.c:(.text+0x14ef8b): undefined reference to `ip6_append_data' l2tp_ip6.c:(.text+0x14ef9d): undefined reference to `ip6_flush_pending_frames' l2tp_ip6.c:(.text+0x14efe2): undefined reference to `ip6_push_pending_frames' net/built-in.o: In function `l2tp_ip6_destroy_sock': l2tp_ip6.c:(.text+0x14f090): undefined reference to `ip6_flush_pending_frames' l2tp_ip6.c:(.text+0x14f0a0): undefined reference to `inet6_destroy_sock' net/built-in.o: In function `l2tp_ip6_connect': l2tp_ip6.c:(.text+0x14f14d): undefined reference to `ip6_datagram_connect' net/built-in.o: In function `l2tp_ip6_bind': l2tp_ip6.c:(.text+0x14f4fe): undefined reference to `ipv6_chk_addr' net/built-in.o: In function `l2tp_ip6_init': l2tp_ip6.c:(.init.text+0x73fa): undefined reference to `inet6_add_protocol' l2tp_ip6.c:(.init.text+0x740c): undefined reference to `inet6_register_protosw' net/built-in.o: In function `l2tp_ip6_exit': l2tp_ip6.c:(.exit.text+0x1954): undefined reference to `inet6_unregister_protosw' l2tp_ip6.c:(.exit.text+0x1965): undefined reference to `inet6_del_protocol' net/built-in.o:(.rodata+0xf2d0): undefined reference to `inet6_release' net/built-in.o:(.rodata+0xf2d8): undefined reference to `inet6_bind' net/built-in.o:(.rodata+0xf308): undefined reference to `inet6_ioctl' net/built-in.o:(.data+0x1af40): undefined reference to `ipv6_setsockopt' net/built-in.o:(.data+0x1af48): undefined reference to `ipv6_getsockopt' net/built-in.o:(.data+0x1af50): undefined reference to `compat_ipv6_setsockopt' net/built-in.o:(.data+0x1af58): undefined reference to `compat_ipv6_getsockopt' make: *** [vmlinux] Error 1 This is due to l2tp uses symbols from IPV6, so when IPV6 is a module, l2tp is not allowed to be builtin. Cc: David Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-21Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next Jeff Kirsher says: ==================== This series contains updates to igb and ixgbevf. v2: updated patch description in 04 patch (ixgbevf: scheduling while atomic in reset hw path) ... Akeem G. Abodunrin (1): igb: Support to enable EEE on all eee_supported devices Alexander Duyck (2): igb: Remove artificial restriction on RQDPC stat reading ixgbevf: Add support for VF API negotiation John Fastabend (1): ixgbevf: scheduling while atomic in reset hw path ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-21net1080: Neaten netdev_dbg useJoe Perches
Remove unnecessary temporary variable and #ifdef DEBUG block. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-20USB: remove dbg() usage in USB networking driversGreg Kroah-Hartman
The dbg() USB macro is so old, it predates me. The USB networking drivers are the last hold-out using this macro, and we want to get rid of it, so replace the usage of it with the proper netdev_dbg() or dev_dbg() (depending on the context) calls. Some places we end up using a local variable for the debug call, so also convert the other existing dev_* calls to use it as well, to save tiny amounts of code space. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-20tcp: Document use of undefined variable.Alan Cox
Both tcp_timewait_state_process and tcp_check_req use the same basic construct of struct tcp_options received tmp_opt; tmp_opt.saw_tstamp = 0; then call tcp_parse_options However if they are fed a frame containing a TCP_SACK then tbe code behaviour is undefined because opt_rx->sack_ok is undefined data. This ought to be documented if it is intentional. Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-20ipv4: Don't add TCP-code in inet_sock_destructChristoph Paasch
Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Acked-by: H.K. Jerry Chu <hkchu@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-20IB/ipoib: Add rtnl_link_ops supportOr Gerlitz
Add rtnl_link_ops to IPoIB, with the first usage being child device create/delete through them. Childs devices are now either legacy ones, created/deleted through the ipoib sysfs entries, or RTNL ones. Adding support for RTNL childs involved refactoring of ipoib_vlan_add which is now used by both the sysfs and the link_ops code. Also, added ndo_uninit entry to support calling unregister_netdevice_queue from the rtnl dellink entry. This required removal of calls to ipoib_dev_cleanup from the driver in flows which use unregister_netdevice, since the networking core will invoke ipoib_uninit which does exactly that. Signed-off-by: Erez Shitrit <erezsh@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-20Merge branch 'for-davem' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next Ben Hutchings says: ==================== 1. Extension to PPS/PTP to allow for PHC devices where pulses are subject to a variable but measurable delay. 2. PPS/PTP/PHC support for Solarflare boards with a timestamping peripheral. 3. MTD support for updating the timestamping peripheral on those boards. 4. Fix for potential over-length requests to firmware. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-20ixgbevf: scheduling while atomic in reset hw pathJohn Fastabend
In ixgbevf_reset_hw_vf() msleep is called while holding mbx_lock resulting in a schedule while atomic bug with trace below. This patch uses mdelay instead. BUG: scheduling while atomic: ip/6539/0x00000002 2 locks held by ip/6539: #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff81419cc3>] rtnl_lock+0x17/0x19 #1: (&(&adapter->mbx_lock)->rlock){+.+...}, at: [<ffffffffa0030855>] ixgbevf_reset+0x30/0xc1 [ixgbevf] Modules linked in: ixgbevf ixgbe mdio libfc scsi_transport_fc 8021q scsi_tgt garp stp llc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 uinput igb coretemp hwmon crc32c_intel ioatdma i2c_i801 shpchp microcode lpc_ich mfd_core i2c_core joydev dca pcspkr serio_raw pata_acpi ata_generic usb_storage pata_jmicron Pid: 6539, comm: ip Not tainted 3.6.0-rc3jk-net-next+ #104 Call Trace: [<ffffffff81072202>] __schedule_bug+0x6a/0x79 [<ffffffff814bc7e0>] __schedule+0xa2/0x684 [<ffffffff8108f85f>] ? trace_hardirqs_off+0xd/0xf [<ffffffff814bd0c0>] schedule+0x64/0x66 [<ffffffff814bb5e2>] schedule_timeout+0xa6/0xca [<ffffffff810536b9>] ? lock_timer_base+0x52/0x52 [<ffffffff812629e0>] ? __udelay+0x15/0x17 [<ffffffff814bb624>] schedule_timeout_uninterruptible+0x1e/0x20 [<ffffffff810541c0>] msleep+0x1b/0x22 [<ffffffffa002e723>] ixgbevf_reset_hw_vf+0x90/0xe5 [ixgbevf] [<ffffffffa0030860>] ixgbevf_reset+0x3b/0xc1 [ixgbevf] [<ffffffffa0032fba>] ixgbevf_open+0x43/0x43e [ixgbevf] [<ffffffff81409610>] ? dev_set_rx_mode+0x2e/0x33 [<ffffffff8140b0f1>] __dev_open+0xa0/0xe5 [<ffffffff814097ed>] __dev_change_flags+0xbe/0x142 [<ffffffff8140b01c>] dev_change_flags+0x21/0x56 [<ffffffff8141a843>] do_setlink+0x2e2/0x7f4 [<ffffffff81016e36>] ? native_sched_clock+0x37/0x39 [<ffffffff8141b0ac>] rtnl_newlink+0x277/0x4bb [<ffffffff8141aee9>] ? rtnl_newlink+0xb4/0x4bb [<ffffffff812217d1>] ? selinux_capable+0x32/0x3a [<ffffffff8104fb17>] ? ns_capable+0x4f/0x67 [<ffffffff81419cc3>] ? rtnl_lock+0x17/0x19 [<ffffffff81419f28>] rtnetlink_rcv_msg+0x236/0x253 [<ffffffff81419cf2>] ? rtnetlink_rcv+0x2d/0x2d [<ffffffff8142fd42>] netlink_rcv_skb+0x43/0x94 [<ffffffff81419ceb>] rtnetlink_rcv+0x26/0x2d [<ffffffff8142faf1>] netlink_unicast+0xee/0x174 [<ffffffff81430327>] netlink_sendmsg+0x26a/0x288 [<ffffffff813fb04f>] ? rcu_read_unlock+0x56/0x67 [<ffffffff813f5e6d>] __sock_sendmsg_nosec+0x58/0x61 [<ffffffff813f81b7>] __sock_sendmsg+0x3d/0x48 [<ffffffff813f8339>] sock_sendmsg+0x6e/0x87 [<ffffffff81107c9f>] ? might_fault+0xa5/0xac [<ffffffff81402a72>] ? copy_from_user+0x2a/0x2c [<ffffffff81402e62>] ? verify_iovec+0x54/0xaa [<ffffffff813f9834>] __sys_sendmsg+0x206/0x288 [<ffffffff810694fa>] ? up_read+0x23/0x3d [<ffffffff811307e5>] ? fcheck_files+0xac/0xea [<ffffffff8113095e>] ? fget_light+0x3a/0xb9 [<ffffffff813f9a2e>] sys_sendmsg+0x42/0x60 [<ffffffff814c5ba9>] system_call_fastpath+0x16/0x1b CC: Eric Dumazet <edumazet@google.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-By: Robert Garrett <robertx.e.garrett@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-09-20ixgbevf: Add support for VF API negotiationAlexander Duyck
This change makes it so that the VF can support the PF/VF API negotiation protocol. Specifically in this case we are adding support for API 1.0 which will mean that the VF is capable of cleaning up buffers that span multiple descriptors without triggering an error. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-09-20igb: Support to enable EEE on all eee_supported devicesAkeem G. Abodunrin
Current implementation enables EEE on only i350 device. This patch enables EEE on all eee_supported devices. Also, configured LPI clock to keep running before EEE is enabled on i210 and i211 devices. Signed-off-by: Akeem G. Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-09-20igb: Remove artificial restriction on RQDPC stat readingAlexander Duyck
For some reason the reading of the RQDPC register was being artificially limited to 4K. Instead of limiting the value we should read the value and add the full amount. Otherwise this can lead to a misleading number of dropped packets when the actual value is in fact much higher. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-09-19r8169: use unlimited DMA burst for TXMichal Schmidt
The r8169 driver currently limits the DMA burst for TX to 1024 bytes. I have a box where this prevents the interface from using the gigabit line to its full potential. This patch solves the problem by setting TX_DMA_BURST to unlimited. The box has an ASRock B75M motherboard with on-board RTL8168evl/8111evl (XID 0c900880). TSO is enabled. I used netperf (TCP_STREAM test) to measure the dependency of TX throughput on MTU. I did it for three different values of TX_DMA_BURST ('5'=512, '6'=1024, '7'=unlimited). This chart shows the results: http://michich.fedorapeople.org/r8169/r8169-effects-of-TX_DMA_BURST.png Interesting points: - With the current DMA burst limit (1024): - at the default MTU=1500 I get only 842 Mbit/s. - when going from small MTU, the performance rises monotonically with increasing MTU only up to a peak at MTU=1076 (908 MBit/s). Then there's a sudden drop to 762 MBit/s from which the throughput rises monotonically again with further MTU increases. - With a smaller DMA burst limit (512): - there's a similar peak at MTU=1076 and another one at MTU=564. - With unlimited DMA burst: - at the default MTU=1500 I get nice 940 Mbit/s. - the throughput rises monotonically with increasing MTU with no strange peaks. Notice that the peaks occur at MTU sizes that are multiples of the DMA burst limit plus 52. Why 52? Because: 20 (IP header) + 20 (TCP header) + 12 (TCP options) = 52 The Realtek-provided r8168 driver (v8.032.00) uses unlimited TX DMA burst too, except for CFG_METHOD_1 where the TX DMA burst is set to 512 bytes. CFG_METHOD_1 appears to be the oldest MAC version of "RTL8168B/8111B", i.e. RTL_GIGA_MAC_VER_11 in r8169. Not sure if this MAC version really needs the smaller burst limit, or if any other versions have similar requirements. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-19ipv6: unify fragment thresh handling codeAmerigo Wang
Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Michal Kubeček <mkubecek@suse.cz> Cc: David Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-19ipv6: make ip6_frag_nqueues() and ip6_frag_mem() static inlineAmerigo Wang
Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Michal Kubeček <mkubecek@suse.cz> Cc: David Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-19ipv6: unify conntrack reassembly expire code with standard oneAmerigo Wang
Two years ago, Shan Wei tried to fix this: http://patchwork.ozlabs.org/patch/43905/ The problem is that RFC2460 requires an ICMP Time Exceeded -- Fragment Reassembly Time Exceeded message should be sent to the source of that fragment, if the defragmentation times out. " If insufficient fragments are received to complete reassembly of a packet within 60 seconds of the reception of the first-arriving fragment of that packet, reassembly of that packet must be abandoned and all the fragments that have been received for that packet must be discarded. If the first fragment (i.e., the one with a Fragment Offset of zero) has been received, an ICMP Time Exceeded -- Fragment Reassembly Time Exceeded message should be sent to the source of that fragment. " As Herbert suggested, we could actually use the standard IPv6 reassembly code which follows RFC2460. With this patch applied, I can see ICMP Time Exceeded sent from the receiver when the sender sent out 3/4 fragmented IPv6 UDP packet. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Michal Kubeček <mkubecek@suse.cz> Cc: David Miller <davem@davemloft.net> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: netfilter-devel@vger.kernel.org Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-19ipv6: add a new namespace for nf_conntrack_reasmAmerigo Wang
As pointed by Michal, it is necessary to add a new namespace for nf_conntrack_reasm code, this prepares for the second patch. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Michal Kubeček <mkubecek@suse.cz> Cc: David Miller <davem@davemloft.net> Cc: Patrick McHardy <kaber@trash.net> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: netfilter-devel@vger.kernel.org Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-19netpoll: call ->ndo_select_queue() in tx pathAmerigo Wang
In netpoll tx path, we miss the chance of calling ->ndo_select_queue(), thus could cause problems when bonding is involved. This patch makes dev_pick_tx() extern (and rename it to netdev_pick_tx()) to let netpoll call it in netpoll_send_skb_on_dev(). Reported-by: Sylvain Munaut <s.munaut@whatever-company.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Cong Wang <amwang@redhat.com> Tested-by: Sylvain Munaut <s.munaut@whatever-company.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-19netdev: make address const in device address managementstephen hemminger
The internal functions for add/deleting addresses don't change their argument. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-19i825xx: znet: fix compiler warnings when building a 64-bit kernelMika Westerberg
When building 64-bit kernel with this driver we get following warnings from the compiler: drivers/net/ethernet/i825xx/znet.c: In function ‘hardware_init’: drivers/net/ethernet/i825xx/znet.c:863:29: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] drivers/net/ethernet/i825xx/znet.c:870:29: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] Fix these by calling isa_virt_to_bus() before passing the pointers to set_dma_addr(). Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-19gre: add GSO supportEric Dumazet
Add GSO support to GRE tunnels. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>