summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2015-12-24IB/mlx5: Advertise atomic capabilities in query deviceEran Ben Elisha
In order to ensure IB spec atomic correctness in atomic operations, if HW is configured to host endianness, advertise IB_ATOMIC_HCA. if not, advertise IB_ATOMIC_NONE. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-24net/mlx5_core: Add setting ATOMIC endian modeEran Ben Elisha
HW is capable of 2 requestor endianness modes for standard 8 Bytes atomic: BE (0x0) and host endianness (0x1). Read the supported modes from hca atomic capabilities and configure HW to host endianness mode if supported. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-24IB/mlx5: Add driver cross-channel supportLeon Romanovsky
Add support of cross-channel functionality to mlx5 driver. This includes ability to ignore overrun for CQ which intended for cross-channel, export device capability and configure the QP to be sync master/slave queues. The cross-channel enabled QP supports combination of three possible properties: * WQE processing on the receive queue of this QP * WQE processing on the send queue of this QP * WQE are supported on the send queue Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-24IB/core: Add cross-channel supportLeon Romanovsky
The cross-channel feature allows to execute WQEs that involve synchronization of I/O operations’ on different QPs. This capability enables to program complex flows with a single function call, hereby significantly reducing overhead associated with I/O processing. Cross-channel operations support is indicated by HCA capability information. The queue pairs can be configured to work as a “sync master queue” or “sync slave queues”. The added flags are: 1. Device capability flag IB_DEVICE_CROSS_CHANNEL for the devices that can perform cross-channel operations. 2. CQ property flag IB_CQ_FLAGS_IGNORE_OVERRUN to disable CQ overrun check. This check is useless in cross-channel scenario. 3. QP property flags to indicate if queues are slave or master: * IB_QP_CREATE_MANAGED_SEND indicates that posted send work requests will not be executed immediately and requires enabling. * IB_QP_CREATE_MANAGED_RECV indicates that posted receive work requests will not be executed immediately and requires enabling. * IB_QP_CREATE_CROSS_CHANNEL declares the QP to work in cross-channel mode. If IB_QP_CREATE_MANAGED_SEND and IB_QP_CREATE_MANAGED_RECV are not provided, this QP will be sync master queue, else it will be sync slave. Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-24IB/core: Align coding style of ib_device_cap_flags structureLeon Romanovsky
Modify enum ib_device_cap_flags such that other patches which add new enum values pass strict checkpatch.pl checks. Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-24IB/mlx5: Add hca_core_clock_offset to udata in init_ucontextMatan Barak
Pass hca_core_clock_offset to user-space is mandatory in order to let the user-space read the free-running clock register from the right offset in the memory mapped page. Passing this value is done by changing the vendor's command and response of init_ucontext to be in extensible form. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Moshe Lazer <moshel@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-24IB/mlx5: Add support for hca_core_clock and timestamp_maskMatan Barak
Reporting the hca_core_clock (in kHZ) and the timestamp_mask in query_device extended verb. timestamp_mask is used by users in order to know what is the valid range of the raw timestamps, while hca_core_clock reports the clock frequency that is used for timestamps. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Moshe Lazer <moshel@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-24IB/core: Add ib_is_udata_clearedMatan Barak
Extending core and vendor verb commands require us to check that the unknown part of the user's given command is all zeros. Adding ib_is_udata_cleared in order to do so. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/core: Display extended counter set if availableChristoph Lameter
Check if the extended counters are available and if so create the proper extended and additional counters. Signed-off-by: Christoph Lameter <cl@linux.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB: remove the write-only usecnt field from struct ib_mrChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@sandisk.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB: remove the struct ib_phys_buf definitionChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> [core] Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB: remove in-kernel support for memory windowsChristoph Hellwig
Remove the unused ib_allow_mw and ib_bind_mw functions, remove the unused IB_WR_BIND_MW and IB_WC_BIND_MW opcodes and move ib_dealloc_mw into the uverbs module. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> [core] Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB: remove support for phys MRsChristoph Hellwig
We have stopped using phys MRs in the kernel a while ago, so let's remove all the cruft used to implement them. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> [core] Reviewed-By: Devesh Sharma<devesh.sharma@avagotech.com> [ocrdma] Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB: remove ib_query_mrChristoph Hellwig
This functionality has no users and was only supported by the staged out EHCA driver. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> [core] Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB: start documenting device capabilitiesChristoph Hellwig
Just IB_DEVICE_LOCAL_DMA_LKEY and IB_DEVICE_MEM_MGT_EXTENSIONS for now as I'm most familar with those. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-By: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/mlx5: Add RoCE fields to Address VectorAchiad Shochat
Set the address handle and QP address path fields according to the link layer type (IB/Eth). Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/mlx5: Support IB device's callbacks for adding/deleting GIDsAchiad Shochat
These callbacks write into the mlx5 RoCE address table. Upon del_gid we write a zero'd GID. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/mlx5: Set network_hdr_type upon RoCE responder completionAchiad Shochat
When handling a responder completion, if the link layer is Ethernet, set the work completion network_hdr_type field according to CQE's info and the IB_WC_WITH_NETWORK_HDR_TYPE flag. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/mlx5: Extend query_device/port to support RoCEAchiad Shochat
Using the vport access functions to retrieve the Ethernet specific information and return this information in ib_query_device and ib_query_port. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23net/mlx5_core: Introduce access functions to query vport RoCE fieldsAchiad Shochat
Introduce access functions to query NIC vport system_image_guid, node_guid and qkey_viol_cntr. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23net/mlx5_core: Introduce access functions to enable/disable RoCEAchiad Shochat
A mlx5 Ethernet port must be explicitly enabled for RoCE. When RoCE is not enabled on the port, the NIC will refuse to create QPs attached to it and incoming RoCE packets will be considered by the NIC as plain Ethernet packets. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/cma: Join and leave multicast groups with IGMPMoni Shoua
Since RoCEv2 is a protocol over IP header it is required to send IGMP join and leave requests to the network when joining and leaving multicast groups. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/core: Initialize UD header structure with IP and UDP headersMoni Shoua
ib_ud_header_init() is used to format InfiniBand headers in a buffer up to (but not with) BTH. For RoCE UDP ENCAP it is required that this function would be able to build also IP and UDP headers. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/core: Validate route when we init ahMatan Barak
In order to make sure API users don't try to use SGIDs which don't conform to the routing table, validate the route before searching the RoCE GID table. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/core: Add rdma_network_type to wcSomnath Kotur
Providers should tell IB core the wc's network type. This is used in order to search for the proper GID in the GID table. When using HCAs that can't provide this info, IB core tries to deep examine the packet and extract the GID type by itself. We choose sgid_index and type from all the matching entries in RDMA-CM based on hint from the IP stack and we set hop_limit for the IP packet based on above hint from IP stack. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Somnath Kotur <Somnath.Kotur@Avagotech.Com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/core: Add ROCE_UDP_ENCAP (RoCE V2) typeMatan Barak
Adding RoCE v2 GID type and port type. Vendors which support this type will get their GID table populated with RoCE v2 GIDs automatically. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/core: Add gid_type to gid attributeMatan Barak
In order to support multiple GID types, we need to store the gid_type with each GID. This is also aligned with the RoCE v2 annex "RoCEv2 PORT GID table entries shall have a "GID type" attribute that denotes the L3 Address type". The currently supported GID is IB_GID_TYPE_IB which is also RoCE v1 GID type. This implies that gid_type should be added to roce_gid_table meta-data. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-22Merge branches '4.5/Or-cleanup' and '4.5/rdma-cq' into k.o/for-4.5Doug Ledford
Signed-off-by: Doug Ledford <dledford@redhat.com> Conflicts: drivers/infiniband/ulp/iser/iser_verbs.c
2015-12-22IB/core: Remove ib_query_deviceOr Gerlitz
The copy of the attributes present on the device is now used by all consumers except for uverbs in case of serving user-space query, where dev->query_device is called. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-22IB/core: Save the device attributes on the device structureIra Weiny
This way both the IB core and upper level drivers can access these cached device attributes rather than querying or caching them on their own. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-19Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three fixes this time, two in SES picked up by KASAN for various types of buffer overrun. The first is a USB array which returns page 8 whatever is asked for and causes us to overrun with incorrect data format assumptions and the second is an invalid iteration of page 10 (the additional information page). The final fix is a reversion of a NULL deref fix which caused suspend/resume not to be called in pairs leading to incorrect device operation (Jens has queued a more proper fix for the problem in block)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: ses: fix additional element traversal bug Revert "SCSI: Fix NULL pointer dereference in runtime PM" ses: Fix problems with simple enclosures
2015-12-18Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "Three patches" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: include/linux/mmdebug.h: should include linux/bug.h mm/zswap: change incorrect strncmp use to strcmp proc: fix -ESRCH error when writing to /proc/$pid/coredump_filter
2015-12-18include/linux/mmdebug.h: should include linux/bug.hJames Morse
mmdebug.h uses BUILD_BUG_ON_INVALID(), assuming someone else included linux/bug.h. Include it ourselves. This saves build-failures such as: arch/arm64/include/asm/pgtable.h: In function 'set_pte_at': arch/arm64/include/asm/pgtable.h:281:3: error: implicit declaration of function 'BUILD_BUG_ON_INVALID' [-Werror=implicit-function-declaration] VM_WARN_ONCE(!pte_young(pte), Fixes: 02602a18c32d7 ("bug: completely remove code generated by disabled VM_BUG_ON()") Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-18Merge tag 'for-linus-4.4-rc5-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen bug fixes from David Vrabel: - XSA-155 security fixes to backend drivers. - XSA-157 security fixes to pciback. * tag 'for-linus-4.4-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen-pciback: fix up cleanup path when alloc fails xen/pciback: Don't allow MSI-X ops if PCI_COMMAND_MEMORY is not set. xen/pciback: For XEN_PCI_OP_disable_msi[|x] only disable if device has MSI(X) enabled. xen/pciback: Do not install an IRQ handler for MSI interrupts. xen/pciback: Return error on XEN_PCI_OP_enable_msix when device has MSI or MSI-X enabled xen/pciback: Return error on XEN_PCI_OP_enable_msi when device has MSI or MSI-X enabled xen/pciback: Save xen_pci_op commands before processing it xen-scsiback: safely copy requests xen-blkback: read from indirect descriptors only once xen-blkback: only read request operation from shared ring once xen-netback: use RING_COPY_REQUEST() throughout xen-netback: don't use last request to determine minimum Tx credit xen: Add RING_COPY_REQUEST() xen/x86/pvh: Use HVM's flush_tlb_others op xen: Resume PMU from non-atomic context xen/events/fifo: Consume unprocessed events when a CPU dies
2015-12-18xen: Add RING_COPY_REQUEST()David Vrabel
Using RING_GET_REQUEST() on a shared ring is easy to use incorrectly (i.e., by not considering that the other end may alter the data in the shared ring while it is being inspected). Safe usage of a request generally requires taking a local copy. Provide a RING_COPY_REQUEST() macro to use instead of RING_GET_REQUEST() and an open-coded memcpy(). This takes care of ensuring that the copy is done correctly regardless of any possible compiler optimizations. Use a volatile source to prevent the compiler from reordering or omitting the copy. This is part of XSA155. CC: stable@vger.kernel.org Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2015-12-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix uninitialized variable warnings in nfnetlink_queue, a lot of people reported this... From Arnd Bergmann. 2) Don't init mutex twice in i40e driver, from Jesse Brandeburg. 3) Fix spurious EBUSY in rhashtable, from Herbert Xu. 4) Missing DMA unmaps in mvpp2 driver, from Marcin Wojtas. 5) Fix race with work structure access in pppoe driver causing corruptions, from Guillaume Nault. 6) Fix OOPS due to sh_eth_rx() not checking whether netdev_alloc_skb() actually succeeded or not, from Sergei Shtylyov. 7) Don't lose flags when settifn IFA_F_OPTIMISTIC in ipv6 code, from Bjørn Mork. 8) VXLAN_HD_RCO defined incorrectly, fix from Jiri Benc. 9) Fix clock source used for cookies in SCTP, from Marcelo Ricardo Leitner. 10) aurora driver needs HAS_DMA dependency, from Geert Uytterhoeven. 11) ndo_fill_metadata_dst op of vxlan has to handle ipv6 tunneling properly as well, from Jiri Benc. 12) Handle request sockets properly in xfrm layer, from Eric Dumazet. 13) Double stats update in ipv6 geneve transmit path, fix from Pravin B Shelar. 14) sk->sk_policy[] needs RCU protection, and as a result xfrm_policy_destroy() needs to free policies using an RCU grace period, from Eric Dumazet. 15) SCTP needs to clone ipv6 tx options in order to avoid use after free, from Eric Dumazet. 16) Missing kbuild export if ila.h, from Stephen Hemminger. 17) Missing mdiobus_alloc() return value checking in mdio-mux.c, from Tobias Klauser. 18) Validate protocol value range in ->create() methods, from Hannes Frederic Sowa. 19) Fix early socket demux races that result in illegal dst reuse, from Eric Dumazet. 20) Validate socket address length in pptp code, from WANG Cong. 21) skb_reorder_vlan_header() uses incorrect offset and can corrupt packets, from Vlad Yasevich. 22) Fix memory leaks in nl80211 registry code, from Ola Olsson. 23) Timeout loop count handing fixes in mISDN, xgbe, qlge, sfc, and qlcnic. From Dan Carpenter. 24) msg.msg_iocb needs to be cleared in recvfrom() otherwise, for example, AF_ALG will interpret it as an async call. From Tadeusz Struk. 25) inetpeer_set_addr_v4 forgets to initialize the 'vif' field, from Eric Dumazet. 26) rhashtable enforces the minimum table size not early enough, breaking how we calculate the per-cpu lock allocations. From Herbert Xu. 27) Fix FCC port lockup in 82xx driver, from Martin Roth. 28) FOU sockets need to be freed using RCU, from Hannes Frederic Sowa. 29) Fix out-of-bounds access in __skb_complete_tx_timestamp() and sock_setsockopt() wrt. timestamp handling. From WANG Cong. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (117 commits) net: check both type and procotol for tcp sockets drivers: net: xgene: fix Tx flow control tcp: restore fastopen with no data in SYN packet af_unix: Revert 'lock_interruptible' in stream receive code fou: clean up socket with kfree_rcu 82xx: FCC: Fixing a bug causing to FCC port lock-up gianfar: Don't enable RX Filer if not supported net: fix warnings in 'make htmldocs' by moving macro definition out of field declaration rhashtable: Fix walker list corruption rhashtable: Enforce minimum size on initial hash table inet: tcp: fix inetpeer_set_addr_v4() ipv6: automatically enable stable privacy mode if stable_secret set net: fix uninitialized variable issue bluetooth: Validate socket address length in sco_sock_bind(). net_sched: make qdisc_tree_decrease_qlen() work for non mq ser_gigaset: remove unnecessary kfree() calls from release method ser_gigaset: fix deallocation of platform device structure ser_gigaset: turn nonsense checks into WARN_ON ser_gigaset: fix up NULL checks qlcnic: fix a timeout loop ...
2015-12-16net: fix warnings in 'make htmldocs' by moving macro definition out of field ↵Hannes Frederic Sowa
declaration Docbook does not like the definition of macros inside a field declaration and adds a warning. Move the definition out. Fixes: 79462ad02e86180 ("net: add validation for the socket syscall protocol argument") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-16inet: tcp: fix inetpeer_set_addr_v4()Eric Dumazet
David Ahern added a vif field in the a4 part of inetpeer_addr struct. This broke IPv4 TCP fast open client side and more generally tcp metrics cache, because inetpeer_addr_cmp() is now comparing two u32 instead of one. inetpeer_set_addr_v4() needs to properly init vif field, otherwise the comparison result depends on uninitialized data. Fixes: 192132b9a034 ("net: Add support for VRFs to inetpeer cache") Reported-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15Merge branch 'rdma-cq.2' of git://git.infradead.org/users/hch/rdma into ↵Doug Ledford
4.5/rdma-cq Signed-off-by: Doug Ledford <dledford@redhat.com> Conflicts: drivers/infiniband/ulp/srp/ib_srp.c - Conflicts with changes in ib_srp.c introduced during 4.4-rc updates
2015-12-15Merge tag 'dmaengine-fix-4.4-rc6' of ↵Linus Torvalds
git://git.infradead.org/users/vkoul/slave-dma Pull dmaengine fixes from Vinod Koul: "This has fixes spread thru driver, notably among them: - edma fixes for recent edma DT changes which went into 4.4 - odd fixes for at_hdmac - minor fixes on bc dma and mic dma" * tag 'dmaengine-fix-4.4-rc6' of git://git.infradead.org/users/vkoul/slave-dma: dmaengine: at_xdmac: fix at_xdmac_prep_dma_memcpy() dmaengine: edma: DT: Change reserved slot array from 16bit to 32bit type dmaengine: edma: DT: Change memcpy channel array from 16bit to 32bit type dmaengine: mic_x100: add missing spin_unlock dmaengine: bcm2835-dma: Convert to use DMA pool dmaengine: at_xdmac: fix bad behavior in interleaved mode dmaengine: at_xdmac: fix false condition for memset_sg transfers dmaengine: at_xdmac: fix macro typo
2015-12-15net: fix IP early demux racesEric Dumazet
David Wilder reported crashes caused by dst reuse. <quote David> I am seeing a crash on a distro V4.2.3 kernel caused by a double release of a dst_entry. In ipv4_dst_destroy() the call to list_empty() finds a poisoned next pointer, indicating the dst_entry has already been removed from the list and freed. The crash occurs 18 to 24 hours into a run of a network stress exerciser. </quote> Thanks to his detailed report and analysis, we were able to understand the core issue. IP early demux can associate a dst to skb, after a lookup in TCP/UDP sockets. When socket cache is not properly set, we want to store into sk->sk_dst_cache the dst for future IP early demux lookups, by acquiring a stable refcount on the dst. Problem is this acquisition is simply using an atomic_inc(), which works well, unless the dst was queued for destruction from dst_release() noticing dst refcount went to zero, if DST_NOCACHE was set on dst. We need to make sure current refcount is not zero before incrementing it, or risk double free as David reported. This patch, being a stable candidate, adds two new helpers, and use them only from IP early demux problematic paths. It might be possible to merge in net-next skb_dst_force() and skb_dst_force_safe(), but I prefer having the smallest patch for stable kernels : Maybe some skb_dst_force() callers do not expect skb->dst can suddenly be cleared. Can probably be backported back to linux-3.6 kernels Reported-by: David J. Wilder <dwilder@us.ibm.com> Tested-by: David J. Wilder <dwilder@us.ibm.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-14net: add validation for the socket syscall protocol argumentHannes Frederic Sowa
郭永刚 reported that one could simply crash the kernel as root by using a simple program: int socket_fd; struct sockaddr_in addr; addr.sin_port = 0; addr.sin_addr.s_addr = INADDR_ANY; addr.sin_family = 10; socket_fd = socket(10,3,0x40000000); connect(socket_fd , &addr,16); AF_INET, AF_INET6 sockets actually only support 8-bit protocol identifiers. inet_sock's skc_protocol field thus is sized accordingly, thus larger protocol identifiers simply cut off the higher bits and store a zero in the protocol fields. This could lead to e.g. NULL function pointer because as a result of the cut off inet_num is zero and we call down to inet_autobind, which is NULL for raw sockets. kernel: Call Trace: kernel: [<ffffffff816db90e>] ? inet_autobind+0x2e/0x70 kernel: [<ffffffff816db9a4>] inet_dgram_connect+0x54/0x80 kernel: [<ffffffff81645069>] SYSC_connect+0xd9/0x110 kernel: [<ffffffff810ac51b>] ? ptrace_notify+0x5b/0x80 kernel: [<ffffffff810236d8>] ? syscall_trace_enter_phase2+0x108/0x200 kernel: [<ffffffff81645e0e>] SyS_connect+0xe/0x10 kernel: [<ffffffff81779515>] tracesys_phase2+0x84/0x89 I found no particular commit which introduced this problem. CVE: CVE-2015-8543 Cc: Cong Wang <cwang@twopensource.com> Reported-by: 郭永刚 <guoyonggang@360.cn> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-14openvswitch: fix trivial comment typoPaolo Abeni
The commit 33db4125ec74 ("openvswitch: Rename LABEL->LABELS") left over an old OVS_CT_ATTR_LABEL instance, fix it. Fixes: 33db4125ec74 ("openvswitch: Rename LABEL->LABELS") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== netfilter fixes for net The following patchset contains Netfilter fixes for you net tree, specifically for nf_tables and nfnetlink_queue, they are: 1) Avoid a compilation warning in nfnetlink_queue that was introduced in the previous merge window with the simplification of the conntrack integration, from Arnd Bergmann. 2) nfnetlink_queue is leaking the pernet subsystem registration from a failure path, patch from Nikolay Borisov. 3) Pass down netns pointer to batch callback in nfnetlink, this is the largest patch and it is not a bugfix but it is a dependency to resolve a splat in the correct way. 4) Fix a splat due to incorrect socket memory accounting with nfnetlink skbuff clones. 5) Add missing conntrack dependencies to NFT_DUP_IPV4 and NFT_DUP_IPV6. 6) Traverse the nftables commit list in reverse order from the commit path, otherwise we crash when the user applies an incremental update via 'nft -f' that deletes an object that was just introduced in this batch, from Xin Long. Regarding the compilation warning fix, many people have sent us (and keep sending us) patches to address this, that's why I'm including this batch even if this is not critical. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-13sched/wait: Fix the signal handling fixPeter Zijlstra
Jan Stancek reported that I wrecked things for him by fixing things for Vladimir :/ His report was due to an UNINTERRUPTIBLE wait getting -EINTR, which should not be possible, however my previous patch made this possible by unconditionally checking signal_pending(). We cannot use current->state as was done previously, because the instruction after the store to that variable it can be changed. We must instead pass the initial state along and use that. Fixes: 68985633bccb ("sched/wait: Fix signal handling in bit wait helpers") Reported-by: Jan Stancek <jstancek@redhat.com> Reported-by: Chris Mason <clm@fb.com> Tested-by: Jan Stancek <jstancek@redhat.com> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> Tested-by: Chris Mason <clm@fb.com> Reviewed-by: Paul Turner <pjt@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: tglx@linutronix.de Cc: Oleg Nesterov <oleg@redhat.com> Cc: hpa@zytor.com Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-13Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixlets from Thomas Gleixner: "Two trivial fixes which add missing header fileas and forward declarations so the code will compile even when the magic include chains are different" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/gic-v3: Add missing include for barrier.h irqchip/gic-v3: Add missing struct device_node declaration
2015-12-13Merge tag 'usb-4.4-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB driver fixes from Greg KH: "Here are a number of small USB fixes for 4.4-rc5. All of them have been in linux-next. The majority are gadget and phy issues, with a few new quirks and device ids added as well" * tag 'usb-4.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (32 commits) USB: add quirk for devices with broken LPM xhci: fix usb2 resume timing and races. usb: musb: fail with error when no DMA controller set usb: gadget: uvc: fix permissions of configfs attributes usb: musb: core: Fix pm runtime for deferred probe usb: phy: msm: fix a possible NULL dereference USB: host: ohci-at91: fix a crash in ohci_hcd_at91_overcurrent_irq usb: Quiet down false peer failure messages usb: xhci: fix config fail of FS hub behind a HS hub with MTT xhci: Fix memory leak in xhci_pme_acpi_rtd3_enable() usb: Use the USB_SS_MULT() macro to decode burst multiplier for log message USB: whci-hcd: add check for dma mapping error usb: core : hub: Fix BOS 'NULL pointer' kernel panic USB: quirks: Apply ALWAYS_POLL to all ELAN devices usb-storage: Fix scsi-sd failure "Invalid field in cdb" for USB adapter JMicron USB: quirks: Fix another ELAN touchscreen usb: dwc3: gadget: don't prestart interrupt endpoints USB: serial: Another Infineon flash loader USB ID USB: cdc_acm: Ignore Infineon Flash Loader utility USB: cp210x: Remove CP2110 ID from compatibility list ...
2015-12-12Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "17 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: MIPS: fix DMA contiguous allocation sh64: fix __NR_fgetxattr ocfs2: fix SGID not inherited issue mm/oom_kill.c: avoid attempting to kill init sharing same memory drivers/base/memory.c: prohibit offlining of memory blocks with missing sections tmpfs: fix shmem_evict_inode() warnings on i_blocks mm/hugetlb.c: fix resv map memory leak for placeholder entries mm: hugetlb: call huge_pte_alloc() only if ptep is null kernel: remove stop_machine() Kconfig dependency mm: kmemleak: mark kmemleak_init prototype as __init mm: fix kerneldoc on mem_cgroup_replace_page osd fs: __r4w_get_page rely on PageUptodate for uptodate MAINTAINERS: make Vladimir co-maintainer of the memory controller mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make any progress mm: fix swapped Movable and Reclaimable in /proc/pagetypeinfo memcg: fix memory.high target mm: hugetlb: fix hugepage memory leak caused by wrong reserve count
2015-12-12Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block layer fixes from Jens Axboe: "A set of fixes for the current series. This contains: - A bunch of fixes for lightnvm, should be the last round for this series. From Matias and Wenwei. - A writeback detach inode fix from Ilya, also marked for stable. - A block (though it says SCSI) fix for an OOPS in SCSI runtime power management. - Module init error path fixes for null_blk from Minfei" * 'for-linus' of git://git.kernel.dk/linux-block: null_blk: Fix error path in module initialization lightnvm: do not compile in debugging by default lightnvm: prevent gennvm module unload on use lightnvm: fix media mgr registration lightnvm: replace req queue with nvmdev for lld lightnvm: comments on constants lightnvm: check mm before use lightnvm: refactor spin_unlock in gennvm_get_blk lightnvm: put blks when luns configure failed lightnvm: use flags in rrpc_get_blk block: detach bdev inode from its wb in __blkdev_put() SCSI: Fix NULL pointer dereference in runtime PM
2015-12-12kernel: remove stop_machine() Kconfig dependencyChris Wilson
Currently the full stop_machine() routine is only enabled on SMP if module unloading is enabled, or if the CPUs are hotpluggable. This leads to configurations where stop_machine() is broken as it will then only run the callback on the local CPU with irqs disabled, and not stop the other CPUs or run the callback on them. For example, this breaks MTRR setup on x86 in certain configs since ea8596bb2d8d379 ("kprobes/x86: Remove unused text_poke_smp() and text_poke_smp_batch() functions") as the MTRR is only established on the boot CPU. This patch removes the Kconfig option for STOP_MACHINE and uses the SMP and HOTPLUG_CPU config options to compile the correct stop_machine() for the architecture, removing the false dependency on MODULE_UNLOAD in the process. Link: https://lkml.org/lkml/2014/10/8/124 References: https://bugs.freedesktop.org/show_bug.cgi?id=84794 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Pranith Kumar <bobby.prani@gmail.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Tejun Heo <tj@kernel.org> Cc: Iulia Manda <iulia.manda21@gmail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Chuck Ebbert <cebbert.lkml@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>