summaryrefslogtreecommitdiff
path: root/drivers/infiniband/core
AgeCommit message (Collapse)Author
2013-09-05Merge tag 'rdma-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull main batch of InfiniBand/RDMA changes from Roland Dreier: - Large ocrdma HW driver update: add "fast register" work requests, fixes, cleanups - Add receive flow steering support for raw QPs - Fix IPoIB neighbour race that leads to crash - iSER updates including support for using "fast register" memory registration - IPv6 support for iWARP - XRC transport fixes * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (54 commits) RDMA/ocrdma: Fix compiler warning about int/pointer size mismatch IB/iser: Fix redundant pointer check in dealloc flow IB/iser: Fix possible memory leak in iser_create_frwr_pool() IB/qib: Move COUNTER_MASK definition within qib_mad.h header guards RDMA/ocrdma: Fix passing wrong opcode to modify_srq RDMA/ocrdma: Fill PVID in UMC case RDMA/ocrdma: Add ABI versioning support RDMA/ocrdma: Consider multiple SGES in case of DPP RDMA/ocrdma: Fix for displaying proper link speed RDMA/ocrdma: Increase STAG array size RDMA/ocrdma: Dont use PD 0 for userpace CQ DB RDMA/ocrdma: FRMA code cleanup RDMA/ocrdma: For ERX2 irrespective of Qid, num_posted offset is 24 RDMA/ocrdma: Fix to work with even a single MSI-X vector RDMA/ocrdma: Remove the MTU check based on Ethernet MTU RDMA/ocrdma: Add support for fast register work requests (FRWR) RDMA/ocrdma: Create IRD queue fix IB/core: Better checking of userspace values for receive flow steering IB/mlx4: Add receive flow steering support IB/core: Export ib_create/destroy_flow through uverbs ...
2013-09-05Merge tag 'PTR_RET-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull PTR_RET() removal patches from Rusty Russell: "PTR_RET() is a weird name, and led to some confusing usage. We ended up with PTR_ERR_OR_ZERO(), and replacing or fixing all the usages. This has been sitting in linux-next for a whole cycle" [ There are still some PTR_RET users scattered about, with some of them possibly being new, but most of them existing in Rusty's tree too. We have that #define PTR_RET(p) PTR_ERR_OR_ZERO(p) thing in <linux/err.h>, so they continue to work for now - Linus ] * tag 'PTR_RET-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: GFS2: Replace PTR_RET with PTR_ERR_OR_ZERO Btrfs: volume: Replace PTR_RET with PTR_ERR_OR_ZERO drm/cma: Replace PTR_RET with PTR_ERR_OR_ZERO sh_veu: Replace PTR_RET with PTR_ERR_OR_ZERO dma-buf: Replace PTR_RET with PTR_ERR_OR_ZERO drivers/rtc: Replace PTR_RET with PTR_ERR_OR_ZERO mm/oom_kill: remove weird use of ERR_PTR()/PTR_ERR(). staging/zcache: don't use PTR_RET(). remoteproc: don't use PTR_RET(). pinctrl: don't use PTR_RET(). acpi: Replace weird use of PTR_RET. s390: Replace weird use of PTR_RET. PTR_RET is now PTR_ERR_OR_ZERO(): Replace most. PTR_RET is now PTR_ERR_OR_ZERO
2013-09-03Merge branches 'cxgb4', 'flowsteer', 'ipoib', 'iser', 'mlx4', 'ocrdma' and ↵Roland Dreier
'qib' into for-next
2013-09-02IB/core: Better checking of userspace values for receive flow steeringMatan Barak
- Don't allow unsupported comp_mask values, user should check ibv_query_device to know which features are supported. - Add a check in ib_uverbs_create_flow() to verify the size passed from the user space. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-28IB/core: Export ib_create/destroy_flow through uverbsHadar Hen Zion
Implement ib_uverbs_create_flow() and ib_uverbs_destroy_flow() to support flow steering for user space applications. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-28IB/core: Infrastructure for extensible uverbs commandsIgor Ivanov
Add infrastructure to support extended uverbs capabilities in a forward/backward manner. Uverbs command opcodes which are based on the verbs extensions approach should be greater or equal to IB_USER_VERBS_CMD_THRESHOLD. They have new header format and processed a bit differently. Whenever a specific IB_USER_VERBS_CMD_XXX is extended, which practically means it needs to have additional arguments, we will be able to add them without creating a completely new IB_USER_VERBS_CMD_YYY command or bumping the uverbs ABI version. This patch for itself doesn't provide the whole scheme which is also dependent on adding a comp_mask field to each extended uverbs command struct. The new header framework allows for future extension of the CMD arguments (ib_uverbs_cmd_hdr.in_words, ib_uverbs_cmd_hdr.out_words) for an existing new command (that is a command that supports the new uverbs command header format suggested in this patch) w/o bumping ABI version and with maintaining backward and formward compatibility to new and old libibverbs versions. In the uverbs command we are passing both uverbs arguments and the provider arguments. We split the ib_uverbs_cmd_hdr.in_words to ib_uverbs_cmd_hdr.in_words which will now carry only uverbs input argument struct size and ib_uverbs_cmd_hdr.provider_in_words that will carry the provider input argument size. Same goes for the response (the uverbs CMD output argument). For example take the create_cq call and the mlx4_ib provider: The uverbs layer gets libibverb's struct ibv_create_cq (named struct ib_uverbs_create_cq in the kernel), mlx4_ib gets libmlx4's struct mlx4_create_cq (which includes struct ibv_create_cq and is named struct mlx4_ib_create_cq in the kernel) and in_words = sizeof(mlx4_create_cq)/4 . Thus ib_uverbs_cmd_hdr.in_words carry both uverbs plus mlx4_ib input argument sizes, where uverbs assumes it knows the size of its input argument - struct ibv_create_cq. Now, if we wish to add a variable to struct ibv_create_cq, we can add a comp_mask field to the struct which is basically bit field indicating which fields exists in the struct (as done for the libibverbs API extension), but we need a way to tell what is the total size of the struct and not assume the struct size is predefined (since we may get different struct sizes from different user libibverbs versions). So we know at which point the provider input argument (struct mlx4_create_cq) begins. Same goes for extending the provider struct mlx4_create_cq. Thus we split the ib_uverbs_cmd_hdr.in_words to ib_uverbs_cmd_hdr.in_words which will now carry only uverbs input argument struct size and ib_uverbs_cmd_hdr.provider_in_words that will carry the provider (mlx4_ib) input argument size. Signed-off-by: Igor Ivanov <Igor.Ivanov@itseez.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-28IB/core: Add receive flow steering supportHadar Hen Zion
The RDMA stack allows for applications to create IB_QPT_RAW_PACKET QPs, which receive plain Ethernet packets, specifically packets that don't carry any QPN to be matched by the receiving side. Applications using these QPs must be provided with a method to program some steering rule with the HW so packets arriving at the local port can be routed to them. This patch adds ib_create_flow(), which allow providing a flow specification for a QP. When there's a match between the specification and a received packet, the packet is forwarded to that QP, in a the same way one uses ib_attach_multicast() for IB UD multicast handling. Flow specifications are provided as instances of struct ib_flow_spec_yyy, which describe L2, L3 and L4 headers. Currently specs for Ethernet, IPv4, TCP and UDP are defined. Flow specs are made of values and masks. The input to ib_create_flow() is a struct ib_flow_attr, which contains a few mandatory control elements and optional flow specs. struct ib_flow_attr { enum ib_flow_attr_type type; u16 size; u16 priority; u32 flags; u8 num_of_specs; u8 port; /* Following are the optional layers according to user request * struct ib_flow_spec_yyy * struct ib_flow_spec_zzz */ }; As these specs are eventually coming from user space, they are defined and used in a way which allows adding new spec types without kernel/user ABI change, just with a little API enhancement which defines the newly added spec. The flow spec structures are defined with TLV (Type-Length-Value) entries, which allows calling ib_create_flow() with a list of variable length of optional specs. For the actual processing of ib_flow_attr the driver uses the number of specs and the size mandatory fields along with the TLV nature of the specs. Steering rules processing order is according to the domain over which the rule is set and the rule priority. All rules set by user space applicatations fall into the IB_FLOW_DOMAIN_USER domain, other domains could be used by future IPoIB RFS and Ethetool flow-steering interface implementation. Lower numerical value for the priority field means higher priority. The returned value from ib_create_flow() is a struct ib_flow, which contains a database pointer (handle) provided by the HW driver to be used when calling ib_destroy_flow(). Applications that offload TCP/IP traffic can also be written over IB UD QPs. The ib_create_flow() / ib_destroy_flow() API is designed to support UD QPs too. A HW driver can set IB_DEVICE_MANAGED_FLOW_STEERING to denote support for flow steering. The ib_flow_attr enum type supports usage of flow steering for promiscuous and sniffer purposes: IB_FLOW_ATTR_NORMAL - "regular" rule, steering according to rule specification IB_FLOW_ATTR_ALL_DEFAULT - default unicast and multicast rule, receive all Ethernet traffic which isn't steered to any QP IB_FLOW_ATTR_MC_DEFAULT - same as IB_FLOW_ATTR_ALL_DEFAULT but only for multicast IB_FLOW_ATTR_SNIFFER - sniffer rule, receive all port traffic ALL_DEFAULT and MC_DEFAULT rules options are valid only for Ethernet link type. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13IB/core: Fixes to XRC reference counting in uverbsYishai Hadas
Added reference counting mechanism for XRC target QPs between ib_uqp_object and its ib_uxrcd_object. This prevents closing an XRC domain that is still attached to a QP. In addition, add missing code in ib_uverbs_destroy_srq() to handle ib_uxrcd_object reference counting correctly when destroying an xsrq. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13IB/core: Add locking around event dispatching on XRC target QPsYishai Hadas
Fix a potential race when event occurrs on a target XRC QP and in the middle of reporting that on its shared qps, one of them is destroyed by user space application. Also add note for kernel consumers in ib_verbs.h that they must not destroy the QP from within the handler. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-12RDMA/cma: Add IPv6 support for iWARPSteve Wise
Modify the type of local_addr and remote_addr fields in struct iw_cm_id from struct sockaddr_in to struct sockaddr_storage to hold IPv6 and IPv4 addresses uniformly. Change the references of local_addr and remote_addr in cxgb4, cxgb3, nes and amso drivers to match this. However to be able to actully run traffic over IPv6, low-level drivers have to add code to support this. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> [ Fix unused variable warnings when INFINIBAND_NES_DEBUG not set. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-31Merge branches 'cma', 'cxgb3', 'cxgb4', 'ipoib', 'misc', 'mlx4', 'mlx5', ↵Roland Dreier
'nes', 'ocrdma' and 'qib' into for-next
2013-07-31IB/core: Create QP1 using the pkey index which contains the default pkeyJack Morgenstein
Currently, QP1 is created using pkey index 0. This patch simply looks for the index containing the default pkey, rather than hard-coding pkey index 0. This change will have no effect in native mode, since QP0 and QP1 are created before the SM configures the port, so pkey table will still be the default table defined by the IB Spec, in C10-123: "If non-volatile storage is not used to hold P_Key Table contents, then if a PM (Partition Manager) is not present, and prior to PM initialization of the P_Key Table, the P_Key Table must act as if it contains a single valid entry, at P_Key_ix = 0, containing the default partition key. All other entries in the P_Key Table must be invalid." Thus, in the native mode case, the driver will find the default pkey at index 0 (so it will be no different than the hard-coding). However, in SR-IOV mode, for VFs, the pkey table may be paravirtualized, so that the VF's pkey index zero may not necessarily be mapped to the real pkey index 0. For VFs, therefore, it is important to find the virtual index which maps to the real default pkey. This commit does the following for QP1 creation: 1. Find the pkey index containing the default pkey, and use that index if found. ib_find_pkey() returns the index of the limited-membership default pkey (0x7FFF) if the full-member default pkey is not in the table. 2. If neither form of the default pkey is found, use pkey index 0 (previous behavior). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-31RDMA/cma: Only call cma_save_ib_info() for CM REQsSean Hefty
Calling cma_save_ib_info() for CM SIDR REQs results in a crash accessing an invalid path record pointer. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-31RDMA/cma: Fix accessing invalid private data for UDSean Hefty
If a application is using AF_IB with a UD QP, but does not provide any private data, we will end up accessing invalid memory. Check for this case and handle it appropriately. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-30RDMA/cma: Fix gcc warningPaul Bolle
Building cma.o triggers this gcc warning: drivers/infiniband/core/cma.c: In function ‘rdma_resolve_addr’: drivers/infiniband/core/cma.c:465:23: warning: ‘port’ may be used uninitialized in this function [-Wmaybe-uninitialized] drivers/infiniband/core/cma.c:426:5: note: ‘port’ was declared here This is a false positive, as "port" will always be initialized if we're at "found". But if we assign to "id_priv->id.port_num" directly, we can drop "port". That will, obviously, silence gcc. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-15PTR_RET is now PTR_ERR_OR_ZERO(): Replace most.Rusty Russell
Sweep of the simple cases. Cc: netdev@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-kernel@lists.infradead.org Cc: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-13Merge tag 'rdma-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull InfiniBand/RDMA changes from Roland Dreier: - AF_IB (native IB addressing) for CMA from Sean Hefty - new mlx5 driver for Mellanox Connect-IB adapters (including post merge request fixes) - SRP fixes from Bart Van Assche (including fix to first merge request) - qib HW driver updates - resurrection of ocrdma HW driver development - uverbs conversion to create fds with O_CLOEXEC set - other small changes and fixes * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (66 commits) mlx5: Return -EFAULT instead of -EPERM IB/qib: Log all SDMA errors unconditionally IB/qib: Fix module-level leak mlx5_core: Adjust hca_cap.uar_page_sz to conform to Connect-IB spec IB/srp: Let srp_abort() return FAST_IO_FAIL if TL offline IB/uverbs: Use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd() mlx5_core: Fixes for sparse warnings IB/mlx5: Make profile[] static in main.c mlx5: Fix parameter type of health_handler_t mlx5: Add driver for Mellanox Connect-IB adapters IB/core: Add reserved values to enums for low-level driver use IB/srp: Bump driver version and release date IB/srp: Make HCA completion vector configurable IB/srp: Maintain a single connection per I_T nexus IB/srp: Fail I/O fast if target offline IB/srp: Skip host settle delay IB/srp: Avoid skipping srp_reset_host() after a transport error IB/srp: Fix remove_one crash due to resource exhaustion IB/qib: New transmitter tunning settings for Dell 1.1 backplane IB/core: Fix error return code in add_port() ...
2013-07-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: "This is a re-do of the net-next pull request for the current merge window. The only difference from the one I made the other day is that this has Eliezer's interface renames and the timeout handling changes made based upon your feedback, as well as a few bug fixes that have trickeled in. Highlights: 1) Low latency device polling, eliminating the cost of interrupt handling and context switches. Allows direct polling of a network device from socket operations, such as recvmsg() and poll(). Currently ixgbe, mlx4, and bnx2x support this feature. Full high level description, performance numbers, and design in commit 0a4db187a999 ("Merge branch 'll_poll'") From Eliezer Tamir. 2) With the routing cache removed, ip_check_mc_rcu() gets exercised more than ever before in the case where we have lots of multicast addresses. Use a hash table instead of a simple linked list, from Eric Dumazet. 3) Add driver for Atheros CQA98xx 802.11ac wireless devices, from Bartosz Markowski, Janusz Dziedzic, Kalle Valo, Marek Kwaczynski, Marek Puzyniak, Michal Kazior, and Sujith Manoharan. 4) Support reporting the TUN device persist flag to userspace, from Pavel Emelyanov. 5) Allow controlling network device VF link state using netlink, from Rony Efraim. 6) Support GRE tunneling in openvswitch, from Pravin B Shelar. 7) Adjust SOCK_MIN_RCVBUF and SOCK_MIN_SNDBUF for modern times, from Daniel Borkmann and Eric Dumazet. 8) Allow controlling of TCP quickack behavior on a per-route basis, from Cong Wang. 9) Several bug fixes and improvements to vxlan from Stephen Hemminger, Pravin B Shelar, and Mike Rapoport. In particular, support receiving on multiple UDP ports. 10) Major cleanups, particular in the area of debugging and cookie lifetime handline, to the SCTP protocol code. From Daniel Borkmann. 11) Allow packets to cross network namespaces when traversing tunnel devices. From Nicolas Dichtel. 12) Allow monitoring netlink traffic via AF_PACKET sockets, in a manner akin to how we monitor real network traffic via ptype_all. From Daniel Borkmann. 13) Several bug fixes and improvements for the new alx device driver, from Johannes Berg. 14) Fix scalability issues in the netem packet scheduler's time queue, by using an rbtree. From Eric Dumazet. 15) Several bug fixes in TCP loss recovery handling, from Yuchung Cheng. 16) Add support for GSO segmentation of MPLS packets, from Simon Horman. 17) Make network notifiers have a real data type for the opaque pointer that's passed into them. Use this to properly handle network device flag changes in arp_netdev_event(). From Jiri Pirko and Timo Teräs. 18) Convert several drivers over to module_pci_driver(), from Peter Huewe. 19) tcp_fixup_rcvbuf() can loop 500 times over loopback, just use a O(1) calculation instead. From Eric Dumazet. 20) Support setting of explicit tunnel peer addresses in ipv6, just like ipv4. From Nicolas Dichtel. 21) Protect x86 BPF JIT against spraying attacks, from Eric Dumazet. 22) Prevent a single high rate flow from overruning an individual cpu during RX packet processing via selective flow shedding. From Willem de Bruijn. 23) Don't use spinlocks in TCP md5 signing fast paths, from Eric Dumazet. 24) Don't just drop GSO packets which are above the TBF scheduler's burst limit, chop them up so they are in-bounds instead. Also from Eric Dumazet. 25) VLAN offloads are missed when configured on top of a bridge, fix from Vlad Yasevich. 26) Support IPV6 in ping sockets. From Lorenzo Colitti. 27) Receive flow steering targets should be updated at poll() time too, from David Majnemer. 28) Fix several corner case regressions in PMTU/redirect handling due to the routing cache removal, from Timo Teräs. 29) We have to be mindful of ipv4 mapped ipv6 sockets in upd_v6_push_pending_frames(). From Hannes Frederic Sowa. 30) Fix L2TP sequence number handling bugs, from James Chapman." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1214 commits) drivers/net: caif: fix wrong rtnl_is_locked() usage drivers/net: enic: release rtnl_lock on error-path vhost-net: fix use-after-free in vhost_net_flush net: mv643xx_eth: do not use port number as platform device id net: sctp: confirm route during forward progress virtio_net: fix race in RX VQ processing virtio: support unlocked queue poll net/cadence/macb: fix bug/typo in extracting gem_irq_read_clear bit Documentation: Fix references to defunct linux-net@vger.kernel.org net/fs: change busy poll time accounting net: rename low latency sockets functions to busy poll bridge: fix some kernel warning in multicast timer sfc: Fix memory leak when discarding scattered packets sit: fix tunnel update via netlink dt:net:stmmac: Add dt specific phy reset callback support. dt:net:stmmac: Add support to dwmac version 3.610 and 3.710 dt:net:stmmac: Allocate platform data only if its NULL. net:stmmac: fix memleak in the open method ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available net: ipv6: fix wrong ping_v6_sendmsg return value ...
2013-07-08Merge branches 'af_ib', 'cxgb4', 'misc', 'mlx5', 'ocrdma', 'qib' and 'srp' ↵Roland Dreier
into for-next
2013-07-08IB/uverbs: Use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd()Roland Dreier
The macro get_unused_fd() is used to allocate a file descriptor with default flags. Those default flags (0) can be "unsafe": O_CLOEXEC must be used by default to not leak file descriptor across exec(). Replace calls to get_unused_fd() in uverbs with calls to get_unused_fd_flags(O_CLOEXEC). Inheriting uverbs fds across exec() cannot be used to do anything useful. Based on a patch/suggestion from Yann Droneaud <ydroneaud@opteya.com>. Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-03drivers: avoid format string in dev_set_nameKees Cook
Calling dev_set_name with a single paramter causes it to be handled as a format string. Many callers are passing potentially dynamic string content, so use "%s" in those cases to avoid any potential accidents, including wrappers like device_create*() and bdi_register(). Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-06-24IB/core: Fix error return code in add_port()Wei Yongjun
Fix to return -ENOMEM in the add_port() error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-21RDMA/cma: Export AF_IB statisticsSean Hefty
Report AF_IB source and destination addresses through netlink interface. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-21RDMA/ucma: Allow user space to specify AF_IB when joining multicastSean Hefty
Allow user space applications to join multicast groups using MGIDs directly. MGIDs may be passed using AF_IB addresses. Since the current multicast join command only supports addresses as large as sockaddr_in6, define a new structure for joining addresses specified using sockaddr_ib. Since AF_IB allows the user to specify the qkey when resolving a remote UD QP address, when joining the multicast group use the qkey value, if one has been assigned. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-21RDMA/ucma: Allow user space to pass AF_IB into resolveSean Hefty
Allow user space applications to call resolve_addr using AF_IB. To support sockaddr_ib, we need to define a new structure capable of handling the larger address size. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-21RDMA/ucma: Allow user space to bind to AF_IBSean Hefty
Support user space binding to addresses using AF_IB. Since sockaddr_ib is larger than sockaddr_in6, we need to define a larger structure when binding using AF_IB. This time we use sockaddr_storage to cover future cases. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-21RDMA/ucma: Name changes to indicate only IP addresses supportedSean Hefty
Several commands into the RDMA CM from user space are restricted to supporting addresses which fit into a sockaddr_in6 structure: bind address, resolve address, and join multicast. With the addition of AF_IB, we need to support addresses which are larger than sockaddr_in6. This will be done by adding new commands that exchange address information using sockaddr_storage. However, to support existing applications, we maintain the current commands and structures, but rename them to indicate that they only support IPv4 and v6 addresses. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-21RDMA/ucma: Add ability to query GID addressesSean Hefty
Part of address resolution is mapping IP addresses to IB GIDs. With the changes to support querying larger addresses and more path records, also provide a way to query IB GIDs after resolution completes. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-21RDMA/cma: Export cma_get_service_id()Sean Hefty
Allow the rdma_ucm to query the IB service ID formed or allocated by the rdma_cm by exporting the cma_get_service_id() functionality. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-21RDMA/ucma: Support querying when IB paths are not reversibleSean Hefty
The current query_route call can return up to two path records. The assumption being that one is the primary path, with optional support for an alternate path. In both cases, the paths are assumed to be reversible and are used to send CM MADs. With the ability to manually set IB path data, the rdma cm can eventually be capable of using up to 6 paths per connection: forward primary, reverse primary, forward alternate, reverse alternate, reversible primary path for CM MADs reversible alternate path for CM MADs. (It is unclear at this time if IB routing will complicate this) In order to handle more flexible routing topologies, add a new command to report any number of paths. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-21IB/sa: Export function to pack a path record into wire formatSean Hefty
Allow converting from struct ib_sa_path_rec to the IB defined SA path record wire format. This will be used to report path data from the rdma cm into user space. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-21RDMA/ucma: Support querying for AF_IB addressesSean Hefty
The sockaddr structure for AF_IB is larger than sockaddr_in6. The rdma cm user space ABI uses the latter to exchange address information between user space and the kernel. To support querying for larger addresses, define a new query command that exchanges data using sockaddr_storage, rather than sockaddr_in6. Unlike the existing query_route command, the new command only returns address information. Route (i.e. path record) data is separated. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-21RDMA/cma: Only listen on IB devices when using AF_IBSean Hefty
If an rdma_cm_id is bound to AF_IB, with a wild card address, only listen on IB devices. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-21RDMA/cma: Set qkey for AF_IBSean Hefty
Allow the user to specify the qkey when using AF_IB. The qkey is added to struct rdma_ucm_conn_param in place of a reserved field, but for backwards compatability, is only accessed if the associated rdma_cm_id is using AF_IB. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-21RDMA/cma: Expose private data when using AF_IBSean Hefty
If the source or destination address is AF_IB, then do not reserve a portion of the private data in the IB CM REQ or SIDR REQ messages for the cma header. Instead, all private data should be exported to the user. When AF_IB is used, the rdma cm does not have sufficient information to fill in the cma header. Additionally, this will be necessary to support any IB connection through the rdma cm interface, Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-21RDMA/cma: Merge cma_get/save_net_infoSean Hefty
With the removal of SDP related code, we can merge cma_get_net_info() with cma_save_net_info(), since we're only ever dealing with a single header format. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20RDMA/cma: Remove unused SDP related codeSean Hefty
The SDP protocol was never merged upstream. Remove unused SDP related code from the RDMA CM. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20RDMA/cma: Add support for AF_IB to cma_get_service_id()Sean Hefty
cma_get_service_id() forms the service ID based on the port space and port number of the rdma_cm_id. Extend the call to support AF_IB, which contains the service ID directly. This will be needed to support any arbitrary SID. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20RDMA/cma: Add support for AF_IB to rdma_resolve_route()Sean Hefty
Allow rdma_resolve_route() to handle the case where the user specified the source and destination addresses using AF_IB. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20RDMA/cma: Add support for AF_IB to rdma_resolve_addr()Sean Hefty
Allow the user to specify the remote address using AF_IB format. When AF_IB is used, the remote address simply needs to be recorded, and no resolution using ARP is done. The local address may still need to be matched with a local IB device. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20RDMA/cma: Verify that source and dest sa_family are the sameSean Hefty
Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20RDMA/cma: Restrict AF_IB loopback to binding to IB devices onlySean Hefty
If a user specifies AF_IB as the source address for a loopback connection, limit the resolution to IB devices only. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20RDMA/cma: Add helper functions to return id address informationSean Hefty
Provide inline helpers to extract source and destination address data from the rdma_cm_id. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20RDMA/cma: Do not modify sa_family when setting loopback addressSean Hefty
cma_resolve_loopback is called after an rdma_cm_id has been bound to a specific sa_family and port. Once the source sa_family for the id has been set, do not modify it. Only the actual IP address portion of the source address needs to be set. As part of this fix, we can simplify setting the source address by moving the loopback address assignment from cma_resolve_loopback to cma_bind_loopback. cma_bind_loopback is only invoked when the source address is the loopback address. Finally, add loopback support for AF_IB as part of the change. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20RDMA/cma: Allow user to specify AF_IB when bindingSean Hefty
Modify rdma_bind_addr to allow the user to specify AF_IB when binding to a device. AF_IB indicates that the user is not mapping an IP address to the native IB addressing. (The mapping may have already been done, or is not needed) Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20RDMA/cma: Update port reservation to support AF_IBSean Hefty
The AF_IB uses a 64-bit service id (SID), which the user can control through the use of a mask. The rdma_cm will assign values to the unmasked portions of the SID based on the selected port space and port number. Because the IB spec divides the SID range into several regions, a SID/mask combination may fall into one of the existing port space ranges as defined by the RDMA CM IP Annex. Map the AF_IB SID to the correct RDMA port space. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20IB/addr: Add AF_IB support to ip_addr_sizeSean Hefty
Add support for AF_IB to ip_addr_size, and rename the function to account for the change. Give the compiler more control over whether the call should be inline or not by moving the definition into the .c file, removing the static inline, and exporting it. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20RDMA/cma: Include AF_IB in loopback and any address checksSean Hefty
Enhance checks for loopback and any address to support AF_IB in addition to AF_INET and AF_INT6. This will allow future patches to use AF_IB when binding and resolving addresses. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20RDMA/cma: Allow enabling reuseaddr in any stateSean Hefty
The rdma_cm only allows setting reuseaddr if the corresponding rdma_cm_id is in the idle state. Allow setting this value in other states. This brings the behavior more inline with sockets. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-05-28net: pass info struct via netdevice notifierJiri Pirko
So far, only net_device * could be passed along with netdevice notifier event. This patch provides a possibility to pass custom structure able to provide info that event listener needs to know. Signed-off-by: Jiri Pirko <jiri@resnulli.us> v2->v3: fix typo on simeth shortened dev_getter shortened notifier_info struct name v1->v2: fix notifier_call parameter in call_netdevice_notifier() Signed-off-by: David S. Miller <davem@davemloft.net>