summaryrefslogtreecommitdiff
path: root/drivers/infiniband/core/verbs.c
AgeCommit message (Collapse)Author
2011-10-13RDMA/core: Export ib_open_qp() to share XRC TGT QPsSean Hefty
XRC TGT QPs are shared resources among multiple processes. Since the creating process may exit, allow other processes which share the same XRC domain to open an existing QP. This allows us to transfer ownership of an XRC TGT QP to another process. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13RDMA/uverbs: Export XRC domains to user spaceSean Hefty
Allow user space to create XRC domains. Because XRCDs are expected to be shared among multiple processes, we use inodes to identify an XRCD. Based on patches by Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13RDMA/verbs: Cleanup XRC TGT QPs when destroying XRCDSean Hefty
XRC TGT QPs are intended to be shared among multiple users and processes. Allow the destruction of an XRC TGT QP to be done explicitly through ib_destroy_qp() or when the XRCD is destroyed. To support destroying an XRC TGT QP, we need to track TGT QPs with the XRCD. When the XRCD is destroyed, all tracked XRC TGT QPs are also cleaned up. To avoid stale reference issues, if a user is holding a reference on a TGT QP, we increment a reference count on the QP. The user releases the reference by calling ib_release_qp. This releases any access to the QP from a user above verbs, but allows the QP to continue to exist until destroyed by the XRCD. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13RDMA/core: Add XRC QPsSean Hefty
XRC ("eXtended reliable connected") is an IB transport that provides better scalability by allowing senders to specify which shared receive queue (SRQ) should be used to receive a message, which essentially allows one transport context (QP connection) to serve multiple destinations (as long as they share an adapter, of course). XRC communication is between an initiator (INI) QP and a target (TGT) QP. Target QPs are associated with SRQs through an XRCD. An XRC TGT QP behaves like a receive-only RD QP. XRC INI QPs behave similarly to RC QPs, except that work requests posted to an XRC INI QP must specify the remote SRQ that is the target of the work request. We define two new QP types for XRC, to distinguish between INI and TGT QPs, and update the core layer to support XRC QPs. This patch is derived from work by Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13RDMA/core: Add XRC SRQ typeSean Hefty
XRC ("eXtended reliable connected") is an IB transport that provides better scalability by allowing senders to specify which shared receive queue (SRQ) should be used to receive a message, which essentially allows one transport context (QP connection) to serve multiple destinations (as long as they share an adapter, of course). XRC defines SRQs that are specifically used by XRC connections. Expand the SRQ code to support XRC SRQs. An XRC SRQ is currently restricted to only XRC use according to the IB XRC Annex. Portions of this patch were derived from work by Jack Morgenstein <jackm@dev.mellanox.co.il>. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13RDMA/core: Add SRQ type fieldSean Hefty
Currently, there is only a single ("basic") type of SRQ, but with XRC support we will add a second. Prepare for this by defining an SRQ type and setting all current users to IB_SRQT_BASIC. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-12RDMA/core: Add XRC domain supportSean Hefty
XRC ("eXtended reliable connected") is an IB transport that provides better scalability by allowing senders to specify which shared receive queue (SRQ) should be used to receive a message, which essentially allows one transport context (QP connection) to serve multiple destinations (as long as they share an adapter, of course). A few new concepts are introduced to support this. This patch adds: - A new device capability flag, IB_DEVICE_XRC, which low-level drivers set to indicate that a device supports XRC. - A new object type, XRC domains (struct ib_xrcd), and new verbs ib_alloc_xrcd()/ib_dealloc_xrcd(). XRCDs are used to limit which XRC SRQs an incoming message can target. This patch is derived from work by Jack Morgenstein <jackm@dev.mellanox.co.il>. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2010-09-28IB/core: Add link layer property to portsEli Cohen
This patch allows ports to have different link layers: IB_LINK_LAYER_INFINIBAND or IB_LINK_LAYER_ETHERNET. This is required for adding IBoE (InfiniBand-over-Ethernet, aka RoCE) support. For devices that do not provide an implementation for querying the link layer property of a port, we return a default value based on the transport: RMA_TRANSPORT_IB nodes will return IB_LINK_LAYER_INFINIBAND and RDMA_TRANSPORT_IWARP nodes will return IB_LINK_LAYER_ETHERNET. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-08-04IB: Rename RAW_ETY to RAW_ETHERTYPEAleksey Senin
Change abbreviated IB_QPT_RAW_ETY to IB_QPT_RAW_ETHERTYPE to make the special QP type easier to understand. cf http://www.mail-archive.com/linux-rdma@vger.kernel.org/msg04530.html Signed-off-by: Aleksey Senin <alekseys@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-15IB/core: Reset to error QP state transition is not allowedRalph Campbell
I was reviewing the QP state transition diagram in the IB 1.2.1 spec and the code for qp_state_table[], and noticed that the code allows a QP to be modified from IB_QPS_RESET to IB_QPS_ERR whereas the notes for figure 124 (pg 457) specifically says that this transition isn't allowed. This is a clarification from earlier versions of the IB spec, which were ambiguous in this area and suggested that the RESET to ERR transition was allowed. Fix up the qp_state_table[] to make RESET->ERR not allowed. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-15RDMA/core: Add memory management extensions supportSteve Wise
This patch adds support for the IB "base memory management extension" (BMME) and the equivalent iWARP operations (which the iWARP verbs mandates all devices must implement). The new operations are: - Allocate an ib_mr for use in fast register work requests. - Allocate/free a physical buffer lists for use in fast register work requests. This allows device drivers to allocate this memory as needed for use in posting send requests (eg via dma_alloc_coherent). - New send queue work requests: * send with remote invalidate * fast register memory region * local invalidate memory region * RDMA read with invalidate local memory region (iWARP only) Consumer interface details: - A new device capability flag IB_DEVICE_MEM_MGT_EXTENSIONS is added to indicate device support for these features. - New send work request opcodes IB_WR_FAST_REG_MR, IB_WR_LOCAL_INV, IB_WR_RDMA_READ_WITH_INV are added. - A new consumer API function, ib_alloc_mr() is added to allocate fast register memory regions. - New consumer API functions, ib_alloc_fast_reg_page_list() and ib_free_fast_reg_page_list() are added to allocate and free device-specific memory for fast registration page lists. - A new consumer API function, ib_update_fast_reg_key(), is added to allow the key portion of the R_Key and L_Key of a fast registration MR to be updated. Consumers call this if desired before posting a IB_WR_FAST_REG_MR work request. Consumers can use this as follows: - MR is allocated with ib_alloc_mr(). - Page list memory is allocated with ib_alloc_fast_reg_page_list(). - MR R_Key/L_Key "key" field is updated with ib_update_fast_reg_key(). - MR made VALID and bound to a specific page list via ib_post_send(IB_WR_FAST_REG_MR) - MR made INVALID via ib_post_send(IB_WR_LOCAL_INV), ib_post_send(IB_WR_RDMA_READ_WITH_INV) or an incoming send with invalidate operation. - MR is deallocated with ib_dereg_mr() - page lists dealloced via ib_free_fast_reg_page_list(). Applications can allocate a fast register MR once, and then can repeatedly bind the MR to different physical block lists (PBLs) via posting work requests to a send queue (SQ). For each outstanding MR-to-PBL binding in the SQ pipe, a fast_reg_page_list needs to be allocated (the fast_reg_page_list is owned by the low-level driver from the consumer posting a work request until the request completes). Thus pipelining can be achieved while still allowing device-specific page_list processing. The 32-bit fast register memory key/STag is composed of a 24-bit index and an 8-bit key. The application can change the key each time it fast registers thus allowing more control over the peer's use of the key/STag (ie it can effectively be changed each time the rkey is rebound to a page list). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-15RDMA: Remove subversion $Id tagsRoland Dreier
They don't get updated by git and so they're worse than useless. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-17IB/core: Add support for modify CQEli Cohen
Add support for modifying CQ parameters for controlling event generation moderation. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-17IB/core: Check optional verbs before using themDotan Barak
Make sure that a device implements the modify_srq and reg_phys_mr optional methods before calling them. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-07IB: Add CQ comp_vector supportMichael S. Tsirkin
Add a num_comp_vectors member to struct ib_device and extend ib_create_cq() to pass in a comp_vector parameter -- this parallels the userspace libibverbs API. Update all hardware drivers to set num_comp_vectors to 1 and have all ULPs pass 0 for the comp_vector value. Pass the value of num_comp_vectors to userspace rather than hard-coding a value of 1. We want multiple CQ event vector support (via MSI-X or similar for adapters that can generate multiple interrupts), but it's not clear how many vectors we want, or how we want to deal with policy issues such as how to decide which vector to use or how to set up interrupt affinity. This patch is useful for experimenting, since no core changes will be necessary when updating a driver to support multiple vectors, and we know that we want to make at least these changes anyway. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-23IB/core: Set hop limit in ib_init_ah_from_wc correctlySean Hefty
The hop_limit value in the ah_attr should be 0xFF, not the value read from the received GRH (which should be 0). See 13.5.4.4 in the 1.2 IB spec. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22RDMA: iWARP Core Changes.Tom Tucker
Modifications to the existing rdma header files, core files, drivers, and ulp files to support iWARP, including: - Hook iWARP CM into the build system and use it in rdma_cm. - Convert enum ib_node_type to enum rdma_node_type, which includes the possibility of RDMA_NODE_RNIC, and update everything for this. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/uverbs: Pass userspace data to modify_srq and modify_qp methodsRalph Campbell
Pass a struct ib_udata to the low-level driver's ->modify_srq() and ->modify_qp() methods, so that it can get to the device-specific data passed in by the userspace driver. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-18IB: Add ib_init_ah_from_wc()Sean Hefty
Add a function to initialize address handle attributes from a work completion. This functionality is duplicated by both verbs and the CM. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-10IB: simplify static rate encodingJack Morgenstein
Push translation of static rate to HCA format into low-level drivers, where it belongs. For static rate encoding, use encoding of rate field from IB standard PathRecord, with addition of value 0, for backwards compatibility with current usage. The changes are: - Add enum ib_rate to midlayer includes. - Get rid of static rate translation in IPoIB; just use static rate directly from Path and MulticastGroup records. - Update mthca driver to translate absolute static rate into the format used by hardware. This also fixes mthca's static rate handling for HCAs that are capable of 4X DDR. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB: Fix modify QP checking of "current QP state" attributeDotan Barak
According to the IB spec version 1.2, section 11.2.4.2, the current table has a couple of mistakes where it allows the current QP state (IB_QP_CUR_STATE) attribute. For the transitions: RTS -> RTS: IB_QP_CUR_STATE should be allowed for all transports SQD -> SQD: IB_QP_CUR_STATE should never be allowed Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB: Whitespace cleanupsRoland Dreier
Remove trailing whitespace and fix indentation that with spaces instead of tabs. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB: Add ib_modify_qp_is_ok() library functionRoland Dreier
The in-kernel mthca driver contains a table of which attributes are valid for each queue pair state transition. It turns out that both other IB drivers -- ipath and ehca -- which are being prepared for merging have copied this table, errors and all. To forestall this code duplication, move this table and the code to check parameters against it into a midlayer library function, ib_modify_qp_is_ok(). Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB: Add userspace support for resizing CQsRoland Dreier
Add support to uverbs to handle resizing userspace CQs (completion queues), including adding an ABI for marshalling requests and responses. The kernel midlayer already has ib_resize_cq(). Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-07IB: Set GIDs correctly in ib_create_ah_from_wc()Ralph Campbell
ib_create_ah_from_wc() doesn't create the correct return address (AH) when there is a GRH present (source & dest GIDs need to be swapped). Signed-off-by: Ralph Campbell <ralphc@pathscale.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-10Merge branch 'for-linus' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband
2005-11-10[IB] Have cq_resize() method take an int, not int*Roland Dreier
Change the struct ib_device.resize_cq() method to take a plain integer that holds the new CQ size, rather than a pointer to an integer that it uses to return the new size. This makes the interface match the exported ib_resize_cq() signature, and allows the low-level driver to update the CQ size with proper locking if necessary. No in-tree drivers are exporting this method yet. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-07[PATCH] fix remaining missing includesTim Schmielau
Fix more include file problems that surfaced since I submitted the previous fix-missing-includes.patch. This should now allow not to include sched.h from module.h, which is done by a followup patch. Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-17[IB] Add checks to multicast attach and detachJack Morgenstein
Add checks so that we only allow multicast attach/detach with a valid multicast GID and the correct QP type. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-08-27[PATCH] IB: move include files to include/rdmaRoland Dreier
Move the InfiniBand headers from drivers/infiniband/include to include/rdma. This allows InfiniBand-using code to live elsewhere, and lets us remove the ugly EXTRA_CFLAGS include path from the InfiniBand Makefiles. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-08-27[PATCH] IB: Add SRQ support to midlayerRoland Dreier
Make the required core API additions and changes for shared receive queues (SRQs). Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-08-27[PATCH] IB: Add copyright noticesRoland Dreier
Make some lawyers happy and add copyright notices for people who forgot to include them when they actually touched the code. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-07-27[PATCH] IB: A couple of IB core bug fixesHal Rosenstock
Replace be32_to_cpup with be32_to_cpu and fix bug referencing pointer rather than value in ib_create_ah_from_wc(). Signed-off-by: Tom Duffy <tduffy@sun.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Hal Rosenstock <halr@voltaire.com> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27[PATCH] IB: Add ib_create_ah_from_wc to IB verbsHal Rosenstock
Added new call: ib_create_ah_from_wc. Call will allocate an address handle given work completion information, including any received GRH. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Hal Rosenstock <halr@voltaire.com> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-08[PATCH] IB uverbs: update kernel midlayer for new APIRoland Dreier
Update kernel InfiniBand midlayer to compile against the updated API for low-level drivers. This just amounts to passing NULL for all userspace-related parameters, and setting userspace-related structure members to NULL. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16Linux-2.6.12-rc2Linus Torvalds
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!