summaryrefslogtreecommitdiff
path: root/drivers/infiniband
AgeCommit message (Collapse)Author
2013-09-03RDMA/ocrdma: Fix compiler warning about int/pointer size mismatchRoland Dreier
Fix: drivers/infiniband/hw/ocrdma/ocrdma_verbs.c: In function 'ocrdma_build_fr': >> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c:1832:7: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] mr = (struct ocrdma_mr *)qp->dev->stag_arr[(hdr->lkey >> 8) & ^ drivers/infiniband/hw/ocrdma/ocrdma_verbs.c: In function 'ocrdma_alloc_frmr': >> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c:2661:64: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] dev->stag_arr[(mr->hwmr.lkey >> 8) & (OCRDMA_MAX_STAG - 1)] = (u64) mr; Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-03IB/iser: Fix redundant pointer check in dealloc flowSagi Grimberg
This bug was discovered by Smatch static checker run by Dan Carpenter. If in free_rx_descriptors(), rx_descs are not NULL then the iser device is definately not NULL, so no need to check it before dereferencing it. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-03IB/iser: Fix possible memory leak in iser_create_frwr_pool()Roi Dayan
Fix leak where desc is not being freed in error flows. Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-03IB/qib: Move COUNTER_MASK definition within qib_mad.h header guardsIra Weiny
Commit 36a8f01cd24b ("IB/qib: Add congestion control agent implementation") caused statements to leak pass the header guard. Fix this. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-03RDMA/ocrdma: Fix passing wrong opcode to modify_srqNaresh Gottumukkala
Fix passing wrong opcode to ocrdma_modify_srq and query SRQ. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-03RDMA/ocrdma: Fill PVID in UMC caseNaresh Gottumukkala
In UMC case, driver needs to fill PVID in the address vector template for UD traffic. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-03RDMA/ocrdma: Add ABI versioning supportNaresh Gottumukkala
Add ABI versioning support between driver and userspace library. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-03RDMA/ocrdma: Consider multiple SGES in case of DPPNaresh Gottumukkala
While posting inline DPP data, we are not considering multiple sges. Fix this. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-03RDMA/ocrdma: Fix for displaying proper link speedNaresh Gottumukkala
Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-03RDMA/ocrdma: Increase STAG array sizeNaresh Gottumukkala
1) Increase STAG Array size. 2) Max inline data size should be set to the same value used during QP creation 3) Set max_sge_rd to zero since we dont support RD transport in our adapters. 4) Max cqes reported in ibv_devinfo should be from QUERY_CONFIG. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-03RDMA/ocrdma: Dont use PD 0 for userpace CQ DBNaresh Gottumukkala
Create_CQ verb doesn't provide a PD pointer. So, until now we are creating all (both userspace and kernel) CQ DB regions from PD0. This will result in mmapping PD0 to applications. A rogue userspace application can mess things up. Also more serious issues is even the be2net NIC uses PD0. This patch addresses this problem by: 1) Create a PD page for every userspace application when the alloc_ucontext is called. This will be destroyed in dealloc_ucontext. 2) All CQs for that context will use the PD allocated in ucontext. 3) The first create_PD call from application will result in returning the PD address from its ucontext (no new PD will be created). 4) For subsecquent create_pd calls from application, we create new PDs for the application. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-03RDMA/ocrdma: FRMA code cleanupNaresh Gottumukkala
1) Fixed setting FR_MR bit for FRWR stag allocation 2) Access rights are passsed during FRWR stage and not during STAT allocation stage 3) FRWR WQE structure cleanup 4) Add QP level signaled bit. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-03RDMA/ocrdma: For ERX2 irrespective of Qid, num_posted offset is 24Naresh Gottumukkala
1) All RQ doorbells are handled by ERX2 and doorbell->num_posted offset is constant to bit offset 24 for ERX2 irrspective of Q id. 2) Fixed RESET to INIT state change (from ERR->RST->INIT->RTR case). Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-03RDMA/ocrdma: Fix to work with even a single MSI-X vectorNaresh Gottumukkala
There are cases like SRIOV where can get only one MSI-X vector allocated for RoCE. In that case we need to use the vector for both data plane and control plane. We need to use EQ create version V2. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-03RDMA/ocrdma: Remove the MTU check based on Ethernet MTUNaresh Gottumukkala
Also increase MAX AH to 512. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-03RDMA/ocrdma: Add support for fast register work requests (FRWR)Naresh Gottumukkala
Also get the max_srq value from query_config mailbox response. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-03RDMA/ocrdma: Create IRD queue fixNaresh Gottumukkala
1) Fix ocrdma_get_num_posted_shift for upto 128 QPs. 2) Create for min of dev->max_wqe and requested wqe in create_qp. 3) As part of creating ird queue, populate with basic header templates. 4) Make sure all the DB memory allocated to userspace are page aligned. 5) Fix issue in checking the mmap local cache. 6) Some code cleanup. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02IB/core: Better checking of userspace values for receive flow steeringMatan Barak
- Don't allow unsupported comp_mask values, user should check ibv_query_device to know which features are supported. - Add a check in ib_uverbs_create_flow() to verify the size passed from the user space. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-28IB/mlx4: Add receive flow steering supportHadar Hen Zion
Implement ib_create_flow() and ib_destroy_flow(). Translate the verbs structures provided by the user to HW structures and call the MLX4_QP_FLOW_STEERING_ATTACH/DETACH firmware commands. On the ATTACH command completion, the firmware provides a 64-bit registration ID, which is placed into struct mlx4_ib_flow that wraps the instance of struct ib_flow which is retuned to caller. Later, this reg ID is used for detaching that flow from the firmware. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-28IB/core: Export ib_create/destroy_flow through uverbsHadar Hen Zion
Implement ib_uverbs_create_flow() and ib_uverbs_destroy_flow() to support flow steering for user space applications. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-28IB/core: Infrastructure for extensible uverbs commandsIgor Ivanov
Add infrastructure to support extended uverbs capabilities in a forward/backward manner. Uverbs command opcodes which are based on the verbs extensions approach should be greater or equal to IB_USER_VERBS_CMD_THRESHOLD. They have new header format and processed a bit differently. Whenever a specific IB_USER_VERBS_CMD_XXX is extended, which practically means it needs to have additional arguments, we will be able to add them without creating a completely new IB_USER_VERBS_CMD_YYY command or bumping the uverbs ABI version. This patch for itself doesn't provide the whole scheme which is also dependent on adding a comp_mask field to each extended uverbs command struct. The new header framework allows for future extension of the CMD arguments (ib_uverbs_cmd_hdr.in_words, ib_uverbs_cmd_hdr.out_words) for an existing new command (that is a command that supports the new uverbs command header format suggested in this patch) w/o bumping ABI version and with maintaining backward and formward compatibility to new and old libibverbs versions. In the uverbs command we are passing both uverbs arguments and the provider arguments. We split the ib_uverbs_cmd_hdr.in_words to ib_uverbs_cmd_hdr.in_words which will now carry only uverbs input argument struct size and ib_uverbs_cmd_hdr.provider_in_words that will carry the provider input argument size. Same goes for the response (the uverbs CMD output argument). For example take the create_cq call and the mlx4_ib provider: The uverbs layer gets libibverb's struct ibv_create_cq (named struct ib_uverbs_create_cq in the kernel), mlx4_ib gets libmlx4's struct mlx4_create_cq (which includes struct ibv_create_cq and is named struct mlx4_ib_create_cq in the kernel) and in_words = sizeof(mlx4_create_cq)/4 . Thus ib_uverbs_cmd_hdr.in_words carry both uverbs plus mlx4_ib input argument sizes, where uverbs assumes it knows the size of its input argument - struct ibv_create_cq. Now, if we wish to add a variable to struct ibv_create_cq, we can add a comp_mask field to the struct which is basically bit field indicating which fields exists in the struct (as done for the libibverbs API extension), but we need a way to tell what is the total size of the struct and not assume the struct size is predefined (since we may get different struct sizes from different user libibverbs versions). So we know at which point the provider input argument (struct mlx4_create_cq) begins. Same goes for extending the provider struct mlx4_create_cq. Thus we split the ib_uverbs_cmd_hdr.in_words to ib_uverbs_cmd_hdr.in_words which will now carry only uverbs input argument struct size and ib_uverbs_cmd_hdr.provider_in_words that will carry the provider (mlx4_ib) input argument size. Signed-off-by: Igor Ivanov <Igor.Ivanov@itseez.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-28IB/core: Add receive flow steering supportHadar Hen Zion
The RDMA stack allows for applications to create IB_QPT_RAW_PACKET QPs, which receive plain Ethernet packets, specifically packets that don't carry any QPN to be matched by the receiving side. Applications using these QPs must be provided with a method to program some steering rule with the HW so packets arriving at the local port can be routed to them. This patch adds ib_create_flow(), which allow providing a flow specification for a QP. When there's a match between the specification and a received packet, the packet is forwarded to that QP, in a the same way one uses ib_attach_multicast() for IB UD multicast handling. Flow specifications are provided as instances of struct ib_flow_spec_yyy, which describe L2, L3 and L4 headers. Currently specs for Ethernet, IPv4, TCP and UDP are defined. Flow specs are made of values and masks. The input to ib_create_flow() is a struct ib_flow_attr, which contains a few mandatory control elements and optional flow specs. struct ib_flow_attr { enum ib_flow_attr_type type; u16 size; u16 priority; u32 flags; u8 num_of_specs; u8 port; /* Following are the optional layers according to user request * struct ib_flow_spec_yyy * struct ib_flow_spec_zzz */ }; As these specs are eventually coming from user space, they are defined and used in a way which allows adding new spec types without kernel/user ABI change, just with a little API enhancement which defines the newly added spec. The flow spec structures are defined with TLV (Type-Length-Value) entries, which allows calling ib_create_flow() with a list of variable length of optional specs. For the actual processing of ib_flow_attr the driver uses the number of specs and the size mandatory fields along with the TLV nature of the specs. Steering rules processing order is according to the domain over which the rule is set and the rule priority. All rules set by user space applicatations fall into the IB_FLOW_DOMAIN_USER domain, other domains could be used by future IPoIB RFS and Ethetool flow-steering interface implementation. Lower numerical value for the priority field means higher priority. The returned value from ib_create_flow() is a struct ib_flow, which contains a database pointer (handle) provided by the HW driver to be used when calling ib_destroy_flow(). Applications that offload TCP/IP traffic can also be written over IB UD QPs. The ib_create_flow() / ib_destroy_flow() API is designed to support UD QPs too. A HW driver can set IB_DEVICE_MANAGED_FLOW_STEERING to denote support for flow steering. The ib_flow_attr enum type supports usage of flow steering for promiscuous and sniffer purposes: IB_FLOW_ATTR_NORMAL - "regular" rule, steering according to rule specification IB_FLOW_ATTR_ALL_DEFAULT - default unicast and multicast rule, receive all Ethernet traffic which isn't steered to any QP IB_FLOW_ATTR_MC_DEFAULT - same as IB_FLOW_ATTR_ALL_DEFAULT but only for multicast IB_FLOW_ATTR_SNIFFER - sniffer rule, receive all port traffic ALL_DEFAULT and MC_DEFAULT rules options are valid only for Ethernet link type. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-26[SCSI] IB/iser: Add Discovery supportOr Gerlitz
To run discovery over iSER we need to advertize the CAP_TEXT_NEGO capability towards user space. Also need to make sure the login RX buffer is posted when SendTargets TEXT PDUs are sent. For that end, we use a setting of the ISCSI_PARAM_DISCOVERY_SESS iscsi param as an indication that this is discovery session. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-08-20treewide: Add __GFP_NOWARN to k.alloc calls with v.alloc fallbacksJoe Perches
Don't emit OOM warnings when k.alloc calls fail when there there is a v.alloc immediately afterwards. Converted a kmalloc/vmalloc with memset to kzalloc/vzalloc. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-08-13IPoIB: Fix race in deleting ipoib_neigh entriesJim Foraker
In several places, this snippet is used when removing neigh entries: list_del(&neigh->list); ipoib_neigh_free(neigh); The list_del() removes neigh from the associated struct ipoib_path, while ipoib_neigh_free() removes neigh from the device's neigh entry lookup table. Both of these operations are protected by the priv->lock spinlock. The table however is also protected via RCU, and so naturally the lock is not held when doing reads. This leads to a race condition, in which a thread may successfully look up a neigh entry that has already been deleted from neigh->list. Since the previous deletion will have marked the entry with poison, a second list_del() on the object will cause a panic: #5 [ffff8802338c3c70] general_protection at ffffffff815108c5 [exception RIP: list_del+16] RIP: ffffffff81289020 RSP: ffff8802338c3d20 RFLAGS: 00010082 RAX: dead000000200200 RBX: ffff880433e60c88 RCX: 0000000000009e6c RDX: 0000000000000246 RSI: ffff8806012ca298 RDI: ffff880433e60c88 RBP: ffff8802338c3d30 R8: ffff8806012ca2e8 R9: 00000000ffffffff R10: 0000000000000001 R11: 0000000000000000 R12: ffff8804346b2020 R13: ffff88032a3e7540 R14: ffff8804346b26e0 R15: 0000000000000246 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000 #6 [ffff8802338c3d38] ipoib_cm_tx_handler at ffffffffa066fe0a [ib_ipoib] #7 [ffff8802338c3d98] cm_process_work at ffffffffa05149a7 [ib_cm] #8 [ffff8802338c3de8] cm_work_handler at ffffffffa05161aa [ib_cm] #9 [ffff8802338c3e38] worker_thread at ffffffff81090e10 #10 [ffff8802338c3ee8] kthread at ffffffff81096c66 #11 [ffff8802338c3f48] kernel_thread at ffffffff8100c0ca We move the list_del() into ipoib_neigh_free(), so that deletion happens only once, after the entry has been successfully removed from the lookup table. This same behavior is already used in ipoib_del_neighs_by_gid() and __ipoib_reap_neigh(). Signed-off-by: Jim Foraker <foraker1@llnl.gov> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Jack Wang <jinpu.wang@profitbricks.com> Reviewed-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13RDMA/cxgb4: Issue RI.FINI before closing when entering TERMSteve Wise
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13RDMA/cxgb4: Advertise ~0ULL as max MR sizeSteve Wise
Lustre uses a advertised max MR size of ~0ULL to indicate it should use a dma_mr. Hence advertise max MR size as ~0ULL. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13RDMA/cxgb4: Always do GTS write if cidx_inc == CIDXINC_MASKSteve Wise
When polling, we do a GTS update if the accumulated cidx_inc == the CQ depth / 16. However, if the CQ is large enough, Cq depth / 16 exceeds the size of the field in the GTS word. So we also need to update if cidx_inc hits CIDXINC_MASK to avoid overflowing the field. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13RDMA/cxgb4: Set arp error handler for PASS_ACCEPT_RPL messagesSteve Wise
accept_cr() failed to set the arp error handler on a reused skb. This results in a kernel crash if the arp does indeed time out. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13RDMA/cxgb4: Fix accounting for unsignaled SQ WRs to deal with wrapSteve Wise
When determining how many WRs are completed with a signaled CQE, correctly deal with queue wraps. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13RDMA/cxgb4: Fix QP flush logicSteve Wise
This patch makes following fixes in QP flush logic: - correctly flushes unsignaled WRs followed by a signaled WR - supports for flushing a CQ bound to multiple QPs - resets cidx_flush if a active queue starts getting HW CQEs again - marks WQ in error when we leave RTS. This was only being done for user queues, but we need it for kernel queues too so that post_send/post_recv will start returning the appropriate error synchronously - eats unsignaled read resp CQEs. HW always inserts CQEs so we must silently discard them if the read work request was unsignaled. - handles QP flushes with pending SW CQEs. The flush and out of order completion logic has a bug where if out of order completions are flushed but not yet polled by the consumer and the qp is then flushed then we end up inserting duplicate completions. - c4iw_flush_sq() should only flush wrs that have not already been flushed. Since we already track where in the SQ we've flushed via sq.cidx_flush, just start at that point and flush any remaining. This bug only caused a problem in the presence of unsignaled work requests. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> [ Fixed sparse warning due to htonl/ntohl confusion. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13RDMA/cxgb4: Handle newer firmware changesSteve Wise
Move QP to TERMINATE instead to allow the peer to get the TERM message. This bug wasn't detectable until newer FW that moves connections out of RDMA mode as soon as an error is detected. QP can exit RTS before the last AE arrives. This was introduced by changes in the FW to kick connections out of RDMA mode as soon as an error is detected. A side effect of this is that the driver can move the QP out of RTS before the AE causing the connection to get kicked out of RDMA mode is processed. Fix for this is to always post async errors even if the QP is out of RTS. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13RDMA/cxgb4: Use correct bit shift macros for vlan filter tuplesSteve Wise
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13RDMA/cxgb4: Add support for active and passive open connection with IPv6 addressVipul Pandya
Add new cpl messages, cpl_act_open_req6 and cpl_t5_act_open_req6, for initiating active open connections. Use LLD api cxgb4_create_server and cxgb4_create_server6 for initiating passive open connections. Similarly use cxgb4_remove_server to remove the passive open connections in place of listen_stop. Add support for iWARP over VLAN device and enable IPv6 support on VLAN device. Make use of import_ep in c4iw_reconnect. Signed-off-by: Vipul Pandya <vipul@chelsio.com> [ Fix build when IPv6 is disabled and make sure iw_cxgb4 is not built-in when ipv6 is a module. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13IB/core: Fixes to XRC reference counting in uverbsYishai Hadas
Added reference counting mechanism for XRC target QPs between ib_uqp_object and its ib_uxrcd_object. This prevents closing an XRC domain that is still attached to a QP. In addition, add missing code in ib_uverbs_destroy_srq() to handle ib_uxrcd_object reference counting correctly when destroying an xsrq. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13IB/core: Add locking around event dispatching on XRC target QPsYishai Hadas
Fix a potential race when event occurrs on a target XRC QP and in the middle of reporting that on its shared qps, one of them is destroyed by user space application. Also add note for kernel consumers in ib_verbs.h that they must not destroy the QP from within the handler. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13IB/qib: Clean up unnecessary MSI/MSI-X capability findYijing Wang
PCI core will initialize device MSI/MSI-X capability in pci_msi_init_pci_dev(). So device drivers should use pci_dev->msi_cap/msix_cap to determine whether a device supports MSI/MSI-X instead of using pci_find_capability(pci_dev, PCI_CAP_ID_MSI/MSIX). Access to PCIe device config space again will consume more time. Signed-off-by: Yijing Wang <wangyijing@huawei.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13IB/qib: Make qib_driver staticPaul Bolle
struct pci_driver qib_driver is only used in qib_init.c. Remove it from qib.h and make it static in qib_init.c. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13IB/qib: Improve SDMA performanceCQ Tang
1. The code accepts chunks of messages, and splits the chunk into packets when converting packets into sdma queue entries. Adjacent packets will use user buffer pages smartly to avoid pinning the same page multiple times. 2. Instead of discarding all the work when SDMA queue is full, the work is saved in a pending queue. Whenever there are enough SDMA queue free entries, pending queue is directly put onto SDMA queue. 3. An interrupt handler is used to progress this pending queue. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: CQ Tang <cq.tang@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> [ Fixed up sparse warnings. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-12RDMA/cma: Add IPv6 support for iWARPSteve Wise
Modify the type of local_addr and remote_addr fields in struct iw_cm_id from struct sockaddr_in to struct sockaddr_storage to hold IPv6 and IPv4 addresses uniformly. Change the references of local_addr and remote_addr in cxgb4, cxgb3, nes and amso drivers to match this. However to be able to actully run traffic over IPv6, low-level drivers have to add code to support this. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> [ Fix unused variable warnings when INFINIBAND_NES_DEBUG not set. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-12RDMA/ocrdma: Cache recv DB until QP moved to RTRNaresh Gottumukkala
1) In post recv, don't ring the DB doorbell if the QP is in RTR state. Cache the DB calls, until the QP is moved to RTS state. 2) Add max_rd_sge support to dev->attr. 3) Code cleanup in alloc_pd path. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-12RDMA/ocrdma: Remove __packedNaresh Gottumukkala
1) Remove __packed for structures. 2) Align and pad all ABI structure to 64 bit boundaries instead of using __packed. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-12RDMA/ocrdma: Remove driver QP state machineNaresh Gottumukkala
Remove QP state machine in ocrdma low-level driver and use on the core IB stack's instead. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-12RDMA/ocrdma: Don't allow zero/invalid sgid usageNaresh Gottumukkala
Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-12RDMA/ocrdma: Remove redundant dev referenceNaresh Gottumukkala
Remove redundant dev reference from structures: 1) ocrdma_cq. 2) ocrdma_ah. 3) ocrdma_hw_mr. 4) ocrdma_mw. 5) ocrdma_srq. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-12RDMA/ocrdma: Style and redundant code cleanupNaresh Gottumukkala
Code cleanup and remove redundant code: 1) redundant initialization removed 2) braces changed as per CodingStyle. 3) redundant checks removed 4) extra braces in return statements removed. 5) removed unused pd pointer from mr. 6) reorganized get_dma_mr() 7) fixed set_av() to return error on invalid sgid index. 8) reference to ocrdma_dev removed from struct ocrdma_pd. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-10IB/iser: Introduce fast memory registration model (FRWR)Sagi Grimberg
Newer HCAs and Virtual functions may not support FMRs but rather a fast registration model, which we call FRWR - "Fast Registration Work Requests". This model was introduced in 00f7ec36c ("RDMA/core: Add memory management extensions support") and works when the IB device supports the IB_DEVICE_MEM_MGT_EXTENSIONS capability. Upon creating the iser device iser will test whether the HCA supports FMRs. If no support for FMRs, check if IB_DEVICE_MEM_MGT_EXTENSIONS is supported and assign function pointers that handle fast registration and allocation of appropriate resources (fast_reg descriptors). Registration is done using posting IB_WR_FAST_REG_MR to the QP and invalidations using posting IB_WR_LOCAL_INV. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-10IB/iser: Place the fmr pool into a union in iser's IB conn structSagi Grimberg
This is preparation step for other memory registration methods to be added. In addition, change reg/unreg routines signature to indicate they use FMRs. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-10IB/iser: Handle unaligned SG in separate functionSagi Grimberg
This routine will be shared with other rdma management schemes. The bounce buffer solution for unaligned SG-lists and the sg_to_page_vec routine are likely to be used for other registration schemes and not just FMR. Move them out of the FMR specific code, and call them from there. Later they will be called also from other reg methods code. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-10IB/iser: Generalize rdma memory registrationSagi Grimberg
Currently the driver uses FMRs as the only means to register the memory pointed by SG provided by the SCSI mid-layer with the RDMA device. As preparation step for adding more methods for fast path memory registration, make the alloc/free and reg/unreg calls function pointers, which are for now just set to the existing FMR ones. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>