summaryrefslogtreecommitdiff
path: root/drivers/infiniband/core/cma.c
AgeCommit message (Collapse)Author
2015-10-30IB/core, cma: Make __attribute_const__ declarations sparse-friendlyBart Van Assche
Move the __attribute_const__ declarations such that sparse understands that these apply to the function itself and not to the return type. This avoids that sparse reports error messages like the following: drivers/infiniband/core/verbs.c:73:12: error: symbol 'ib_event_msg' redeclared with different type (originally declared at include/rdma/ib_verbs.h:470) - different modifiers Fixes: 2b1b5b601230 ("IB/core, cma: Nice log-friendly string helpers") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28IB/cma: Add support for network namespacesGuy Shapiro
Add support for network namespaces in the ib_cma module. This is accomplished by: 1. Adding network namespace parameter for rdma_create_id. This parameter is used to populate the network namespace field in rdma_id_private. rdma_create_id keeps a reference on the network namespace. 2. Using the network namespace from the rdma_id instead of init_net inside of ib_cma, when listening on an ID and when looking for an ID for an incoming request. 3. Decrementing the reference count for the appropriate network namespace when calling rdma_destroy_id. In order to preserve the current behavior init_net is passed when calling from other modules. Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28IB/cma: Separate port allocation to network namespacesHaggai Eran
Keep a struct for each network namespace containing the IDRs for the RDMA CM port spaces. The struct is created dynamically using the generic_net mechanism. This patch is internal infrastructure work for the following patches. In this patch, init_net is statically used as the network namespace for the new port-space API. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28IB/addr: Pass network namespace as a parameterGuy Shapiro
Add network namespace support to the ib_addr module. For that, all the address resolution and matching should be done using the appropriate namespace instead of init_net. This is achieved by: 1. Adding an explicit network namespace argument to exported function that require a namespace. 2. Saving the namespace in the rdma_addr_client structure. 3. Using it when calling networking functions. In order to preserve the behavior of calling modules, &init_net is passed as the parameter in calls from other modules. This is modified as namespace support is added on more levels. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-22IB/core: Remove smac and vlan id from path recordMatan Barak
The GID cache accompanies every GID with attributes. The GID attributes link the GID with its netdevice, which could be resolved to smac and vlan id easily. Since we've added the netdevice (ifindex and net) to the path record, storing the L2 attributes is duplicated data and hence these attributes are removed. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-22IB/cm: Remove the usage of smac and vid of qp_attr and cm_avMatan Barak
The cm and cma don't need to explicitly handle vlan and smac, as they are resolved from the GID index now. Removing this portion of code. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-22IB/cma: cma_validate_port should verify the port and netdeviceMatan Barak
Previously, cma_validate_port searched for GIDs in IB cache and then tried to verify the found port. This could fail when there are identical GIDs on both ports. In addition, netdevice should be taken into account when searching the GID table. Fixing cma_validate_port to search only the relevant port's cache and netdevice. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-22IB/core: Add netdev and gid attributes paramteres to cacheMatan Barak
Adding an ability to query the IB cache by a netdev and get the attributes of a GID. These parameters are necessary in order to successfully resolve the required GID (when the netdevice is known) and get the Ethernet L2 attributes from a GID. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-20IB/cma: Use inner P_Key to determine netdevHaggai Eran
When discussing the patches to demux ids in rdma_cm instead of ib_cm, it was decided that it is best to use the P_Key value in the packet headers. However, the mlx5 and ipath drivers are currently unable to send correct P_Key values in GMP headers. They always send using a single P_Key that is set during the GSI QP initialization. Change the rdma_cm code to look at the P_Key value that is part of the packet payload as a workaround. Once the drivers are fixed this patch can be reverted. Fixes: 4c21b5bcef73 ("IB/cma: Add net_dev and private data checks to RDMA CM") Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-20IB/cma: Potential NULL dereference in cma_id_from_eventHaggai Eran
If the lookup of a listening ID failed for an AF_IB request, the code would try to call dev_put() on a NULL net_dev. Fixes: be688195bd08 ("IB/cma: Fix net_dev reference leak with failed requests") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-06IB/cma: Accept connection without a valid netdev on RoCEHaggai Eran
The netdev checks recently added to RDMA CM expect a valid netdev to be found for both InfiniBand and RoCE, but the code that find a netdev is only implemented for InfiniBand. Currently RoCE doesn't provide an API to find the netdev matching a given set of parameters, so this patch just disables the netdev enforcement for each incoming connections when the link layer is RoCE. Fixes: 4c21b5bcef73 ("IB/cma: Add net_dev and private data checks to RDMA CM") Reported-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/cma: Fix net_dev reference leak with failed requestsHaggai Eran
When no matching listening ID is found for a given request, the net_dev that was used to find the request isn't released. Fixes: 0b3ca768fcb0 ("IB/cma: Use found net_dev for passive connections") Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/cma: Share ib_cm_ids between rdma_cm_idsHaggai Eran
Use ib_cm_insert_listen to create listening IB CM IDs or share existing ones if needed. When given a request on a specific CM ID, the code now matches the request to the RDMA CM ID based on the request parameters, so it no longer needs to rely on the ib_cm's private data matching capabilities. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/cma: Use found net_dev for passive connectionsHaggai Eran
When receiving a new connection in cma_req_handler, we actually already know the net_dev that is used for the connection's creation. Instead of calling cma_translate_addr to resolve the new connection id's source address, just use the net_dev that was found. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/cma: Validate routing of incoming requestsHaggai Eran
Pass incoming request parameters through the relevant IPv4/IPv6 routing tables and make sure the network stack is configured to handle such requests. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/cma: Add net_dev and private data checks to RDMA CMHaggai Eran
Instead of relying on a the ib_cm module to check an incoming CM request's private data header, add these checks to the RDMA CM module. This allows a following patch to to clean up the ib_cm interface and remove the code that looks into the private headers. It will also allow supporting namespaces in RDMA CM by making these checks namespace aware later on. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/cma: Helper functions to access port space IDRsHaggai Eran
Add helper functions to access the IDRs by port-space and port number. Pass around the port-space enum in cma.c instead of using pointers to port-space IDRs. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/cma: Refactor RDMA IP CM private-data parsing codeHaggai Eran
When receiving a connection request, rdma_cm needs to associate the request with a network device, in order to disambiguate requests. To do this, it needs to know the request's destination IP. For this the module needs to allow getting this information from the private data in the request packet, instead of relying on the information already being in the listening RDMA CM ID. When creating a new incoming connection ID, the code in cma_save_ip{4,6}_info can no longer rely on the listener's private data to find the port number, so it reads it from the requested service ID. Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/core: lock client data with lists_rwsemHaggai Eran
An ib_client callback that is called with the lists_rwsem locked only for read is protected from changes to the IB client lists, but not from ib_unregister_device() freeing its client data. This is because ib_unregister_device() will remove the device from the device list with lists_rwsem locked for write, but perform the rest of the cleanup, including the call to remove() without that lock. Mark client data that is undergoing de-registration with a new going_down flag in the client data context. Lock the client data list with lists_rwsem for write in addition to using the spinlock, so that functions calling the callback would be able to lock only lists_rwsem for read and let callbacks sleep. Since ib_unregister_client() now marks the client data context, no need for remove() to search the context again, so pass the client data directly to remove() callbacks. Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-29RDMA/cma: fix IPv6 address resolutionSpencer Baugh
Resolving a link-local IPv6 address with an unspecified source address was broken by commit 5462eddd7a, which prevented the IPv6 stack from learning the scope id of the link-local IPv6 address, causing random failures as the IP stack chose a random link to resolve the address on. This commit 5462eddd7a made us bail out of cma_check_linklocal early if the address passed in was not an IPv6 link-local address. On the address resolution path, the address passed in is the source address; if the source address is the unspecified address, which is not link-local, we will bail out early. This is mostly correct, but if the destination address is a link-local address, then we will be following a link-local route, and we'll need to tell the IPv6 stack what the scope id of the destination address is. This used to be done by last line of cma_check_linklocal, which is skipped when bailing out early: dev_addr->bound_dev_if = sin6->sin6_scope_id; (In cma_bind_addr, the sin6_scope_id of the source address is set to the sin6_scope_id of the destination address, so this is correct) This line is required in turn for the following line, L279 of addr6_resolve, to actually inform the IPv6 stack of the scope id: fl6.flowi6_oif = addr->bound_dev_if; Since we can only know we are in this failure case when we have access to both the source IPv6 address and destination IPv6 address, we have to deal with this further up the stack. So detect this failure case in cma_bind_addr, and set bound_dev_if to the destination address scope id to correct it. Signed-off-by: Spencer Baugh <sbaugh@catern.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-06-23Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: - a large cleanup of how device capabilities are checked for various features - additional cleanups in the MAD processing - update to the srp driver - creation and use of centralized log message helpers - add const to a number of args to calls and clean up call chain - add support for extended cq create verb - add support for timestamps on cq completion - add support for processing OPA MAD packets * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (92 commits) IB/mad: Add final OPA MAD processing IB/mad: Add partial Intel OPA MAD support IB/mad: Add partial Intel OPA MAD support IB/core: Add OPA MAD core capability flag IB/mad: Add support for additional MAD info to/from drivers IB/mad: Convert allocations from kmem_cache to kzalloc IB/core: Add ability for drivers to report an alternate MAD size. IB/mad: Support alternate Base Versions when creating MADs IB/mad: Create a generic helper for DR forwarding checks IB/mad: Create a generic helper for DR SMP Recv processing IB/mad: Create a generic helper for DR SMP Send processing IB/mad: Split IB SMI handling from MAD Recv handler IB/mad cleanup: Generalize processing of MAD data IB/mad cleanup: Clean up function params -- find_mad_agent IB/mlx4: Add support for CQ time-stamping IB/mlx4: Add mmap call to map the hardware clock IB/core: Pass hardware specific data in query_device IB/core: Add timestamp_mask and hca_core_clock to query_device IB/core: Extend ib_uverbs_create_cq IB/core: Add CQ creation time-stamping flag ...
2015-06-02Merge branch 'for-4.2-misc' into k.o/for-4.2Doug Ledford
2015-06-02RDMA/iw_cm: Export tos field to iwarp providersSteve Wise
rdma-cma/iw_cm: Export tos field to iwarp providers Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-20IB/cma: Fix broken AF_IB UD supportMatthew Finlay
Support for using UD and AF_IB is currently broken. The IB_CM_SIDR_REQ_RECEIVED message is not handled properly in cma_save_net_info() and we end up falling into code that will try and process the request as ipv4/ipv6, which will end up failing. The resolution is to add a check for the SIDR_REQ and call cma_save_ib_info() with a NULL path record. Change cma_save_ib_info() to copy the src sib info from the listen_id when the path record is NULL. Reported-by: Hari Shankar <Hari.Shankar@netapp.com> Signed-off-by: Matt Finlay <matt@mellanox.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-20Merge branches 'bart-srp', 'generic-errors', 'ira-cleanups' and 'mwang-v8' ↵Doug Ledford
into k.o/for-4.2
2015-05-20IB/core: Change rdma_protocol_iboe to roceIra Weiny
After discussion upstream, it was agreed to transition the usage of iboe in the kernel to roce. This keeps our terminology consistent with what was finalized in the IBTA Annex 16 and IBTA Annex 17 publications. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18IB/core, cma: Nice log-friendly string helpersSagi Grimberg
Some of us keep revisiting the code to decode enumerations that appear in out logs. Let's borrow the nice logging helpers that exists in xprtrdma and rds for CMA events, IB events and WC statuses. Reviewd-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18IB/Verbs: Use management helper rdma_cap_eth_ah()Michael Wang
Introduce helper rdma_cap_eth_ah() to help us check if the port of an IB device support Ethernet Address Handler. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18IB/Verbs: Use management helper rdma_cap_af_ib()Michael Wang
Introduce helper rdma_cap_af_ib() to help us check if the port of an IB device support Native Infiniband Address. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18IB/Verbs: Use management helper rdma_cap_ib_mcast()Michael Wang
Introduce helper rdma_cap_ib_mcast() to help us check if the port of an IB device support Infiniband Multicast. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18IB/Verbs: Use management helper rdma_cap_ib_sa()Michael Wang
Introduce helper rdma_cap_ib_sa() to help us check if the port of an IB device support Infiniband Subnet Administration. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18IB/Verbs: Use management helper rdma_cap_iw_cm()Michael Wang
Introduce helper rdma_cap_iw_cm() to help us check if the port of an IB device support IWARP Communication Manager. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18IB/Verbs: Use management helper rdma_cap_ib_cm()Michael Wang
Introduce helper rdma_cap_ib_cm() to help us check if the port of an IB device support Infiniband Communication Manager. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18IB/Verbs: Reform rest part in IB-core cmaMichael Wang
Use raw management helpers to reform rest part in IB-core cma. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18IB/Verbs: Reform cma_acquire_dev()Michael Wang
Reform cma_acquire_dev() with management helpers, introduce cma_validate_port() to make the code more clean. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18IB/Verbs: Reform mcast related part in IB-core cmaMichael Wang
Use raw management helpers to reform mcast related part in IB-core cma. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18IB/Verbs: Reform route related part in IB-core cmaMichael Wang
Use raw management helpers to reform route related part in IB-core cma. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18IB/Verbs: Reform cm related part in IB-core cma/ucmMichael Wang
Use raw management helpers to reform cm related part in IB-core cma/ucm. Few checks focus on the device cm type rather than the port capability, directly pass port 1 works currently, but can't support mixing cm type device in future. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-05RDMA/CMA: Canonize IPv4 on IPV6 sockets properlyJason Gunthorpe
When accepting a new IPv4 connect to an IPv6 socket, the CMA tries to canonize the address family to IPv4, but does not properly process the listening sockaddr to get the listening port, and does not properly set the address family of the canonized sockaddr. Fixes: e51060f08a61 ("IB: IP address based RDMA connection manager") Cc: <stable@vger.kernel.org> Reported-By: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2014-06-10RDMA/core: Add support for iWARP Port Mapper user space serviceTatyana Nikolova
This patch adds iWARP Port Mapper (IWPM) Version 2 support. The iWARP Port Mapper implementation is based on the port mapper specification section in the Sockets Direct Protocol paper - http://www.rdmaconsortium.org/home/draft-pinkerton-iwarp-sdp-v1.0.pdf Existing iWARP RDMA providers use the same IP address as the native TCP/IP stack when creating RDMA connections. They need a mechanism to claim the TCP ports used for RDMA connections to prevent TCP port collisions when other host applications use TCP ports. The iWARP Port Mapper provides a standard mechanism to accomplish this. Without this service it is possible for RDMA application to bind/listen on the same port which is already being used by native TCP host application. If that happens the incoming TCP connection data can be passed to the RDMA stack with error. The iWARP Port Mapper solution doesn't contain any changes to the existing network stack in the kernel space. All the changes are contained with the infiniband tree and also in user space. The iWARP Port Mapper service is implemented as a user space daemon process. Source for the IWPM service is located at http://git.openfabrics.org/git?p=~tnikolova/libiwpm-1.0.0/.git;a=summary The iWARP driver (port mapper client) sends to the IWPM service the local IP address and TCP port it has received from the RDMA application, when starting a connection. The IWPM service performs a socket bind from user space to get an available TCP port, called a mapped port, and communicates it back to the client. In that sense, the IWPM service is used to map the TCP port, which the RDMA application uses to any port available from the host TCP port space. The mapped ports are used in iWARP RDMA connections to avoid collisions with native TCP stack which is aware that these ports are taken. When an RDMA connection using a mapped port is terminated, the client notifies the IWPM service, which then releases the TCP port. The message exchange between the IWPM service and the iWARP drivers (between user space and kernel space) is implemented using netlink sockets. 1) Netlink interface functions are added: ibnl_unicast() and ibnl_mulitcast() for sending netlink messages to user space 2) The signature of the existing ibnl_put_msg() is changed to be more generic 3) Two netlink clients are added: RDMA_NL_NES, RDMA_NL_C4IW corresponding to the two iWarp drivers - nes and cxgb4 which use the IWPM service 4) Enums are added to enumerate the attributes in the netlink messages, which are exchanged between the user space IWPM service and the iWARP drivers Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: PJ Waskiewicz <pj.waskiewicz@solidfire.com> [ Fold in range checking fixes and nlh_next removal as suggested by Dan Carpenter and Steve Wise. Fix sparse endianness in hash. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-04-01IB/core: Don't resolve passive side RoCE L2 address in CMA REQ handlerMoni Shoua
The code that resolves the passive side source MAC within the rdma_cm connection request handler was both redundant and buggy, so remove it. It was redundant since later, when an RC QP is modified to RTR state, the resolution will take place in the ib_core module. It was buggy because this callback also deals with UD SIDR exchange, for which we incorrectly looked at the REQ member of the CM event and dereferenced a random value. Fixes: dd5f03beb4f7 ("IB/core: Ethernet L2 attributes in verbs/cm structures") Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-01-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) BPF debugger and asm tool by Daniel Borkmann. 2) Speed up create/bind in AF_PACKET, also from Daniel Borkmann. 3) Correct reciprocal_divide and update users, from Hannes Frederic Sowa and Daniel Borkmann. 4) Currently we only have a "set" operation for the hw timestamp socket ioctl, add a "get" operation to match. From Ben Hutchings. 5) Add better trace events for debugging driver datapath problems, also from Ben Hutchings. 6) Implement auto corking in TCP, from Eric Dumazet. Basically, if we have a small send and a previous packet is already in the qdisc or device queue, defer until TX completion or we get more data. 7) Allow userspace to manage ipv6 temporary addresses, from Jiri Pirko. 8) Add a qdisc bypass option for AF_PACKET sockets, from Daniel Borkmann. 9) Share IP header compression code between Bluetooth and IEEE802154 layers, from Jukka Rissanen. 10) Fix ipv6 router reachability probing, from Jiri Benc. 11) Allow packets to be captured on macvtap devices, from Vlad Yasevich. 12) Support tunneling in GRO layer, from Jerry Chu. 13) Allow bonding to be configured fully using netlink, from Scott Feldman. 14) Allow AF_PACKET users to obtain the VLAN TPID, just like they can already get the TCI. From Atzm Watanabe. 15) New "Heavy Hitter" qdisc, from Terry Lam. 16) Significantly improve the IPSEC support in pktgen, from Fan Du. 17) Allow ipv4 tunnels to cache routes, just like sockets. From Tom Herbert. 18) Add Proportional Integral Enhanced packet scheduler, from Vijay Subramanian. 19) Allow openvswitch to mmap'd netlink, from Thomas Graf. 20) Key TCP metrics blobs also by source address, not just destination address. From Christoph Paasch. 21) Support 10G in generic phylib. From Andy Fleming. 22) Try to short-circuit GRO flow compares using device provided RX hash, if provided. From Tom Herbert. The wireless and netfilter folks have been busy little bees too. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2064 commits) net/cxgb4: Fix referencing freed adapter ipv6: reallocate addrconf router for ipv6 address when lo device up fib_frontend: fix possible NULL pointer dereference rtnetlink: remove IFLA_BOND_SLAVE definition rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_info qlcnic: update version to 5.3.55 qlcnic: Enhance logic to calculate msix vectors. qlcnic: Refactor interrupt coalescing code for all adapters. qlcnic: Update poll controller code path qlcnic: Interrupt code cleanup qlcnic: Enhance Tx timeout debugging. qlcnic: Use bool for rx_mac_learn. bonding: fix u64 division rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC sfc: Use the correct maximum TX DMA ring size for SFC9100 Add Shradha Shah as the sfc driver maintainer. net/vxlan: Share RX skb de-marking and checksum checks with ovs tulip: cleanup by using ARRAY_SIZE() ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called net/cxgb4: Don't retrieve stats during recovery ...
2014-01-23Merge branch 'ip-roce' into for-nextRoland Dreier
Conflicts: drivers/infiniband/hw/mlx4/main.c
2014-01-23RDMA/cma: Handle global/non-linklocal IPv6 addresses in cma_check_linklocal()Somnath Kotur
If addr is not a linklocal address, the code incorrectly fails to return and ends up assigning the scope ID to the scope id of the address, which is wrong. Fix by checking if it's a link local address first, and immediately return 0 if not. Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-01-18IB/cma: IBoE (RoCE) IP-based GID addressingMoni Shoua
Currently, the IB core and specifically the RDMA-CM assumes that IBoE (RoCE) gids encode related Ethernet netdevice interface MAC address and possibly VLAN id. Change GIDs to be treated as they encode interface IP address. Since Ethernet layer 2 address parameters are not longer encoded within gids, we have to extend the Infiniband address structures (e.g. ib_ah_attr) with layer 2 address parameters, namely mac and vlan. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-01-14net: replace macros net_random and net_srandom with direct calls to prandomAruna-Hewapathirane
This patch removes the net_random and net_srandom macros and replaces them with direct calls to the prandom ones. As new commits only seem to use prandom_u32 there is no use to keep them around. This change makes it easier to grep for users of prandom_u32. Signed-off-by: Aruna-Hewapathirane <aruna.hewapathirane@gmail.com> Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14IB/core: Ethernet L2 attributes in verbs/cm structuresMatan Barak
This patch add the support for Ethernet L2 attributes in the verbs/cm/cma structures. When dealing with L2 Ethernet, we should use smac, dmac, vlan ID and priority in a similar manner that the IB L2 (and the L4 PKEY) attributes are used. Thus, those attributes were added to the following structures: * ib_ah_attr - added dmac * ib_qp_attr - added smac and vlan_id, (sl remains vlan priority) * ib_wc - added smac, vlan_id * ib_sa_path_rec - added smac, dmac, vlan_id * cm_av - added smac and vlan_id For the path record structure, extra care was taken to avoid the new fields when packing it into wire format, so we don't break the IB CM and SA wire protocol. On the active side, the CM fills. its internal structures from the path provided by the ULP. We add there taking the ETH L2 attributes and placing them into the CM Address Handle (struct cm_av). On the passive side, the CM fills its internal structures from the WC associated with the REQ message. We add there taking the ETH L2 attributes from the WC. When the HW driver provides the required ETH L2 attributes in the WC, they set the IB_WC_WITH_SMAC and IB_WC_WITH_VLAN flags. The IB core code checks for the presence of these flags, and in their absence does address resolution from the ib_init_ah_from_wc() helper function. ib_modify_qp_is_ok is also updated to consider the link layer. Some parameters are mandatory for Ethernet link layer, while they are irrelevant for IB. Vendor drivers are modified to support the new function signature. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-11-18Merge tag 'rdma-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband/rdma updates from Roland Dreier: - Re-enable flow steering verbs with new improved userspace ABI - Fixes for slow connection due to GID lookup scalability - IPoIB fixes - Many fixes to HW drivers including mlx4, mlx5, ocrdma and qib - Further improvements to SRP error handling - Add new transport type for Cisco usNIC * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (66 commits) IB/core: Re-enable create_flow/destroy_flow uverbs IB/core: extended command: an improved infrastructure for uverbs commands IB/core: Remove ib_uverbs_flow_spec structure from userspace IB/core: Use a common header for uverbs flow_specs IB/core: Make uverbs flow structure use names like verbs ones IB/core: Rename 'flow' structs to match other uverbs structs IB/core: clarify overflow/underflow checks on ib_create/destroy_flow IB/ucma: Convert use of typedef ctl_table to struct ctl_table IB/cm: Convert to using idr_alloc_cyclic() IB/mlx5: Fix page shift in create CQ for userspace IB/mlx4: Fix device max capabilities check IB/mlx5: Fix list_del of empty list IB/mlx5: Remove dead code IB/core: Encorce MR access rights rules on kernel consumers IB/mlx4: Fix endless loop in resize CQ RDMA/cma: Remove unused argument and minor dead code RDMA/ucma: Discard events for IDs not yet claimed by user space IB/core: Add Cisco usNIC rdma node and transport types RDMA/nes: Remove self-assignment from nes_query_qp() IB/srp: Report receive errors correctly ...
2013-11-11RDMA/cma: Remove unused argument and minor dead codeMichal Nazarewicz
The dev variable is never assigned after being initialised. Signed-off-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-11-08IB/cma: Check for GID on listening device firstDoug Ledford
As a simple optimization that should speed up the vast majority of connect attemps on IB devices, when we are searching for the GID of an incoming connection in the cached GID lists of devices, search the device that received the incoming connection request first. If we don't find it there, then move on to other devices. This reduces the time to perform 10,000 connections considerably. Prior to this patch, a bad run of cmtime would look like this: connect : 12399.26 12351.10 8609.00 1239.93 With this patch, it looks more like this: connect : 5864.86 5799.80 8876.00 586.49 Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Roland Dreier <roland@purestorage.com>