Age | Commit message (Collapse) | Author |
|
Fixes in_atomic stack-dump, when Mellanox module is loaded into the RT
Kernel.
Michael S. Tsirkin <mst@dev.mellanox.co.il> sayeth:
"Basically, if you just make spin_lock_irqsave (and spin_lock_irq) not disable
interrupts for non-raw spinlocks, I think all of infiniband will be fine without
changes."
Signed-off-by: Sven-Thorsten Dietrich <sven@thebigcorporation.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
|
|
Fixes in_atomic stack-dump, when Mellanox module is loaded into the RT
Kernel.
Michael S. Tsirkin <mst@dev.mellanox.co.il> sayeth:
"Basically, if you just make spin_lock_irqsave (and spin_lock_irq) not disable
interrupts for non-raw spinlocks, I think all of infiniband will be fine without
changes."
Signed-off-by: Sven-Thorsten Dietrich <sven@thebigcorporation.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
|
|
Fixes in_atomic stack-dump, when Mellanox module is loaded into the RT
Kernel.
Michael S. Tsirkin <mst@dev.mellanox.co.il> sayeth:
"Basically, if you just make spin_lock_irqsave (and spin_lock_irq) not disable
interrupts for non-raw spinlocks, I think all of infiniband will be fine without
changes."
Signed-off-by: Sven-Thorsten Dietrich <sven@thebigcorporation.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
commit b6b87a1df604678ed1be40158080db012a99ccca upstream.
This patch fixes the incorrect setting of ->post_send_buf_count
related to RDMA WRITEs + READs where isert_rdma_rw->send_wr_num
was not being taken into account.
This includes incrementing ->post_send_buf_count within
isert_put_datain() + isert_get_dataout(), decrementing within
__isert_send_completion() + isert_response_completion(), and
clearing wr->send_wr_num within isert_completion_rdma_read()
This is necessary because even though IB_SEND_SIGNALED is
not set for RDMA WRITEs + READs, during a QP failure event
the work requests will be returned with exception status
from the TX completion queue.
Acked-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 9bb4ca68fc48eea618b1af1c1cde2a251ae32d1b upstream.
This patch changes IB_WR_FAST_REG_MR + IB_WR_LOCAL_INV related
work requests to include a ISER_FRWR_LI_WRID value in order to
signal isert_cq_tx_work() that these requests should be ignored.
This is necessary because even though IB_SEND_SIGNALED is not
set for either work request, during a QP failure event the work
requests will be returned with exception status from the TX
completion queue.
v2 changes:
- Rename ISER_FRWR_LI_WRID -> ISER_FASTREG_LI_WRID (Sagi)
Acked-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit defd884845297fd5690594bfe89656b01f16d87e upstream.
This patch addresses a couple of different hug shutdown issues
related to wait_event() + isert_conn->state. First, it changes
isert_conn->conn_wait + isert_conn->conn_wait_comp_err from
waitqueues to completions, and sets ISER_CONN_TERMINATING from
within isert_disconnect_work().
Second, it splits isert_free_conn() into isert_wait_conn() that
is called earlier in iscsit_close_connection() to ensure that
all outstanding commands have completed before continuing.
Finally, it breaks isert_cq_comp_err() into seperate TX / RX
related code, and adds logic in isert_cq_rx_comp_err() to wait
for outstanding commands to complete before setting ISER_CONN_DOWN
and calling complete(&isert_conn->conn_wait_comp_err).
Acked-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 5159d763f60af693a3fcec45dce2021f66e528a4 upstream.
There are a handful of uses of list_empty() for cmd->i_conn_node
within iser-target code that expect to return false once a cmd
has been removed from the per connect list.
This patch changes all uses of list_del -> list_del_init in order
to ensure that list_empty() returns false as expected.
Acked-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 2853c2b6671509591be09213954d7249ca6ff224 upstream.
This patch moves INIT_WORK setup for cq_desc->cq_[rx,tx]_work into
isert_create_device_ib_res(), instead of being done each callback
invocation in isert_cq_[rx,tx]_callback().
This also fixes a 'INFO: trying to register non-static key' warning
when cancel_work_sync() is called before INIT_WORK has setup the
struct work_struct.
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 94a7111043d99819cd0a72d9b3174c7054adb2a0 upstream.
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 04d9cd1224e5bc9d6146bab2866cdc81deb9b509 upstream.
This patch avoids a duplicate iscsit_increment_maxcmdsn() call for
ISER_IB_RDMA_WRITE within isert_map_rdma() + isert_reg_rdma_frwr(),
which will already be occuring once during isert_put_datain() ->
iscsit_build_rsp_pdu() operation.
It also removes the local conn->stat_sn assignment + increment,
and changes the third parameter to iscsit_build_rsp_pdu() to
signal this should be done by iscsi_target_mode code.
Tested-by: Moussa Ba <moussaba@micron.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f01b9f73392b48c6cda7c2c66594c73137c776da upstream.
This patch changes isert_reg_rdma_frwr() to not use FRMR for single
dma entry requests from small I/Os, in order to avoid the associated
memory registration overhead.
Using DMA MR is sufficient here for the single dma entry requests,
and addresses a >= v3.12 performance regression.
Signed-off-by: Vu Pham <vu@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit cd4e38542a5c2cab94e5410fb17c1cc004a60792 upstream.
The IB spec does not guarantee that the opcode is available in error
completions. Hence do not rely on it. See also commit 948d1e889e5b
("IB/srp: Introduce srp_handle_qp_err()").
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 99b6697a50c2acbe3ca2772d359fc9a28835dc84 upstream.
If SCSI commands are submitted with a SCSI request timeout that is
lower than the the IB RC timeout, it can happen that the SCSI error
handler has already started device recovery before transport layer
error handling starts. So it can happen that the SCSI error handler
tries to abort a SCSI command after it has been reset by
srp_rport_reconnect().
Tell the SCSI error handler that such commands have finished and that
it is not necessary to continue its recovery strategy for commands
that have been reset by srp_rport_reconnect().
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 65d7dd2f3479ef5aec1d9ddd1481cb7851c11af6 upstream.
Remove an SRP target from the SRP target list before invoking the last
scsi_host_put() call. This change is necessary because that last put
frees the memory that holds the srp_target_port structure.
This patch prevents the following kernel oops:
RIP: 0010:[<ffffffff810b00d0>] __lock_acquire+0x500/0x1570
Call Trace:
[<ffffffff810b11e4>] lock_acquire+0xa4/0x120
[<ffffffff81531206>] _spin_lock+0x36/0x70
[<ffffffffa01b6d8f>] srp_remove_work+0xef/0x180 [ib_srp]
[<ffffffff8109125c>] worker_thread+0x21c/0x3d0
[<ffffffff81096e86>] kthread+0x96/0xa0
[<ffffffff8100c20a>] child_rip+0xa/0x20
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ bvanassche - Modified path description and CC'ed stable. ]
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
This patch changes isert_connect_release() to correctly check for
the existence struct isert_device *device before checking for
isert_device->use_frwr.
Signed-off-by: Vu Pham <vu@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
The SRP specification requires:
"Response data shall be provided in any SRP_RSP response that is sent in
response to an SRP_TSK_MGMT request (see 6.7). The information in the
RSP_CODE field (see table 24) shall indicate the completion status of
the task management function."
So fix this to avoid the SRP initiator interprets task management functions
that succeeded as failed.
Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com>
Cc: stable@vger.kernel.org # 3.3+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
This patch fixes a bug where ib_destroy_cm_id() was incorrectly being called
after srpt_destroy_ch_ib() had destroyed the active QP.
This would result in the following failed SRP_LOGIN_REQ messages:
Received SRP_LOGIN_REQ with i_port_id 0x0:0x2590ffff1762bd, t_port_id 0x2c903009f8f40:0x2c903009f8f40 and it_iu_len 260 on port 1 (guid=0xfe80000000000000:0x2c903009f8f41)
Received SRP_LOGIN_REQ with i_port_id 0x0:0x2590ffff1758f9, t_port_id 0x2c903009f8f40:0x2c903009f8f40 and it_iu_len 260 on port 2 (guid=0xfe80000000000000:0x2c903009f8f42)
Received SRP_LOGIN_REQ with i_port_id 0x0:0x2590ffff175941, t_port_id 0x2c903009f8f40:0x2c903009f8f40 and it_iu_len 260 on port 2 (guid=0xfe80000000000000:0x2c90300a3cfb2)
Received SRP_LOGIN_REQ with i_port_id 0x0:0x2590ffff176299, t_port_id 0x2c903009f8f40:0x2c903009f8f40 and it_iu_len 260 on port 1 (guid=0xfe80000000000000:0x2c90300a3cfb1)
mlx4_core 0000:84:00.0: command 0x19 failed: fw status = 0x9
rejected SRP_LOGIN_REQ because creating a new RDMA channel failed.
Received SRP_LOGIN_REQ with i_port_id 0x0:0x2590ffff176299, t_port_id 0x2c903009f8f40:0x2c903009f8f40 and it_iu_len 260 on port 1 (guid=0xfe80000000000000:0x2c90300a3cfb1)
mlx4_core 0000:84:00.0: command 0x19 failed: fw status = 0x9
rejected SRP_LOGIN_REQ because creating a new RDMA channel failed.
Received SRP_LOGIN_REQ with i_port_id 0x0:0x2590ffff176299, t_port_id 0x2c903009f8f40:0x2c903009f8f40 and it_iu_len 260 on port 1 (guid=0xfe80000000000000:0x2c90300a3cfb1)
Reported-by: Navin Ahuja <navin.ahuja@saratoga-speed.com>
Cc: stable@vger.kernel.org # 3.3+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
"Lots of activity again this round for I/O performance optimizations
(per-cpu IDA pre-allocation for vhost + iscsi/target), and the
addition of new fabric independent features to target-core
(COMPARE_AND_WRITE + EXTENDED_COPY).
The main highlights include:
- Support for iscsi-target login multiplexing across individual
network portals
- Generic Per-cpu IDA logic (kent + akpm + clameter)
- Conversion of vhost to use per-cpu IDA pre-allocation for
descriptors, SGLs and userspace page pointer list
- Conversion of iscsi-target + iser-target to use per-cpu IDA
pre-allocation for descriptors
- Add support for generic COMPARE_AND_WRITE (AtomicTestandSet)
emulation for virtual backend drivers
- Add support for generic EXTENDED_COPY (CopyOffload) emulation for
virtual backend drivers.
- Add support for fast memory registration mode to iser-target (Vu)
The patches to add COMPARE_AND_WRITE and EXTENDED_COPY support are of
particular significance, which make us the first and only open source
target to support the full set of VAAI primitives.
Currently Linux clients are lacking upstream support to actually
utilize these primitives. However, with server side support now in
place for folks like MKP + ZAB working on the client, this logic once
reserved for the highest end of storage arrays, can now be run in VMs
on their laptops"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (50 commits)
target/iscsi: Bump versions to v4.1.0
target: Update copyright ownership/year information to 2013
iscsi-target: Bump default TCP listen backlog to 256
target: Fix >= v3.9+ regression in PR APTPL + ALUA metadata write-out
iscsi-target; Bump default CmdSN Depth to 64
iscsi-target: Remove unnecessary wait_for_completion in iscsi_get_thread_set
iscsi-target: Add thread_set->ts_activate_sem + use common deallocate
iscsi-target: Fix race with thread_pre_handler flush_signals + ISCSI_THREAD_SET_DIE
target: remove unused including <linux/version.h>
iser-target: introduce fast memory registration mode (FRWR)
iser-target: generalize rdma memory registration and cleanup
iser-target: move rdma wr processing to a shared function
target: Enable global EXTENDED_COPY setup/release
target: Add Third Party Copy (3PC) bit in INQUIRY response
target: Enable EXTENDED_COPY setup in spc_parse_cdb
target: Add support for EXTENDED_COPY copy offload emulation
target: Avoid non-existent tg_pt_gp_mem in target_alua_state_check
target: Add global device list for EXTENDED_COPY
target: Make helpers non static for EXTENDED_COPY command setup
target: Make spc_parse_naa_6h_vendor_specific non static
...
|
|
Update copyright ownership/year information for target-core,
loopback, iscsi-target, tcm_qla2xx, vhost and iser-target.
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
This model was introduced in 00f7ec36c "RDMA/core: Add memory
management extensions support" and works when the IB device
supports the IB_DEVICE_MEM_MGT_EXTENSIONS capability.
Upon creating the isert device, ib_isert will test whether the HCA
supports FRWR. If supported then set the flag and assign
function pointers that handle fast registration and deregistration
of appropriate resources (fast_reg descriptors).
When new connection coming in, ib_isert will check frwr flag and
create frwr resouces, if fail to do it will switch back to
old model of using global dma key and turn off the frwr support.
Registration is done using posting IB_WR_FAST_REG_MR to the QP and
invalidations using posting IB_WR_LOCAL_INV.
Signed-off-by: Vu Pham <vu@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
Current driver uses global dma key to register the memory pointed
by sg list provided by the target core.
This is the preparation step for adding more methods like fast
path memory registration, make the reg/unreg calls be function
pointers.
Signed-off-by: Vu Pham <vu@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
isert_put_datain() and isert_get_dataout() share a lot of code
in rdma wr processing, move this common code to a shared function.
Use isert_unmap_cmd to cleanup for RDMA_READ completion.
Remove duplicate field in isert_cmd and isert_rdma_wr structs
Change misc debug messages to track isert_cmd
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Vu Pham <vu@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
This command converts iscsi/isert-target to use allocations based on
iscsit_transport->priv_size within iscsit_allocate_cmd(), instead of
using an embedded isert_cmd->iscsi_cmd.
This includes removing iscsit_transport->alloc_cmd() usage, along
with updating isert-target code to use iscsit_priv_cmd().
Also, remove left-over iscsit_transport->release_cmd() usage for
direct calls to iscsit_release_cmd(), and drop the now unused
lio_cmd_cache and isert_cmd_cache.
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Kent Overstreet <kmo@daterainc.com>
Signed-off-by: Nicholas Bellinger <nab@daterainc.com>
|
|
This patch updates iser-target code to support login negotiation
multi-plexing. This includes only using isert_conn->conn_login_comp
for the first login request PDU, pushing the subsequent processing
to iscsi_conn->login_work -> iscsi_target_do_login_rx(), and turning
isert_get_login_rx() into a NOP.
v3 changes:
- Drop unnecessary LOGIN_FLAGS_READ_ACTIVE bit set in
isert_rx_login_req()
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull main batch of InfiniBand/RDMA changes from Roland Dreier:
- Large ocrdma HW driver update: add "fast register" work requests,
fixes, cleanups
- Add receive flow steering support for raw QPs
- Fix IPoIB neighbour race that leads to crash
- iSER updates including support for using "fast register" memory
registration
- IPv6 support for iWARP
- XRC transport fixes
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (54 commits)
RDMA/ocrdma: Fix compiler warning about int/pointer size mismatch
IB/iser: Fix redundant pointer check in dealloc flow
IB/iser: Fix possible memory leak in iser_create_frwr_pool()
IB/qib: Move COUNTER_MASK definition within qib_mad.h header guards
RDMA/ocrdma: Fix passing wrong opcode to modify_srq
RDMA/ocrdma: Fill PVID in UMC case
RDMA/ocrdma: Add ABI versioning support
RDMA/ocrdma: Consider multiple SGES in case of DPP
RDMA/ocrdma: Fix for displaying proper link speed
RDMA/ocrdma: Increase STAG array size
RDMA/ocrdma: Dont use PD 0 for userpace CQ DB
RDMA/ocrdma: FRMA code cleanup
RDMA/ocrdma: For ERX2 irrespective of Qid, num_posted offset is 24
RDMA/ocrdma: Fix to work with even a single MSI-X vector
RDMA/ocrdma: Remove the MTU check based on Ethernet MTU
RDMA/ocrdma: Add support for fast register work requests (FRWR)
RDMA/ocrdma: Create IRD queue fix
IB/core: Better checking of userspace values for receive flow steering
IB/mlx4: Add receive flow steering support
IB/core: Export ib_create/destroy_flow through uverbs
...
|
|
'qib' into for-next
|
|
This bug was discovered by Smatch static checker run by Dan Carpenter.
If in free_rx_descriptors(), rx_descs are not NULL then the iser
device is definately not NULL, so no need to check it before
dereferencing it.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
Fix leak where desc is not being freed in error flows.
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
To run discovery over iSER we need to advertize the CAP_TEXT_NEGO capability
towards user space. Also need to make sure the login RX buffer is posted when
SendTargets TEXT PDUs are sent. For that end, we use a setting of the
ISCSI_PARAM_DISCOVERY_SESS iscsi param as an indication that this is
discovery session.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
|
In several places, this snippet is used when removing neigh entries:
list_del(&neigh->list);
ipoib_neigh_free(neigh);
The list_del() removes neigh from the associated struct ipoib_path, while
ipoib_neigh_free() removes neigh from the device's neigh entry lookup
table. Both of these operations are protected by the priv->lock
spinlock. The table however is also protected via RCU, and so naturally
the lock is not held when doing reads.
This leads to a race condition, in which a thread may successfully look
up a neigh entry that has already been deleted from neigh->list. Since
the previous deletion will have marked the entry with poison, a second
list_del() on the object will cause a panic:
#5 [ffff8802338c3c70] general_protection at ffffffff815108c5
[exception RIP: list_del+16]
RIP: ffffffff81289020 RSP: ffff8802338c3d20 RFLAGS: 00010082
RAX: dead000000200200 RBX: ffff880433e60c88 RCX: 0000000000009e6c
RDX: 0000000000000246 RSI: ffff8806012ca298 RDI: ffff880433e60c88
RBP: ffff8802338c3d30 R8: ffff8806012ca2e8 R9: 00000000ffffffff
R10: 0000000000000001 R11: 0000000000000000 R12: ffff8804346b2020
R13: ffff88032a3e7540 R14: ffff8804346b26e0 R15: 0000000000000246
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000
#6 [ffff8802338c3d38] ipoib_cm_tx_handler at ffffffffa066fe0a [ib_ipoib]
#7 [ffff8802338c3d98] cm_process_work at ffffffffa05149a7 [ib_cm]
#8 [ffff8802338c3de8] cm_work_handler at ffffffffa05161aa [ib_cm]
#9 [ffff8802338c3e38] worker_thread at ffffffff81090e10
#10 [ffff8802338c3ee8] kthread at ffffffff81096c66
#11 [ffff8802338c3f48] kernel_thread at ffffffff8100c0ca
We move the list_del() into ipoib_neigh_free(), so that deletion happens
only once, after the entry has been successfully removed from the lookup
table. This same behavior is already used in ipoib_del_neighs_by_gid()
and __ipoib_reap_neigh().
Signed-off-by: Jim Foraker <foraker1@llnl.gov>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Jack Wang <jinpu.wang@profitbricks.com>
Reviewed-by: Shlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
Newer HCAs and Virtual functions may not support FMRs but rather a fast
registration model, which we call FRWR - "Fast Registration Work Requests".
This model was introduced in 00f7ec36c ("RDMA/core: Add memory management
extensions support") and works when the IB device supports the
IB_DEVICE_MEM_MGT_EXTENSIONS capability.
Upon creating the iser device iser will test whether the HCA supports
FMRs. If no support for FMRs, check if IB_DEVICE_MEM_MGT_EXTENSIONS
is supported and assign function pointers that handle fast
registration and allocation of appropriate resources (fast_reg
descriptors).
Registration is done using posting IB_WR_FAST_REG_MR to the QP and
invalidations using posting IB_WR_LOCAL_INV.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
This is preparation step for other memory registration methods to be
added. In addition, change reg/unreg routines signature to indicate
they use FMRs.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
This routine will be shared with other rdma management schemes. The
bounce buffer solution for unaligned SG-lists and the sg_to_page_vec
routine are likely to be used for other registration schemes and not
just FMR.
Move them out of the FMR specific code, and call them from there.
Later they will be called also from other reg methods code.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
Currently the driver uses FMRs as the only means to register the
memory pointed by SG provided by the SCSI mid-layer with the RDMA
device.
As preparation step for adding more methods for fast path memory
registration, make the alloc/free and reg/unreg calls function
pointers, which are for now just set to the existing FMR ones.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
Use cmds_max passed from user space to be the number of PDUs to be
supported for the session instead of hard-coded ISCSI_DEF_XMIT_CMDS_MAX.
This allow controlling the max number of SCSI commands for the session.
Also don't ignore the qdepth passed from user space.
Derive from session->cmds_max the actual number of RX buffers and FMR
pool size to allocate during the connection bind phase.
Since the iser transport connection is established before the iscsi
session/connection are created and bound, we still use one hard-coded
quantity ISER_DEF_XMIT_CMDS_MAX to compute the maximum number of
work-requests to be supported by the RC QP used for the connection.
The above quantity is made to be a power of two between ISCSI_TOTAL_CMDS_MIN
(16) and ISER_DEF_XMIT_CMDS_MAX (512) inclusive.
Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
This is a preparation step to a patch that accepts the number of max
SCSI commands to be supported a session from user space iSCSI tools.
Move the allocation of the login buffer, FMR pool and its associated
page vector from iser_create_ib_conn_res() (which is called prior when
we actually know how many commands should be supported) to
iser_alloc_rx_descriptors() (which is called during the iscsi
connection bind step where this quantity is known).
Also do small refactoring around the deallocation to make that path
similar to the allocation one.
Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
Commit 4f363882612 ("IB/iser: Move informational messages from error
to info level") set info prints to be emitted at a lower debug level
than warning prints, which is a bit odd. Fix that.
Also move the prints on unaligned SG from warning to debug level.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
IPoIB's required behaviour w.r.t to the pkey used by the device is the following:
- For "parent" interfaces (e.g ib0, ib1, etc) who are created
automatically as a result of hot-plug events from the IB core, the
driver needs to take whatever pkey vlaue it finds in index 0, and
stick to that index.
- For child interfaces (e.g ib0.8001, etc) created by admin directive,
the driver needs to use and stick to the value provided during its
creation.
In SR-IOV environment its possible for the VF probe to take place
before the cloud management software provisions the suitable pkey for
the VF in the paravirtualed PKEY table index 0. When this is the case,
the VF IB stack will find in index 0 an invalide pkey, which is all
zeros.
Moreover, the cloud managment can assign the pkey value at index 0 at
any time of the guest life cycle.
The correct behavior for IPoIB to address these requirements for
parent interfaces is to use PKEY_CHANGE event as trigger to optionally
re-init the device pkey value and re-create all the relevant resources
accordingly, if the value of the pkey in index 0 has changed (from
invalid to valid or from valid value X to invalid value Y).
This patch enhances the heavy flushing code which is triggered by pkey
change event, to behave correctly for parent devices. For child
devices, the code remains the same, namely chases pkey value and not
index.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
Make sure that the IB invalid pkey (0x0000 or 0x8000) isn't used for
child devices.
Also, make sure to always set the full membership bit for the pkey of
devices created by rtnl link ops.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull InfiniBand/RDMA changes from Roland Dreier:
- AF_IB (native IB addressing) for CMA from Sean Hefty
- new mlx5 driver for Mellanox Connect-IB adapters (including post
merge request fixes)
- SRP fixes from Bart Van Assche (including fix to first merge request)
- qib HW driver updates
- resurrection of ocrdma HW driver development
- uverbs conversion to create fds with O_CLOEXEC set
- other small changes and fixes
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (66 commits)
mlx5: Return -EFAULT instead of -EPERM
IB/qib: Log all SDMA errors unconditionally
IB/qib: Fix module-level leak
mlx5_core: Adjust hca_cap.uar_page_sz to conform to Connect-IB spec
IB/srp: Let srp_abort() return FAST_IO_FAIL if TL offline
IB/uverbs: Use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd()
mlx5_core: Fixes for sparse warnings
IB/mlx5: Make profile[] static in main.c
mlx5: Fix parameter type of health_handler_t
mlx5: Add driver for Mellanox Connect-IB adapters
IB/core: Add reserved values to enums for low-level driver use
IB/srp: Bump driver version and release date
IB/srp: Make HCA completion vector configurable
IB/srp: Maintain a single connection per I_T nexus
IB/srp: Fail I/O fast if target offline
IB/srp: Skip host settle delay
IB/srp: Avoid skipping srp_reset_host() after a transport error
IB/srp: Fix remove_one crash due to resource exhaustion
IB/qib: New transmitter tunning settings for Dell 1.1 backplane
IB/core: Fix error return code in add_port()
...
|
|
If the transport layer is offline it is more appropriate to let
srp_abort() return FAST_IO_FAIL instead of SUCCESS.
Reported-by: Sebastian Riemer <sebastian.riemer@profitbricks.com>
Acked-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
"Lots of activity this round on performance improvements in target-core
while benchmarking the prototype scsi-mq initiator code with
vhost-scsi fabric ports, along with a number of iscsi/iser-target
improvements and hardening fixes for exception path cases post v3.10
merge.
The highlights include:
- Make persistent reservations APTPL buffer allocated on-demand, and
drop per t10_reservation buffer. (grover)
- Make virtual LUN=0 a NULLIO device, and skip allocation of NULLIO
device pages (grover)
- Add transport_cmd_check_stop write_pending bit to avoid extra
access of ->t_state_lock is WRITE I/O submission fast-path. (nab)
- Drop unnecessary CMD_T_DEV_ACTIVE check from
transport_lun_remove_cmd to avoid extra access of ->t_state_lock in
release fast-path. (nab)
- Avoid extra t_state_lock access in __target_execute_cmd fast-path
(nab)
- Drop unnecessary vhost-scsi wait_for_tasks=true usage +
->t_state_lock access in release fast-path. (nab)
- Convert vhost-scsi to use modern se_cmd->cmd_kref
TARGET_SCF_ACK_KREF usage (nab)
- Add tracepoints for SCSI commands being processed (roland)
- Refactoring of iscsi-target handling of ISCSI_OP_NOOP +
ISCSI_OP_TEXT to be transport independent (nab)
- Add iscsi-target SendTargets=$IQN support for in-band discovery
(nab)
- Add iser-target support for in-band discovery (nab + Or)
- Add iscsi-target demo-mode TPG authentication context support (nab)
- Fix isert_put_reject payload buffer post (nab)
- Fix iscsit_add_reject* usage for iser (nab)
- Fix iscsit_sequence_cmd reject handling for iser (nab)
- Fix ISCSI_OP_SCSI_TMFUNC handling for iser (nab)
- Fix session reset bug with RDMA_CM_EVENT_DISCONNECTED (nab)
The last five iscsi/iser-target items are CC'ed to stable, as they do
address issues present in v3.10 code. They are certainly larger than
I'd like for stable patch set, but are important to ensure proper
REJECT exception handling in iser-target for 3.10.y"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (51 commits)
iser-target: Ignore non TEXT + LOGOUT opcodes for discovery
target: make queue_tm_rsp() return void
target: remove unused codes from enum tcm_tmrsp_table
iscsi-target: kstrtou* configfs attribute parameter cleanups
iscsi-target: Fix tfc_tpg_auth_cit configfs length overflow
iscsi-target: Fix tfc_tpg_nacl_auth_cit configfs length overflow
iser-target: Add support for ISCSI_OP_TEXT opcode + payload handling
iser-target: Rename sense_buf_[dma,len] to pdu_[dma,len]
iser-target: Add vendor_err debug output
target: Add (obsolete) checking for PMI/LBA fields in READ CAPACITY(10)
target: Return correct sense data for IO past the end of a device
target: Add tracepoints for SCSI commands being processed
iser-target: Fix session reset bug with RDMA_CM_EVENT_DISCONNECTED
iscsi-target: Fix ISCSI_OP_SCSI_TMFUNC handling for iser
iscsi-target: Fix iscsit_sequence_cmd reject handling for iser
iscsi-target: Fix iscsit_add_reject* usage for iser
iser-target: Fix isert_put_reject payload buffer post
iscsi-target: missing kfree() on error path
iscsi-target: Drop left-over iscsi_conn->bad_hdr
target: Make core_scsi3_update_and_write_aptpl return sense_reason_t
...
|
|
This patch adds a check in isert_rx_opcode() to ignore non TEXT + LOGOUT
opcodes when SessionType=Discovery has been negotiated.
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
The return value wasn't checked by any of the callers. Assuming this is
correct behaviour, we can simplify some code by not bothering to
generate it.
nab: Add srpt_queue_data_in() + srpt_queue_tm_rsp() nops around
srpt_queue_response() void return
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
This patch adds isert_handle_text_cmd() to handle incoming
ISCSI_OP_TEXT PDU processing, along with isert_put_text_rsp()
for posting ISCSI_OP_TEXT_RSP ib_send_wr response.
It copies ISCSI_OP_TEXT payload using unsolicited payload at
&iser_rx_desc->data[0] into iscsi_cmd->text_in_ptr for usage
with outgoing isert_put_text_rsp() -> iscsit_build_text_rsp()
v2 changes:
- Let iscsit_build_text_rsp() determine any extra padding
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
Now that these two variables are used for REJECT payloads as well
as SCSI response sense payloads, rename them to something that
makes more sense.
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
Add output for ib_wc.vendor_err in isert_cq_[t,r]x_work(), which
is useful for debugging future issues.
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
This patch addresses a bug where RDMA_CM_EVENT_DISCONNECTED may occur
before the connection shutdown has been completed by rx/tx threads,
that causes isert_free_conn() to wait indefinately on ->conn_wait.
This patch allows isert_disconnect_work code to invoke rdma_disconnect
when isert_disconnect_work() process context is started by client
session reset before isert_free_conn() code has been reached.
It also adds isert_conn->conn_mutex protection for ->state within
isert_disconnect_work(), isert_cq_comp_err() and isert_free_conn()
code, along with isert_check_state() for wait_event usage.
(v2: Add explicit iscsit_cause_connection_reinstatement call
during isert_disconnect_work() to force conn reset)
Cc: stable@vger.kernel.org # 3.10+
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|