diff options
author | Jeff Layton <jlayton@poochiereds.net> | 2016-05-17 16:28:47 (GMT) |
---|---|---|
committer | Anna Schumaker <Anna.Schumaker@Netapp.com> | 2016-05-17 19:48:12 (GMT) |
commit | 183d9e7b112aaed0d19c16ffcf0f8c3a86dc71e0 (patch) | |
tree | e78c3b12f7a309b7a364dff4ebcab69fab66cfae /Documentation/md-cluster.txt | |
parent | 83026d80a16ea6a4e4f06a994fc7831b1d8d6375 (diff) | |
download | linux-183d9e7b112aaed0d19c16ffcf0f8c3a86dc71e0.tar.xz |
pnfs: rework LAYOUTGET retry handling
There are several problems in the way a stateid is selected for a
LAYOUTGET operation:
We pick a stateid to use in the RPC prepare op, but that makes
it difficult to serialize LAYOUTGETs that use the open stateid. That
serialization is done in pnfs_update_layout, which occurs well before
the rpc_prepare operation.
Between those two events, the i_lock is dropped and reacquired.
pnfs_update_layout can find that the list has lsegs in it and not do any
serialization, but then later pnfs_choose_layoutget_stateid ends up
choosing the open stateid.
This patch changes the client to select the stateid to use in the
LAYOUTGET earlier, when we're searching for a usable layout segment.
This way we can do it all while holding the i_lock the first time, and
ensure that we serialize any LAYOUTGET call that uses a non-layout
stateid.
This also means a rework of how LAYOUTGET replies are handled, as we
must now get the latest stateid if we want to retransmit in response
to a retryable error.
Most of those errors boil down to the fact that the layout state has
changed in some fashion. Thus, what we really want to do is to re-search
for a layout when it fails with a retryable error, so that we can avoid
reissuing the RPC at all if possible.
While the LAYOUTGET RPC is async, the initiating thread always waits for
it to complete, so it's effectively synchronous anyway. Currently, when
we need to retry a LAYOUTGET because of an error, we drive that retry
via the rpc state machine.
This means that once the call has been submitted, it runs until it
completes. So, we must move the error handling for this RPC out of the
rpc_call_done operation and into the caller.
In order to handle errors like NFS4ERR_DELAY properly, we must also
pass a pointer to the sliding timeout, which is now moved to the stack
in pnfs_update_layout.
The complicating errors are -NFS4ERR_RECALLCONFLICT and
-NFS4ERR_LAYOUTTRYLATER, as those involve a timeout after which we give
up and return NULL back to the caller. So, there is some special
handling for those errors to ensure that the layers driving the retries
can handle that appropriately.
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Diffstat (limited to 'Documentation/md-cluster.txt')
0 files changed, 0 insertions, 0 deletions