Age | Commit message (Collapse) | Author |
|
|
|
|
|
commit 93dc41bdc5c853916610576c6b48a1704959c70d upstream.
We have one report of a crash in xs_tcp_setup_socket.
The call path to the crash is:
xs_tcp_setup_socket -> inet_stream_connect -> lock_sock_nested.
The 'sock' passed to that last function is NULL.
The only way I can see this happening is a concurrent call to
xs_close:
xs_close -> xs_reset_transport -> sock_release -> inet_release
inet_release sets:
sock->sk = NULL;
inet_stream_connect calls
lock_sock(sock->sk);
which gets NULL.
All calls to xs_close are protected by XPRT_LOCKED as are most
activations of the workqueue which runs xs_tcp_setup_socket.
The exception is xs_tcp_schedule_linger_timeout.
So presumably the timeout queued by the later fires exactly when some
other code runs xs_close().
To protect against this we can move the cancel_delayed_work_sync()
call from xs_destory() to xs_close().
As xs_close is never called from the worker scheduled on
->connect_worker, this can never deadlock.
Signed-off-by: NeilBrown <neilb@suse.de>
[Trond: Make it safe to call cancel_delayed_work_sync() on AF_LOCAL sockets]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 9eb2ddb48ce3a7bd745c14a933112994647fa3cd upstream.
Fix a race in which the RPC client is shutting down while the
gss daemon is processing a downcall. If the RPC client manages to
shut down before the gss daemon is done, then the struct gss_auth
used in gss_release_msg() may have already been freed.
Link: http://lkml.kernel.org/r/1392494917.71728.YahooMailNeo@web140002.mail.bf1.yahoo.com
Reported-by: John <da_audiophile@yahoo.com>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 06ea0bfe6e6043cb56a78935a19f6f8ebc636226 upstream.
When a send failure occurs due to the socket being out of buffer space,
we call xs_nospace() in order to have the RPC task wait until the
socket has drained enough to make it worth while trying again.
The current patch fixes a race in which the socket is drained before
we get round to setting up the machinery in xs_nospace(), and which
is reported to cause hangs.
Link: http://lkml.kernel.org/r/20140210170315.33dfc621@notabene.brown
Fixes: a9a6b52ee1ba (SUNRPC: Don't start the retransmission timer...)
Reported-by: Neil Brown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 1654a04cd702fd19c297c36300a6ab834cf8c072 upstream.
It doesn't make much sense to make reads from this procfile hang. As
far as I can tell, only gssproxy itself will open this file and it
never reads from it. Change it to just give the present setting of
sn->use_gss_proxy without waiting for anything.
Note that we do not want to call use_gss_proxy() in this codepath
since an inopportune read of this file could cause it to be disabled
prematurely.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6ff33b7dd0228b7d7ed44791bbbc98b03fd15d9d upstream.
When a task enters call_refreshresult with status 0 from call_refresh and
!rpcauth_uptodatecred(task) it enters call_refresh again with no rate-limiting
or max number of retries.
Instead of trying forever, make use of the retry path that other errors use.
This only seems to be possible when the crrefresh callback is gss_refresh_null,
which only happens when destroying the context.
To reproduce:
1) mount with sec=krb5 (or sec=sys with krb5 negotiated for non FSID specific
operations).
2) reboot - the client will be stuck and will need to be hard rebooted
BUG: soft lockup - CPU#0 stuck for 22s! [kworker/0:2:46]
Modules linked in: rpcsec_gss_krb5 nfsv4 nfs fscache ppdev crc32c_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd serio_raw i2c_piix4 i2c_core e1000 parport_pc parport shpchp nfsd auth_rpcgss oid_registry exportfs nfs_acl lockd sunrpc autofs4 mptspi scsi_transport_spi mptscsih mptbase ata_generic floppy
irq event stamp: 195724
hardirqs last enabled at (195723): [<ffffffff814a925c>] restore_args+0x0/0x30
hardirqs last disabled at (195724): [<ffffffff814b0a6a>] apic_timer_interrupt+0x6a/0x80
softirqs last enabled at (195722): [<ffffffff8103f583>] __do_softirq+0x1df/0x276
softirqs last disabled at (195717): [<ffffffff8103f852>] irq_exit+0x53/0x9a
CPU: 0 PID: 46 Comm: kworker/0:2 Not tainted 3.13.0-rc3-branch-dros_testing+ #4
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013
Workqueue: rpciod rpc_async_schedule [sunrpc]
task: ffff8800799c4260 ti: ffff880079002000 task.ti: ffff880079002000
RIP: 0010:[<ffffffffa0064fd4>] [<ffffffffa0064fd4>] __rpc_execute+0x8a/0x362 [sunrpc]
RSP: 0018:ffff880079003d18 EFLAGS: 00000246
RAX: 0000000000000005 RBX: 0000000000000007 RCX: 0000000000000007
RDX: 0000000000000007 RSI: ffff88007aecbae8 RDI: ffff8800783d8900
RBP: ffff880079003d78 R08: ffff88006e30e9f8 R09: ffffffffa005a3d7
R10: ffff88006e30e7b0 R11: ffff8800783d8900 R12: ffffffffa006675e
R13: ffff880079003ce8 R14: ffff88006e30e7b0 R15: ffff8800783d8900
FS: 0000000000000000(0000) GS:ffff88007f200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f3072333000 CR3: 0000000001a0b000 CR4: 00000000001407f0
Stack:
ffff880079003d98 0000000000000246 0000000000000000 ffff88007a9a4830
ffff880000000000 ffffffff81073f47 ffff88007f212b00 ffff8800799c4260
ffff8800783d8988 ffff88007f212b00 ffffe8ffff604800 0000000000000000
Call Trace:
[<ffffffff81073f47>] ? trace_hardirqs_on_caller+0x145/0x1a1
[<ffffffffa00652d3>] rpc_async_schedule+0x27/0x32 [sunrpc]
[<ffffffff81052974>] process_one_work+0x211/0x3a5
[<ffffffff810528d5>] ? process_one_work+0x172/0x3a5
[<ffffffff81052eeb>] worker_thread+0x134/0x202
[<ffffffff81052db7>] ? rescuer_thread+0x280/0x280
[<ffffffff81052db7>] ? rescuer_thread+0x280/0x280
[<ffffffff810584a0>] kthread+0xc9/0xd1
[<ffffffff810583d7>] ? __kthread_parkme+0x61/0x61
[<ffffffff814afd6c>] ret_from_fork+0x7c/0xb0
[<ffffffff810583d7>] ? __kthread_parkme+0x61/0x61
Code: e8 87 63 fd e0 c6 05 10 dd 01 00 01 48 8b 43 70 4c 8d 6b 70 45 31 e4 a8 02 0f 85 d5 02 00 00 4c 8b 7b 48 48 c7 43 48 00 00 00 00 <4c> 8b 4b 50 4d 85 ff 75 0c 4d 85 c9 4d 89 cf 0f 84 32 01 00 00
And the output of "rpcdebug -m rpc -s all":
RPC: 61 call_refresh (status 0)
RPC: 61 call_refresh (status 0)
RPC: 61 refreshing RPCSEC_GSS cred ffff88007a413cf0
RPC: 61 refreshing RPCSEC_GSS cred ffff88007a413cf0
RPC: 61 call_refreshresult (status 0)
RPC: 61 refreshing RPCSEC_GSS cred ffff88007a413cf0
RPC: 61 call_refreshresult (status 0)
RPC: 61 refreshing RPCSEC_GSS cred ffff88007a413cf0
RPC: 61 call_refresh (status 0)
RPC: 61 call_refreshresult (status 0)
RPC: 61 call_refresh (status 0)
RPC: 61 call_refresh (status 0)
RPC: 61 refreshing RPCSEC_GSS cred ffff88007a413cf0
RPC: 61 call_refreshresult (status 0)
RPC: 61 call_refresh (status 0)
RPC: 61 refreshing RPCSEC_GSS cred ffff88007a413cf0
RPC: 61 call_refresh (status 0)
RPC: 61 refreshing RPCSEC_GSS cred ffff88007a413cf0
RPC: 61 refreshing RPCSEC_GSS cred ffff88007a413cf0
RPC: 61 call_refreshresult (status 0)
RPC: 61 call_refresh (status 0)
RPC: 61 call_refresh (status 0)
RPC: 61 call_refresh (status 0)
RPC: 61 call_refresh (status 0)
RPC: 61 call_refreshresult (status 0)
RPC: 61 refreshing RPCSEC_GSS cred ffff88007a413cf0
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d07ba8422f1e58be94cc98a1f475946dc1b89f1b upstream.
In cases where an rpc client has a parent hierarchy, then
rpc_free_client may end up calling rpc_release_client() on the
parent, thus recursing back into rpc_free_client. If the hierarchy
is deep enough, then we can get into situations where the stack
simply overflows.
The fix is to have rpc_release_client() loop so that it can take
care of the parent rpc client hierarchy without needing to
recurse.
Reported-by: Jeff Layton <jlayton@redhat.com>
Reported-by: Weston Andros Adamson <dros@netapp.com>
Reported-by: Bruce Fields <bfields@fieldses.org>
Link: http://lkml.kernel.org/r/2C73011F-0939-434C-9E4D-13A1EB1403D7@netapp.com
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a6b31d18b02ff9d7915c5898c9b5ca41a798cd73 upstream.
The following scenario can cause silent data corruption when doing
NFS writes. It has mainly been observed when doing database writes
using O_DIRECT.
1) The RPC client uses sendpage() to do zero-copy of the page data.
2) Due to networking issues, the reply from the server is delayed,
and so the RPC client times out.
3) The client issues a second sendpage of the page data as part of
an RPC call retransmission.
4) The reply to the first transmission arrives from the server
_before_ the client hardware has emptied the TCP socket send
buffer.
5) After processing the reply, the RPC state machine rules that
the call to be done, and triggers the completion callbacks.
6) The application notices the RPC call is done, and reuses the
pages to store something else (e.g. a new write).
7) The client NIC drains the TCP socket send buffer. Since the
page data has now changed, it reads a corrupted version of the
initial RPC call, and puts it on the wire.
This patch fixes the problem in the following manner:
The ordering guarantees of TCP ensure that when the server sends a
reply, then we know that the _first_ transmission has completed. Using
zero-copy in that situation is therefore safe.
If a time out occurs, we then send the retransmission using sendmsg()
(i.e. no zero-copy), We then know that the socket contains a full copy of
the data, and so it will retransmit a faithful reproduction even if the
RPC call completes, and the application reuses the O_DIRECT buffer in
the meantime.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 5fccc5b52ee07d07a74ce53c6f174bff81e26a16 upstream.
Add the missing 'break' to ensure that we don't corrupt a legacy 'v0' type
message by appending the 'v1'.
Cc: Bruce Fields <bfields@fieldses.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This fixes a regression since eb6dc19d8e72ce3a957af5511d20c0db0a8bd007
"RPCSEC_GSS: Share all credential caches on a per-transport basis" which
could cause an occasional oops in the nfsd code (see below).
The problem was that an auth was left referencing a client that had been
freed. To avoid this we need to ensure that auths are shared only
between descendants of a common client; the fact that a clone of an
rpc_client takes a reference on its parent then ensures that the parent
client will last as long as the auth.
Also add a comment explaining what I think was the intention of this
code.
general protection fault: 0000 [#1] PREEMPT SMP
Modules linked in: rpcsec_gss_krb5 nfsd auth_rpcgss oid_registry nfs_acl lockd sunrpc
CPU: 3 PID: 4071 Comm: kworker/u8:2 Not tainted 3.11.0-rc2-00182-g025145f #1665
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Workqueue: nfsd4_callbacks nfsd4_do_callback_rpc [nfsd]
task: ffff88003e206080 ti: ffff88003c384000 task.ti: ffff88003c384000
RIP: 0010:[<ffffffffa00001f3>] [<ffffffffa00001f3>] rpc_net_ns+0x53/0x70 [sunrpc]
RSP: 0000:ffff88003c385ab8 EFLAGS: 00010246
RAX: 6b6b6b6b6b6b6b6b RBX: ffff88003af9a800 RCX: 0000000000000002
RDX: ffffffffa00001a5 RSI: 0000000000000001 RDI: ffffffff81e284e0
RBP: ffff88003c385ad8 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000015 R12: ffff88003c990840
R13: ffff88003c990878 R14: ffff88003c385ba8 R15: ffff88003e206080
FS: 0000000000000000(0000) GS:ffff88003fd80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fcdf737e000 CR3: 000000003ad2b000 CR4: 00000000000006e0
Stack:
ffffffffa00001a5 0000000000000006 0000000000000006 ffff88003af9a800
ffff88003c385b08 ffffffffa00d52a4 ffff88003c385ba8 ffff88003c751bd8
ffff88003c751bc0 ffff88003e113600 ffff88003c385b18 ffffffffa00d530c
Call Trace:
[<ffffffffa00001a5>] ? rpc_net_ns+0x5/0x70 [sunrpc]
[<ffffffffa00d52a4>] __gss_pipe_release+0x54/0x90 [auth_rpcgss]
[<ffffffffa00d530c>] gss_pipe_free+0x2c/0x30 [auth_rpcgss]
[<ffffffffa00d678b>] gss_destroy+0x9b/0xf0 [auth_rpcgss]
[<ffffffffa000de63>] rpcauth_release+0x23/0x30 [sunrpc]
[<ffffffffa0001e81>] rpc_release_client+0x51/0xb0 [sunrpc]
[<ffffffffa00020d5>] rpc_shutdown_client+0xe5/0x170 [sunrpc]
[<ffffffff81098a14>] ? cpuacct_charge+0xa4/0xb0
[<ffffffff81098975>] ? cpuacct_charge+0x5/0xb0
[<ffffffffa019556f>] nfsd4_process_cb_update.isra.17+0x2f/0x210 [nfsd]
[<ffffffff819a4ac0>] ? _raw_spin_unlock_irq+0x30/0x60
[<ffffffff819a4acb>] ? _raw_spin_unlock_irq+0x3b/0x60
[<ffffffff810703ab>] ? process_one_work+0x15b/0x510
[<ffffffffa01957dd>] nfsd4_do_callback_rpc+0x8d/0xa0 [nfsd]
[<ffffffff8107041e>] process_one_work+0x1ce/0x510
[<ffffffff810703ab>] ? process_one_work+0x15b/0x510
[<ffffffff810712ab>] worker_thread+0x11b/0x370
[<ffffffff81071190>] ? manage_workers.isra.24+0x2b0/0x2b0
[<ffffffff8107854b>] kthread+0xdb/0xe0
[<ffffffff819a4ac0>] ? _raw_spin_unlock_irq+0x30/0x60
[<ffffffff81078470>] ? __init_kthread_worker+0x70/0x70
[<ffffffff819ac7dc>] ret_from_fork+0x7c/0xb0
[<ffffffff81078470>] ? __init_kthread_worker+0x70/0x70
Code: a5 01 00 a0 31 d2 31 f6 48 c7 c7 e0 84 e2 81 e8 f4 91 0a e1 48 8b 43 60 48 c7 c2 a5 01 00 a0 be 01 00 00 00 48 c7 c7 e0 84 e2 81 <48> 8b 98 10 07 00 00 e8 91 8f 0a e1 e8
+3c 4e 07 e1 48 83 c4 18
RIP [<ffffffffa00001f3>] rpc_net_ns+0x53/0x70 [sunrpc]
RSP <ffff88003c385ab8>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile 4 from Al Viro:
"list_lru pile, mostly"
This came out of Andrew's pile, Al ended up doing the merge work so that
Andrew didn't have to.
Additionally, a few fixes.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (42 commits)
super: fix for destroy lrus
list_lru: dynamically adjust node arrays
shrinker: Kill old ->shrink API.
shrinker: convert remaining shrinkers to count/scan API
staging/lustre/libcfs: cleanup linux-mem.h
staging/lustre/ptlrpc: convert to new shrinker API
staging/lustre/obdclass: convert lu_object shrinker to count/scan API
staging/lustre/ldlm: convert to shrinkers to count/scan API
hugepage: convert huge zero page shrinker to new shrinker API
i915: bail out earlier when shrinker cannot acquire mutex
drivers: convert shrinkers to new count/scan API
fs: convert fs shrinkers to new scan/count API
xfs: fix dquot isolation hang
xfs-convert-dquot-cache-lru-to-list_lru-fix
xfs: convert dquot cache lru to list_lru
xfs: rework buffer dispose list tracking
xfs-convert-buftarg-lru-to-generic-code-fix
xfs: convert buftarg LRU to generic code
fs: convert inode and dentry shrinking to be node aware
vmscan: per-node deferred work
...
|
|
Pull NFS client bugfixes (part 2) from Trond Myklebust:
"Bugfixes:
- Fix a few credential reference leaks resulting from the
SP4_MACH_CRED NFSv4.1 state protection code.
- Fix the SUNRPC bloatometer footprint: convert a 256K hashtable into
the intended 64 byte structure.
- Fix a long standing XDR issue with FREE_STATEID
- Fix a potential WARN_ON spamming issue
- Fix a missing dprintk() kuid conversion
New features:
- Enable the NFSv4.1 state protection support for the WRITE and
COMMIT operations"
* tag 'nfs-for-3.12-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
SUNRPC: No, I did not intend to create a 256KiB hashtable
sunrpc: Add missing kuids conversion for printing
NFSv4.1: sp4_mach_cred: WARN_ON -> WARN_ON_ONCE
NFSv4.1: sp4_mach_cred: no need to ref count creds
NFSv4.1: fix SECINFO* use of put_rpccred
NFSv4.1: sp4_mach_cred: ask for WRITE and COMMIT
NFSv4.1 fix decode_free_stateid
|
|
Fix the declaration of the gss_auth_hash_table so that it creates
a 16 bucket hashtable, as I had intended.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
m68k/allmodconfig:
net/sunrpc/auth_generic.c: In function ‘generic_key_timeout’:
net/sunrpc/auth_generic.c:241: warning: format ‘%d’ expects type ‘int’, but
argument 2 has type ‘kuid_t’
commit cdba321e291f0fbf5abda4d88340292b858e3d4d ("sunrpc: Convert kuids and
kgids to uids and gids for printing") forgot to convert one instance.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Pull nfsd updates from Bruce Fields:
"This was a very quiet cycle! Just a few bugfixes and some cleanup"
* 'nfsd-next' of git://linux-nfs.org/~bfields/linux:
rpc: let xdr layer allocate gssproxy receieve pages
rpc: fix huge kmalloc's in gss-proxy
rpc: comment on linux_cred encoding, treat all as unsigned
rpc: clean up decoding of gssproxy linux creds
svcrpc: remove unused rq_resused
nfsd4: nfsd4_create_clid_dir prints uninitialized data
nfsd4: fix leak of inode reference on delegation failure
Revert "nfsd: nfs4_file_get_access: need to be more careful with O_RDWR"
sunrpc: prepare NFS for 2038
nfsd4: fix setlease error return
nfsd: nfs4_file_get_access: need to be more careful with O_RDWR
|
|
Convert the remaining couple of random shrinkers in the tree to the new
API.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Glauber Costa <glommer@openvz.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: J. Bruce Fields <bfields@redhat.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Rientjes <rientjes@google.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: J. Bruce Fields <bfields@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Pull NFS client updates from Trond Myklebust:
"Highlights include:
- Fix NFSv4 recovery so that it doesn't recover lost locks in cases
such as lease loss due to a network partition, where doing so may
result in data corruption. Add a kernel parameter to control
choice of legacy behaviour or not.
- Performance improvements when 2 processes are writing to the same
file.
- Flush data to disk when an RPCSEC_GSS session timeout is imminent.
- Implement NFSv4.1 SP4_MACH_CRED state protection to prevent other
NFS clients from being able to manipulate our lease and file
locking state.
- Allow sharing of RPCSEC_GSS caches between different rpc clients.
- Fix the broken NFSv4 security auto-negotiation between client and
server.
- Fix rmdir() to wait for outstanding sillyrename unlinks to complete
- Add a tracepoint framework for debugging NFSv4 state recovery
issues.
- Add tracing to the generic NFS layer.
- Add tracing for the SUNRPC socket connection state.
- Clean up the rpc_pipefs mount/umount event management.
- Merge more patches from Chuck in preparation for NFSv4 migration
support"
* tag 'nfs-for-3.12-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (107 commits)
NFSv4: use mach cred for SECINFO_NO_NAME w/ integrity
NFS: nfs_compare_super shouldn't check the auth flavour unless 'sec=' was set
NFSv4: Allow security autonegotiation for submounts
NFSv4: Disallow security negotiation for lookups when 'sec=' is specified
NFSv4: Fix security auto-negotiation
NFS: Clean up nfs_parse_security_flavors()
NFS: Clean up the auth flavour array mess
NFSv4.1 Use MDS auth flavor for data server connection
NFS: Don't check lock owner compatability unless file is locked (part 2)
NFS: Don't check lock owner compatibility in writes unless file is locked
nfs4: Map NFS4ERR_WRONG_CRED to EPERM
nfs4.1: Add SP4_MACH_CRED write and commit support
nfs4.1: Add SP4_MACH_CRED stateid support
nfs4.1: Add SP4_MACH_CRED secinfo support
nfs4.1: Add SP4_MACH_CRED cleanup support
nfs4.1: Add state protection handler
nfs4.1: Minimal SP4_MACH_CRED implementation
SUNRPC: Replace pointer values with task->tk_pid and rpc_clnt->cl_clid
SUNRPC: Add an identifier for struct rpc_clnt
SUNRPC: Ensure rpc_task->tk_pid is available for tracepoints
...
|
|
In theory the linux cred in a gssproxy reply can include up to
NGROUPS_MAX data, 256K of data. In the common case we expect it to be
shorter. So do as the nfsv3 ACL code does and let the xdr code allocate
the pages as they come in, instead of allocating a lot of pages that
won't typically be used.
Tested-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
The reply to a gssproxy can include up to NGROUPS_MAX gid's, which will
take up more than a page. We therefore need to allocate an array of
pages to hold the reply instead of trying to allocate a single huge
buffer.
Tested-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
The encoding of linux creds is a bit confusing.
Also: I think in practice it doesn't really matter whether we treat any
of these things as signed or unsigned, but unsigned seems more
straightforward: uid_t/gid_t are unsigned and it simplifies the ngroups
overflow check.
Tested-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
We can use the normal coding infrastructure here.
Two minor behavior changes:
- we're assuming no wasted space at the end of the linux cred.
That seems to match gss-proxy's behavior, and I can't see why
it would need to do differently in the future.
- NGROUPS_MAX check added: note groups_alloc doesn't do this,
this is the caller's responsibility.
Tested-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Conflicts:
drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
net/bridge/br_multicast.c
net/ipv6/sit.c
The conflicts were minor:
1) sit.c changes overlap with change to ip_tunnel_xmit() signature.
2) br_multicast.c had an overlap between computing max_delay using
msecs_to_jiffies and turning MLDV2_MRC() into an inline function
with a name using lowercase instead of uppercase letters.
3) stmmac had two overlapping changes, one which conditionally allocated
and hooked up a dma_cfg based upon the presence of the pbl OF property,
and another one handling store-and-forward DMA made. The latter of
which should not go into the new of_find_property() basic block.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add an identifier in order to aid debugging.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Add client side debugging to help trace socket connection/disconnection
and unexpected state change issues.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Most of the time an error from the credops crvalidate function means the
server has sent us a garbage verifier. The gss_validate function is the
exception where there is an -EACCES case if the user GSS_context on the client
has expired.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
This patch provides the RPC layer helper functions to allow NFS to manage
data in the face of expired credentials - such as avoiding buffered WRITEs
and COMMITs when the gss context will expire before the WRITEs are flushed
and COMMITs are sent.
These helper functions enable checking the expiration of an underlying
credential key for a generic rpc credential, e.g. the gss_cred gss context
gc_expiry which for Kerberos is set to the remaining TGT lifetime.
A new rpc_authops key_timeout is only defined for the generic auth.
A new rpc_credops crkey_to_expire is only defined for the generic cred.
A new rpc_credops crkey_timeout is only defined for the gss cred.
Set a credential key expiry watermark, RPC_KEY_EXPIRE_TIMEO set to 240 seconds
as a default and can be set via a module parameter as we need to ensure there
is time for any dirty data to be flushed.
If key_timeout is called on a credential with an underlying credential key that
will expire within watermark seconds, we set the RPC_CRED_KEY_EXPIRE_SOON
flag in the generic_cred acred so that the NFS layer can clean up prior to
key expiration.
Checking a generic credential's underlying credential involves a cred lookup.
To avoid this lookup in the normal case when the underlying credential has
a key that is valid (before the watermark), a notify flag is set in
the generic credential the first time the key_timeout is called. The
generic credential then stops checking the underlying credential key expiry, and
the underlying credential (gss_cred) match routine then checks the key
expiration upon each normal use and sets a flag in the associated generic
credential only when the key expiration is within the watermark.
This in turn signals the generic credential key_timeout to perform the extra
credential lookup thereafter.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
The NFS layer needs to know when a key has expired.
This change also returns -EKEYEXPIRED to the application, and the informative
"Key has expired" error message is displayed. The user then knows that
credential renewal is required.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Ensure that we set rpc_clnt->cl_parent before calling rpc_client_register
so that rpcauth_create can find any existing RPCSEC_GSS caches for this
transport.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Ensure that all struct rpc_clnt for any given socket/rdma channel
share the same RPCSEC_GSS/krb5,krb5i,krb5p caches.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Ensure that if an rpc_clnt owns more than one RPCSEC_GSS-based authentication
mechanism, then those caches will share the same 'gssd' upcall pipe.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Add support for looking up existing objects and creating new ones if there
is no match.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
It is now redundant.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
The current system requires everyone to set up notifiers, manage directory
locking, etc.
What we really want to do is have the rpc_client create its directory,
and then create all the entries.
This patch will allow the RPCSEC_GSS and NFS code to register all the
objects that they want to have appear in the directory, and then have
the sunrpc code call them back to actually create/destroy their pipefs
dentries when the rpc_client creates/destroys the parent.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
If an error condition occurs on rpc_pipefs creation, or the user mounts
rpc_pipefs and then unmounts it, then the dentries in struct gss_auth
need to be reset to NULL so that a second call to gss_pipes_dentries_destroy
doesn't try to free them again.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Don't pass the rpc_client as a parameter, when what we really want is
the net namespace.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
The clnt->cl_principal is being used exclusively to store the service
target name for RPCSEC_GSS/krb5 callbacks. Replace it with something that
is stored only in the RPCSEC_GSS-specific code.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Optimise away gss_encode_msg: we don't need to look up the pipe
version a second time.
Save the gss target name in struct gss_auth. It is a property of the
auth cache itself, and doesn't really belong in the rpc_client.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
The directory name is _always_ clnt->cl_program->pipe_dir_name.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
It just duplicates the cl_program->name, and is not used in any fast
paths where the extra dereference will cause a hit.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Some architectures, such as ARM-32 do not return the same base address
when you call kmap_atomic() twice on the same page.
This causes problems for the memmove() call in the XDR helper routine
"_shift_data_right_pages()", since it defeats the detection of
overlapping memory ranges, and has been seen to corrupt memory.
The fix is to distinguish between the case where we're doing an
inter-page copy or not. In the former case of we know that the memory
ranges cannot possibly overlap, so we can additionally micro-optimise
by replacing memmove() with memcpy().
Reported-by: Mark Young <MYoung@nvidia.com>
Reported-by: Matt Craighead <mcraighead@nvidia.com>
Cc: Bruce Fields <bfields@fieldses.org>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Tested-by: Matt Craighead <mcraighead@nvidia.com>
|
|
|
|
If rpcbind causes our connection to the AF_LOCAL socket to close after
we've registered a service, then we want to be careful about reconnecting
since the mount namespace may have changed.
By simply refusing to reconnect the AF_LOCAL socket in the case of
unregister, we avoid the need to somehow save the mount namespace. While
this may lead to some services not unregistering properly, it should
be safe.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Nix <nix@esperi.org.uk>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: stable@vger.kernel.org # 3.9.x
|
|
There is no need for the kernel to time out the AF_LOCAL connection to
the rpcbind socket, and doing so is problematic because when it is
time to reconnect, our process may no longer be using the same mount
namespace.
Reported-by: Nix <nix@esperi.org.uk>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: stable@vger.kernel.org # 3.9.x
|
|
Merge net into net-next to setup some infrastructure Eric
Dumazet needs for usbnet changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The change made to rsc_parse() in
0dc1531aca7fd1440918bd55844a054e9c29acad "svcrpc: store gss mech in
svc_cred" should also have been propagated to the gss-proxy codepath.
This fixes a crash in the gss-proxy case.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|