summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2012-05-30ubifs: use generic_fillattr()Al Viro
don't open-code it... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-30xfs: switch to proper __bitwise type for KM_... flagsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-30switch utimes() to fget_light/fput_lightAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-30switch statfs to fget_light/fput_lightAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-30switch flock to fget_light/fput_lightAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-30switch signalfd4() to fget_light/fput_lightAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-30switch fcntl to fget_raw_light/fput_lightAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-30switch xattr syscalls to fget_light/fput_lightAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-30switch readdir/getdents to fget_light/fput_lightAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-30switch do_fsync() to fget_light()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-29Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull CIFS updates from Steve French. * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: (29 commits) cifs: fix oops while traversing open file list (try #4) cifs: Fix comment as d_alloc_root() is replaced by d_make_root() CIFS: Introduce SMB2 mounts as vers=2.1 CIFS: Introduce SMB2 Kconfig option CIFS: Move add/set_credits and get_credits_field to ops structure CIFS: Move protocol specific demultiplex thread calls to ops struct CIFS: Move protocol specific part from cifs_readv_receive to ops struct CIFS: Move header_size/max_header_size to ops structure CIFS: Move protocol specific part from SendReceive2 to ops struct cifs: Include backup intent search flags during searches {try #2) CIFS: Separate protocol specific part from setlk CIFS: Separate protocol specific part from getlk CIFS: Separate protocol specific lock type handling CIFS: Convert lock type to 32 bit variable CIFS: Move locks to cifsFileInfo structure cifs: convert send_nt_cancel into a version specific op cifs: add a smb_version_operations/values structures and a smb_version enum cifs: remove the vers= and version= synonyms for ver= cifs: add warning about change in default cache semantics in 3.7 cifs: display cache= option in /proc/mounts ...
2012-05-29Merge tag 'nfs-for-3.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "New features include: - Rewrite the O_DIRECT code so that it can share the same coalescing and pNFS functionality as the page cache code. - Allow the server to provide hints as to when we should use pNFS, and when it is more efficient to read and write through the metadata server. - NFS cache consistency updates: * Use the ctime to emulate a change attribute for NFSv2/v3 so that all NFS versions can share the same cache management code. * New cache management code will only look at the change attribute and size attribute when deciding whether or not our cached data is still valid or not. * Don't request NFSv4 post-op attributes on writes in cases such as O_DIRECT, where we don't care about data cache consistency, or when we have a write delegation, and know that our cache is still consistent. * Don't request NFSv4 post-op attributes on operations such as COMMIT, where there are no expected metadata updates. * Don't request NFSv4 directory post-op attributes in cases where the operations themselves already return change attribute updates: i.e. operations such as OPEN, CREATE, REMOVE, LINK and RENAME. - Speed up 'ls' and friends by using READDIR rather than READDIRPLUS if we detect no attempts to lookup filenames. - Improve the code sharing between NFSv2/v3 and v4 mounts - NFSv4.1 state management efficiency improvements - More patches in preparation for NFSv4/v4.1 migration functionality." Fix trivial conflict in fs/nfs/nfs4proc.c that was due to the dcache qstr name initialization changes (that made the length/hash a 64-bit union) * tag 'nfs-for-3.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (146 commits) NFSv4: Add debugging printks to state manager NFSv4: Map NFS4ERR_SHARE_DENIED into an EACCES error instead of EIO NFSv4: update_changeattr does not need to set NFS_INO_REVAL_PAGECACHE NFSv4.1: nfs4_reset_session should use nfs4_handle_reclaim_lease_error NFSv4.1: Handle other occurrences of NFS4ERR_CONN_NOT_BOUND_TO_SESSION NFSv4.1: Handle NFS4ERR_CONN_NOT_BOUND_TO_SESSION in the state manager NFSv4.1: Handle errors in nfs4_bind_conn_to_session NFSv4.1: nfs4_bind_conn_to_session should drain the session NFSv4.1: Don't clobber the seqid if exchange_id returns a confirmed clientid NFSv4.1: Add DESTROY_CLIENTID NFSv4.1: Ensure we use the correct credentials for bind_conn_to_session NFSv4.1: Ensure we use the correct credentials for session create/destroy NFSv4.1: Move NFSPROC4_CLNT_BIND_CONN_TO_SESSION to the end of the operations NFSv4.1: Handle NFS4ERR_SEQ_MISORDERED when confirming the lease NFSv4: When purging the lease, we must clear NFS4CLNT_LEASE_CONFIRM NFSv4: Clean up the error handling for nfs4_reclaim_lease NFSv4.1: Exchange ID must use GFP_NOFS allocation mode nfs41: Use BIND_CONN_TO_SESSION for CB_PATH_DOWN* nfs4.1: add BIND_CONN_TO_SESSION operation NFSv4.1 test the mdsthreshold hint parameters ...
2012-05-28NFSv4: Add debugging printks to state managerTrond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-28NFSv4: Map NFS4ERR_SHARE_DENIED into an EACCES error instead of EIOTrond Myklebust
If a file OPEN is denied due to a share lock, the resulting NFS4ERR_SHARE_DENIED is currently mapped to the default EIO. This patch adds a more appropriate mapping, and brings Linux into line with what Solaris 10 does. See https://bugzilla.kernel.org/show_bug.cgi?id=43286 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
2012-05-28Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osdLinus Torvalds
Pull exofs updates from Boaz Harrosh: "Just a couple of patches. The first is a BUG fix destined for stable which missed the 3.4-rc7 Kernel. The second is just a fixture addition so exofs is able to be better exported as a cluster file system via pNFS." * 'for-linus' of git://git.open-osd.org/linux-open-osd: exofs: Add SYSFS info for autologin/pNFS export exofs: Fix CRASH on very early IO errors.
2012-05-28Merge tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linuxLinus Torvalds
Pull writeback tree from Wu Fengguang: "Mainly from Jan Kara to avoid iput() in the flusher threads." * tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux: writeback: Avoid iput() from flusher thread vfs: Rename end_writeback() to clear_inode() vfs: Move waiting for inode writeback from end_writeback() to evict_inode() writeback: Refactor writeback_single_inode() writeback: Remove wb->list_lock from writeback_single_inode() writeback: Separate inode requeueing after writeback writeback: Move I_DIRTY_PAGES handling writeback: Move requeueing when I_SYNC set to writeback_sb_inodes() writeback: Move clearing of I_SYNC into inode_sync_complete() writeback: initialize global_dirty_limit fs: remove 8 bytes of padding from struct writeback_control on 64 bit builds mm: page-writeback.c: local functions should not be exposed globally
2012-05-28NFSv4: update_changeattr does not need to set NFS_INO_REVAL_PAGECACHETrond Myklebust
We're already invalidating the data cache, and setting the new change attribute. Since directories don't care about the i_size field, there is no need to be forcing any extra revalidation of the page cache. We do keep the NFS_INO_INVALID_ATTR flag, in order to force an attribute cache revalidation on stat() calls since we do not update the mtime and ctime fields. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-27NFSv4.1: nfs4_reset_session should use nfs4_handle_reclaim_lease_errorTrond Myklebust
The results from a call to nfs4_proc_create_session() should always be fed into nfs4_handle_reclaim_lease_error, so that we can handle errors such as NFS4ERR_SEQ_MISORDERED correctly. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-27NFSv4.1: Handle other occurrences of NFS4ERR_CONN_NOT_BOUND_TO_SESSIONTrond Myklebust
Let nfs4_schedule_session_recovery() handle the details of choosing between resetting the session, and other session related recovery. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-27NFSv4.1: Handle NFS4ERR_CONN_NOT_BOUND_TO_SESSION in the state managerTrond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-27NFSv4.1: Handle errors in nfs4_bind_conn_to_sessionTrond Myklebust
Ensure that we handle NFS4ERR_DELAY errors separately, and then let nfs4_recovery_handle_error() handle all other cases. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-27NFSv4.1: nfs4_bind_conn_to_session should drain the sessionTrond Myklebust
In order to avoid races with other RPC calls that end up setting the NFS4CLNT_BIND_CONN_TO_SESSION flag. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-26word-at-a-time: make the interfaces truly genericLinus Torvalds
This changes the interfaces in <asm/word-at-a-time.h> to be a bit more complicated, but a lot more generic. In particular, it allows us to really do the operations efficiently on both little-endian and big-endian machines, pretty much regardless of machine details. For example, if you can rely on a fast population count instruction on your architecture, this will allow you to make your optimized <asm/word-at-a-time.h> file with that. NOTE! The "generic" version in include/asm-generic/word-at-a-time.h is not truly generic, it actually only works on big-endian. Why? Because on little-endian the generic algorithms are wasteful, since you can inevitably do better. The x86 implementation is an example of that. (The only truly non-generic part of the asm-generic implementation is the "find_zero()" function, and you could make a little-endian version of it. And if the Kbuild infrastructure allowed us to pick a particular header file, that would be lovely) The <asm/word-at-a-time.h> functions are as follows: - WORD_AT_A_TIME_CONSTANTS: specific constants that the algorithm uses. - has_zero(): take a word, and determine if it has a zero byte in it. It gets the word, the pointer to the constant pool, and a pointer to an intermediate "data" field it can set. This is the "quick-and-dirty" zero tester: it's what is run inside the hot loops. - "prep_zero_mask()": take the word, the data that has_zero() produced, and the constant pool, and generate an *exact* mask of which byte had the first zero. This is run directly *outside* the loop, and allows the "has_zero()" function to answer the "is there a zero byte" question without necessarily getting exactly *which* byte is the first one to contain a zero. If you do multiple byte lookups concurrently (eg "hash_name()", which looks for both NUL and '/' bytes), after you've done the prep_zero_mask() phase, the result of those can be or'ed together to get the "either or" case. - The result from "prep_zero_mask()" can then be fed into "find_zero()" (to find the byte offset of the first byte that was zero) or into "zero_bytemask()" (to find the bytemask of the bytes preceding the zero byte). The existence of zero_bytemask() is optional, and is not necessary for the normal string routines. But dentry name hashing needs it, so if you enable DENTRY_WORD_AT_A_TIME you need to expose it. This changes the generic strncpy_from_user() function and the dentry hashing functions to use these modified word-at-a-time interfaces. This gets us back to the optimized state of the x86 strncpy that we lost in the previous commit when moving over to the generic version. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-26NFSv4.1: Don't clobber the seqid if exchange_id returns a confirmed clientidTrond Myklebust
If the EXCHGID4_FLAG_CONFIRMED_R flag is set, the client is in theory supposed to already know the correct value of the seqid, in which case RFC5661 states that it should ignore the value returned. Also ensure that if the sanity check in nfs4_check_cl_exchange_flags fails, then we must not change the nfs_client fields. Finally, clean up the code: we don't need to retest the value of 'status' unless it can change. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-26NFSv4.1: Add DESTROY_CLIENTIDTrond Myklebust
Ensure that we destroy our lease on last unmount Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-25NFSv4.1: Ensure we use the correct credentials for bind_conn_to_sessionTrond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Weston Andros Adamson <dros@netapp.com>
2012-05-25NFSv4.1: Ensure we use the correct credentials for session create/destroyTrond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-25NFSv4.1: Move NFSPROC4_CLNT_BIND_CONN_TO_SESSION to the end of the operationsTrond Myklebust
For backward compatibility with nfs-utils. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Weston Andros Adamson <dros@netapp.com>
2012-05-25NFSv4.1: Handle NFS4ERR_SEQ_MISORDERED when confirming the leaseTrond Myklebust
Apparently the patch "NFS: Always use the same SETCLIENTID boot verifier" is tickling a Linux nfs server bug, and causing a regression: the server can get into a situation where it keeps replying NFS4ERR_SEQ_MISORDERED to our CREATE_SESSION request even when we are sending the correct sequence ID. Fix this by purging the lease and then retrying. Reported-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-25NFSv4: When purging the lease, we must clear NFS4CLNT_LEASE_CONFIRMTrond Myklebust
Otherwise we can end up not sending a new exchange-id/setclientid Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-25NFSv4: Clean up the error handling for nfs4_reclaim_leaseTrond Myklebust
Try to consolidate the error handling for nfs4_reclaim_lease into a single function instead of doing a bit here, and a bit there... Also ensure that NFS4CLNT_PURGE_STATE handles errors correctly. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-25Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext2, ext3 and quota fixes from Jan Kara: "Interesting bits are: - removal of a special i_mutex locking subclass (I_MUTEX_QUOTA) since quota code does not need i_mutex anymore in any unusual way. - backport (from ext4) of a fix of a checkpointing bug (missing cache flush) that could lead to fs corruption on power failure The rest are just random small fixes & cleanups." * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: ext2: trivial fix to comment for ext2_free_blocks ext2: remove the redundant comment for ext2_export_ops ext3: return 32/64-bit dir name hash according to usage type quota: Get rid of nested I_MUTEX_QUOTA locking subclass quota: Use precomputed value of sb_dqopt in dquot_quota_sync ext2: Remove i_mutex use from ext2_quota_write() reiserfs: Remove i_mutex use from reiserfs_quota_write() ext4: Remove i_mutex use from ext4_quota_write() ext3: Remove i_mutex use from ext3_quota_write() quota: Fix double lock in add_dquot_ref() with CONFIG_QUOTA_DEBUG jbd: Write journal superblock with WRITE_FUA after checkpointing jbd: protect all log tail updates with j_checkpoint_mutex jbd: Split updating of journal superblock and marking journal empty ext2: do not register write_super within VFS ext2: Remove s_dirt handling ext2: write superblock only once on unmount ext3: update documentation with barrier=1 default ext3: remove max_debt in find_group_orlov() jbd: Refine commit writeout logic
2012-05-24Merge tag 'stable/for-linus-3.5-rc0-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen updates from Konrad Rzeszutek Wilk: "Features: * Extend the APIC ops implementation and add IRQ_WORKER vector support so that 'perf' can work properly. * Fix self-ballooning code, and balloon logic when booting as initial domain. * Move array printing code to generic debugfs * Support XenBus domains. * Lazily free grants when a domain is dead/non-existent. * In M2P code use batching calls Bug-fixes: * Fix NULL dereference in allocation failure path (hvc_xen) * Fix unbinding of IRQ_WORKER vector during vCPU hot-unplug * Fix HVM guest resume - we would leak an PIRQ value instead of reusing the existing one." Fix up add-add onflicts in arch/x86/xen/enlighten.c due to addition of apic ipi interface next to the new apic_id functions. * tag 'stable/for-linus-3.5-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: do not map the same GSI twice in PVHVM guests. hvc_xen: NULL dereference on allocation failure xen: Add selfballoning memory reservation tunable. xenbus: Add support for xenbus backend in stub domain xen/smp: unbind irqworkX when unplugging vCPUs. xen: enter/exit lazy_mmu_mode around m2p_override calls xen/acpi/sleep: Enable ACPI sleep via the __acpi_os_prepare_sleep xen: implement IRQ_WORK_VECTOR handler xen: implement apic ipi interface xen/setup: update VA mapping when releasing memory during setup xen/setup: Combine the two hypercall functions - since they are quite similar. xen/setup: Populate freed MFNs from non-RAM E820 entries and gaps to E820 RAM xen/setup: Only print "Freeing XXX-YYY pfn range: Z pages freed" if Z > 0 xen/gnttab: add deferred freeing logic debugfs: Add support to print u32 array in debugfs xen/p2m: An early bootup variant of set_phys_to_machine xen/p2m: Collapse early_alloc_p2m_middle redundant checks. xen/p2m: Allow alloc_p2m_middle to call reserve_brk depending on argument xen/p2m: Move code around to allow for better re-usage.
2012-05-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds
Pull sparc changes from David S. Miller: "This has the generic strncpy_from_user() implementation architectures can now use, which we've been developing on linux-arch over the past few days. For good measure I ran both a 32-bit and a 64-bit glibc testsuite run, and the latter of which pointed out an adjustment I needed to make to sparc's user_addr_max() definition. Linus, you were right, STACK_TOP was not the right thing to use, even on sparc itself :-) From Sam Ravnborg, we have a conversion of sparc32 over to the common alloc_thread_info_node(), since the aspect which originally blocked our doing so (sun4c) has been removed." Fix up trivial arch/sparc/Kconfig and lib/Makefile conflicts. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc: Fix user_addr_max() definition. lib: Sparc's strncpy_from_user is generic enough, move under lib/ kernel: Move REPEAT_BYTE definition into linux/kernel.h sparc: Increase portability of strncpy_from_user() implementation. sparc: Optimize strncpy_from_user() zero byte search. sparc: Add full proper error handling to strncpy_from_user(). sparc32: use the common implementation of alloc_thread_info_node()
2012-05-24Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfsLinus Torvalds
Pull XFS update from Ben Myers: - Removal of xfsbufd - Background CIL flushes have been moved to a workqueue. - Fix to xfs_check_page_type applicable to filesystems where blocksize < page size - Fix for stale data exposure when extsize hints are used. - A series of xfs_buf cache cleanups. - Fix for XFS_IOC_ALLOCSP - Cleanups for includes and removal of xfs_lrw.[ch]. - Moved all busy extent handling to it's own file so that it is easier to merge with userspace. - Fix for log mount failure. - Fix to enable inode reclaim during quotacheck at mount time. - Fix for delalloc quota accounting. - Fix for memory reclaim deadlock on agi buffer. - Fixes for failed writes and to clean up stale delalloc blocks. - Fix to use GFP_NOFS in blkdev_issue_flush - SEEK_DATA/SEEK_HOLE support * 'for-linus' of git://oss.sgi.com/xfs/xfs: (57 commits) xfs: add trace points for log forces xfs: fix memory reclaim deadlock on agi buffer xfs: fix delalloc quota accounting on failure xfs: protect xfs_sync_worker with s_umount semaphore xfs: introduce SEEK_DATA/SEEK_HOLE support xfs: make xfs_extent_busy_trim not static xfs: make XBF_MAPPED the default behaviour xfs: flush outstanding buffers on log mount failure xfs: Properly exclude IO type flags from buffer flags xfs: clean up xfs_bit.h includes xfs: move xfs_do_force_shutdown() and kill xfs_rw.c xfs: move xfs_get_extsz_hint() and kill xfs_rw.h xfs: move xfs_fsb_to_db to xfs_bmap.h xfs: clean up busy extent naming xfs: move busy extent handling to it's own file xfs: move xfsagino_t to xfs_types.h xfs: use iolock on XFS_IOC_ALLOCSP calls xfs: kill XBF_DONTBLOCK xfs: kill xfs_read_buf() xfs: kill XBF_LOCK ...
2012-05-24NFSv4.1: Exchange ID must use GFP_NOFS allocation modeTrond Myklebust
Exchange ID can be called in a lease reclaim situation, so it will deadlock if it then tries to write out dirty NFS pages. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-24nfs41: Use BIND_CONN_TO_SESSION for CB_PATH_DOWN*Weston Andros Adamson
The state manager can handle SEQ4_STATUS_CB_PATH_DOWN* flags with a BIND_CONN_TO_SESSION instead of destroying the session and creating a new one. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-24nfs4.1: add BIND_CONN_TO_SESSION operationWeston Andros Adamson
This patch adds the BIND_CONN_TO_SESSION operation which is needed for upcoming SP4_MACH_CRED work and useful for recovering from broken connections without destroying the session. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-24NFSv4.1 test the mdsthreshold hint parametersAndy Adamson
Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-24NFSv4.1 add nfs_inode book keeping for mdsthresholdAndy Adamson
Keep track of the number of bytes read or written via buffered, direct, and mem-mapped i/o for use by mdsthreshold size_io hints. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-24NFSv4.1 cache mdsthreshold values on OPENAndy Adamson
Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-24NFSv4.1 mdsthreshold attribute xdrAndy Adamson
We only support one layout type per file system, so one threshold_item4 per mdsthreshold4. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-24kernel: Move REPEAT_BYTE definition into linux/kernel.hDavid S. Miller
And make sure that everything using it explicitly includes that header file. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull more networking updates from David Miller: "Ok, everything from here on out will be bug fixes." 1) One final sync of wireless and bluetooth stuff from John Linville. These changes have all been in his tree for more than a week, and therefore have had the necessary -next exposure. John was just away on a trip and didn't have a change to send the pull request until a day or two ago. 2) Put back some defines in user exposed header file areas that were removed during the tokenring purge. From Stephen Hemminger and Paul Gortmaker. 3) A bug fix for UDP hash table allocation got lost in the pile due to one of those "you got it.. no I've got it.." situations. :-) From Tim Bird. 4) SKB coalescing in TCP needs to have stricter checks, otherwise we'll try to coalesce overlapping frags and crash. Fix from Eric Dumazet. 5) RCU routing table lookups can race with free_fib_info(), causing crashes when we deref the device pointers in the route. Fix by releasing the net device in the RCU callback. From Yanmin Zhang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (293 commits) tcp: take care of overlaps in tcp_try_coalesce() ipv4: fix the rcu race between free_fib_info and ip_route_output_slow mm: add a low limit to alloc_large_system_hash ipx: restore token ring define to include/linux/ipx.h if: restore token ring ARP type to header xen: do not disable netfront in dom0 phy/micrel: Fix ID of KSZ9021 mISDN: Add X-Tensions USB ISDN TA XC-525 gianfar:don't add FCB length to hard_header_len Bluetooth: Report proper error number in disconnection Bluetooth: Create flags for bt_sk() Bluetooth: report the right security level in getsockopt Bluetooth: Lock the L2CAP channel when sending Bluetooth: Restore locking semantics when looking up L2CAP channels Bluetooth: Fix a redundant and problematic incoming MTU check Bluetooth: Add support for Foxconn/Hon Hai AR5BBU22 0489:E03C Bluetooth: Fix EIR data generation for mgmt_device_found Bluetooth: Fix Inquiry with RSSI event mask Bluetooth: improve readability of l2cap_seq_list code Bluetooth: Fix skb length calculation ...
2012-05-24mm: add a low limit to alloc_large_system_hashTim Bird
UDP stack needs a minimum hash size value for proper operation and also uses alloc_large_system_hash() for proper NUMA distribution of its hash tables and automatic sizing depending on available system memory. On some low memory situations, udp_table_init() must ignore the alloc_large_system_hash() result and reallocs a bigger memory area. As we cannot easily free old hash table, we leak it and kmemleak can issue a warning. This patch adds a low limit parameter to alloc_large_system_hash() to solve this problem. We then specify UDP_HTABLE_SIZE_MIN for UDP/UDPLite hash table allocation. Reported-by: Mark Asselstine <mark.asselstine@windriver.com> Reported-by: Tim Bird <tim.bird@am.sony.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-24Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace enhancements from Eric Biederman: "This is a course correction for the user namespace, so that we can reach an inexpensive, maintainable, and reasonably complete implementation. Highlights: - Config guards make it impossible to enable the user namespace and code that has not been converted to be user namespace safe. - Use of the new kuid_t type ensures the if you somehow get past the config guards the kernel will encounter type errors if you enable user namespaces and attempt to compile in code whose permission checks have not been updated to be user namespace safe. - All uids from child user namespaces are mapped into the initial user namespace before they are processed. Removing the need to add an additional check to see if the user namespace of the compared uids remains the same. - With the user namespaces compiled out the performance is as good or better than it is today. - For most operations absolutely nothing changes performance or operationally with the user namespace enabled. - The worst case performance I could come up with was timing 1 billion cache cold stat operations with the user namespace code enabled. This went from 156s to 164s on my laptop (or 156ns to 164ns per stat operation). - (uid_t)-1 and (gid_t)-1 are reserved as an internal error value. Most uid/gid setting system calls treat these value specially anyway so attempting to use -1 as a uid would likely cause entertaining failures in userspace. - If setuid is called with a uid that can not be mapped setuid fails. I have looked at sendmail, login, ssh and every other program I could think of that would call setuid and they all check for and handle the case where setuid fails. - If stat or a similar system call is called from a context in which we can not map a uid we lie and return overflowuid. The LFS experience suggests not lying and returning an error code might be better, but the historical precedent with uids is different and I can not think of anything that would break by lying about a uid we can't map. - Capabilities are localized to the current user namespace making it safe to give the initial user in a user namespace all capabilities. My git tree covers all of the modifications needed to convert the core kernel and enough changes to make a system bootable to runlevel 1." Fix up trivial conflicts due to nearby independent changes in fs/stat.c * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits) userns: Silence silly gcc warning. cred: use correct cred accessor with regards to rcu read lock userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq userns: Convert cgroup permission checks to use uid_eq userns: Convert tmpfs to use kuid and kgid where appropriate userns: Convert sysfs to use kgid/kuid where appropriate userns: Convert sysctl permission checks to use kuid and kgids. userns: Convert proc to use kuid/kgid where appropriate userns: Convert ext4 to user kuid/kgid where appropriate userns: Convert ext3 to use kuid/kgid where appropriate userns: Convert ext2 to use kuid/kgid where appropriate. userns: Convert devpts to use kuid/kgid where appropriate userns: Convert binary formats to use kuid/kgid where appropriate userns: Add negative depends on entries to avoid building code that is userns unsafe userns: signal remove unnecessary map_cred_ns userns: Teach inode_capable to understand inodes whose uids map to other namespaces. userns: Fail exec for suid and sgid binaries with ids outside our user namespace. userns: Convert stat to return values mapped from kuids and kgids userns: Convert user specfied uids and gids in chown into kuids and kgid userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs ...
2012-05-23Merge tag 'pm-for-3.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: - Implementation of opportunistic suspend (autosleep) and user space interface for manipulating wakeup sources. - Hibernate updates from Bojan Smojver and Minho Ban. - Updates of the runtime PM core and generic PM domains framework related to PM QoS. - Assorted fixes. * tag 'pm-for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (25 commits) epoll: Fix user space breakage related to EPOLLWAKEUP PM / Domains: Make it possible to add devices to inactive domains PM / Hibernate: Use get_gendisk to verify partition if resume_file is integer format PM / Domains: Fix computation of maximum domain off time PM / Domains: Fix link checking when add subdomain PM / Sleep: User space wakeup sources garbage collector Kconfig option PM / Sleep: Make the limit of user space wakeup sources configurable PM / Documentation: suspend-and-cpuhotplug.txt: Fix typo PM / Domains: Cache device stop and domain power off governor results, v3 PM / Domains: Make device removal more straightforward PM / Sleep: Fix a mistake in a conditional in autosleep_store() epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready PM / QoS: Create device constraints objects on notifier registration PM / Runtime: Remove device fields related to suspend time, v2 PM / Domains: Rework default domain power off governor function, v2 PM / Domains: Rework default device stop governor function, v2 PM / Sleep: Add user space interface for manipulating wakeup sources, v3 PM / Sleep: Add "prevent autosleep time" statistics to wakeup sources PM / Sleep: Implement opportunistic sleep, v2 PM / Sleep: Add wakeup_source_activate and wakeup_source_deactivate tracepoints ...
2012-05-23NFS: Add memory barriers to the nfs_client->cl_cons_state initialisationTrond Myklebust
Ensure that a process that uses the nfs_client->cl_cons_state test for whether the initialisation process is finished does not read stale data. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-23NFSv4: Fix a race in the net namespace mount notificationTrond Myklebust
Since the struct nfs_client gets added to the global nfs_client_list before it is initialised, it is possible that rpc_pipefs_event can end up trying to create idmapper entries on such a thing. The solution is to have the mount notification wait for the initialisation of each nfs_client to complete, and then to skip any entries for which the it failed. Reported-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
2012-05-23NFSv4.1: Fix session initialisation racesTrond Myklebust
Session initialisation is not complete until the lease manager has run. We need to ensure that both nfs4_init_session and nfs4_init_ds_session do so, and that they check for any resulting errors in clp->cl_cons_state. Only after this is done, can nfs4_ds_connect check the contents of clp->cl_exchange_flags. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Andy Adamson <andros@netapp.com>