summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2013-07-01ext4: implement error handling of ext4_mb_new_preallocation()Alexey Khoroshilov
If memory allocation in ext4_mb_new_group_pa() is failed, it returns error code, ext4_mb_new_preallocation() propages it, but ext4_mb_new_blocks() ignores it. An observed result was: - allocation fail means ext4_mb_new_group_pa() does not update ext4_allocation_context; - ext4_mb_new_blocks() sets ext4_allocation_request->len (ar->len = ac->ac_b_ex.fe_len;) to number of blocks preallocated (512) instead of number of blocks requested (1); - that activates update cycle in ext4_splice_branch(): for (i = 1; i < blks; i++) <-- blks is 512 instead of 1 here *(where->p + i) = cpu_to_le32(current_block++); - it iterates 511 times and corrupts a chunk of memory including inode structure; - page fault happens at EXT4_SB(inode->i_sb) in ext4_mark_inode_dirty(); - system hangs with 'scheduling while atomic' BUG. The patch implements a check for ext4_mb_new_preallocation() error code and handles its failure as if ext4_mb_regular_allocator() fails. Found by Linux File System Verification project (linuxtesting.org). [ Patch restructed by tytso to make the flow of control easier to follow. ] Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-07-01ext4: fix corruption when online resizing a fs with 1K block sizeMaarten ter Huurne
Subtracting the number of the first data block places the superblock backups one block too early, corrupting the file system. When the block size is larger than 1K, the first data block is 0, so the subtraction has no effect and no corruption occurs. Signed-off-by: Maarten ter Huurne <maarten@treewalker.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> CC: stable@vger.kernel.org
2013-07-01pstore: Pass header size in the pstore write callbackAruna Balakrishnaiah
Header size is needed to distinguish between header and the dump data. Incorporate the addition of new argument (hsize) in the pstore write callback. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01Merge tag 'v3.10' into nextBenjamin Herrenschmidt
Merge 3.10 in order to get some of the last minute powerpc changes, resolve conflicts and add additional fixes on top of them.
2013-06-29Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull ubifs fixes from Al Viro: "A couple of ubifs readdir/lseek race fixes. Stable fodder, really nasty..." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: UBIFS: fix a horrid bug UBIFS: prepare to fix a horrid bug
2013-06-29Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Ingo Molnar: "One more fix for a recently discovered bug" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Disable monitoring on setuid processes for regular users
2013-06-29lseek_execute() doesn't need an inode passed to itAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29block_dev: switch to fixed_size_llseek()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29locks: give the blocked_hash its own spinlockJeff Layton
There's no reason we have to protect the blocked_hash and file_lock_list with the same spinlock. With the tests I have, breaking it in two gives a barely measurable performance benefit, but it seems reasonable to make this locking as granular as possible. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29locks: add a new "lm_owner_key" lock operationJeff Layton
Currently, the hashing that the locking code uses to add these values to the blocked_hash is simply calculated using fl_owner field. That's valid in most cases except for server-side lockd, which validates the owner of a lock based on fl_owner and fl_pid. In the case where you have a small number of NFS clients doing a lot of locking between different processes, you could end up with all the blocked requests sitting in a very small number of hash buckets. Add a new lm_owner_key operation to the lock_manager_operations that will generate an unsigned long to use as the key in the hashtable. That function is only implemented for server-side lockd, and simply XORs the fl_owner and fl_pid. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29locks: turn the blocked_list into a hashtableJeff Layton
Break up the blocked_list into a hashtable, using the fl_owner as a key. This speeds up searching the hash chains, which is especially significant for deadlock detection. Note that the initial implementation assumes that hashing on fl_owner is sufficient. In most cases it should be, with the notable exception being server-side lockd, which compares ownership using a tuple of the nlm_host and the pid sent in the lock request. So, this may degrade to a single hash bucket when you only have a single NFS client. That will be addressed in a later patch. The careful observer may note that this patch leaves the file_lock_list alone. There's much less of a case for turning the file_lock_list into a hashtable. The only user of that list is the code that generates /proc/locks, and it always walks the entire list. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29locks: convert fl_link to a hlist_nodeJeff Layton
Testing has shown that iterating over the blocked_list for deadlock detection turns out to be a bottleneck. In order to alleviate that, begin the process of turning it into a hashtable. We start by turning the fl_link into a hlist_node and the global lists into hlists. A later patch will do the conversion of the blocked_list to a hashtable. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29locks: avoid taking global lock if possible when waking up blocked waitersJeff Layton
Since we always hold the i_lock when inserting a new waiter onto the fl_block list, we can avoid taking the global lock at all if we find that it's empty when we go to wake up blocked waiters. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29locks: protect most of the file_lock handling with i_lockJeff Layton
Having a global lock that protects all of this code is a clear scalability problem. Instead of doing that, move most of the code to be protected by the i_lock instead. The exceptions are the global lists that the ->fl_link sits on, and the ->fl_block list. ->fl_link is what connects these structures to the global lists, so we must ensure that we hold those locks when iterating over or updating these lists. Furthermore, sound deadlock detection requires that we hold the blocked_list state steady while checking for loops. We also must ensure that the search and update to the list are atomic. For the checking and insertion side of the blocked_list, push the acquisition of the global lock into __posix_lock_file and ensure that checking and update of the blocked_list is done without dropping the lock in between. On the removal side, when waking up blocked lock waiters, take the global lock before walking the blocked list and dequeue the waiters from the global list prior to removal from the fl_block list. With this, deadlock detection should be race free while we minimize excessive file_lock_lock thrashing. Finally, in order to avoid a lock inversion problem when handling /proc/locks output we must ensure that manipulations of the fl_block list are also protected by the file_lock_lock. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29locks: encapsulate the fl_link list handlingJeff Layton
Move the fl_link list handling routines into a separate set of helpers. Also ensure that locks and requests are always put on global lists last (after fully initializing them) and are taken off before unintializing them. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29locks: make "added" in __posix_lock_file a boolJeff Layton
Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29locks: comment cleanups and clarificationsJeff Layton
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29locks: make generic_add_lease and generic_delete_lease staticJeff Layton
Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29cifs: use posix_unblock_lock instead of locks_delete_blockJeff Layton
commit 66189be74 (CIFS: Fix VFS lock usage for oplocked files) exported the locks_delete_block symbol. There's already an exported helper function that provides this capability however, so make cifs use that instead and turn locks_delete_block back into a static function. Note that if fl->fl_next == NULL then this lock has already been through locks_delete_block(), so we should be OK to ignore an ENOENT error here and simply not retry the lock. Cc: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29locks: drop the unused filp argument to posix_unblock_lockJeff Layton
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29Don't pass inode to ->d_hash() and ->d_compare()Linus Torvalds
Instances either don't look at it at all (the majority of cases) or only want it to find the superblock (which can be had as dentry->d_sb). A few cases that want more are actually safe with dentry->d_inode - the only precaution needed is the check that it hadn't been replaced with NULL by rmdir() or by overwriting rename(), which case should be simply treated as cache miss. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29minix: bug widening a binary "not" operationDan Carpenter
"chunk_size" is an unsigned int and "pos" is an unsigned long. The "& ~(chunk_size-1)" operation clears the high 32 bits unintentionally. The ALIGN() macro does the correct thing. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2013-06-29splice: lift checks from do_splice_from() into callersAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29constify rw_verify_area()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29new helper: fixed_size_llseek()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29ecryptfs: switch ecryptfs_decode_and_decrypt_filename() from dentry to sbAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29fuse: another open-coded file_inode()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29btrfs: more open-coded file_inode()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29fanotify: quit wanking with FASYNC in ->release()Al Viro
... especially since there's no way to get that sucker on the list fsnotify_fasync() works with - the only thing adding to it is fsnotify_fasync() itself and it's never called for fanotify files while they are opened. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29kill find_inode_number()Al Viro
the only remaining caller (in ncpfs) is guaranteed to return 0 - we only hit it if we'd just checked that there's no dentry with such name. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29coda: don't bother with find_inode_number()Al Viro
the fallback it's using for dcache misses is actually the same value we would've used for inumber anyway. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29proc_fill_cache(): clean up, get rid of pointless find_inode_number() useAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29proc_fill_cache(): just make instantiate_t return intAl Viro
all instances always return ERR_PTR(-E...) or NULL, anyway Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29proc_pid_readdir(): stop wanking with proc_fill_cache() for /proc/selfAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29proc_fill_cache(): kill pointless checkAl Viro
we'd just checked that child->d_inode is non-NULL, for fuck sake! Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29ncpfs: don't bother with EBUSY on removal of busy directoriesAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29don't call file_pos_write() if vfs_{read,write}{,v}() failsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29Replace a bunch of file->dentry->d_inode refs with file_inode()David Howells
Replace a bunch of file->dentry->d_inode refs with file_inode(). In __fput(), use file->f_inode instead so as not to be affected by any tricks that file_inode() might grow. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29udf: provide ->tmpfile()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29ext3 ->tmpfile() supportAl Viro
In this case we do need a bit more than usual, due to orphan list handling. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29allow the temp files created by open() to be linked toAl Viro
O_TMPFILE | O_CREAT => linkat() with AT_SYMLINK_FOLLOW and /proc/self/fd/<n> as oldpath (i.e. flink()) will create a link O_TMPFILE | O_CREAT | O_EXCL => ENOENT on attempt to link those guys Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[O_TMPFILE] it's still short a few helpers, but infrastructure should be OK ↵Al Viro
now... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29allow build_open_flags() to return an errorAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29lift file_*_write out of do_splice_direct()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29lift file_*_write out of do_splice_from()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29do_last(): fix missing checks for LAST_BIND caseAl Viro
/proc/self/cwd with O_CREAT should fail with EISDIR. /proc/self/exe, OTOH, should fail with ENOTDIR when opened with O_DIRECTORY. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] constify ->actorAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] ->readdir() is goneAl Viro
everything's converted to ->iterate() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert ecryptfsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert codaAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>