summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2012-06-12Merge tag 'writeback-lock-fix' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux Pull writeback locking fix from Wu Fengguang: "fix unbalanced wb->list_lock in 3.5-rc1" * tag 'writeback-lock-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux: writeback: Fix lock imbalance in writeback_sb_inodes()
2012-06-12exofs: fix sparse non-ANSI function warningRandy Dunlap
Fix sparse non-ANSI function warning: fs/exofs/sys.c:112:28: warning: non-ANSI function declaration of function 'exofs_sysfs_dbg_print' Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-08writeback: Fix lock imbalance in writeback_sb_inodes()Jan Kara
Fix bug introduced by 169ebd90. We have to have wb_list_lock locked when restarting writeback loop after having waited for inode writeback. Bug description by Ted Tso: I can reproduce this fairly easily by using ext4 w/o a journal, running under KVM with 1024megs memory, with fsstress (xfstests #13): [ 45.153294] ===================================== [ 45.154784] [ BUG: bad unlock balance detected! ] [ 45.155591] 3.5.0-rc1-00002-gb22b1f1 #124 Not tainted [ 45.155591] ------------------------------------- [ 45.155591] flush-254:16/2499 is trying to release lock (&(&wb->list_lock)->rlock) at: [ 45.155591] [<c022c3da>] writeback_sb_inodes+0x160/0x327 [ 45.155591] but there are no more locks to release! Reported-by: Theodore Ts'o <tytso@mit.edu> Tested-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
2012-06-08Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bug fixes from Theodore Ts'o: "This update contains two bug fixes, both destined for the stable tree. Perhaps the most important is one which fixes ext4 when used with file systems originally formatted for use with ext3, but then later converted to take advantage of ext4." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: don't set i_flags in EXT4_IOC_SETFLAGS ext4: fix the free blocks calculation for ext3 file systems w/ uninit_bg
2012-06-08Merge tag 'upstream-3.5-rc2' of git://git.infradead.org/linux-ubifsLinus Torvalds
Pull UBI/UBIFS fixes from Artem Bityutskiy: "Fix UBI and UBIFS - they refuse to work without debugfs. This was broken by the 3.5-rc1 UBI/UBIFS changes when we removed the debugging Kconfig switches. Also, correct locking in 'ubi_wl_flush()' - it was extended to support flushing a specific LEB in 3.5-rc1, and the locking was sub-optimal." * tag 'upstream-3.5-rc2' of git://git.infradead.org/linux-ubifs: UBI: correct ubi_wl_flush locking UBIFS: fix debugfs-less systems support UBI: fix debugfs-less systems support
2012-06-08Revert "vfs: stop d_splice_alias creating directory aliases"Linus Torvalds
This reverts commit 7732a557b1342c6e6966efb5f07effcf99f56167 (and commit 3f50fff4dace23d3cfeb195d5cd4ee813cee68b7, which was a follow-up cleanup). We're chasing an elusive bug that Dave Jones can apparently reproduce using his system call fuzzer tool, and that looks like some kind of locking ordering problem on the directory i_mutex chain. Our i_mutex locking is rather complex, and depends on the topological ordering of the directories, which is why we have been very wary of splicing directory entries around. Of course, we really don't want to ever see aliased unconnected directories anyway, so none of this should ever happen, but this revert aims to basically get us back to a known older state. Bruce points to some of the previous discussion at http://marc.info/?i=<20110310105821.GE22723@ZenIV.linux.org.uk> and in particular a long post from Neil: http://marc.info/?i=<20110311150749.2fa2be66@notabene.brown> It should be noted that it's possible that Dave's problems come from other changes altohgether, including possibly just the fact that Dave constantly is teachning his fuzzer new tricks. So what appears to be a new bug could in fact be an old one that just gets newly triggered, but reverting these patches as "still under heavy discussion" is the right thing regardless. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-08Revert "mm: correctly synchronize rss-counters at exit/exec"Linus Torvalds
This reverts commit 40af1bbdca47e5c8a2044039bb78ca8fd8b20f94. It's horribly and utterly broken for at least the following reasons: - calling sync_mm_rss() from mmput() is fundamentally wrong, because there's absolutely no reason to believe that the task that does the mmput() always does it on its own VM. Example: fork, ptrace, /proc - you name it. - calling it *after* having done mmdrop() on it is doubly insane, since the mm struct may well be gone now. - testing mm against NULL before you call it is insane too, since a NULL mm there would have caused oopses long before. .. and those are just the three bugs I found before I decided to give up looking for me and revert it asap. I should have caught it before I even took it, but I trusted Andrew too much. Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Hugh Dickins <hughd@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-07ext4: don't set i_flags in EXT4_IOC_SETFLAGSTao Ma
Commit 7990696 uses the ext4_{set,clear}_inode_flags() functions to change the i_flags automatically but fails to remove the error setting of i_flags. So we still have the problem of trashing state flags. Fix this by removing the assignment. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
2012-06-07ext4: fix the free blocks calculation for ext3 file systems w/ uninit_bgTheodore Ts'o
Ext3 filesystems that are converted to use as many ext4 file system features as possible will enable uninit_bg to speed up e2fsck times. These file systems will have a native ext3 layout of inode tables and block allocation bitmaps (as opposed to ext4's flex_bg layout). Unfortunately, in these cases, when first allocating a block in an uninitialized block group, ext4 would incorrectly calculate the number of free blocks in that block group, and then errorneously report that the file system was corrupt: EXT4-fs error (device vdd): ext4_mb_generate_buddy:741: group 30, 32254 clusters in bitmap, 32258 in gd This problem can be reproduced via: mke2fs -q -t ext4 -O ^flex_bg /dev/vdd 5g mount -t ext4 /dev/vdd /mnt fallocate -l 4600m /mnt/test The problem was caused by a bone headed mistake in the check to see if a particular metadata block was part of the block group. Many thanks to Kees Cook for finding and bisecting the buggy commit which introduced this bug (commit fd034a84e1, present since v3.2). Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Reported-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Tested-by: Kees Cook <keescook@chromium.org> Cc: stable@kernel.org
2012-06-07mm: correctly synchronize rss-counters at exit/execKonstantin Khlebnikov
mm->rss_stat counters have per-task delta: task->rss_stat. Before changing task->mm pointer the kernel must flush this delta with sync_mm_rss(). do_exit() already calls sync_mm_rss() to flush the rss-counters before committing the rss statistics into task->signal->maxrss, taskstats, audit and other stuff. Unfortunately the kernel does this before calling mm_release(), which can call put_user() for processing task->clear_child_tid. So at this point we can trigger page-faults and task->rss_stat becomes non-zero again. As a result mm->rss_stat becomes inconsistent and check_mm() will print something like this: | BUG: Bad rss-counter state mm:ffff88020813c380 idx:1 val:-1 | BUG: Bad rss-counter state mm:ffff88020813c380 idx:2 val:1 This patch moves sync_mm_rss() into mm_release(), and moves mm_release() out of do_exit() and calls it earlier. After mm_release() there should be no pagefaults. [akpm@linux-foundation.org: tweak comment] Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Hugh Dickins <hughd@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: <stable@vger.kernel.org> [3.4.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-07UBIFS: fix debugfs-less systems supportArtem Bityutskiy
Commit "f70b7e5 UBIFS: remove Kconfig debugging option" broke UBIFS and it refuses to initialize if debugfs (CONFIG_DEBUG_FS) is disabled. I incorrectly assumed that debugfs files creation function will return success if debugfs is disabled, but they actually return -ENODEV. This patch fixes the issue. Reported-by: Paul Parsons <lost.distance@yahoo.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Tested-by: Paul Parsons <lost.distance@yahoo.com>
2012-06-05Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse updates from Miklos Szeredi. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: fix blksize calculation fuse: fix stat call on 32 bit platforms fuse: optimize fallocate on permanent failure fuse: add FALLOCATE operation fuse: Convert to kstrtoul_from_user
2012-06-04Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs fixes from Steve French. * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: CIFS: Move get_next_mid to ops struct CIFS: Make accessing is_valid_oplock/dump_detail ops struct field safe CIFS: Improve identation in cifs_unlock_range CIFS: Fix possible wrong memory allocation
2012-06-04vfs: Fix /proc/<tid>/fdinfo/<fd> file handlingLinus Torvalds
Cyrill Gorcunov reports that I broke the fdinfo files with commit 30a08bf2d31d ("proc: move fd symlink i_mode calculations into tid_fd_revalidate()"), and he's quite right. The tid_fd_revalidate() function is not just used for the <tid>/fd symlinks, it's also used for the <tid>/fdinfo/<fd> files, and the permission model for those are different. So do the dynamic symlink permission handling just for symlinks, making the fdinfo files once more appear as the proper regular files they are. Of course, Al Viro argued (probably correctly) that we shouldn't do the symlink permission games at all, and make the symlinks always just be the normal 'lrwxrwxrwx'. That would have avoided this issue too, but since somebody noticed that the permissions had changed (which was the reason for that original commit 30a08bf2d31d in the first place), people do apparently use this feature. [ Basically, you can use the symlink permission data as a cheap "fdinfo" replacement, since you see whether the file is open for reading and/or writing by just looking at st_mode of the symlink. So the feature does make sense, even if the pain it has caused means we probably shouldn't have done it to begin with. ] Reported-and-tested-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-02Merge branch 'akpm' (Fixups for Andrew's patchbomb)Linus Torvalds
Merge fixups for the mac NLS tables from Andrew. * emailed from Andrew Morton, and one cleanup by me: nls: fix (and rename) mac NLS table files and config options fs/nls/Makefile: remove bogus CONFIG_ assignments
2012-06-02nls: fix (and rename) mac NLS table files and config optionsLinus Torvalds
The config options in the Kconfig file (with _CODEPAGE_ in the name) didn't match the config option name in the Makefile (no _CODEPAGE_). And both of them were of the hard-to-read MACXYZZY variety, which made them hard to parse for normal humans: MACROMAN easily reads as "macro man", not as "Mac Roman". So rename the options to be consistent, and be NLS_MAC_xyzzy. Rename the files to be mac-xyzzy.c too, and drop the "nls" part entirely (it's already in the directory name). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-02fs/nls/Makefile: remove bogus CONFIG_ assignmentsAndrew Morton
These were debug things which snuck through. Reported-by: Yinghai Lu <yinghai@kernel.org> Cc: Vladimir Serbinenko <phcoder@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-01Merge tag 'for-linus-3.5-20120601' of git://git.infradead.org/linux-mtdLinus Torvalds
Pull mtd update from David Woodhouse: - More robust parsing especially of xattr data in JFFS2 - Updates to mxc_nand and gpmi drivers to support new boards and device tree - Improve consistency of information about ECC strength in NAND devices - Clean up partition handling of plat_nand - Support NAND drivers without dedicated access to OOB area - BCH hardware ECC support for OMAP - Other fixes and cleanups, and a few new device IDs Fixed trivial conflict in drivers/mtd/nand/gpmi-nand/gpmi-nand.c due to added include files next to each other. * tag 'for-linus-3.5-20120601' of git://git.infradead.org/linux-mtd: (75 commits) mtd: mxc_nand: move ecc strengh setup before nand_scan_tail mtd: block2mtd: fix recursive call of mtd_writev mtd: gpmi-nand: define ecc.strength mtd: of_parts: fix breakage in Kconfig mtd: nand: fix scan_read_raw_oob mtd: docg3 fix in-middle of blocks reads mtd: cfi_cmdset_0002: Slight cleanup of fixup messages mtd: add fixup for S29NS512P NOR flash. jffs2: allow to complete xattr integrity check on first GC scan jffs2: allow to discriminate between recoverable and non-recoverable errors mtd: nand: omap: add support for hardware BCH ecc ARM: OMAP3: gpmc: add BCH ecc api and modes mtd: nand: check the return code of 'read_oob/read_oob_raw' mtd: nand: remove 'sndcmd' parameter of 'read_oob/read_oob_raw' mtd: m25p80: Add support for Winbond W25Q80BW jffs2: get rid of jffs2_sync_super jffs2: remove unnecessary GC pass on sync jffs2: remove unnecessary GC pass on umount jffs2: remove lock_super mtd: gpmi: add gpmi support for mx6q ...
2012-06-01Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull third pile of signal handling patches from Al Viro: "This time it's mostly helpers and conversions to them; there's a lot of stuff remaining in the tree, but that'll either go in -rc2 (isolated bug fixes, ideally via arch maintainers' trees) or will sit there until the next cycle." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: x86: get rid of calling do_notify_resume() when returning to kernel mode blackfin: check __get_user() return value whack-a-mole with TIF_FREEZE FRV: Optimise the system call exit path in entry.S [ver #2] FRV: Shrink TIF_WORK_MASK [ver #2] FRV: Prevent syscall exit tracing and notify_resume at end of kernel exceptions new helper: signal_delivered() powerpc: get rid of restore_sigmask() most of set_current_blocked() callers want SIGKILL/SIGSTOP removed from set set_restore_sigmask() is never called without SIGPENDING (and never should be) TIF_RESTORE_SIGMASK can be set only when TIF_SIGPENDING is set don't call try_to_freeze() from do_signal() pull clearing RESTORE_SIGMASK into block_sigmask() sh64: failure to build sigframe != signal without handler openrisc: tracehook_signal_handler() is supposed to be called on success new helper: sigmask_to_save() new helper: restore_saved_sigmask() new helpers: {clear,test,test_and_clear}_restore_sigmask() HAVE_RESTORE_SIGMASK is defined on all architectures now
2012-06-01CIFS: Move get_next_mid to ops structPavel Shilovsky
Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>
2012-06-01CIFS: Make accessing is_valid_oplock/dump_detail ops struct field safePavel Shilovsky
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>
2012-06-01CIFS: Improve identation in cifs_unlock_rangePavel Shilovsky
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>
2012-06-01CIFS: Fix possible wrong memory allocationPavel Shilovsky
when cifs_reconnect sets maxBuf to 0 and we try to calculate a size of memory we need to store locks. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>
2012-06-01Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs changes from Al Viro. "A lot of misc stuff. The obvious groups: * Miklos' atomic_open series; kills the damn abuse of ->d_revalidate() by NFS, which was the major stumbling block for all work in that area. * ripping security_file_mmap() and dealing with deadlocks in the area; sanitizing the neighborhood of vm_mmap()/vm_munmap() in general. * ->encode_fh() switched to saner API; insane fake dentry in mm/cleancache.c gone. * assorted annotations in fs (endianness, __user) * parts of Artem's ->s_dirty work (jff2 and reiserfs parts) * ->update_time() work from Josef. * other bits and pieces all over the place. Normally it would've been in two or three pull requests, but signal.git stuff had eaten a lot of time during this cycle ;-/" Fix up trivial conflicts in Documentation/filesystems/vfs.txt (the 'truncate_range' inode method was removed by the VM changes, the VFS update adds an 'update_time()' method), and in fs/btrfs/ulist.[ch] (due to sparse fix added twice, with other changes nearby). * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (95 commits) nfs: don't open in ->d_revalidate vfs: retry last component if opening stale dentry vfs: nameidata_to_filp(): don't throw away file on error vfs: nameidata_to_filp(): inline __dentry_open() vfs: do_dentry_open(): don't put filp vfs: split __dentry_open() vfs: do_last() common post lookup vfs: do_last(): add audit_inode before open vfs: do_last(): only return EISDIR for O_CREAT vfs: do_last(): check LOOKUP_DIRECTORY vfs: do_last(): make ENOENT exit RCU safe vfs: make follow_link check RCU safe vfs: do_last(): use inode variable vfs: do_last(): inline walk_component() vfs: do_last(): make exit RCU safe vfs: split do_lookup() Btrfs: move over to use ->update_time fs: introduce inode operation ->update_time reiserfs: get rid of resierfs_sync_super reiserfs: mark the superblock as dirty a bit later ...
2012-06-01Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull Ext4 updates from Theodore Ts'o: "The major new feature added in this update is Darrick J Wong's metadata checksum feature, which adds crc32 checksums to ext4's metadata fields. There is also the usual set of cleanups and bug fixes." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (44 commits) ext4: hole-punch use truncate_pagecache_range jbd2: use kmem_cache_zalloc wrapper instead of flag ext4: remove mb_groups before tearing down the buddy_cache ext4: add ext4_mb_unload_buddy in the error path ext4: don't trash state flags in EXT4_IOC_SETFLAGS ext4: let getattr report the right blocks in delalloc+bigalloc ext4: add missing save_error_info() to ext4_error() ext4: add debugging trigger for ext4_error() ext4: protect group inode free counting with group lock ext4: use consistent ssize_t type in ext4_file_write() ext4: fix format flag in ext4_ext_binsearch_idx() ext4: cleanup in ext4_discard_allocated_blocks() ext4: return ENOMEM when mounts fail due to lack of memory ext4: remove redundundant "(char *) bh->b_data" casts ext4: disallow hard-linked directory in ext4_lookup ext4: fix potential integer overflow in alloc_flex_gd() ext4: remove needs_recovery in ext4_mb_init() ext4: force ro mount if ext4_setup_super() fails ext4: fix potential NULL dereference in ext4_free_inodes_counts() ext4/jbd2: add metadata checksumming to the list of supported features ...
2012-06-01HAVE_RESTORE_SIGMASK is defined on all architectures nowAl Viro
Everyone either defines it in arch thread_info.h or has TIF_RESTORE_SIGMASK and picks default set_restore_sigmask() in linux/thread_info.h. Kill the ifdefs, slap #error in linux/thread_info.h to catch breakage when new ones get merged. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01nfs: don't open in ->d_revalidateMiklos Szeredi
NFSv4 can't do reliable opens in d_revalidate, since it cannot know whether a mount needs to be followed or not. It does check d_mountpoint() on the dentry, which can result in a weird error if the VFS found that the mount does not in fact need to be followed, e.g.: # mount --bind /mnt/nfs /mnt/nfs-clone # echo something > /mnt/nfs/tmp/bar # echo x > /tmp/file # mount --bind /tmp/file /mnt/nfs-clone/tmp/bar # cat /mnt/nfs/tmp/bar cat: /mnt/nfs/tmp/bar: Not a directory Which should, by any sane filesystem, result in "something" being printed. So instead do the open in f_op->open() and in the unlikely case that the cached dentry turned out to be invalid, drop the dentry and return EOPENSTALE to let the VFS retry. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01vfs: retry last component if opening stale dentryMiklos Szeredi
NFS optimizes away d_revalidates for last component of open. This means that open itself can find the dentry stale. This patch allows the filesystem to return EOPENSTALE and the VFS will retry the lookup on just the last component if possible. If the lookup was done using RCU mode, including the last component, then this is not possible since the parent dentry is lost. In this case fall back to non-RCU lookup. Currently this is not used since NFS will always leave RCU mode. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01vfs: nameidata_to_filp(): don't throw away file on errorMiklos Szeredi
If open fails, don't put the file. This allows it to be reused if open needs to be retried. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01vfs: nameidata_to_filp(): inline __dentry_open()Miklos Szeredi
Copy __dentry_open() into nameidata_to_filp(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01vfs: do_dentry_open(): don't put filpMiklos Szeredi
Move put_filp() out to __dentry_open(), the only caller now. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01vfs: split __dentry_open()Miklos Szeredi
Split __dentry_open() into two functions: do_dentry_open() - does most of the actual work, doesn't put file on failure open_check_o_direct() - after a successful open, checks direct_IO method This will allow i_op->atomic_open to do just the file initialization and leave the direct_IO checking to the VFS. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01vfs: do_last() common post lookupMiklos Szeredi
Now the post lookup code can be shared between O_CREAT and plain opens since they are essentially the same. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01vfs: do_last(): add audit_inode before openMiklos Szeredi
This allows this code to be shared between O_CREAT and plain opens. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01vfs: do_last(): only return EISDIR for O_CREATMiklos Szeredi
This allows this code to be shared between O_CREAT and plain opens. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01vfs: do_last(): check LOOKUP_DIRECTORYMiklos Szeredi
Check for ENOTDIR before finishing open. This allows this code to be shared between O_CREAT and plain opens. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01vfs: do_last(): make ENOENT exit RCU safeMiklos Szeredi
This will allow this code to be used in RCU mode. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01vfs: make follow_link check RCU safeMiklos Szeredi
This will allow this code to be used in RCU mode. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01vfs: do_last(): use inode variableMiklos Szeredi
Use helper variable instead of path->dentry->d_inode before complete_walk(). This will allow this code to be used in RCU mode. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01vfs: do_last(): inline walk_component()Miklos Szeredi
Copy walk_component() into do_lookup(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01vfs: do_last(): make exit RCU safeMiklos Szeredi
Allow returning from do_last() with LOOKUP_RCU still set on the "out:" and "exit:" labels. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01vfs: split do_lookup()Miklos Szeredi
Split do_lookup() into two functions: lookup_fast() - does cached lookup without i_mutex lookup_slow() - does lookup with i_mutex Both follow managed dentries. The new functions are needed by atomic_open. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01Btrfs: move over to use ->update_timeJosef Bacik
Btrfs had been doing it's own file_update_time so we could catch ENOSPC properly, so just update our btrfs_update_time to work with the new stuff and then we'll be fancy later. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
2012-06-01fs: introduce inode operation ->update_timeJosef Bacik
Btrfs has to make sure we have space to allocate new blocks in order to modify the inode, so updating time can fail. We've gotten around this by having our own file_update_time but this is kind of a pain, and Christoph has indicated he would like to make xfs do something different with atime updates. So introduce ->update_time, where we will deal with i_version an a/m/c time updates and indicate which changes need to be made. The normal version just does what it has always done, updates the time and marks the inode dirty, and then filesystems can choose to do something different. I've gone through all of the users of file_update_time and made them check for errors with the exception of the fault code since it's complicated and I wasn't quite sure what to do there, also Jan is going to be pushing the file time updates into page_mkwrite for those who have it so that should satisfy btrfs and make it not a big deal to check the file_update_time() return code in the generic fault path. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
2012-06-01Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs updates from Chris Mason: "This includes a fairly large change from Josef around data writeback completion. Before, the writeback wasn't completed until the metadata insertions for the extent were done, and this made for fairly large latency spikes on the last page of each ordered extent. We already had a separate mechanism for tracking pending metadata insertions, so Josef just needed to tweak things a little to end writeback earlier on the page. Overall it makes us much friendly to memory reclaim and lowers latencies quite a lot for synchronous IO. Jan Schmidt has finished some background work required to track btree blocks as they go through changes in ownership. It's the missing piece he needed for both btrfs send/receive and subvolume quotas. Neither of those are ready yet, but the new tracking code is included here. Most of the time, the new code is off. It is only used by scrub and other backref walkers. Stefan Behrens has added io failure tracking. This includes counters for which drives are causing the most trouble so the admin (or an automated tool) can choose to kick them out. We're tracking IO errors, crc errors, and generation checks we do on each metadata block. RAID5/6 did miss the cut this time because I'm having trouble with corruptions. I'll nail it down next week and post as a beta testing before 3.6" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (58 commits) Btrfs: fix tree mod log rewinded level and rewinding of moved keys Btrfs: fix tree mod log del_ptr Btrfs: add tree_mod_dont_log helper Btrfs: add missing spin_lock for insertion into tree mod log Btrfs: add inodes before dropping the extent lock in find_all_leafs Btrfs: use delayed ref sequence numbers for all fs-tree updates Btrfs: fix false positive in check-integrity on unmount Btrfs: fix runtime warning in check-integrity check data mode Btrfs: set ioprio of scrub readahead to idle Btrfs: fix return code in drop_objectid_items Btrfs: check to see if the inode is in the log before fsyncing Btrfs: return value of btrfs_read_buffer is checked correctly Btrfs: read device stats on mount, write modified ones during commit Btrfs: add ioctl to get and reset the device stats Btrfs: add device counters for detected IO and checksum errors btrfs: Drop unused function btrfs_abort_devices() Btrfs: fix the same inode id problem when doing auto defragment Btrfs: fall back to non-inline if we don't have enough space Btrfs: fix how we deal with the orphan block rsv Btrfs: convert the inode bit field to use the actual bit operations ...
2012-06-01Merge branch 'for-3.5' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull the rest of the nfsd commits from Bruce Fields: "... and then I cherry-picked the remainder of the patches from the head of my previous branch" This is the rest of the original nfsd branch, rebased without the delegation stuff that I thought really needed to be redone. I don't like rebasing things like this in general, but in this situation this was the lesser of two evils. * 'for-3.5' of git://linux-nfs.org/~bfields/linux: (50 commits) nfsd4: fix, consolidate client_has_state nfsd4: don't remove rebooted client record until confirmation nfsd4: remove some dprintk's and a comment nfsd4: return "real" sequence id in confirmed case nfsd4: fix exchange_id to return confirm flag nfsd4: clarify that renewing expired client is a bug nfsd4: simpler ordering of setclientid_confirm checks nfsd4: setclientid: remove pointless assignment nfsd4: fix error return in non-matching-creds case nfsd4: fix setclientid_confirm same_cred check nfsd4: merge 3 setclientid cases to 2 nfsd4: pull out common code from setclientid cases nfsd4: merge last two setclientid cases nfsd4: setclientid/confirm comment cleanup nfsd4: setclientid remove unnecessary terms from a logical expression nfsd4: move rq_flavor into svc_cred nfsd4: stricter cred comparison for setclientid/exchange_id nfsd4: move principal name into svc_cred nfsd4: allow removing clients not holding state nfsd4: rearrange exchange_id logic to simplify ...
2012-06-01reiserfs: get rid of resierfs_sync_superArtem Bityutskiy
This patch stops reiserfs using the VFS 'write_super()' method along with the s_dirt flag, because they are on their way out. The whole "superblock write-out" VFS infrastructure is served by the 'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and writes out all dirty superblock using the '->write_super()' call-back. But the problem with this thread is that it wastes power by waking up the system every 5 seconds, even if there are no diry superblocks, or there are no client file-systems which would need this (e.g., btrfs does not use '->write_super()'). So we want to kill it completely and thus, we need to make file-systems to stop using the '->write_super()' VFS service, and then remove it together with the kernel thread. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01reiserfs: mark the superblock as dirty a bit laterArtem Bityutskiy
The 'journal_mark_dirty()' function currently first marks the superblock as dirty by setting 's_dirt' to 1, then does various sanity checks and returns, then actuall does all the magic with the journal. This is not an ideal order, though. It makes more sense to first do all the checks, then do all the internal stuff, and at the end notify the VFS that the superblock is now dirty. This patch moves the 's_dirt = 1' assignment from the very beginning of this function to the very end. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01reiserfs: remove useless superblock dirtyingArtem Bityutskiy
The 'reiserfs_resize()' function marks the superblock as dirty by assigning 1 to 's_dirt' and then calls 'journal_mark_dirty()' which does the same. Thus, we can remove the assignment from 'reiserfs_resize()'. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01reiserfs: clean-up function return typeArtem Bityutskiy
Turn 'reiserfs_flush_old_commits()' into a void function because the callers do not cares about what it returns anyway. We are going to remove the 'sb->s_dirt' field completely and this patch is a small step towards this direction. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>