summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2013-02-27Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 regression fix from Theodore Ts'o: "This fixes a real brown paper bag bug which causes ext4 to choke on file systems larger than 512GB." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix extent status tree regression for file systems > 512GB
2013-02-27ext4: fix extent status tree regression for file systems > 512GBTheodore Ts'o
This fixes a regression introduced by commit f7fec032aa782. The problem was that the extents status flags caused us to mask out block numbers smaller than 2**28 blocks. Since we didn't test with file systems smaller than 512GB, we didn't notice this during the development cycle. A typical failure looks like this: EXT4-fs error (device sdb1): htree_dirblock_to_tree:919: inode #172235804: block 152052301: comm ls: bad entry in directory: rec_len is smaller than minimal - offset=0(0), inode=0, rec_len=0, name_len=0 ... where 'debugfs -R "stat <172235804>" /dev/sdb1' reports that the inode has block number 688923213. When viewed in hex, block number 152052301 (from the syslog) is 0x910224D, while block number 688923213 is 0x2910224D. Note the missing "0x20000000" in the block number. Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Verified-by: Markus Trippelsdorf <markus@trippelsdorf.de> Reported-by: Dave Jones <davej@redhat.com> Verified-by: Dave Jones <davej@redhat.com> Cc: Zheng Liu <gnehzuil.liu@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-02-27Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile (part one) from Al Viro: "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent locking violations, etc. The most visible changes here are death of FS_REVAL_DOT (replaced with "has ->d_weak_revalidate()") and a new helper getting from struct file to inode. Some bits of preparation to xattr method interface changes. Misc patches by various people sent this cycle *and* ocfs2 fixes from several cycles ago that should've been upstream right then. PS: the next vfs pile will be xattr stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) saner proc_get_inode() calling conventions proc: avoid extra pde_put() in proc_fill_super() fs: change return values from -EACCES to -EPERM fs/exec.c: make bprm_mm_init() static ocfs2/dlm: use GFP_ATOMIC inside a spin_lock ocfs2: fix possible use-after-free with AIO ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero target: writev() on single-element vector is pointless export kernel_write(), convert open-coded instances fs: encode_fh: return FILEID_INVALID if invalid fid_type kill f_vfsmnt vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op nfsd: handle vfs_getattr errors in acl protocol switch vfs_getattr() to struct path default SET_PERSONALITY() in linux/elf.h ceph: prepopulate inodes only when request is aborted d_hash_and_lookup(): export, switch open-coded instances 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate() 9p: split dropping the acls from v9fs_set_create_acl() ...
2013-02-26Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Theodore Ts'o: "The one new feature added in this patch series is the ability to use the "punch hole" functionality for inodes that are not using extent maps. In the bug fix category, we fixed some races in the AIO and fstrim code, and some potential NULL pointer dereferences and memory leaks in error handling code paths. In the optimization category, we fixed a performance regression in the jbd2 layer introduced by commit d9b01934d56a ("jbd: fix fsync() tid wraparound bug", introduced in v3.0) which shows up in the AIM7 benchmark. We also further optimized jbd2 by minimize the amount of time that transaction handles are held active. This patch series also features some additional enhancement of the extent status tree, which is now used to cache extent information in a more efficient/compact form than what we use on-disk." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (65 commits) ext4: fix free clusters calculation in bigalloc filesystem ext4: no need to remove extent if len is 0 in ext4_es_remove_extent() ext4: fix xattr block allocation/release with bigalloc ext4: reclaim extents from extent status tree ext4: adjust some functions for reclaiming extents from extent status tree ext4: remove single extent cache ext4: lookup block mapping in extent status tree ext4: track all extent status in extent status tree ext4: let ext4_ext_map_blocks return EXT4_MAP_UNWRITTEN flag ext4: rename and improbe ext4_es_find_extent() ext4: add physical block and status member into extent status tree ext4: refine extent status tree ext4: use ERR_PTR() abstraction for ext4_append() ext4: refactor code to read directory blocks into ext4_read_dirblock() ext4: add debugging context for warning in ext4_da_update_reserve_space() ext4: use KERN_WARNING for warning messages jbd2: use module parameters instead of debugfs for jbd_debug ext4: use module parameters instead of debugfs for mballoc_debug ext4: start handle at the last possible moment when creating inodes ext4: fix the number of credits needed for acl ops with inline data ...
2013-02-26Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext2, ext3, udf updates from Jan Kara: "Several UDF fixes, a support for UDF extent cache, and couple of ext2 and ext3 cleanups and minor fixes" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: Ext2: remove the static function release_blocks to optimize the kernel Ext2: mark inode dirty after the function dquot_free_block_nodirty is called Ext2: remove the overhead check about sb in the function ext2_new_blocks udf: Remove unused s_extLength from udf_bitmap udf: Make s_block_bitmap standard array udf: Fix bitmap overflow on large filesystems with small block size udf: add extent cache support in case of file reading udf: Write LVID to disk after opening / closing Ext3: return ENOMEM rather than EIO if sb_getblk fails Ext2: return ENOMEM rather than EIO if sb_getblk fails Ext3: use unlikely to improve the efficiency of the kernel Ext2: use unlikely to improve the efficiency of the kernel Ext3: add necessary check in case IO error happens Ext2: free memory allocated and forget buffer head when io error happens ext3: Fix memory leak when quota options are specified multiple times ext3, ext4, ocfs2: remove unused macro NAMEI_RA_INDEX
2013-02-26Merge tag 'upstream-3.9-rc1' of git://git.infradead.org/linux-ubifsLinus Torvalds
Pull ubifs updates from Artem Bityutskiy: "It's been quite silent and we have only a couple of bug-fixes for the orphans handling code plus one cosmetic change." * tag 'upstream-3.9-rc1' of git://git.infradead.org/linux-ubifs: UBIFS: fix double free of ubifs_orphan objects UBIFS: fix use of freed ubifs_orphan objects UBIFS: rename random32() to prandom_u32()
2013-02-26Merge tag 'f2fs-for-3.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs update from Jaegeuk Kim: "[Major bug fixes] o Store device file information correctly o Fix -EIO handling with respect to power-off-recovery o Allocate blocks with global locks o Fix wrong calculation of the SSR cost [Cleanups] o Get rid of fake on-stack dentries [Enhancement] o Support (un)freeze_fs o Enhance the f2fs_gc flow o Support 32-bit binary execution on 64-bit kernel" * tag 'f2fs-for-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (29 commits) f2fs: avoid build warning f2fs: add compat_ioctl to provide backward compatability f2fs: fix calculation of max. gc cost in the SSR case f2fs: clarify and enhance the f2fs_gc flow f2fs: optimize the return condition for has_not_enough_free_secs f2fs: make an accessor to get sections for particular block type f2fs: mark gc_thread as NULL when thread creation is failed f2fs: name gc task as per the block device f2fs: remove unnecessary gc option check and balance_fs f2fs: remove repeated F2FS_SET_SB_DIRT call f2fs: when check superblock failed, try to check another superblock f2fs: use F2FS_BLKSIZE to judge bloksize and page_cache_size f2fs: add device name in debugfs f2fs: stop repeated checking if cp is needed f2fs: avoid balanc_fs during evict_inode f2fs: remove the use of page_cache_release f2fs: fix typo mistake for data_version description f2fs: reorganize code for ra_node_page f2fs: avoid redundant call to has_not_enough_free_secs in f2fs_gc f2fs: add un/freeze_fs into super_operations ...
2013-02-26saner proc_get_inode() calling conventionsAl Viro
Make it drop the pde in *all* cases when no new reference to it is put into an inode - both when an inode had already been set up (as we were already doing) and when inode allocation has failed. Makes for simpler logics in callers... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26proc: avoid extra pde_put() in proc_fill_super()Maxim Patlasov
If proc_get_inode() succeeded, but d_make_root() failed, pde_put() for proc_root will be called twice: the first time due to iput() called from d_make_root() and the second time directly in the end of proc_fill_super(). Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26fs: change return values from -EACCES to -EPERMZhao Hongjiang
According to SUSv3: [EACCES] Permission denied. An attempt was made to access a file in a way forbidden by its file access permissions. [EPERM] Operation not permitted. An attempt was made to perform an operation limited to processes with appropriate privileges or to the owner of a file or other resource. So -EPERM should be returned if capability checks fails. Strictly speaking this is an API change since the error code user sees is altered. Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: Jan Kara <jack@suse.cz> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26fs/exec.c: make bprm_mm_init() staticYuanhan Liu
There is only one user of bprm_mm_init, and it's inside the same file. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26ocfs2/dlm: use GFP_ATOMIC inside a spin_lockDan Carpenter
My static checker complains that this is called with a spin_lock held in dlm_master_requery_handler() from dlmrecovery.c. Probably the reason we have not received any bug reports about this is that recovery is not a common operation. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Joel Becker <jlbec@evilplan.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26ocfs2: fix possible use-after-free with AIOJan Kara
Running AIO is pinning inode in memory using file reference. Once AIO is completed using aio_complete(), file reference is put and inode can be freed from memory. So we have to be sure that calling aio_complete() is the last thing we do with the inode. Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Jeff Moyer <jmoyer@redhat.com> Acked-by: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code pathSunil Mushran
Commit ea022dfb3c2a4680483b00eb2fecc9fc4f6091d1 was missing a var init. Reported-and-Tested-by: Vincent Etienne <vetienne@aprogsys.com> Signed-off-by: Sunil Mushran <sunil.mushran@gmail.com> Signed-off-by: Joel Becker <jlbec@evilplan.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zeroAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26export kernel_write(), convert open-coded instancesAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26fs: encode_fh: return FILEID_INVALID if invalid fid_typeNamjae Jeon
This patch is a follow up on below patch: [PATCH] exportfs: add FILEID_INVALID to indicate invalid fid_type commit: 216b6cbdcbd86b1db0754d58886b466ae31f5a63 Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Vivek Trivedi <t.vivek@samsung.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Sage Weil <sage@inktank.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26kill f_vfsmntAl Viro
very few users left... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry opJeff Layton
The following set of operations on a NFS client and server will cause server# mkdir a client# cd a server# mv a a.bak client# sleep 30 # (or whatever the dir attrcache timeout is) client# stat . stat: cannot stat `.': Stale NFS file handle Obviously, we should not be getting an ESTALE error back there since the inode still exists on the server. The problem is that the lookup code will call d_revalidate on the dentry that "." refers to, because NFS has FS_REVAL_DOT set. nfs_lookup_revalidate will see that the parent directory has changed and will try to reverify the dentry by redoing a LOOKUP. That of course fails, so the lookup code returns ESTALE. The problem here is that d_revalidate is really a bad fit for this case. What we really want to know at this point is whether the inode is still good or not, but we don't really care what name it goes by or whether the dcache is still valid. Add a new d_op->d_weak_revalidate operation and have complete_walk call that instead of d_revalidate. The intent there is to allow for a "weaker" d_revalidate that just checks to see whether the inode is still good. This is also gives us an opportunity to kill off the FS_REVAL_DOT special casing. [AV: changed method name, added note in porting, fixed confusion re having it possibly called from RCU mode (it won't be)] Cc: NeilBrown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26nfsd: handle vfs_getattr errors in acl protocolJ. Bruce Fields
We're currently ignoring errors from vfs_getattr. The correct thing to do is to do the stat in the main service procedure not in the response encoding. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26switch vfs_getattr() to struct pathAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26ceph: prepopulate inodes only when request is abortedSage Weil
If r_aborted is true, we do not hold the dir i_mutex, and cannot touch the dcache. However, we still need to update the inodes with the state returned by the MDS. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sage Weil <sage@inktank.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26d_hash_and_lookup(): export, switch open-coded instancesAl Viro
* calling conventions change - ERR_PTR() is returned on ->d_hash() errors; NULL is just for dcache miss now. * exported, open-coded instances in ncpfs and cifs converted. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-269p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-269p: split dropping the acls from v9fs_set_create_acl()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-269p: switch v9fs_acl_chmod() from dentry to inode+fidAl Viro
caller has both, might as well pass them explicitly. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-269p: switch v9fs_set_acl() from dentry to fidAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-269p: lift the call of set_cached_acl() into the callers of v9fs_set_acl()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-269p: add fid-based variant of v9fs_xattr_set()Al Viro
... making v9fs_xattr_set() a wrapper for it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26hugetlb_file_setup(): use d_alloc_pseudo()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fix from Al Viro: "Fix for 3.8 breakage introduced by "vfs: Allow unprivileged manipulation of the mount namespace" - accessing mnt->mnt_ns is done there without needed locking *and* without any real need. Definite -stable fodder, fortunately not going too far back. This is *not* all - there will be much bigger vfs pull request tomorrow." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: get rid of unprotected dereferencing of mnt->mnt_ns
2013-02-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace and namespace infrastructure changes from Eric W Biederman: "This set of changes starts with a few small enhnacements to the user namespace. reboot support, allowing more arbitrary mappings, and support for mounting devpts, ramfs, tmpfs, and mqueuefs as just the user namespace root. I do my best to document that if you care about limiting your unprivileged users that when you have the user namespace support enabled you will need to enable memory control groups. There is a minor bug fix to prevent overflowing the stack if someone creates way too many user namespaces. The bulk of the changes are a continuation of the kuid/kgid push down work through the filesystems. These changes make using uids and gids typesafe which ensures that these filesystems are safe to use when multiple user namespaces are in use. The filesystems converted for 3.9 are ceph, 9p, afs, ocfs2, gfs2, ncpfs, nfs, nfsd, and cifs. The changes for these filesystems were a little more involved so I split the changes into smaller hopefully obviously correct changes. XFS is the only filesystem that remains. I was hoping I could get that in this release so that user namespace support would be enabled with an allyesconfig or an allmodconfig but it looks like the xfs changes need another couple of days before it they are ready." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (93 commits) cifs: Enable building with user namespaces enabled. cifs: Convert struct cifs_ses to use a kuid_t and a kgid_t cifs: Convert struct cifs_sb_info to use kuids and kgids cifs: Modify struct smb_vol to use kuids and kgids cifs: Convert struct cifsFileInfo to use a kuid cifs: Convert struct cifs_fattr to use kuid and kgids cifs: Convert struct tcon_link to use a kuid. cifs: Modify struct cifs_unix_set_info_args to hold a kuid_t and a kgid_t cifs: Convert from a kuid before printing current_fsuid cifs: Use kuids and kgids SID to uid/gid mapping cifs: Pass GLOBAL_ROOT_UID and GLOBAL_ROOT_GID to keyring_alloc cifs: Use BUILD_BUG_ON to validate uids and gids are the same size cifs: Override unmappable incoming uids and gids nfsd: Enable building with user namespaces enabled. nfsd: Properly compare and initialize kuids and kgids nfsd: Store ex_anon_uid and ex_anon_gid as kuids and kgids nfsd: Modify nfsd4_cb_sec to use kuids and kgids nfsd: Handle kuids and kgids in the nfs4acl to posix_acl conversion nfsd: Convert nfsxdr to use kuids and kgids nfsd: Convert nfs3xdr to use kuids and kgids ...
2013-02-24Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull signal handling cleanups from Al Viro: "This is the first pile; another one will come a bit later and will contain SYSCALL_DEFINE-related patches. - a bunch of signal-related syscalls (both native and compat) unified. - a bunch of compat syscalls switched to COMPAT_SYSCALL_DEFINE (fixing several potential problems with missing argument validation, while we are at it) - a lot of now-pointless wrappers killed - a couple of architectures (cris and hexagon) forgot to save altstack settings into sigframe, even though they used the (uninitialized) values in sigreturn; fixed. - microblaze fixes for delivery of multiple signals arriving at once - saner set of helpers for signal delivery introduced, several architectures switched to using those." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (143 commits) x86: convert to ksignal sparc: convert to ksignal arm: switch to struct ksignal * passing alpha: pass k_sigaction and siginfo_t using ksignal pointer burying unused conditionals make do_sigaltstack() static arm64: switch to generic old sigaction() (compat-only) arm64: switch to generic compat rt_sigaction() arm64: switch compat to generic old sigsuspend arm64: switch to generic compat rt_sigqueueinfo() arm64: switch to generic compat rt_sigpending() arm64: switch to generic compat rt_sigprocmask() arm64: switch to generic sigaltstack sparc: switch to generic old sigsuspend sparc: COMPAT_SYSCALL_DEFINE does all sign-extension as well as SYSCALL_DEFINE sparc: kill sign-extending wrappers for native syscalls kill sparc32_open() sparc: switch to use of generic old sigaction sparc: switch sys_compat_rt_sigaction() to COMPAT_SYSCALL_DEFINE mips: switch to generic sys_fork() and sys_clone() ...
2013-02-24Merge branch 'akpm' (more incoming from Andrew)Linus Torvalds
Merge second patch-bomb from Andrew Morton: - A little DM fix - the MM queue * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (154 commits) ksm: allocate roots when needed mm: cleanup "swapcache" in do_swap_page mm,ksm: swapoff might need to copy mm,ksm: FOLL_MIGRATION do migration_entry_wait ksm: shrink 32-bit rmap_item back to 32 bytes ksm: treat unstable nid like in stable tree ksm: add some comments tmpfs: fix mempolicy object leaks tmpfs: fix use-after-free of mempolicy object mm/fadvise.c: drain all pagevecs if POSIX_FADV_DONTNEED fails to discard all pages mm: export mmu notifier invalidates mm: accelerate mm_populate() treatment of THP pages mm: use long type for page counts in mm_populate() and get_user_pages() mm: accurately document nr_free_*_pages functions with code comments HWPOISON: change order of error_states[]'s elements HWPOISON: fix misjudgement of page_action() for errors on mlocked pages memcg: stop warning on memcg_propagate_kmem net: change type of virtio_chan->p9_max_pages vmscan: change type of vm_total_pages to unsigned long fs/nfsd: change type of max_delegations, nfsd_drc_max_mem and nfsd_drc_mem_used ...
2013-02-24fs/nfsd: change type of max_delegations, nfsd_drc_max_mem and nfsd_drc_mem_usedZhang Yanfei
The three variables are calculated from nr_free_buffer_pages so change their types to unsigned long in case of overflow. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-24fs/buffer.c: change type of max_buffer_heads to unsigned longZhang Yanfei
max_buffer_heads is calculated from nr_free_buffer_pages(), so change its type to unsigned long in case of overflow. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-24swap: make each swap partition have one address_spaceShaohua Li
When I use several fast SSD to do swap, swapper_space.tree_lock is heavily contended. This makes each swap partition have one address_space to reduce the lock contention. There is an array of address_space for swap. The swap entry type is the index to the array. In my test with 3 SSD, this increases the swapout throughput 20%. [akpm@linux-foundation.org: revert unneeded change to __add_to_swap_cache] Signed-off-by: Shaohua Li <shli@fusionio.com> Cc: Hugh Dickins <hughd@google.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-24memory-failure: use num_poisoned_pages instead of mce_bad_pagesXishi Qiu
Since MCE is an x86 concept, and this code is in mm/, it would be better to use the name num_poisoned_pages instead of mce_bad_pages. [akpm@linux-foundation.org: fix mm/sparse.c] Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Suggested-by: Borislav Petkov <bp@alien8.de> Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-24mm: make do_mmap_pgoff return populate as a size in bytes, not as a boolMichel Lespinasse
do_mmap_pgoff() rounds up the desired size to the next PAGE_SIZE multiple, however there was no equivalent code in mm_populate(), which caused issues. This could be fixed by introduced the same rounding in mm_populate(), however I think it's preferable to make do_mmap_pgoff() return populate as a size rather than as a boolean, so we don't have to duplicate the size rounding logic in mm_populate(). Signed-off-by: Michel Lespinasse <walken@google.com> Acked-by: Rik van Riel <riel@redhat.com> Tested-by: Andy Lutomirski <luto@amacapital.net> Cc: Greg Ungerer <gregungerer@westnet.com.au> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-24mm: introduce mm_populate() for populating new vmasMichel Lespinasse
When creating new mappings using the MAP_POPULATE / MAP_LOCKED flags (or with MCL_FUTURE in effect), we want to populate the pages within the newly created vmas. This may take a while as we may have to read pages from disk, so ideally we want to do this outside of the write-locked mmap_sem region. This change introduces mm_populate(), which is used to defer populating such mappings until after the mmap_sem write lock has been released. This is implemented as a generalization of the former do_mlock_pages(), which accomplished the same task but was using during mlock() / mlockall(). Signed-off-by: Michel Lespinasse <walken@google.com> Reported-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Rik van Riel <riel@redhat.com> Tested-by: Andy Lutomirski <luto@amacapital.net> Cc: Greg Ungerer <gregungerer@westnet.com.au> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-23hostfs: directory methods have no business in non-directory inode_operationsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-23__d_materialise_unique() is too genericAl Viro
Its first argument is always non-root, while the second one is always root. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-23fs: Fix possible use-after-free with AIOJan Kara
Running AIO is pinning inode in memory using file reference. Once AIO is completed using aio_complete(), file reference is put and inode can be freed from memory. So we have to be sure that calling aio_complete() is the last thing we do with the inode. CC: Christoph Hellwig <hch@infradead.org> CC: Jens Axboe <axboe@kernel.dk> CC: Jeff Moyer <jmoyer@redhat.com> CC: stable@vger.kernel.org Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-23constify d_lookup() argumentsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-23constify __d_lookup() argumentsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-23lookup_slow: get rid of name argumentAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-23lookup_fast: get rid of name argumentAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-23get rid of name and type arguments of walk_component()Al Viro
... always can be found in nameidata now. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-23link_path_walk(): move assignments to nd->last/nd->last_type upAl Viro
... and clean the main loop a bit Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-23vfs: remove d_path_with_unreachableJeff Layton
The last caller was removed >2 years ago in commit 7b2a69ba7. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>