summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2013-07-11xfs: Add pquota fields where gquota is used.Chandra Seetharaman
Add project quota changes to all the places where group quota field is used: * add separate project quota members into various structures * split project quota and group quotas so that instead of overriding the group quota members incore, the new project quota members are used instead * get rid of usage of the OQUOTA flag incore, in favor of separate group and project quota flags. * add a project dquot argument to various functions. Not using the pquotino field from superblock yet. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2013-07-11ext4: fix warning in ext4_evict_inode()Jan Kara
The following race can lead to ext4_evict_inode() seeing i_ioend_count > 0 and thus triggering a sanity check warning: CPU1 CPU2 ext4_end_bio() ext4_evict_inode() ext4_finish_bio() end_page_writeback(); truncate_inode_pages() evict page WARN_ON(i_ioend_count > 0); ext4_put_io_end_defer() ext4_release_io_end() dec i_ioend_count This is possible use-after-free bug since we decrement i_ioend_count in possibly released inode. Since i_ioend_count is used only for sanity checks one possible solution would be to just remove it but for now I'd like to keep those sanity checks to help debugging the new ext4 writeback code. This patch changes ext4_end_bio() to call ext4_put_io_end_defer() before ext4_finish_bio() in the shortcut case when unwritten extent conversion isn't needed. In that case we don't need the io_end so we are safe to drop it early. Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-07-11mm: remove free_area_cacheMichel Lespinasse
Since all architectures have been converted to use vm_unmapped_area(), there is no remaining use for the free_area_cache. Signed-off-by: Michel Lespinasse <walken@google.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Helge Deller <deller@gmx.de> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-11net: rename include/net/ll_poll.h to include/net/busy_poll.hEliezer Tamir
Rename the file and correct all the places where it is included. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-10[CIFS] Fix minor endian error in durable handle patch seriesSteve French
Fix endian warning: CHECK fs/cifs/smb2pdu.c fs/cifs/smb2pdu.c:1068:40: warning: incorrect type in assignment (different base types) fs/cifs/smb2pdu.c:1068:40: expected restricted __le32 [usertype] Next fs/cifs/smb2pdu.c:1068:40: got unsigned long Signed-off-by: Steve French <smfrench@gmail.com>
2013-07-10CIFS: Reconnect durable handles for SMB2Pavel Shilovsky
On reconnects, we need to reopen file and then obtain all byte-range locks held by the client. SMB2 protocol provides feature to make this process atomic by reconnecting to the same file handle with all it's byte-range locks. This patch adds this capability for SMB2 shares. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
2013-07-10CIFS: Make SMB2_open use cifs_open_parms structPavel Shilovsky
to prepare it for further durable handle reconnect processing. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
2013-07-10CIFS: Introduce cifs_open_parms structPavel Shilovsky
and pass it to the open() call. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
2013-07-10CIFS: Request durable open for SMB2 opensPavel Shilovsky
by passing durable context together with a handle caching lease or batch oplock. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
2013-07-10CIFS: Simplify SMB2 create context handlingPavel Shilovsky
to make it easier to add other create context further. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
2013-07-10CIFS: Simplify SMB2_open code pathPavel Shilovsky
by passing a filename to a separate iovec regardless of its length. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
2013-07-10CIFS: Respect create_options in smb2_open_filePavel Shilovsky
and eliminated unused file_attribute parms of SMB2_open. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
2013-07-10CIFS: Fix lease context buffer parsingPavel Shilovsky
to prevent missing RqLs context if it's not the first one. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
2013-07-10xfs: fix sgid inheritance for subdirectories inheriting default acls [V3]Carlos Maiolino
XFS removes sgid bits of subdirectories under a directory containing a default acl. When a default acl is set, it implies xfs to call xfs_setattr_nonsize() in its code path. Such function is shared among mkdir and chmod system calls, and does some checks unneeded by mkdir (calling inode_change_ok()). Such checks remove sgid bit from the inode after it has been granted. With this patch, we extend the meaning of XFS_ATTR_NOACL flag to avoid these checks when acls are being inherited (thanks hch). Also, xfs_setattr_mode, doesn't need to re-check for group id and capabilities permissions, this only implies in another try to remove sgid bit from the directories. Such check is already done either on inode_change_ok() or xfs_setattr_nonsize(). Changelog: V2: Extends the meaning of XFS_ATTR_NOACL instead of wrap the tests into another function V3: Remove S_ISDIR check in xfs_setattr_nonsize() from the patch Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2013-07-10Use ecryptfs_dentry_to_lower_path in a couple of placesMatthew Wilcox
There are two places in ecryptfs that benefit from using ecryptfs_dentry_to_lower_path() instead of separate calls to ecryptfs_dentry_to_lower() and ecryptfs_dentry_to_lower_mnt(). Both sites use fewer instructions and less stack (determined by examining objdump output). Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
2013-07-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: "This is a re-do of the net-next pull request for the current merge window. The only difference from the one I made the other day is that this has Eliezer's interface renames and the timeout handling changes made based upon your feedback, as well as a few bug fixes that have trickeled in. Highlights: 1) Low latency device polling, eliminating the cost of interrupt handling and context switches. Allows direct polling of a network device from socket operations, such as recvmsg() and poll(). Currently ixgbe, mlx4, and bnx2x support this feature. Full high level description, performance numbers, and design in commit 0a4db187a999 ("Merge branch 'll_poll'") From Eliezer Tamir. 2) With the routing cache removed, ip_check_mc_rcu() gets exercised more than ever before in the case where we have lots of multicast addresses. Use a hash table instead of a simple linked list, from Eric Dumazet. 3) Add driver for Atheros CQA98xx 802.11ac wireless devices, from Bartosz Markowski, Janusz Dziedzic, Kalle Valo, Marek Kwaczynski, Marek Puzyniak, Michal Kazior, and Sujith Manoharan. 4) Support reporting the TUN device persist flag to userspace, from Pavel Emelyanov. 5) Allow controlling network device VF link state using netlink, from Rony Efraim. 6) Support GRE tunneling in openvswitch, from Pravin B Shelar. 7) Adjust SOCK_MIN_RCVBUF and SOCK_MIN_SNDBUF for modern times, from Daniel Borkmann and Eric Dumazet. 8) Allow controlling of TCP quickack behavior on a per-route basis, from Cong Wang. 9) Several bug fixes and improvements to vxlan from Stephen Hemminger, Pravin B Shelar, and Mike Rapoport. In particular, support receiving on multiple UDP ports. 10) Major cleanups, particular in the area of debugging and cookie lifetime handline, to the SCTP protocol code. From Daniel Borkmann. 11) Allow packets to cross network namespaces when traversing tunnel devices. From Nicolas Dichtel. 12) Allow monitoring netlink traffic via AF_PACKET sockets, in a manner akin to how we monitor real network traffic via ptype_all. From Daniel Borkmann. 13) Several bug fixes and improvements for the new alx device driver, from Johannes Berg. 14) Fix scalability issues in the netem packet scheduler's time queue, by using an rbtree. From Eric Dumazet. 15) Several bug fixes in TCP loss recovery handling, from Yuchung Cheng. 16) Add support for GSO segmentation of MPLS packets, from Simon Horman. 17) Make network notifiers have a real data type for the opaque pointer that's passed into them. Use this to properly handle network device flag changes in arp_netdev_event(). From Jiri Pirko and Timo Teräs. 18) Convert several drivers over to module_pci_driver(), from Peter Huewe. 19) tcp_fixup_rcvbuf() can loop 500 times over loopback, just use a O(1) calculation instead. From Eric Dumazet. 20) Support setting of explicit tunnel peer addresses in ipv6, just like ipv4. From Nicolas Dichtel. 21) Protect x86 BPF JIT against spraying attacks, from Eric Dumazet. 22) Prevent a single high rate flow from overruning an individual cpu during RX packet processing via selective flow shedding. From Willem de Bruijn. 23) Don't use spinlocks in TCP md5 signing fast paths, from Eric Dumazet. 24) Don't just drop GSO packets which are above the TBF scheduler's burst limit, chop them up so they are in-bounds instead. Also from Eric Dumazet. 25) VLAN offloads are missed when configured on top of a bridge, fix from Vlad Yasevich. 26) Support IPV6 in ping sockets. From Lorenzo Colitti. 27) Receive flow steering targets should be updated at poll() time too, from David Majnemer. 28) Fix several corner case regressions in PMTU/redirect handling due to the routing cache removal, from Timo Teräs. 29) We have to be mindful of ipv4 mapped ipv6 sockets in upd_v6_push_pending_frames(). From Hannes Frederic Sowa. 30) Fix L2TP sequence number handling bugs, from James Chapman." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1214 commits) drivers/net: caif: fix wrong rtnl_is_locked() usage drivers/net: enic: release rtnl_lock on error-path vhost-net: fix use-after-free in vhost_net_flush net: mv643xx_eth: do not use port number as platform device id net: sctp: confirm route during forward progress virtio_net: fix race in RX VQ processing virtio: support unlocked queue poll net/cadence/macb: fix bug/typo in extracting gem_irq_read_clear bit Documentation: Fix references to defunct linux-net@vger.kernel.org net/fs: change busy poll time accounting net: rename low latency sockets functions to busy poll bridge: fix some kernel warning in multicast timer sfc: Fix memory leak when discarding scattered packets sit: fix tunnel update via netlink dt:net:stmmac: Add dt specific phy reset callback support. dt:net:stmmac: Add support to dwmac version 3.610 and 3.710 dt:net:stmmac: Allocate platform data only if its NULL. net:stmmac: fix memleak in the open method ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available net: ipv6: fix wrong ping_v6_sendmsg return value ...
2013-07-09NFS: Allow nfs_updatepage to extend a write under additional circumstancesScott Mayhew
Currently nfs_updatepage allows a write to be extended to cover a full page only if we don't have a byte range lock lock on the file... but if we have a write delegation on the file or if we have the whole file locked for writing then we should be allowed to extend the write as well. Signed-off-by: Scott Mayhew <smayhew@redhat.com> [Trond: fix up call to nfs_have_delegation()] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-07-09xfs: dquot log reservations are too smallDave Chinner
During review of the separate project quota inode patches, it became obvious that the dquot log reservation calculation underestimated the number dquots that can be modified in a transaction. This has it's roots way back in the Irix quota implementation. That is, when quotas were first implemented in XFS, it only supported user and project quotas as Irix did not have group quotas. Hence the worst case operation involving dquot modification was calculated to involve 2 user dquots and 1 project dquot or 1 user dequot and 2 project dquots. i.e. 3 dquots. This was determined back in 1996, and has remained unchanged ever since. However, back in 2001, the Linux XFS port dropped all support for project quota and implmented group quotas over the top. This was effectively done with a search-and-replace of project with group, and as such the log reservation was not changed. However, with the advent of group quotas, chmod and rename now could modify more than 3 dquots in a single transaction - both could modify 4 dquots. Hence this log reservation has been wrong for a long time. In 2005, project quota support was reintroduced into Linux, but it was implemented to be mutually exclusive to group quotas and so this didn't add any new changes to the dquot log reservation. Hence when project quotas were in use (rather than group quotas) the log reservation was again valid, just like in the Irix days. Now, with the addition of the separate project quota inode, group and project quotas are no longer mutually exclusive, and hence operations can now modify three dquots per inode where previously it was only two. The worst case here is the rename transaction, which can allocate/free space on two different directory inodes, and if they have different uid/gid/prid configurations and are world writeable, then rename can actually modify 6 different dquots now. Further, the dquot log reservation doesn't take into account the space used by the dquot log format structure that precedes the dquot that is logged, and hence further underestimates the worst case log space required by dquots during a transaction. This has been missing since the first commit in 1996. Hence the worst case log reservation needs to be increased from 3 to 6, and it needs to take into account a log format header for each of those dquots. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2013-07-09xfs: remove local fork format handling from xfs_bmapi_write()Dave Chinner
The conversion from local format to extent format requires interpretation of the data in the fork being converted, so it cannot be done in a generic way. It is up to the caller to convert the fork format to extent format before calling into xfs_bmapi_write() so format conversion can be done correctly. The code in xfs_bmapi_write() to convert the format is used implicitly by the attribute and directory code, but they specifically zero the fork size so that the conversion does not do any allocation or manipulation. Move this conversion into the shortform to leaf functions for the dir/attr code so the conversions are explicitly controlled by all callers. Now we can remove the conversion code in xfs_bmapi_write. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2013-07-09NFS: Make nfs_readdir revalidate less oftenScott Mayhew
Make nfs_readdir revalidate only when we're at the beginning of the directory or if the cached attributes have expired. Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-07-09NFS: Make nfs_attribute_cache_expired() non-staticScott Mayhew
NFS: Make nfs_attribute_cache_expired() non-static so we can call it from nfs_readdir(). Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-07-09nfs: set verifier on existing dentries in nfs_prime_dcacheJeff Layton
nfs_prime_dcache currently only sets the verifier when it doesn't initially a matching dentry in the dcache. Set the verifier in the case where we do find a dentry in the dcache. This ensures that we don't have to look up the dentry again if we want to use it after a readdir. Cc: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-07-09xfs: use get_unused_fd_flags(0) instead of get_unused_fd()Yann Droneaud
Macro get_unused_fd() is used to allocate a file descriptor with default flags. Those default flags (0) can be "unsafe": O_CLOEXEC must be used by default to not leak file descriptor across exec(). Instead of macro get_unused_fd(), functions anon_inode_getfd() or get_unused_fd_flags() should be used with flags given by userspace. If not possible, flags should be set to O_CLOEXEC to provide userspace with a default safe behavor. In a further patch, get_unused_fd() will be removed so that new code start using anon_inode_getfd() or get_unused_fd_flags() with correct flags. This patch replaces calls to get_unused_fd() with equivalent call to get_unused_fd_flags(0) to preserve current behavor for existing code. The hard coded flag value (0) should be reviewed on a per-subsystem basis, and, if possible, set to O_CLOEXEC. Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2013-07-09xfs: clean up unused codes at xfs_bulkstat()Jie Liu
There are some unused codes at xfs_bulkstat(): - Variable bp is defined to point to the on-disk inode cluster buffer, but it proved to be of no practical help. - We process the chunks of good inodes which were fetched by iterating btree records from an AG. When processing inodes from each chunk, the code recomputing agbno if run into the first inode of a cluster, however, the agbno is not being used thereafter. This patch tries to clean up those things. Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2013-07-09Merge branch 'akpm' (updates from Andrew Morton)Linus Torvalds
Merge second patch-bomb from Andrew Morton: - misc fixes - audit stuff - fanotify/inotify/dnotify things - most of the rest of MM. The new cache shrinker code from Glauber and Dave Chinner probably isn't quite stabilized yet. - ptrace - ipc - partitions - reboot cleanups - add LZ4 decompressor, use it for kernel compression * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (118 commits) lib/scatterlist: error handling in __sg_alloc_table() scsi_debug: fix do_device_access() with wrap around range crypto: talitos: use sg_pcopy_to_buffer() lib/scatterlist: introduce sg_pcopy_from_buffer() and sg_pcopy_to_buffer() lib/scatterlist: factor out sg_miter_get_next_page() from sg_miter_next() crypto: add lz4 Cryptographic API lib: add lz4 compressor module arm: add support for LZ4-compressed kernel lib: add support for LZ4-compressed kernel decompressor: add LZ4 decompressor module lib: add weak clz/ctz functions reboot: move arch/x86 reboot= handling to generic kernel reboot: arm: change reboot_mode to use enum reboot_mode reboot: arm: prepare reboot_mode for moving to generic kernel code reboot: arm: remove unused restart_mode fields from some arm subarchs reboot: unicore32: prepare reboot_mode for moving to generic kernel code reboot: x86: prepare reboot_mode for moving to generic kernel code reboot: checkpatch.pl the new kernel/reboot.c file reboot: move shutdown/reboot related functions to kernel/reboot.c reboot: remove -stable friendly PF_THREAD_BOUND define ...
2013-07-09Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph updates from Sage Weil: "There is some follow-on RBD cleanup after the last window's code drop, a series from Yan fixing multi-mds behavior in cephfs, and then a sprinkling of bug fixes all around. Some warnings, sleeping while atomic, a null dereference, and cleanups" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (36 commits) libceph: fix invalid unsigned->signed conversion for timespec encoding libceph: call r_unsafe_callback when unsafe reply is received ceph: fix race between cap issue and revoke ceph: fix cap revoke race ceph: fix pending vmtruncate race ceph: avoid accessing invalid memory libceph: Fix NULL pointer dereference in auth client code ceph: Reconstruct the func ceph_reserve_caps. ceph: Free mdsc if alloc mdsc->mdsmap failed. ceph: remove sb_start/end_write in ceph_aio_write. ceph: avoid meaningless calling ceph_caps_revoking if sync_mode == WB_SYNC_ALL. ceph: fix sleeping function called from invalid context. ceph: move inode to proper flushing list when auth MDS changes rbd: fix a couple warnings ceph: clear migrate seq when MDS restarts ceph: check migrate seq before changing auth cap ceph: fix race between page writeback and truncate ceph: reset iov_len when discarding cap release messages ceph: fix cap release race libceph: fix truncate size calculation ...
2013-07-09Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs update from Chris Mason: "These are the usual mixture of bugs, cleanups and performance fixes. Miao has some really nice tuning of our crc code as well as our transaction commits. Josef is peeling off more and more problems related to early enospc, and has a number of important bug fixes in here too" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (81 commits) Btrfs: wait ordered range before doing direct io Btrfs: only do the tree_mod_log_free_eb if this is our last ref Btrfs: hold the tree mod lock in __tree_mod_log_rewind Btrfs: make backref walking code handle skinny metadata Btrfs: fix crash regarding to ulist_add_merge Btrfs: fix several potential problems in copy_nocow_pages_for_inode Btrfs: cleanup the code of copy_nocow_pages_for_inode() Btrfs: fix oops when recovering the file data by scrub function Btrfs: make the chunk allocator completely tree lockless Btrfs: cleanup orphaned root orphan item Btrfs: fix wrong mirror number tuning Btrfs: cleanup redundant code in btrfs_submit_direct() Btrfs: remove btrfs_sector_sum structure Btrfs: check if we can nocow if we don't have data space Btrfs: stop using try_to_writeback_inodes_sb_nr to flush delalloc Btrfs: use a percpu to keep track of possibly pinned bytes Btrfs: check for actual acls rather than just xattrs when caching no acl Btrfs: move btrfs_truncate_page to btrfs_cont_expand instead of btrfs_truncate Btrfs: optimize reada_for_balance Btrfs: optimize read_block_for_search ...
2013-07-09net/fs: change busy poll time accountingEliezer Tamir
Suggested by Linus: Changed time accounting for busy-poll: - Make it microsecond based. - Use unsigned longs. - Revert back to use time_after instead of time_in_range. Reorder poll/select busy loop conditions: - Clear busy_flag after one time we can't busy-poll. - Only init busy_end if we actually are going to busy-poll. Added one more missing need_resched() test. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-09Merge tag 'for-linus-v3.11-rc1' of git://oss.sgi.com/xfs/xfsLinus Torvalds
Pull xfs update from Ben Myers: "This includes several bugfixes, part of the work for project quotas and group quotas to be used together, performance improvements for inode creation/deletion, buffer readahead, and bulkstat, implementation of the inode change count, an inode create transaction, and the removal of a bunch of dead code. There are also some duplicate commits that you already have from the 3.10-rc series. - part of the work to allow project quotas and group quotas to be used together - inode change count - inode create transaction - block queue plugging in buffer readahead and bulkstat - ordered log vector support - removal of dead code in and around xfs_sync_inode_grab, xfs_ialloc_get_rec, XFS_MOUNT_RETERR, XFS_ALLOCFREE_LOG_RES, XFS_DIROP_LOG_RES, xfs_chash, ctl_table, and xfs_growfs_data_private - don't keep silent if sunit/swidth can not be changed via mount - fix a leak of remote symlink blocks into the filesystem when xattrs are used on symlinks - fix for fiemap to return FIEMAP_EXTENT_UNKOWN flag on delay extents - part of a fix for xfs_fsr - disable speculative preallocation with small files - performance improvements for inode creates and deletes" * tag 'for-linus-v3.11-rc1' of git://oss.sgi.com/xfs/xfs: (61 commits) xfs: Remove incore use of XFS_OQUOTA_ENFD and XFS_OQUOTA_CHKD xfs: Change xfs_dquot_acct to be a 2-dimensional array xfs: Code cleanup and removal of some typedef usage xfs: Replace macro XFS_DQ_TO_QIP with a function xfs: Replace macro XFS_DQUOT_TREE with a function xfs: Define a new function xfs_is_quota_inode() xfs: implement inode change count xfs: Use inode create transaction xfs: Inode create item recovery xfs: Inode create transaction reservations xfs: Inode create log items xfs: Introduce an ordered buffer item xfs: Introduce ordered log vector support xfs: xfs_ifree doesn't need to modify the inode buffer xfs: don't do IO when creating an new inode xfs: don't use speculative prealloc for small files xfs: plug directory buffer readahead xfs: add pluging for bulkstat readahead xfs: Remove dead function prototype xfs_sync_inode_grab() xfs: Remove the left function variable from xfs_ialloc_get_rec() ...
2013-07-09Merge tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "Feature highlights include: - Add basic client support for NFSv4.2 - Add basic client support for Labeled NFS (selinux for NFSv4.2) - Fix the use of credentials in NFSv4.1 stateful operations, and add support for NFSv4.1 state protection. Bugfix highlights: - Fix another NFSv4 open state recovery race - Fix an NFSv4.1 back channel session regression - Various rpc_pipefs races - Fix another issue with NFSv3 auth negotiation Please note that Labeled NFS does require some additional support from the security subsystem. The relevant changesets have all been reviewed and acked by James Morris." * tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (54 commits) NFS: Set NFS_CS_MIGRATION for NFSv4 mounts NFSv4.1 Refactor nfs4_init_session and nfs4_init_channel_attrs nfs: have NFSv3 try server-specified auth flavors in turn nfs: have nfs_mount fake up a auth_flavs list when the server didn't provide it nfs: move server_authlist into nfs_try_mount_request nfs: refactor "need_mount" code out of nfs_try_mount SUNRPC: PipeFS MOUNT notification optimization for dying clients SUNRPC: split client creation routine into setup and registration SUNRPC: fix races on PipeFS UMOUNT notifications SUNRPC: fix races on PipeFS MOUNT notifications NFSv4.1 use pnfs_device maxcount for the objectlayout gdia_maxcount NFSv4.1 use pnfs_device maxcount for the blocklayout gdia_maxcount NFSv4.1 Fix gdia_maxcount calculation to fit in ca_maxresponsesize NFS: Improve legacy idmapping fallback NFSv4.1 end back channel session draining NFS: Apply v4.1 capabilities to v4.2 NFSv4.1: Clean up layout segment comparison helper names NFSv4.1: layout segment comparison helpers should take 'const' parameters NFSv4: Move the DNS resolver into the NFSv4 module rpc_pipefs: only set rpc_dentry_ops if d_op isn't already set ...
2013-07-09Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext3 fix and quota cleanup from Jan Kara: "A fix of ext3 error reporting from fsync and a quota cleanup" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: quota: Convert use of typedef ctl_table to struct ctl_table ext3: Fix fsync error handling after filesystem abort.
2013-07-09Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull third set of VFS updates from Al Viro: "Misc stuff all over the place. There will be one more pile in a couple of days" This is an "evil merge" that also uses the new d_count helper in fs/configfs/dir.c, missed by commit 84d08fa888e7 ("helper for reading ->d_count") * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ncpfs: fix error return code in ncp_parse_options() locks: move file_lock_list to a set of percpu hlist_heads and convert file_lock_lock to an lglock seq_file: add seq_list_*_percpu helpers f2fs: fix readdir incorrectness mode_t whack-a-mole... lustre: kill the pointless wrapper helper for reading ->d_count
2013-07-09xfs: use XFS_BMAP_BMDR_SPACE vs. XFS_BROOT_SIZE_ADJEric Sandeen
XFS_BROOT_SIZE_ADJ is an undocumented macro which accounts for the difference in size between the on-disk and in-core btree root. It's much clearer to just use the newly-added XFS_BMAP_BMDR_SPACE macro which gives us the on-disk size directly. In one case, we must test that the if_broot exists before applying the macro, however. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2013-07-09fatfs: add FAT_IOCTL_GET_VOLUME_IDMike Lockwood
This patch, originally from Android kernel, adds vfat ioctl command FAT_IOCTL_GET_VOLUME_ID, with this command we can get the vfat volume ID using following code: ioctl(fd, FAT_IOCTL_GET_VOLUME_ID, &volume_ID) This patch is a modified version of the patch by Mike Lockwood, with changes from Dmitry Pervushin, who noticed the original patch makes some volume IDs abiguous with error returns: for example, if volume id is 0xFFFFFDAD, that matches -ENOIOCTLCMD, we get "FFFFFFFF" from the user space. So add a parameter to ioctl to get the correct volume ID. Android uses vfat volume ID to identify different sd card, when a new sd card is inserted to device, android can scan the media on it and pop up new contents. Signed-off-by: Bintian Wang <bintian.wang@linaro.org> Cc: dmitry pervushin <dpervushin@gmail.com> Cc: Mike Lockwood <lockwood@android.com> Cc: Colin Cross <ccross@android.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: John Stultz <john.stultz@linaro.org> Cc: Sean McNeil <sean@mcneil.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-09ncpfs: fix error return code in ncp_parse_options()Wei Yongjun
Fix to return -EINVAL from the option parse error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: Petr Vandrovec <petr@vandrovec.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-09mm/writeback: don't check force_wait to handle bdi->work_listWanpeng Li
After commit 839a8e8660b6 ("writeback: replace custom worker pool implementation with unbound workqueue"), bdi_writeback_workfn runs off bdi_writeback->dwork, on each execution, it processes bdi->work_list and reschedules if there are more things to do instead of flush any work that race with us existing. It is unecessary to check force_wait in wb_do_writeback since it is always 0 after the mentioned commit. This patch remove the force_wait in wb_do_writeback. Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Reviewed-by: Tejun Heo <tj@kernel.org> Reviewed-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-09fs/fs-writeback.c: : make wb_do_writeback() as staticHaicheng Li
It's not used globally and could be static. Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-09fsnotify: update comments concerning locking schemeLino Sanfilippo
There have been changes in the locking scheme of fsnotify but the comments in the source code have not been updated yet. This patch corrects this. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-09inotify: fix race when adding a new watchLino Sanfilippo
In inotify_new_watch() the number of watches for a group is compared against the max number of allowed watches and increased afterwards. The check and incrementation is not done atomically, so it is possible for multiple concurrent threads to pass the check and increment the number of marks above the allowed max. This patch uses an inotify groups mark_lock to ensure that both check and incrementation are done atomic. Furthermore we dont have to worry about the race that allows a concurrent thread to add a watch just after inotify_update_existing_watch() returned with -ENOENT anymore, since this is also synchronized by the groups mark mutex now. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-09dnotify: replace dnotify_mark_mutex with mark mutex of dnotify_groupLino Sanfilippo
There is no need to use a special mutex to protect against the fcntl/close race (see dnotify.c for a description of this race). Instead the dnotify_groups mark mutex can be used. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-09fanotify: put duplicate code for adding vfsmount/inode marks into an own ↵Lino Sanfilippo
function The code under the groups mark_mutex in fanotify_add_inode_mark() and fanotify_add_vfsmount_mark() is almost identical. So put it into a seperate function. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-09fanotify: fix races when adding/removing marksLino Sanfilippo
For both adding an event to an existing mark and destroying a mark we first have to find it via fsnotify_find_[inode|vfsmount]_mark(). But getting the mark and adding an event (or destroying it) is not done atomically. This opens a race where a thread is about to destroy a mark while another thread still finds the same mark and adds an event to its mask although it will be destroyed. Another race exists concerning the excess of a groups number of marks limit: When a mark is added the number of group marks is checked against the max number of marks per group and increased afterwards. Since check and increment is also not done atomically, this may result in 2 or more processes passing the check at the same time and increasing the number of group marks above the allowed limit. With this patch both races are avoided by doing the concerning operations with the groups mark mutex locked. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-09fanotify: info leak in copy_event_to_user()Dan Carpenter
The ->reserved field isn't cleared so we leak one byte of stack information to userspace. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-09net: rename low latency sockets functions to busy pollEliezer Tamir
Rename functions in include/net/ll_poll.h to busy wait. Clarify documentation about expected power use increase. Rename POLL_LL to POLL_BUSY_LOOP. Add need_resched() testing to poll/select busy loops. Note, that in select and poll can_busy_poll is dynamic and is updated continuously to reflect the existence of supported sockets with valid queue information. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-08nfsd4: support minorversion 1 by defaultJ. Bruce Fields
We now have minimal minorversion 1 support; turn it on by default. This can still be turned off with "echo -4.1 >/proc/fs/nfsd/versions". Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-08nfsd4: allow destroy_session over destroyed sessionJ. Bruce Fields
RFC 5661 allows a client to destroy a session using a compound associated with the destroyed session, as long as the DESTROY_SESSION op is the last op of the compound. We attempt to allow this, but testing against a Solaris client (which does destroy sessions in this way) showed that we were failing the DESTROY_SESSION with NFS4ERR_DELAY, because we assumed the reference count on the session (held by us) represented another rpc in progress over this session. Fix this by noting that in this case the expected reference count is 1, not 0. Also, note as long as the session holds a reference to the compound we're destroying, we can't free it here--instead, delay the free till the final put in nfs4svc_encode_compoundres. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-08ncpfs: fix error return code in ncp_parse_options()Wei Yongjun
Fix to return -EINVAL from the option parse error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-08locks: move file_lock_list to a set of percpu hlist_heads and convert ↵Jeff Layton
file_lock_lock to an lglock The file_lock_list is only used for /proc/locks. The vastly common case is for locks to be put onto the list and come off again, without ever being traversed. Help optimize for this use-case by moving to percpu hlist_head-s. At the same time, we can make the locking less contentious by moving to an lglock. When iterating over the lists for /proc/locks, we must take the global lock and then iterate over each CPU's list in turn. This change necessitates a new fl_link_cpu field to keep track of which CPU the entry is on. On x86_64 at least, this field is placed within an existing hole in the struct to avoid growing the size. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-08seq_file: add seq_list_*_percpu helpersJeff Layton
When we convert the file_lock_list to a set of percpu lists, we'll need a way to iterate over them in order to output /proc/locks info. Add some seq_list_*_percpu helpers to handle that. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-08f2fs: fix readdir incorrectnessJaegeuk Kim
In the previous Al Viro's readdir patch set, there occurs a bug when running xfstest: 006 as follows. [Error output] alpha size = 4, name length = 6, total files = 4096, nproc=1 1023 files created rm: cannot remove `/mnt/f2fs/permname.15150/a': Directory not empty [Correct output] alpha size = 4, name length = 6, total files = 4096, nproc=1 4097 files created This bug is due to the misupdate of directory position in ctx. So, this patch fixes this. [AV: fixed a braino] CC: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>