summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2016-03-21btrfs: make sure we stay inside the bvec during __btrfs_lookup_bio_sumsChris Mason
Commit c40a3d38aff4e1c (Btrfs: Compute and look up csums based on sectorsized blocks) changes around how we walk the bios while looking up crcs. There's an inner loop that is jumping to the next bvec based on sectors and before it derefs the next bvec, it needs to make sure we're still in the bio. In this case, the outer loop would have decided to stop moving forward too, and the bvec deref is never actually used for anything. But CONFIG_DEBUG_PAGEALLOC catches it because we're outside our bio. Signed-off-by: Chris Mason <clm@fb.com> Reviewed-by: David Sterba <dsterba@suse.com>
2016-03-21Merge branch 'mm-pkeys-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 protection key support from Ingo Molnar: "This tree adds support for a new memory protection hardware feature that is available in upcoming Intel CPUs: 'protection keys' (pkeys). There's a background article at LWN.net: https://lwn.net/Articles/643797/ The gist is that protection keys allow the encoding of user-controllable permission masks in the pte. So instead of having a fixed protection mask in the pte (which needs a system call to change and works on a per page basis), the user can map a (handful of) protection mask variants and can change the masks runtime relatively cheaply, without having to change every single page in the affected virtual memory range. This allows the dynamic switching of the protection bits of large amounts of virtual memory, via user-space instructions. It also allows more precise control of MMU permission bits: for example the executable bit is separate from the read bit (see more about that below). This tree adds the MM infrastructure and low level x86 glue needed for that, plus it adds a high level API to make use of protection keys - if a user-space application calls: mmap(..., PROT_EXEC); or mprotect(ptr, sz, PROT_EXEC); (note PROT_EXEC-only, without PROT_READ/WRITE), the kernel will notice this special case, and will set a special protection key on this memory range. It also sets the appropriate bits in the Protection Keys User Rights (PKRU) register so that the memory becomes unreadable and unwritable. So using protection keys the kernel is able to implement 'true' PROT_EXEC on x86 CPUs: without protection keys PROT_EXEC implies PROT_READ as well. Unreadable executable mappings have security advantages: they cannot be read via information leaks to figure out ASLR details, nor can they be scanned for ROP gadgets - and they cannot be used by exploits for data purposes either. We know about no user-space code that relies on pure PROT_EXEC mappings today, but binary loaders could start making use of this new feature to map binaries and libraries in a more secure fashion. There is other pending pkeys work that offers more high level system call APIs to manage protection keys - but those are not part of this pull request. Right now there's a Kconfig that controls this feature (CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) that is default enabled (like most x86 CPU feature enablement code that has no runtime overhead), but it's not user-configurable at the moment. If there's any serious problem with this then we can make it configurable and/or flip the default" * 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits) x86/mm/pkeys: Fix mismerge of protection keys CPUID bits mm/pkeys: Fix siginfo ABI breakage caused by new u64 field x86/mm/pkeys: Fix access_error() denial of writes to write-only VMA mm/core, x86/mm/pkeys: Add execute-only protection keys support x86/mm/pkeys: Create an x86 arch_calc_vm_prot_bits() for VMA flags x86/mm/pkeys: Allow kernel to modify user pkey rights register x86/fpu: Allow setting of XSAVE state x86/mm: Factor out LDT init from context init mm/core, x86/mm/pkeys: Add arch_validate_pkey() mm/core, arch, powerpc: Pass a protection key in to calc_vm_flag_bits() x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU x86/mm/pkeys: Add Kconfig prompt to existing config option x86/mm/pkeys: Dump pkey from VMA in /proc/pid/smaps x86/mm/pkeys: Dump PKRU with other kernel registers mm/core, x86/mm/pkeys: Differentiate instruction fetches x86/mm/pkeys: Optimize fault handling in access_error() mm/core: Do not enforce PKEY permissions on remote mm access um, pkeys: Add UML arch_*_access_permitted() methods mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keys x86/mm/gup: Simplify get_user_pages() PTE bit handling ...
2016-03-20ubifs: Remove unused headerAndreas Gruenbacher
UBIFS does not support POSIX ACLs, so there is no need for including any POSIX ACL hesders. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2016-03-20ubifs: Add logging functions for ubifs_msg, ubifs_err and ubifs_warnJoe Perches
The existing logging macros are fairly large and converting the macros to functions make the object code smaller. Use %pV and __builtin_return_address(0) as appropriate. $ size fs/ubifs/built-in.o* text data bss dec hex filename 575831 309688 161312 1046831 ff92f fs/ubifs/built-in.o.allyesconfig.new 622457 312872 161120 1096449 10bb01 fs/ubifs/built-in.o.allyesconfig.old 223785 640 644 225069 36f2d fs/ubifs/built-in.o.defconfig.new 251873 640 644 253157 3dce5 fs/ubifs/built-in.o.defconfig.old Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2016-03-20writeback, cgroup: fix use of the wrong bdi_writeback which mismatches the inodeTejun Heo
When cgroup writeback is in use, there can be multiple wb's (bdi_writeback's) per bdi and an inode may switch among them dynamically. In a couple places, the wrong wb was used leading to performing operations on the wrong list under the wrong lock corrupting the io lists. * writeback_single_inode() was taking @wb parameter and used it to remove the inode from io lists if it becomes clean after writeback. The callers of this function were always passing in the root wb regardless of the actual wb that the inode was associated with, which could also change while writeback is in progress. Fix it by dropping the @wb parameter and using inode_to_wb_and_lock_list() to determine and lock the associated wb. * After writeback_sb_inodes() writes out an inode, it re-locks @wb and inode to remove it from or move it to the right io list. It assumes that the inode is still associated with @wb; however, the inode may have switched to another wb while writeback was in progress. Fix it by using inode_to_wb_and_lock_list() to determine and lock the associated wb after writeback is complete. As the function requires the original @wb->list_lock locked for the next iteration, in the unlikely case where the inode has changed association, switch the locks. Kudos to Tahsin for pinpointing these subtle breakages. Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: d10c80955265 ("writeback: implement foreign cgroup inode bdi_writeback switching") Link: http://lkml.kernel.org/g/CAAeU0aMYeM_39Y2+PaRvyB1nqAPYZSNngJ1eBRmrxn7gKAt2Mg@mail.gmail.com Reported-and-diagnosed-by: Tahsin Erdogan <tahsin@google.com> Tested-by: Tahsin Erdogan <tahsin@google.com> Cc: stable@vger.kernel.org # v4.2+ Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-20writeback, cgroup: fix premature wb_put() in locked_inode_to_wb_and_lock_list()Tejun Heo
locked_inode_to_wb_and_lock_list() wb_get()'s the wb associated with the target inode, unlocks inode, locks the wb's list_lock and verifies that the inode is still associated with the wb. To prevent the wb going away between dropping inode lock and acquiring list_lock, the wb is pinned while inode lock is held. The wb reference is put right after acquiring list_lock citing that the wb won't be dereferenced anymore. This isn't true. If the inode is still associated with the wb, the inode has reference and it's safe to return the wb; however, if inode has been switched, the wb still needs to be unlocked which is a dereference and can lead to use-after-free if it it races with wb destruction. Fix it by putting the reference after releasing list_lock. Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: 87e1d789bf55 ("writeback: implement [locked_]inode_to_wb_and_lock_list()") Cc: stable@vger.kernel.org # v4.2+ Tested-by: Tahsin Erdogan <tahsin@google.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-20Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: - Preparations of parallel lookups (the remaining main obstacle is the need to move security_d_instantiate(); once that becomes safe, the rest will be a matter of rather short series local to fs/*.c - preadv2/pwritev2 series from Christoph - assorted fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (32 commits) splice: handle zero nr_pages in splice_to_pipe() vfs: show_vfsstat: do not ignore errors from show_devname method dcache.c: new helper: __d_add() don't bother with __d_instantiate(dentry, NULL) untangle fsnotify_d_instantiate() a bit uninline d_add() replace d_add_unique() with saner primitive quota: use lookup_one_len_unlocked() cifs_get_root(): use lookup_one_len_unlocked() nfs_lookup: don't bother with d_instantiate(dentry, NULL) kill dentry_unhash() ceph_fill_trace(): don't bother with d_instantiate(dn, NULL) autofs4: don't bother with d_instantiate(dentry, NULL) in ->lookup() configfs: move d_rehash() into configfs_create() for regular files ceph: don't bother with d_rehash() in splice_dentry() namei: teach lookup_slow() to skip revalidate namei: massage lookup_slow() to be usable by lookup_one_len_unlocked() lookup_one_len_unlocked(): use lookup_dcache() namei: simplify invalidation logics in lookup_dcache() namei: change calling conventions for lookup_{fast,slow} and follow_managed() ...
2016-03-19Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge second patch-bomb from Andrew Morton: - a couple of hotfixes - the rest of MM - a new timer slack control in procfs - a couple of procfs fixes - a few misc things - some printk tweaks - lib/ updates, notably to radix-tree. - add my and Nick Piggin's old userspace radix-tree test harness to tools/testing/radix-tree/. Matthew said it was a godsend during the radix-tree work he did. - a few code-size improvements, switching to __always_inline where gcc screwed up. - partially implement character sets in sscanf * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (118 commits) sscanf: implement basic character sets lib/bug.c: use common WARN helper param: convert some "on"/"off" users to strtobool lib: add "on"/"off" support to kstrtobool lib: update single-char callers of strtobool() lib: move strtobool() to kstrtobool() include/linux/unaligned: force inlining of byteswap operations include/uapi/linux/byteorder, swab: force inlining of some byteswap operations include/asm-generic/atomic-long.h: force inlining of some atomic_long operations usb: common: convert to use match_string() helper ide: hpt366: convert to use match_string() helper ata: hpt366: convert to use match_string() helper power: ab8500: convert to use match_string() helper power: charger_manager: convert to use match_string() helper drm/edid: convert to use match_string() helper pinctrl: convert to use match_string() helper device property: convert to use match_string() helper lib/string: introduce match_string() helper radix-tree tests: add test for radix_tree_iter_next radix-tree tests: add regression3 test ...
2016-03-18Merge branch 'for-4.6/core' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull core block updates from Jens Axboe: "Here are the core block changes for this merge window. Not a lot of exciting stuff going on in this round, most of the changes have been on the driver side of things. That pull request is coming next. This pull request contains: - A set of fixes for chained bio handling from Christoph. - A tag bounds check for blk-mq from Hannes, ensuring that we don't do something stupid if a device reports an invalid tag value. - A set of fixes/updates for the CFQ IO scheduler from Jan Kara. - A set of blk-mq fixes from Keith, adding support for dynamic hardware queues, and fixing init of max_dev_sectors for stacking devices. - A fix for the dynamic hw context from Ming. - Enabling of cgroup writeback support on a block device, from Shaohua" * 'for-4.6/core' of git://git.kernel.dk/linux-block: blk-mq: add bounds check on tag-to-rq conversion block: bio_remaining_done() isn't unlikely block: cleanup bio_endio block: factor out chained bio completion block: don't unecessarily clobber bi_error for chained bios block-dev: enable writeback cgroup support blk-mq: Fix NULL pointer updating nr_requests blk-mq: mark request queue as mq asap block: Initialize max_dev_sectors to 0 blk-mq: dynamic h/w context count cfq-iosched: Allow parent cgroup to preempt its child cfq-iosched: Allow sync noidle workloads to preempt each other cfq-iosched: Reorder checks in cfq_should_preempt() cfq-iosched: Don't group_idle if cfqq has big thinktime
2016-03-18Merge branches 'work.lookups', 'work.misc' and 'work.preadv2' into for-nextAl Viro
2016-03-18splice: handle zero nr_pages in splice_to_pipe()Rabin Vincent
Running the following command: busybox cat /sys/kernel/debug/tracing/trace_pipe > /dev/null with any tracing enabled pretty very quickly leads to various NULL pointer dereferences and VM BUG_ON()s, such as these: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffff8119df6c>] generic_pipe_buf_release+0xc/0x40 Call Trace: [<ffffffff811c48a3>] splice_direct_to_actor+0x143/0x1e0 [<ffffffff811c42e0>] ? generic_pipe_buf_nosteal+0x10/0x10 [<ffffffff811c49cf>] do_splice_direct+0x8f/0xb0 [<ffffffff81196869>] do_sendfile+0x199/0x380 [<ffffffff81197600>] SyS_sendfile64+0x90/0xa0 [<ffffffff8192cbee>] entry_SYSCALL_64_fastpath+0x12/0x6d page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0) kernel BUG at include/linux/mm.h:367! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC RIP: [<ffffffff8119df9c>] generic_pipe_buf_release+0x3c/0x40 Call Trace: [<ffffffff811c48a3>] splice_direct_to_actor+0x143/0x1e0 [<ffffffff811c42e0>] ? generic_pipe_buf_nosteal+0x10/0x10 [<ffffffff811c49cf>] do_splice_direct+0x8f/0xb0 [<ffffffff81196869>] do_sendfile+0x199/0x380 [<ffffffff81197600>] SyS_sendfile64+0x90/0xa0 [<ffffffff8192cd1e>] tracesys_phase2+0x84/0x89 (busybox's cat uses sendfile(2), unlike the coreutils version) This is because tracing_splice_read_pipe() can call splice_to_pipe() with spd->nr_pages == 0. spd_pages underflows in splice_to_pipe() and we fill the page pointers and the other fields of the pipe_buffers with garbage. All other callers of splice_to_pipe() avoid calling it when nr_pages == 0, and we could make tracing_splice_read_pipe() do that too, but it seems reasonable to have splice_to_page() handle this condition gracefully. Cc: stable@vger.kernel.org Signed-off-by: Rabin Vincent <rabin@rab.in> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-18nfsd: block and scsi layout drivers need to depend on CONFIG_BLOCKChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-03-18nfsd: add SCSI layout supportChristoph Hellwig
This is a simple extension to the block layout driver to use SCSI persistent reservations for access control and fencing, as well as SCSI VPD pages for device identification. For this we need to pass the nfs4_client to the proc_getdeviceinfo method to generate the reservation key, and add a new fence_client method to allow for fence actions in the layout driver. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-03-18nfsd: move some blocklayout codeChristoph Hellwig
Trivial reorganization, no change in behavior. Move some code around, pull some code out of block layoutcommit that will be useful for the scsi layout. [bfields@redhat.com: split off from "nfsd: add SCSI layout support"] Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-03-18nfsd: add a new config option for the block layout driverChristoph Hellwig
Split the config symbols into a generic pNFS one, which is invisible and gets selected by the layout drivers, and one for the block layout driver. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-03-18nfs/blocklayout: add SCSI layout supportChristoph Hellwig
This is a trivial extension to the block layout driver to support the new SCSI layouts draft. There are three changes: - device identifcation through the SCSI VPD page. This allows us to directly use the udev generated persistent device names instead of requiring an expensive lookup by crawling every block device node in /dev and reading a signature for it. - use of SCSI persistent reservations to protect device access and allow for robust fencing. On the client sides this just means registering and unregistering a server supplied key. - an optimized LAYOUTCOMMIT payload that doesn't send unessecary fields to the server. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-03-18f2fs: submit node page write bios when really requiredJaegeuk Kim
If many threads calls fsync with data writes, we don't need to flush every bios having node page writes. The f2fs_wait_on_page_writeback will flush its bios when the page is really needed. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-03-18f2fs: add missing argument to f2fs_setxattr stubArnd Bergmann
The f2fs_setxattr() prototype for CONFIG_F2FS_FS_XATTR=n has been wrong for a long time, since 8ae8f1627f39 ("f2fs: support xattr security labels"), but there have never been any callers, so it did not matter. Now, the function gets called from f2fs_ioc_keyctl(), which causes a build failure: fs/f2fs/file.c: In function 'f2fs_ioc_keyctl': include/linux/stddef.h:7:14: error: passing argument 6 of 'f2fs_setxattr' makes integer from pointer without a cast [-Werror=int-conversion] #define NULL ((void *)0) ^ fs/f2fs/file.c:1599:27: note: in expansion of macro 'NULL' value, F2FS_KEY_SIZE, NULL, type); ^ In file included from ../fs/f2fs/file.c:29:0: fs/f2fs/xattr.h:129:19: note: expected 'int' but argument is of type 'void *' static inline int f2fs_setxattr(struct inode *inode, int index, ^ fs/f2fs/file.c:1597:9: error: too many arguments to function 'f2fs_setxattr' return f2fs_setxattr(inode, F2FS_XATTR_INDEX_KEY, ^ In file included from ../fs/f2fs/file.c:29:0: fs/f2fs/xattr.h:129:19: note: declared here static inline int f2fs_setxattr(struct inode *inode, int index, Thsi changes the prototype of the empty stub function to match that of the actual implementation. This will not make the key management work when F2FS_FS_XATTR is disabled, but it gets it to build at least. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-03-18f2fs: fix to avoid unneeded unlock_new_inodeChao Yu
During ->lookup, I_NEW state of inode was been cleared in f2fs_iget, so in error path, we don't need to clear it again. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-03-18f2fs: clean up opened code with f2fs_update_dentryChao Yu
Just clean up opened code with existing function, no logic change. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-03-18f2fs: declare static functionsJaegeuk Kim
Just to avoid sparse warnings. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-03-18f2fs: use cryptoapi crc32 functionsKeith Mok
The crc function is done bit by bit. Optimize this by use cryptoapi crc32 function which is backed by h/w acceleration. Signed-off-by: Keith Mok <ek9852@gmail.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-03-18f2fs: modify the readahead method in ra_node_page()Fan Li
ra_node_page() is used to read ahead one node page. Comparing to regular read, it's faster because it doesn't wait for IO completion. But if it is called twice for reading the same block, and the IO request from the first call hasn't been completed before the second call, the second call will have to wait until the read is over. Here use the code in __do_page_cache_readahead() to solve this problem. It does nothing when someone else already puts the page in mapping. The status of page should be assured by whoever puts it there. This implement also prevents alteration of page reference count. Signed-off-by: Fan li <fanofcode.li@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-03-18f2fs crypto: sync ext4_lookup and ext4_file_openJaegeuk Kim
This patch tries to catch up with lookup and open policies in ext4. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-03-18fs crypto: move per-file encryption from f2fs tree to fs/cryptoJaegeuk Kim
This patch adds the renamed functions moved from the f2fs crypto files. 1. definitions for per-file encryption used by ext4 and f2fs. 2. crypto.c for encrypt/decrypt functions a. IO preparation: - fscrypt_get_ctx / fscrypt_release_ctx b. before IOs: - fscrypt_encrypt_page - fscrypt_decrypt_page - fscrypt_zeroout_range c. after IOs: - fscrypt_decrypt_bio_pages - fscrypt_pullback_bio_page - fscrypt_restore_control_page 3. policy.c supporting context management. a. For ioctls: - fscrypt_process_policy - fscrypt_get_policy b. For context permission - fscrypt_has_permitted_context - fscrypt_inherit_context 4. keyinfo.c to handle permissions - fscrypt_get_encryption_info - fscrypt_free_encryption_info 5. fname.c to support filename encryption a. general wrapper functions - fscrypt_fname_disk_to_usr - fscrypt_fname_usr_to_disk - fscrypt_setup_filename - fscrypt_free_filename b. specific filename handling functions - fscrypt_fname_alloc_buffer - fscrypt_fname_free_buffer 6. Makefile and Kconfig Cc: Al Viro <viro@ftp.linux.org.uk> Signed-off-by: Michael Halcrow <mhalcrow@google.com> Signed-off-by: Ildar Muslukhov <ildarm@google.com> Signed-off-by: Uday Savagaonkar <savagaon@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-03-17Merge tag 'please-pull-pstore' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux Pull pstore update from Tony Luck: "Allow ram backend to be configured with addresses above 4GB" * tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: pstore: Add support for 64 Bit address space
2016-03-17Merge tag 'gfs2-merge-window' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull GFS2 updates from Bob Peterson: "We only have six patches ready for this merge window: - Arnd Bergmann contributed a patch that fixes an uninitialized variable warning. - The second patch avoids a kernel panic due to referencing an iopen glock that may not be held, in an error path. - The third patch fixes a rounding error that caused xfs_tests direct IO write "fsx" tests to fail on GFS2. - The fourth patch tidies up the code path when glocks are being reused to recreate a dinode that was recently deleted. - The fifth reverts an ages-old patch that should no longer be needed, and which interfered with the transition of dinodes from unlinked to free. - And lastly, a patch to eliminate a function parameter that's not needed" * tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: GFS2: Eliminate parameter non_block on gfs2_inode_lookup GFS2: Don't filter out I_FREEING inodes anymore GFS2: Prevent delete work from occurring on glocks used for create GFS2: Fix direct IO write rounding error gfs2: avoid uninitialized variable warning GFS2: Check if iopen is held when deleting inode
2016-03-17Merge tag 'dlm-4.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm updates from David Teigland: "Previous changes introduced the use of socket error reporting for dlm sockets. This set includes two fixes in how the socket error callbacks are used" * tag 'dlm-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: DLM: Save and restore socket callbacks properly DLM: Replace nodeid_to_addr with kernel_getpeername
2016-03-17Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Performance improvements in SEEK_DATA and xattr scalability improvements, plus a lot of clean ups and bug fixes" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (38 commits) ext4: clean up error handling in the MMP support jbd2: do not fail journal because of frozen_buffer allocation failure ext4: use __GFP_NOFAIL in ext4_free_blocks() ext4: fix compile error while opening the macro DOUBLE_CHECK ext4: print ext4 mount option data_err=abort correctly ext4: fix NULL pointer dereference in ext4_mark_inode_dirty() ext4: drop unneeded BUFFER_TRACE in ext4_delete_inline_entry() ext4: fix misspellings in comments. jbd2: fix FS corruption possibility in jbd2_journal_destroy() on umount path ext4: more efficient SEEK_DATA implementation ext4: cleanup handling of bh->b_state in DAX mmap ext4: return hole from ext4_map_blocks() ext4: factor out determining of hole size ext4: fix setting of referenced bit in ext4_es_lookup_extent() ext4: remove i_ioend_count ext4: simplify io_end handling for AIO DIO ext4: move trans handling and completion deferal out of _ext4_get_block ext4: rename and split get blocks functions ext4: use i_mutex to serialize unaligned AIO DIO ext4: pack ioend structure better ...
2016-03-17Merge tag 'configfs-for-linus' of git://git.infradead.org/users/hch/configfsLinus Torvalds
Pull configfs updates from Christoph Hellwig: - A large patch from me to simplify setting up the list of default groups by actually implementing it as a list instead of an array. - a small Y2083 prep patch from Deepa Dinamani. Probably doesn't matter on it's own, but it seems like he is trying to get rid of all CURRENT_TIME uses in file systems, which is a worthwhile goal. * tag 'configfs-for-linus' of git://git.infradead.org/users/hch/configfs: configfs: switch ->default groups to a linked list configfs: Replace CURRENT_TIME by current_fs_time()
2016-03-17lib: update single-char callers of strtobool()Kees Cook
Some callers of strtobool() were passing a pointer to unterminated strings. In preparation of adding multi-character processing to kstrtobool(), update the callers to not pass single-character pointers, and switch to using the new kstrtobool_from_user() helper where possible. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Amitkumar Karwar <akarwar@marvell.com> Cc: Nishant Sarmukadam <nishants@marvell.com> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Steve French <sfrench@samba.org> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Joe Perches <joe@perches.com> Cc: Kees Cook <keescook@chromium.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17btrfs: use radix_tree_iter_retry()Matthew Wilcox
Even though this is a 'can't happen' situation, use the new radix_tree_iter_retry() pattern to eliminate a goto. [akpm@linux-foundation.org: fix btrfs build] Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <jbacik@fb.com> Cc: David Sterba <dsterba@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17proc-vmcore: wrong data type casting fixDave Young
On i686 PAE enabled machine the contiguous physical area could be large and it can cause trimming down variables in below calculation in read_vmcore() and mmap_vmcore(): tsz = min_t(size_t, m->offset + m->size - *fpos, buflen); That is, the types being used is like below on i686: m->offset: unsigned long long int m->size: unsigned long long int *fpos: loff_t (long long int) buflen: size_t (unsigned int) So casting (m->offset + m->size - *fpos) by size_t means truncating a given value by 4GB. Suppose (m->offset + m->size - *fpos) being truncated to 0, buflen >0 then we will get tsz = 0. It is of course not an expected result. Similarly we could also get other truncated values less than buflen. Then the real size passed down is not correct any more. If (m->offset + m->size - *fpos) is above 4GB, read_vmcore or mmap_vmcore use the min_t result with truncated values being compared to buflen. Then, fpos proceeds with the wrong value so that we reach below bugs: 1) read_vmcore will refuse to continue so makedumpfile fails. 2) mmap_vmcore will trigger BUG_ON() in remap_pfn_range(). Use unsigned long long in min_t instead so that the variables in are not truncated. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Dave Young <dyoung@redhat.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Jianyu Zhan <nasa4836@gmail.com> Cc: Minfei Huang <mhuang@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17proc/base: make prompt shell start from new line after executing "cat ↵Minfei Huang
/proc/$pid/wchan" It is not elegant that prompt shell does not start from new line after executing "cat /proc/$pid/wchan". Make prompt shell start from new line. Signed-off-by: Minfei Huang <mnfhuang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17procfs: add conditional compilation checkEric Engestrom
`proc_timers_operations` is only used when CONFIG_CHECKPOINT_RESTORE is enabled. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17proc: add /proc/<pid>/timerslack_ns interfaceJohn Stultz
This patch provides a proc/PID/timerslack_ns interface which exposes a task's timerslack value in nanoseconds and allows it to be changed. This allows power/performance management software to set timer slack for other threads according to its policy for the thread (such as when the thread is designated foreground vs. background activity) If the value written is non-zero, slack is set to that value. Otherwise sets it to the default for the thread. This interface checks that the calling task has permissions to to use PTRACE_MODE_ATTACH_FSCREDS on the target task, so that we can ensure arbitrary apps do not change the timer slack for other apps. Signed-off-by: John Stultz <john.stultz@linaro.org> Acked-by: Kees Cook <keescook@chromium.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Oren Laadan <orenl@cellrox.com> Cc: Ruchi Kandoi <kandoiruchi@google.com> Cc: Rom Lemarchand <romlem@android.com> Cc: Android Kernel Team <kernel-team@android.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17timer: convert timer_slack_ns from unsigned long to u64John Stultz
This patchset introduces a /proc/<pid>/timerslack_ns interface which would allow controlling processes to be able to set the timerslack value on other processes in order to save power by avoiding wakeups (Something Android currently does via out-of-tree patches). The first patch tries to fix the internal timer_slack_ns usage which was defined as a long, which limits the slack range to ~4 seconds on 32bit systems. It converts it to a u64, which provides the same basically unlimited slack (500 years) on both 32bit and 64bit machines. The second patch introduces the /proc/<pid>/timerslack_ns interface which allows the full 64bit slack range for a task to be read or set on both 32bit and 64bit machines. With these two patches, on a 32bit machine, after setting the slack on bash to 10 seconds: $ time sleep 1 real 0m10.747s user 0m0.001s sys 0m0.005s The first patch is a little ugly, since I had to chase the slack delta arguments through a number of functions converting them to u64s. Let me know if it makes sense to break that up more or not. Other than that things are fairly straightforward. This patch (of 2): The timer_slack_ns value in the task struct is currently a unsigned long. This means that on 32bit applications, the maximum slack is just over 4 seconds. However, on 64bit machines, its much much larger (~500 years). This disparity could make application development a little (as well as the default_slack) to a u64. This means both 32bit and 64bit systems have the same effective internal slack range. Now the existing ABI via PR_GET_TIMERSLACK and PR_SET_TIMERSLACK specify the interface as a unsigned long, so we preserve that limitation on 32bit systems, where SET_TIMERSLACK can only set the slack to a unsigned long value, and GET_TIMERSLACK will return ULONG_MAX if the slack is actually larger then what can be stored by an unsigned long. This patch also modifies hrtimer functions which specified the slack delta as a unsigned long. Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Oren Laadan <orenl@cellrox.com> Cc: Ruchi Kandoi <kandoiruchi@google.com> Cc: Rom Lemarchand <romlem@android.com> Cc: Kees Cook <keescook@chromium.org> Cc: Android Kernel Team <kernel-team@android.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17mm: introduce page reference manipulation functionsJoonsoo Kim
The success of CMA allocation largely depends on the success of migration and key factor of it is page reference count. Until now, page reference is manipulated by direct calling atomic functions so we cannot follow up who and where manipulate it. Then, it is hard to find actual reason of CMA allocation failure. CMA allocation should be guaranteed to succeed so finding offending place is really important. In this patch, call sites where page reference is manipulated are converted to introduced wrapper function. This is preparation step to add tracepoint to each page reference manipulation function. With this facility, we can easily find reason of CMA allocation failure. There is no functional change in this patch. In addition, this patch also converts reference read sites. It will help a second step that renames page._count to something else and prevents later attempt to direct access to it (Suggested by Andrew). Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Michal Nazarewicz <mina86@mina86.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Minchan Kim <minchan@kernel.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17mm/page_alloc.c: calculate 'available' memory in a separate functionIgor Redko
Add a new field, VIRTIO_BALLOON_S_AVAIL, to virtio_balloon memory statistics protocol, corresponding to 'Available' in /proc/meminfo. It indicates to the hypervisor how big the balloon can be inflated without pushing the guest system to swap. This metric would be very useful in VM orchestration software to improve memory management of different VMs under overcommit. This patch (of 2): Factor out calculation of the available memory counter into a separate exportable function, in order to be able to use it in other parts of the kernel. In particular, it appears a relevant metric to report to the hypervisor via virtio-balloon statistics interface (in a followup patch). Signed-off-by: Igor Redko <redkoi@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17/proc/kpageflags: return KPF_SLAB for slab tail pagesNaoya Horiguchi
Currently /proc/kpageflags returns just KPF_COMPOUND_TAIL for slab tail pages, which is inconvenient when grasping how slab pages are distributed (userspace always needs to check which kind of tail pages by itself). This patch sets KPF_SLAB for such pages. With this patch: $ grep Slab /proc/meminfo ; tools/vm/page-types -b slab Slab: 64880 kB flags page-count MB symbolic-flags long-symbolic-flags 0x0000000000000080 16220 63 _______S__________________________________ slab total 16220 63 16220 pages equals to 64880 kB, so returned result is consistent with the global counter. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reviewed-by: Vladimir Davydov <vdavydov@virtuozzo.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17/proc/kpageflags: return KPF_BUDDY for "tail" buddy pagesNaoya Horiguchi
Currently /proc/kpageflags returns nothing for "tail" buddy pages, which is inconvenient when grasping how free pages are distributed. This patch sets KPF_BUDDY for such pages. With this patch: $ grep MemFree /proc/meminfo ; tools/vm/page-types -b buddy MemFree: 3134992 kB flags page-count MB symbolic-flags long-symbolic-flags 0x0000000000000400 779272 3044 __________B_______________________________ buddy 0x0000000000000c00 4385 17 __________BM______________________________ buddy,mmap total 783657 3061 783657 pages is 3134628 kB (roughly consistent with the global counter,) so it's OK. [akpm@linux-foundation.org: update comment, per Naoya] Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reviewed-by: Vladimir Davydov <vdavydov@virtuozzo.com>> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17Merge tag 'char-misc-4.6-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc updates from Greg KH: "Here is the big char/misc driver update for 4.6-rc1. The majority of the patches here is hwtracing and some new mic drivers, but there's a lot of other driver updates as well. Full details in the shortlog. All have been in linux-next for a while with no reported issues" * tag 'char-misc-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (238 commits) goldfish: Fix build error of missing ioremap on UM nvmem: mediatek: Fix later provider initialization nvmem: imx-ocotp: Fix return value of imx_ocotp_read nvmem: Fix dependencies for !HAS_IOMEM archs char: genrtc: replace blacklist with whitelist drivers/hwtracing: make coresight-etm-perf.c explicitly non-modular drivers: char: mem: fix IS_ERROR_VALUE usage char: xillybus: Fix internal data structure initialization pch_phub: return -ENODATA if ROM can't be mapped Drivers: hv: vmbus: Support kexec on ws2012 r2 and above Drivers: hv: vmbus: Support handling messages on multiple CPUs Drivers: hv: utils: Remove util transport handler from list if registration fails Drivers: hv: util: Pass the channel information during the init call Drivers: hv: vmbus: avoid unneeded compiler optimizations in vmbus_wait_for_unload() Drivers: hv: vmbus: remove code duplication in message handling Drivers: hv: vmbus: avoid wait_for_completion() on crash Drivers: hv: vmbus: don't loose HVMSG_TIMER_EXPIRED messages misc: at24: replace memory_accessor with nvmem_device_read eeprom: 93xx46: extend driver to plug into the NVMEM framework eeprom: at25: extend driver to plug into the NVMEM framework ...
2016-03-17Merge tag 'driver-core-4.6-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Just a few patches this time around for the 4.6-rc1 merge window. Largest is a new firmware driver, but there are some other updates to the driver core in here as well, the shortlog has the details. All have been in linux-next for a while with no reported issues" * tag 'driver-core-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: Revert "driver-core: platform: probe of-devices only using list of compatibles" firmware: qemu config needs I/O ports firmware: qemu_fw_cfg.c: fix typo FW_CFG_DATA_OFF driver-core: platform: probe of-devices only using list of compatibles driver-core: platform: fix typo in documentation for multi-driver helper component: remove impossible condition drivers: dma-coherent: simplify dma_init_coherent_memory return value devicetree: update documentation for fw_cfg ARM bindings firmware: create directory hierarchy for sysfs fw_cfg entries firmware: introduce sysfs driver for QEMU's fw_cfg device kobject: export kset_find_obj() for module use driver core: bus: use to_subsys_private and to_device_private_bus driver core: bus: use list_for_each_entry* debugfs: Add stub function for debugfs_create_automount(). kernfs: make kernfs_walk_ns() use kernfs_pr_cont_buf[]
2016-03-17nfsd: recover: fix memory leakSudip Mukherjee
nfsd4_cltrack_grace_start() will allocate the memory for grace_start but when we returned due to error we missed freeing it. Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-03-17orangefs: put register_chrdev immediately before register_filesystemMartin Brandenburg
Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-03-17orangefs: remove paranoia in orangefs_set_inodeMartin Brandenburg
Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-03-17orangefs: sanitize listxattr and return EIO on impossible valuesMartin Brandenburg
Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-03-17orangefs: remove unused reference to xattr key lengthMartin Brandenburg
Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-03-17Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security layer updates from James Morris: "There are a bunch of fixes to the TPM, IMA, and Keys code, with minor fixes scattered across the subsystem. IMA now requires signed policy, and that policy is also now measured and appraised" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (67 commits) X.509: Make algo identifiers text instead of enum akcipher: Move the RSA DER encoding check to the crypto layer crypto: Add hash param to pkcs1pad sign-file: fix build with CMS support disabled MAINTAINERS: update tpmdd urls MODSIGN: linux/string.h should be #included to get memcpy() certs: Fix misaligned data in extra certificate list X.509: Handle midnight alternative notation in GeneralizedTime X.509: Support leap seconds Handle ISO 8601 leap seconds and encodings of midnight in mktime64() X.509: Fix leap year handling again PKCS#7: fix unitialized boolean 'want' firmware: change kernel read fail to dev_dbg() KEYS: Use the symbol value for list size, updated by scripts/insert-sys-cert KEYS: Reserve an extra certificate symbol for inserting without recompiling modsign: hide openssl output in silent builds tpm_tis: fix build warning with tpm_tis_resume ima: require signed IMA policy ima: measure and appraise the IMA policy itself ima: load policy using path ...
2016-03-17Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto update from Herbert Xu: "Here is the crypto update for 4.6: API: - Convert remaining crypto_hash users to shash or ahash, also convert blkcipher/ablkcipher users to skcipher. - Remove crypto_hash interface. - Remove crypto_pcomp interface. - Add crypto engine for async cipher drivers. - Add akcipher documentation. - Add skcipher documentation. Algorithms: - Rename crypto/crc32 to avoid name clash with lib/crc32. - Fix bug in keywrap where we zero the wrong pointer. Drivers: - Support T5/M5, T7/M7 SPARC CPUs in n2 hwrng driver. - Add PIC32 hwrng driver. - Support BCM6368 in bcm63xx hwrng driver. - Pack structs for 32-bit compat users in qat. - Use crypto engine in omap-aes. - Add support for sama5d2x SoCs in atmel-sha. - Make atmel-sha available again. - Make sahara hashing available again. - Make ccp hashing available again. - Make sha1-mb available again. - Add support for multiple devices in ccp. - Improve DMA performance in caam. - Add hashing support to rockchip" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (116 commits) crypto: qat - remove redundant arbiter configuration crypto: ux500 - fix checks of error code returned by devm_ioremap_resource() crypto: atmel - fix checks of error code returned by devm_ioremap_resource() crypto: qat - Change the definition of icp_qat_uof_regtype hwrng: exynos - use __maybe_unused to hide pm functions crypto: ccp - Add abstraction for device-specific calls crypto: ccp - CCP versioning support crypto: ccp - Support for multiple CCPs crypto: ccp - Remove check for x86 family and model crypto: ccp - memset request context to zero during import lib/mpi: use "static inline" instead of "extern inline" lib/mpi: avoid assembler warning hwrng: bcm63xx - fix non device tree compatibility crypto: testmgr - allow rfc3686 aes-ctr variants in fips mode. crypto: qat - The AE id should be less than the maximal AE number lib/mpi: Endianness fix crypto: rockchip - add hash support for crypto engine in rk3288 crypto: xts - fix compile errors crypto: doc - add skcipher API documentation crypto: doc - update AEAD AD handling ...