summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-02-28kexec: prevent double free on image allocation failureSasha Levin
If kimage_normal_alloc() fails to initialize an allocated kimage, it will free the image but would still set 'rimage', as a result kexec_load will try to free it again. This would explode as part of the freeing process is accessing internal members which point to uninitialized memory. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28kexec: export PG_hwpoison flag into vmcoreinfoMitsuhiro Tanino
This patch exports a PG_hwpoison into vmcoreinfo when CONFIG_MEMORY_FAILURE is defined. "makedumpfile" needs to read information of memory, such as 'mem_section', 'zone', 'pageflags' from vmcore. We introduce a function into "makedumpfile" to exclude hwpoison page from vmcore dump. In order to introduce this function, PG_hwpoison flag have to export into vmcoreinfo. Signed-off-by: Mitsuhiro Tanino <mitsuhiro.tanino.gm@hitachi.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Mitsuhiro Tanino <mitsuhiro.tanino.gm@hitachi.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28kexec: get rid of duplicate check for hole_endZhang Yanfei
hole_end has been checked to make sure it is <= crash_res.end in the while condition check, so the if condition check is duplicate. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28kexec: add the values related to buddy system for filtering free pages.Atsushi Kumagai
tAdd adds the values related to buddy system to vmcoreinfo data so that makedumpfile (dump filtering command) can filter out all free pages with the new logic. It's faster than the current logic because it can distinguish free page by analyzing page structure at the same time as filtering for other unnecessary pages (e.g. anonymous page). OTOH, the current logic has to trace free_list to distinguish free pages while analyzing page structure to filter out other unnecessary pages. The new logic uses the fact that buddy page is marked by _mapcount == PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab is set or not before looking up _mapcount value. And we can get the order of buddy system from private field. To sum it up, the values below are required for this logic. Required values: - OFFSET(page._mapcount) - OFFSET(page.private) - NUMBER(PG_slab) - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) Changelog from v1 to v2: 1. remove SIZE(pageflags) The new logic was changed after I sent v1 patch. Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile. What's makedumpfile: makedumpfile creates a small dumpfile by excluding unnecessary pages for the analysis. To distinguish unnecessary pages, makedumpfile gets the vmcoreinfo data which has the minimum debugging information only for dump filtering. Signed-off-by: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28fork: unshare: remove dead codeAlan Cox
If new_nsproxy is set we will always call switch_task_namespaces and then set new_nsproxy back to NULL so the reassignment and fall through check are redundant Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28fs/seq_file.c:seq_lseek(): fix switch statement indentingAndrew Morton
[akpm@linux-foundation.org: checkpatch fixes] Cc: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28seq-file: use SEEK_ macros instead of hardcoded numbersCyrill Gorcunov
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28coredump: use a freezable_schedule for the coredump_finish waitMandeep Singh Baines
Prevents hung_task detector from panicing the machine. This is also needed to prevent this wait from blocking suspend. (It doesnt' currently block suspend but it would once the next patch in this series is applied.) [yongjun_wei@trendmicro.com.cn: kernel/exit.c: remove duplicated include] Signed-off-by: Mandeep Singh Baines <msb@chromium.org> Cc: Ben Chan <benchan@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28lockdep: check that no locks held at freeze timeMandeep Singh Baines
We shouldn't try_to_freeze if locks are held. Holding a lock can cause a deadlock if the lock is later acquired in the suspend or hibernate path (e.g. by dpm). Holding a lock can also cause a deadlock in the case of cgroup_freezer if a lock is held inside a frozen cgroup that is later acquired by a process outside that group. [akpm@linux-foundation.org: export debug_check_no_locks_held] Signed-off-by: Mandeep Singh Baines <msb@chromium.org> Cc: Ben Chan <benchan@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28fs/proc/vmcore.c: put if tests in the top of the while loop to reduce ↵Zhang Yanfei
duplication In read_vmcore() two `if' tests are duplicated. Change the position of them could reduce the duplication. This change does not affect the behaviour of the function. [akpm@linux-foundation.org: avoid `if (foo = bar)' thing, use min_t()] [akpm@linux-foundation.org: s/max_t/min_t/] Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28fs/proc: clean up printksAndrew Morton
- use pr_foo() throughout - remove a couple of duplicated KERN_WARNINGs, via WARN(KERN_WARNING "...") - nuke a few warnings which I've never seen happen, ever. Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28coredump: remove redundant defines for dumpable statesKees Cook
The existing SUID_DUMP_* defines duplicate the newer SUID_DUMPABLE_* defines introduced in 54b501992dd2 ("coredump: warn about unsafe suid_dumpable / core_pattern combo"). Remove the new ones, and use the prior values instead. Signed-off-by: Kees Cook <keescook@chromium.org> Reported-by: Chen Gang <gang.chen@asianux.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alan Cox <alan@linux.intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: James Morris <james.l.morris@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28kernel/signal.c: fix suboptimal printk usageValdis Kletnieks
Several printk's were missing KERN_INFO and KERN_CONT flags. In addition, a printk that was outside a #if/#endif should have been inside, which would result in stray blank line on non-x86 boxes. Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28signal: allow to send any siginfo to itselfAndrey Vagin
The idea is simple. We need to get the siginfo for each signal on checkpointing dump, and then return it back on restore. The first problem is that the kernel doesn't report complete siginfos to userspace. In a signal handler the kernel strips SI_CODE from siginfo. When a siginfo is received from signalfd, it has a different format with fixed sizes of fields. The interface of signalfd was extended. If a signalfd is created with the flag SFD_RAW, it returns siginfo in a raw format. rt_sigqueueinfo looks suitable for restoring signals, but it can't send siginfo with a positive si_code, because these codes are reserved for the kernel. In the real world each person has right to do anything with himself, so I think a process should able to send any siginfo to itself. This patch: The kernel prevents sending of siginfo with positive si_code, because these codes are reserved for kernel. I think we can allow a task to send such a siginfo to itself. This operation should not be dangerous. This functionality is required for restoring signals in checkpoint/restart. Signed-off-by: Andrey Vagin <avagin@openvz.org> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28Documentation/cgroups/blkio-controller.txt: fix typoWarren Turkal
Signed-off-by: Warren Turkal <wt@ooyala.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28Documentation/DMA-API-HOWTO.txt: minor grammar correctionsShuah Khan
Signed-off-by: Shuah Khan <shuah.khan@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28fat: mark fs as dirty on mount and clean on umountOleksij Rempel
There is no documented methods to mark FAT as dirty. Unofficially MS started to use reserved Byte in boot sector for this purpose, at least since Win 2000. With Win 7 user is warned if fs is dirty and asked to clean it. Different versions of Win, handle it in different ways, but always have same meaning: - Win 2000 and XP, set it on write operations and remove it after operation was finnished - Win 7, set dirty flag on first write and remove it on umount. We will do it as follows: - set dirty flag on mount. If fs was initially dirty, warn user, remember it and do not do any changes to boot sector. - clean it on umount. If fs was initially dirty, leave it dirty. - do not do any thing if fs mounted read-only. - TODO: leave fs dirty if we found some error after mount. Signed-off-by: Oleksij Rempel <bug-track@fisher-privat.net> Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28fat: add extended fileds to struct fat_boot_sectorOleksij Rempel
Later we will need "state" field to check if volume was cleanly unmounted. Signed-off-by: Oleksij Rempel <bug-track@fisher-privat.net> Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28hfsplus: fix issue with unzeroed unused b-tree nodesVyacheslav Dubeyko
The fsck_hfs (under MacOS X) complains about unzeroed unused b-tree nodes after deletion of folders' tree under Linux. SYMPTOMS: Running Disk Utiltiy's "Verify Disk" on "test" gives the following: Verifying volume “Test” Checking file systemChecking Journaled HFS Plus volume. Checking extents overflow file. Checking catalog file. Unused node is not erased (node = 3111) Checking multi-linked files. Checking catalog hierarchy. Checking extended attributes file. Checking volume bitmap. Checking volume information. The volume Test was found corrupt and needs to be repaired. Error: This disk needs to be repaired. Click Repair Disk. REPRODUCING PATH: 1. Prepare HFS+ (non-case sensitive) partition (for example, 5GB) under MacOS X. 2. Copy linux kernel source tree (for example, 3.7-rc6 version) on this partition under MacOS X. 3. Then switch to Linux and mount this prepared partition. 4. Execute `sudo rm -r` under prepared directory with linux kernel source tree. 5. Unmount and boot back into OS X. 6. Open up Disk Utility and verify partition. REPRODUCIBILITY: 100% FIX: It is added code of node clearing in hfs_bnode_put() method for the case when node has flag HFS_BNODE_DELETED. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reported-by: Kyle Laracey <kalaracey@gmail.com> Acked-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28hfsplus: add support of manipulation by attributes fileVyacheslav Dubeyko
Add support of manipulation by attributes file. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28hfsplus: rework functionality of getting, setting and deleting of extended ↵Vyacheslav Dubeyko
attributes Rework functionality of getting, setting and deleting of extended attributes. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28hfsplus: add functionality of manipulating by records in attributes treeVyacheslav Dubeyko
Add functionality of manipulating by records in attributes tree. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28hfsplus: add on-disk layout declarations related to attributes treeVyacheslav Dubeyko
Add all necessary on-disk layout declarations related to attributes file. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28hfsplus: add osx.* prefix for handling namespace of Mac OS X extended attributesVyacheslav Dubeyko
hfsplus: reworked support of extended attributes. Current mainline implementation of hfsplus file system driver treats as extended attributes only two fields (fdType and fdCreator) of user_info field in file description record (struct hfsplus_cat_file). It is possible to get or set only these two fields as extended attributes. But HFS+ treats as com.apple.FinderInfo extended attribute an union of user_info and finder_info fields as for file (struct hfsplus_cat_file) as for folder (struct hfsplus_cat_folder). Moreover, current mainline implementation of hfsplus file system driver doesn't support special metadata file - attributes tree. Mac OS X 10.4 and later support extended attributes by making use of the HFS+ filesystem Attributes file B*-tree feature which allows for named forks. Mac OS X supports only inline extended attributes, limiting their size to 3802 bytes. Any regular file may have a list of extended attributes. HFS+ supports an arbitrary number of named forks. Each attribute is denoted by a name and the associated data. The name is a null-terminated Unicode string. It is possible to list, to get, to set, and to remove extended attributes from files or directories. It exists some peculiarity during getting of extended attributes list by means of getfattr utility. The getfattr utility expects prefix "user." before any extended attribute's name. So, it ignores any names that don't contained such prefix. Such behavior of getfattr utility results in unexpected empty output of extended attributes list even in the case when file (or folder) contains extended attributes. It needs to use empty string as regular expression pattern for names matching (getfattr --match=""). For support of extended attributes in HFS+: 1. It was added necessary on-disk layout declarations related to Attributes tree into hfsplus_raw.h file. 2. It was added attributes.c file with implementation of functionality of manipulation by records in Attributes tree. 3. It was reworked hfsplus_listxattr, hfsplus_getxattr, hfsplus_setxattr functions in ioctl.c. Moreover, it was added hfsplus_removexattr method. This patch: Add osx.* prefix for handling namespace of Mac OS X extended attributes. [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28lib/scatterlist: use page iterator in the mapping iteratorImre Deak
For better code reuse use the newly added page iterator to iterate through the pages. The offset, length within the page is still calculated by the mapping iterator as well as the actual mapping. Idea from Tejun Heo. Signed-off-by: Imre Deak <imre.deak@intel.com> Cc: Maxim Levitsky <maximlevitsky@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: James Hogan <james.hogan@imgtec.com> Cc: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28lib/scatterlist: add simple page iteratorImre Deak
Add an iterator to walk through a scatter list a page at a time starting at a specific page offset. As opposed to the mapping iterator this is meant to be small, performing well even in simple loops like collecting all pages on the scatterlist into an array or setting up an iommu table based on the pages' DMA address. Signed-off-by: Imre Deak <imre.deak@intel.com> Cc: Maxim Levitsky <maximlevitsky@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28MAINTAINERS: update Tegra section to capture all Tegra filesStephen Warren
The intent is to ensure that all Tegra-related patches are sent to the linux-tegra@ mailing list, so people can keep up-to-date on all misc driver changes. Doing this with a keyword is far simpler and more compact than listing all Tegra-related drivers, even if wildcards were used. Words such as integrate or integrator are common. Ensure the character right before "tegra" isn't a-z (case-insensitive), to make sure the keyword doesn't match those. The only files that the keyword doesn't match are the NVEC driver. Add the linux-tegra mailing list to the NVEC entry to solve this. Signed-off-by: Stephen Warren <swarren@nvidia.com> Cc: Joe Perches <joe@perches.com> Cc: Julian Andres Klode <jak@jak-linux.org> Cc: Marc Dietrich <marvin24@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28get_maintainer: allow keywords to match filenamesStephen Warren
Allow K: entries in MAINTAINERS to match directly against filenames; either those extracted from patch +++ or --- lines, or those specified on the command-line using the -f option. This potentially allows fewer lines in a MAINTAINERS entry, if all the relevant files are scattered throughout the whole kernel tree, yet contain some common keyword. An example would be using an ARM SoC name as the keyword to catch all related drivers. I don't think setting exact_pattern_match_hash would be appropriate here; at least for intended Tegra use case, this feature is to ensure that all Tegra-related driver changes get Cc'd to the Tegra mailing list. Setting exact_pattern_match_hash would prevent git history parsing for e.g. S-o-b tags, which still seems like it would be useful. Hence, this flag isn't set. Signed-off-by: Stephen Warren <swarren@nvidia.com> Acked-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28usermodehelper: cleanup/fix __orderly_poweroff() && argv_free()Oleg Nesterov
__orderly_poweroff() does argv_free() if call_usermodehelper_fns() returns -ENOMEM. As Lucas pointed out, this can be wrong if -ENOMEM was not triggered by the failing call_usermodehelper_setup(), in this case both __orderly_poweroff() and argv_cleanup() can do kfree(). Kill argv_cleanup() and change __orderly_poweroff() to call argv_free() unconditionally like do_coredump() does. This info->cleanup() is not needed (and wrong) since 6c0c0d4d "fix bug in orderly_poweroff() which did the UMH_NO_WAIT => UMH_WAIT_EXEC change, we can rely on the fact that CLONE_VFORK can't return until do_execve() succeeds/fails. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Lucas De Marchi <lucas.demarchi@profusion.mobi> Cc: David Howells <dhowells@redhat.com> Cc: James Morris <james.l.morris@oracle.com> Cc: hongfeng <hongfeng@marvell.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28mm: use vm_unmapped_area() on frv architectureMichel Lespinasse
Update the frv arch_get_unmapped_area function to make use of vm_unmapped_area() instead of implementing a brute force search. Signed-off-by: Michel Lespinasse <walken@google.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28ocfs2: ac->ac_allow_chain_relink=0 won't disable group relinkXiaowei.Hu
ocfs2_block_group_alloc_discontig() disables chain relink by setting ac->ac_allow_chain_relink = 0 because it grabs clusters from multiple cluster groups. It doesn't keep the credits for all chain relink,but ocfs2_claim_suballoc_bits overrides this in this call trace: ocfs2_block_group_claim_bits()->ocfs2_claim_clusters()-> __ocfs2_claim_clusters()->ocfs2_claim_suballoc_bits() ocfs2_claim_suballoc_bits set ac->ac_allow_chain_relink = 1; then call ocfs2_search_chain() one time and disable it again, and then we run out of credits. Fix is to allow relink by default and disable it in ocfs2_block_group_alloc_discontig. Without this patch, End-users will run into a crash due to run out of credits, backtrace like this: RIP: 0010:[<ffffffffa0808b14>] [<ffffffffa0808b14>] jbd2_journal_dirty_metadata+0x164/0x170 [jbd2] RSP: 0018:ffff8801b919b5b8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88022139ddc0 RCX: ffff880159f652d0 RDX: ffff880178aa3000 RSI: ffff880159f652d0 RDI: ffff880087f09bf8 RBP: ffff8801b919b5e8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000001e00 R11: 00000000000150b0 R12: ffff880159f652d0 R13: ffff8801a0cae908 R14: ffff880087f09bf8 R15: ffff88018d177800 FS: 00007fc9b0b6b6e0(0000) GS:ffff88022fd40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 000000000040819c CR3: 0000000184017000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process dd (pid: 9945, threadinfo ffff8801b919a000, task ffff880149a264c0) Call Trace: ocfs2_journal_dirty+0x2f/0x70 [ocfs2] ocfs2_relink_block_group+0x111/0x480 [ocfs2] ocfs2_search_chain+0x455/0x9a0 [ocfs2] ... Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28ocfs2: fix ocfs2_init_security_and_acl() to initialize acl correctlyJeff Liu
We need to re-initialize the security for a new reflinked inode with its parent dirs if it isn't specified to be preserved for ocfs2_reflink(). However, the code logic is broken at ocfs2_init_security_and_acl() although ocfs2_init_security_get() succeed. As a result, ocfs2_acl_init() does not involked and therefore the default ACL of parent dir was missing on the new inode. Note this was introduced by 9d8f13ba3 ("security: new security_inode_init_security API adds function callback") To reproduce: set default ACL for the parent dir(ocfs2 in this case): $ setfacl -m default:user:jeff:rwx ../ocfs2/ $ getfacl ../ocfs2/ # file: ../ocfs2/ # owner: jeff # group: jeff user::rwx group::r-x other::r-x default:user::rwx default:user:jeff:rwx default:group::r-x default:mask::rwx default:other::r-x $ touch a $ getfacl a # file: a # owner: jeff # group: jeff user::rw- group::rw- other::r-- Before patching, create reflink file b from a, the user default ACL entry(user:jeff:rwx)was missing: $ ./ocfs2_reflink a b $ getfacl b # file: b # owner: jeff # group: jeff user::rw- group::rw- other::r-- In this case, the end user can also observed an error message at syslog: (ocfs2_reflink,3229,2):ocfs2_init_security_and_acl:7193 ERROR: status = 0 After applying this patch, create reflink file c from a: $ ./ocfs2_reflink a c $ getfacl c # file: c # owner: jeff # group: jeff user::rw- user:jeff:rwx #effective:rw- group::r-x #effective:r-- mask::rw- other::r-- Test program: /* Usage: reflink <source> <dest> */ #include <stdio.h> #include <stdint.h> #include <stdbool.h> #include <string.h> #include <errno.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sys/ioctl.h> static int reflink_file(char const *src_name, char const *dst_name, bool preserve_attrs) { int fd; #ifndef REFLINK_ATTR_NONE # define REFLINK_ATTR_NONE 0 #endif #ifndef REFLINK_ATTR_PRESERVE # define REFLINK_ATTR_PRESERVE 1 #endif #ifndef OCFS2_IOC_REFLINK struct reflink_arguments { uint64_t old_path; uint64_t new_path; uint64_t preserve; }; # define OCFS2_IOC_REFLINK _IOW ('o', 4, struct reflink_arguments) #endif struct reflink_arguments args = { .old_path = (unsigned long) src_name, .new_path = (unsigned long) dst_name, .preserve = preserve_attrs ? REFLINK_ATTR_PRESERVE : REFLINK_ATTR_NONE, }; fd = open(src_name, O_RDONLY); if (fd < 0) { fprintf(stderr, "Failed to open %s: %s\n", src_name, strerror(errno)); return -1; } if (ioctl(fd, OCFS2_IOC_REFLINK, &args) < 0) { fprintf(stderr, "Failed to reflink %s to %s: %s\n", src_name, dst_name, strerror(errno)); return -1; } } int main(int argc, char *argv[]) { if (argc != 3) { fprintf(stdout, "Usage: %s source dest\n", argv[0]); return 1; } return reflink_file(argv[1], argv[2], 0); } Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Tao Ma <boyu.mt@taobao.com> Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28scripts/kernel-doc: handle struct member __aligned without numbersNishanth Menon
Commit ef5da59f1260 ("scripts/kernel-doc: handle struct member __aligned") permits "char something [123] __aligned(8);". However, by using \d we constraint ourselves with integers. This is not always the case. In fact, it might be better to do char something[123] __aligned(sizeof(u16)); For example, With wireless_dev defining: u8 address[ETH_ALEN] __aligned(sizeof(u16)); With \d, scripts/kernel-doc erroneously says: Warning(include/net/cfg80211.h:2618): Excess struct/union/enum/typedef member 'address' description in 'wireless_dev' This is because the regex __aligned\s*\(\d+\) fails match at \d as sizeof is used. So replace \d with . to indicate "something" in kernel-doc to ignore __aligned(SOMETHING) in structs. With this change, we can use integers OR sizeof() or macros as we please. Signed-off-by: Nishanth Menon <nm@ti.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Johannes Berg <johannes.berg@intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Michal Marek <mmarek@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28mm: accelerate munlock() treatment of THP pagesMichel Lespinasse
munlock_vma_pages_range() was always incrementing addresses by PAGE_SIZE at a time. When munlocking THP pages (or the huge zero page), this resulted in taking the mm->page_table_lock 512 times in a row. We can do better by making use of the page_mask returned by follow_page_mask (for the huge zero page case), or the size of the page munlock_vma_page() operated on (for the true THP page case). Signed-off-by: Michel Lespinasse <walken@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28backlight: add new lp8788 backlight driverKim, Milo
TI LP8788 PMU supports regulators, battery charger, RTC, ADC, backlight dri= ver and current sinks. This patch enables LP8788 backlight module. (Brightness mode) The brightness is controlled by PWM input or I2C register. All modes are supported in the driver. (Platform data) Configurable data can be defined in the platform side. name : backlight driver name. (default: "lcd-backlight") initial_brightness : initial value of backlight brightness bl_mode : brightness control by PWM or lp8788 register dim_mode : dimming mode selection full_scale : full scale current setting rise_time : brightness ramp up step time fall_time : brightness ramp down step time pwm_pol : PWM polarity setting when bl_mode is PWM based period_ns : platform specific PWM period value. unit is nano. The default values are set in case no platform data is defined. [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: Milo(Woogyom) Kim <milo.kim@ti.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Samuel Ortiz <sameo@linux.intel.com> Cc: Thierry Reding <thierry.reding@avionic-design.de> Cc: "devendra.aaru" <devendra.aaru@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28lib/devres.c: fix misplaced #endifJingoo Han
A misplaced #endif causes link errors related to pcim_*() functions. This is because pcim_*() functions are related to CONFIG_PCI option, however these are not related to CONFIG_HAS_IOPORT option. Therefore, when CONFIG_PCI is enabled and CONFIG_HAS_IOPORT is not enabled, it makes link errors related to pcim_*() functions as below: drivers/ata/libata-sff.c:3233: undefined reference to `pcim_iomap_regions' drivers/ata/libata-sff.c:3238: undefined reference to `pcim_iomap_table' drivers/built-in.o: In function `ata_pci_sff_init_host': drivers/ata/libata-sff.c:2318: undefined reference to `pcim_iomap_regions' drivers/ata/libata-sff.c:2329: undefined reference to `pcim_iomap_table Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Greg KH <greg@kroah.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28mm: use vm_unmapped_area() on parisc architectureMichel Lespinasse
Update the parisc arch_get_unmapped_area function to make use of vm_unmapped_area() instead of implementing a brute force search. [akpm@linux-foundation.org: remove now-unused DCACHE_ALIGN(), per James] Signed-off-by: Michel Lespinasse <walken@google.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Acked-by: Helge Deller <deller@gmx.de> Tested-by: Helge Deller <deller@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28drivers/video/backlight/ams369fg06.c: make power_on() call optionalJingoo Han
This patch makes power_on() call optional. The voltage source can be provided to some boards using ams369fg06 panel, thus in this case, power on/off sequence is not necessary. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28checkpatch: improve CamelCase test for PageJoe Perches
Add the ClearPage/SetPage/TestClearPage/TestSetPage variants to the not reported Page CamelCase variables. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27sysrq: don't depend on weak undefined arrays to have an address that ↵Linus Torvalds
compares as NULL When taking an address of an extern array, gcc quite naturally should be able to say "an address of an object can never be NULL" and just optimize away the test entirely. However, the new alternate sysrq reset code (commit 154b7a489a5b: "Input: sysrq - allow specifying alternate reset sequence") did exactly that, and declared platform_sysrq_reset_seq[] as a weak array, and expecting that testing the address of the array would show whether it actually got linked against something or not. And that doesn't work with all gcc versions. Clearly it works with *some* versions of gcc, and maybe it's even supposed to work, but it really is a very fragile concept. So instead of testing the address of the weak variable, just create a weak instance of that array that is empty. If some platform then has a real platform_sysrq_reset_seq[] that overrides our weak one, the linker will switch to that one, and it all works without any run-time conditionals at all. Reported-by: Dave Airlie <airlied@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27mm: do not grow the stack vma just because of an overrun on preceding vmaLinus Torvalds
The stack vma is designed to grow automatically (marked with VM_GROWSUP or VM_GROWSDOWN depending on architecture) when an access is made beyond the existing boundary. However, particularly if you have not limited your stack at all ("ulimit -s unlimited"), this can cause the stack to grow even if the access was really just one past *another* segment. And that's wrong, especially since we first grow the segment, but then immediately later enforce the stack guard page on the last page of the segment. So _despite_ first growing the stack segment as a result of the access, the kernel will then make the access cause a SIGSEGV anyway! So do the same logic as the guard page check does, and consider an access to within one page of the next segment to be a bad access, rather than growing the stack to abut the next segment. Reported-and-tested-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile (part one) from Al Viro: "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent locking violations, etc. The most visible changes here are death of FS_REVAL_DOT (replaced with "has ->d_weak_revalidate()") and a new helper getting from struct file to inode. Some bits of preparation to xattr method interface changes. Misc patches by various people sent this cycle *and* ocfs2 fixes from several cycles ago that should've been upstream right then. PS: the next vfs pile will be xattr stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) saner proc_get_inode() calling conventions proc: avoid extra pde_put() in proc_fill_super() fs: change return values from -EACCES to -EPERM fs/exec.c: make bprm_mm_init() static ocfs2/dlm: use GFP_ATOMIC inside a spin_lock ocfs2: fix possible use-after-free with AIO ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero target: writev() on single-element vector is pointless export kernel_write(), convert open-coded instances fs: encode_fh: return FILEID_INVALID if invalid fid_type kill f_vfsmnt vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op nfsd: handle vfs_getattr errors in acl protocol switch vfs_getattr() to struct path default SET_PERSONALITY() in linux/elf.h ceph: prepopulate inodes only when request is aborted d_hash_and_lookup(): export, switch open-coded instances 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate() 9p: split dropping the acls from v9fs_set_create_acl() ...
2013-02-27Merge tag 'xtensa-next-20130225' of git://github.com/czankel/xtensa-linuxLinus Torvalds
Pull xtensa update from Chris Zankel: "Added features: - add support for thread local storage (TLS) - add accept4 and finit_module syscalls - support medium-priority interrupts - add support for dc232c processor variant - support file-base simulated disk for ISS simulator Bug fixes: - fix return values returned by the str[n]cmp functions - avoid mmap cache aliasing - fix handling of 'windowed registers' in ptrace" * tag 'xtensa-next-20130225' of git://github.com/czankel/xtensa-linux: xtensa: add accept4 syscall xtensa: add support for TLS xtensa: add missing include asm/uaccess.h to checksum.h xtensa: do not enable GENERIC_GPIO by default xtensa: complete ptrace handling of register windows xtensa: add support for oprofile xtensa: move spill_registers to traps.h xtensa: ISS: add host file-based simulated disk xtensa: fix str[n]cmp return value xtensa: avoid mmap cache aliasing xtensa: add finit_module syscall xtensa: pull signal definitions from signal-defs.h xtensa: fix ipc_parse_version selection xtensa: dispatch medium-priority interrupts xtensa: Add config files for Diamond 233L - Rev C processor variant xtensa: use new common dtc rule xtensa: rename prom_update_property to of_update_property
2013-02-27Merge branch 'next' of git://git.monstr.eu/linux-2.6-microblazeLinus Torvalds
Pull microblaze update from Michal Simek: "Microblaze changes. After my discussion with Arnd I have also added there asm-generic io patch which is Acked by him and Geert." * 'next' of git://git.monstr.eu/linux-2.6-microblaze: asm-generic: io: Fix ioread16/32be and iowrite16/32be microblaze: Do not use module.h in files which are not modules microblaze: Fix coding style issues microblaze: Add missing return from debugfs_tlb microblaze: Makefile clean microblaze: Add .gitignore entries for auto-generated files microblaze: Fix strncpy_from_user macro
2013-02-27Merge branch 'for-upstream' of git://openrisc.net/jonas/linuxLinus Torvalds
Pull OpenRISC updates from Jonas Bonn: "An equal number of bug fixes and trivial cleanups; no new features. - Two patches to fix errors thrown by the updated toolchain. - Three other bug fixes. - Four trivial cleanups." * 'for-upstream' of git://openrisc.net/jonas/linux: openrisc: add missing header inclusion openrisc: really pass correct arg to schedule_tail Add bitops include needed for ext2 filesystem openrisc: update DTLB-miss handler last openrisc: fix up vmalloc page table loading openrisc idle: delete pm_idle openrisc: remove CONFIG_SYMBOL_PREFIX openrisc: avoid using function parameter regs in reset vector openrisc: remove unused current_regs
2013-02-27Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar. * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm/pageattr: Prevent PSE and GLOABL leftovers to confuse pmd/pte_present and pmd_huge Revert "x86, mm: Make spurious_fault check explicitly check explicitly check the PRESENT bit" x86/mm/numa: Don't check if node is NUMA_NO_NODE x86, efi: Make "noefi" really disable EFI runtime serivces x86/apic: Fix parsing of the 'lapic' cmdline option
2013-02-27Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Ingo Molnar. * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource : Nomadik-mtu : fix missing irq initialization posix-timer: Don't call idr_find() with out-of-range ID
2013-02-27Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar. * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cputime: Use local_clock() for full dynticks cputime accounting cputime: Constify timeval_to_cputime(timeval) argument sched: Move RR_TIMESLICE from sysctl.h to rt.h sched: Fix /proc/sched_debug failure on very very large systems sched: Fix /proc/sched_stat failure on very very large systems sched/core: Remove the obsolete and unused nr_uninterruptible() function
2013-02-27Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar. * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86: Add Intel IvyBridge event scheduling constraints ftrace: Call ftrace cleanup module notifier after all other notifiers tracing/syscalls: Allow archs to ignore tracing compat syscalls
2013-02-26Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Theodore Ts'o: "The one new feature added in this patch series is the ability to use the "punch hole" functionality for inodes that are not using extent maps. In the bug fix category, we fixed some races in the AIO and fstrim code, and some potential NULL pointer dereferences and memory leaks in error handling code paths. In the optimization category, we fixed a performance regression in the jbd2 layer introduced by commit d9b01934d56a ("jbd: fix fsync() tid wraparound bug", introduced in v3.0) which shows up in the AIM7 benchmark. We also further optimized jbd2 by minimize the amount of time that transaction handles are held active. This patch series also features some additional enhancement of the extent status tree, which is now used to cache extent information in a more efficient/compact form than what we use on-disk." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (65 commits) ext4: fix free clusters calculation in bigalloc filesystem ext4: no need to remove extent if len is 0 in ext4_es_remove_extent() ext4: fix xattr block allocation/release with bigalloc ext4: reclaim extents from extent status tree ext4: adjust some functions for reclaiming extents from extent status tree ext4: remove single extent cache ext4: lookup block mapping in extent status tree ext4: track all extent status in extent status tree ext4: let ext4_ext_map_blocks return EXT4_MAP_UNWRITTEN flag ext4: rename and improbe ext4_es_find_extent() ext4: add physical block and status member into extent status tree ext4: refine extent status tree ext4: use ERR_PTR() abstraction for ext4_append() ext4: refactor code to read directory blocks into ext4_read_dirblock() ext4: add debugging context for warning in ext4_da_update_reserve_space() ext4: use KERN_WARNING for warning messages jbd2: use module parameters instead of debugfs for jbd_debug ext4: use module parameters instead of debugfs for mballoc_debug ext4: start handle at the last possible moment when creating inodes ext4: fix the number of credits needed for acl ops with inline data ...