summaryrefslogtreecommitdiff
path: root/kernel/kexec.c
AgeCommit message (Collapse)Author
2014-01-09kexec: migrate to reboot cpuVivek Goyal
commit c97102ba96324da330078ad8619ba4dfe840dbe3 upstream. Commit 1b3a5d02ee07 ("reboot: move arch/x86 reboot= handling to generic kernel") moved reboot= handling to generic code. In the process it also removed the code in native_machine_shutdown() which are moving reboot process to reboot_cpu/cpu0. I guess that thought must have been that all reboot paths are calling migrate_to_reboot_cpu(), so we don't need this special handling. But kexec reboot path (kernel_kexec()) is not calling migrate_to_reboot_cpu() so above change broke kexec. Now reboot can happen on non-boot cpu and when INIT is sent in second kerneo to bring up BP, it brings down the machine. So start calling migrate_to_reboot_cpu() in kexec reboot path to avoid this problem. Bisected by WANG Chao. Reported-by: Matthew Whitehead <mwhitehe@redhat.com> Reported-by: Dave Young <dyoung@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Tested-by: Baoquan He <bhe@redhat.com> Tested-by: WANG Chao <chaowang@redhat.com> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-20PCI: Disable Bus Master only on kexec rebootKhalid Aziz
commit 4fc9bbf98fd66f879e628d8537ba7c240be2b58e upstream. Add a flag to tell the PCI subsystem that kernel is shutting down in preparation to kexec a kernel. Add code in PCI subsystem to use this flag to clear Bus Master bit on PCI devices only in case of kexec reboot. This fixes a power-off problem on Acer Aspire V5-573G and likely other machines and avoids any other issues caused by clearing Bus Master bit on PCI devices in normal shutdown path. The problem was introduced by b566a22c2332 ("PCI: disable Bus Master on PCI device shutdown"). This patch is based on discussion at http://marc.info/?l=linux-pci&m=138425645204355&w=2 Link: https://bugzilla.kernel.org/show_bug.cgi?id=63861 Reported-by: Chang Liu <cl91tp@gmail.com> Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Konstantin Khlebnikov <koct9i@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-11kexec: remove unnecessary returnXishi Qiu
Code can not run here forever, so remove the unnecessary return. Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> Suggested-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Reviewed-by: Simon Horman <horms@verge.net.au> Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01kexec: Use min() and min_t() to simplify logicZhang Yanfei
Simplify the logic of variable assignments. [akpm@linux-foundation.org: replace min_t with min, remove unneeded casts] Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Reviewed-by: Simon Horman <horms@verge.net.au> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01kexec: fix wrong types of some local variablesZhang Yanfei
The types of the following local variables: - ubytes/mbytes in kimage_load_crash_segment()/kimage_load_normal_segment() - r in vmcoreinfo_append_str() are wrong, so fix them. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Simon Horman <horms@verge.net.au> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29kexec, vmalloc: export additional vmalloc layer informationAtsushi Kumagai
Now, vmap_area_list is exported as VMCOREINFO for makedumpfile to get the start address of vmalloc region (vmalloc_start). The address which contains vmalloc_start value is represented as below: vmap_area_list.next - OFFSET(vmap_area.list) + OFFSET(vmap_area.va_start) However, both OFFSET(vmap_area.va_start) and OFFSET(vmap_area.list) aren't exported as VMCOREINFO. So this patch exports them externally with small cleanup. [akpm@linux-foundation.org: vmalloc.h should include list.h for list_head] Signed-off-by: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Dave Anderson <anderson@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ingo Molnar <mingo@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm, vmalloc: export vmap_area_list, instead of vmlistJoonsoo Kim
Although our intention is to unexport internal structure entirely, but there is one exception for kexec. kexec dumps address of vmlist and makedumpfile uses this information. We are about to remove vmlist, then another way to retrieve information of vmalloc layer is needed for makedumpfile. For this purpose, we export vmap_area_list, instead of vmlist. Signed-off-by: Joonsoo Kim <js1304@gmail.com> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Dave Anderson <anderson@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm,kexec: use common help functions to free reserved pagesJiang Liu
Use common help functions to free reserved pages. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Cc: Eric Biederman <ebiederm@xmission.com> Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-17kexec: use Crash kernel for Crash kernel lowYinghai Lu
We can extend kexec-tools to support multiple "Crash kernel" in /proc/iomem instead. So we can use "Crash kernel" instead of "Crash kernel low" in /proc/iomem. Suggested-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1366089828-19692-3-git-send-email-yinghai@kernel.org Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-17x86, kdump: Change crashkernel_high/low= to crashkernel=,high/lowYinghai Lu
Per hpa, use crashkernel=X,high crashkernel=Y,low instead of crashkernel_hign=X crashkernel_low=Y. As that could be extensible. -v2: according to Vivek, change delimiter to ; -v3: let hign and low only handle simple form and it conforms to description in kernel-parameters.txt still keep crashkernel=X override any crashkernel=X,high crashkernel=Y,low -v4: update get_last_crashkernel returning and add more strict checking in parse_crashkernel_simple() found by HATAYAMA. -v5: Change delimiter back to , according to HPA. also separate parse_suffix from parse_simper according to vivek. so we can avoid @pos in that path. -v6: Tight the checking about crashkernel=X,highblahblah,high found by HTYAYAMA. Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1366089828-19692-5-git-send-email-yinghai@kernel.org Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-17x86, kdump: Retore crashkernel= to allocate under 896MYinghai Lu
Vivek found old kexec-tools does not work new kernel anymore. So change back crashkernel= back to old behavoir, and add crashkernel_high= to let user decide if buffer could be above 4G, and also new kexec-tools will be needed. -v2: let crashkernel=X override crashkernel_high= update description about _high will be ignored by crashkernel=X -v3: update description about kernel-parameters.txt according to Vivek. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1366089828-19692-4-git-send-email-yinghai@kernel.org Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-02-28kexec: avoid freeing NULL pointer in image_crash_alloc()Zhang Yanfei
Though there is no error if we free a NULL pointer, I think we could avoid this behaviour. Change the code a little in kimage_crash_alloc() could avoid this kind of unnecessary free. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Sasha Levin <sasha.levin@oracle.com> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28kexec: fix memory leak in function kimage_normal_allocZhang Yanfei
If kimage_normal_alloc() fails to alloc pages for image->swap_page, it should call kimage_free_page_list() to free allocated pages in image->control_pages list before it frees image. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Sasha Levin <sasha.levin@oracle.com> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28kexec: prevent double free on image allocation failureSasha Levin
If kimage_normal_alloc() fails to initialize an allocated kimage, it will free the image but would still set 'rimage', as a result kexec_load will try to free it again. This would explode as part of the freeing process is accessing internal members which point to uninitialized memory. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28kexec: export PG_hwpoison flag into vmcoreinfoMitsuhiro Tanino
This patch exports a PG_hwpoison into vmcoreinfo when CONFIG_MEMORY_FAILURE is defined. "makedumpfile" needs to read information of memory, such as 'mem_section', 'zone', 'pageflags' from vmcore. We introduce a function into "makedumpfile" to exclude hwpoison page from vmcore dump. In order to introduce this function, PG_hwpoison flag have to export into vmcoreinfo. Signed-off-by: Mitsuhiro Tanino <mitsuhiro.tanino.gm@hitachi.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Mitsuhiro Tanino <mitsuhiro.tanino.gm@hitachi.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28kexec: get rid of duplicate check for hole_endZhang Yanfei
hole_end has been checked to make sure it is <= crash_res.end in the while condition check, so the if condition check is duplicate. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-28kexec: add the values related to buddy system for filtering free pages.Atsushi Kumagai
tAdd adds the values related to buddy system to vmcoreinfo data so that makedumpfile (dump filtering command) can filter out all free pages with the new logic. It's faster than the current logic because it can distinguish free page by analyzing page structure at the same time as filtering for other unnecessary pages (e.g. anonymous page). OTOH, the current logic has to trace free_list to distinguish free pages while analyzing page structure to filter out other unnecessary pages. The new logic uses the fact that buddy page is marked by _mapcount == PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab is set or not before looking up _mapcount value. And we can get the order of buddy system from private field. To sum it up, the values below are required for this logic. Required values: - OFFSET(page._mapcount) - OFFSET(page.private) - NUMBER(PG_slab) - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) Changelog from v1 to v2: 1. remove SIZE(pageflags) The new logic was changed after I sent v1 patch. Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile. What's makedumpfile: makedumpfile creates a small dumpfile by excluding unnecessary pages for the analysis. To distinguish unnecessary pages, makedumpfile gets the vmcoreinfo data which has the minimum debugging information only for dump filtering. Signed-off-by: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-30x86: Add Crash kernel low reservationYinghai Lu
During kdump kernel's booting stage, it need to find low ram for swiotlb buffer when system does not support intel iommu/dmar remapping. kexed-tools is appending memmap=exactmap and range from /proc/iomem with "Crash kernel", and that range is above 4G for 64bit after boot protocol 2.12. We need to add another range in /proc/iomem like "Crash kernel low", so kexec-tools could find that info and append to kdump kernel command line. Try to reserve some under 4G if the normal "Crash kernel" is above 4G. User could specify the size with crashkernel_low=XX[KMG]. -v2: fix warning that is found by Fengguang's test robot. -v3: move out get_mem_size change to another patch, to solve compiling warning that is found by Borislav Petkov <bp@alien8.de> -v4: user must specify crashkernel_low if system does not support intel or amd iommu. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-31-git-send-email-yinghai@kernel.org Cc: Eric Biederman <ebiederm@xmission.com> Cc: Rob Landley <rob@landley.net> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-10-05kdump: remove unneeded includeWei Yongjun
The inclusion of <generated/utsrelease.h> is unnecessary. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-31kdump: append newline to the last lien of vmcoreinfo noteVivek Goyal
The last line of vmcoreinfo note does not end with \n. Parsing all the lines in note becomes easier if all lines end with \n instead of trying to special case the last line. I know at least one tool, vmcore-dmesg in kexec-tools tree which made the assumption that all lines end with \n. I think it is a good idea to fix it. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-29Merge branch 'akpm' (Andrew's patch-bomb)Linus Torvalds
Merge third batch of patches from Andrew Morton: - Some MM stragglers - core SMP library cleanups (on_each_cpu_mask) - Some IPI optimisations - kexec - kdump - IPMI - the radix-tree iterator work - various other misc bits. "That'll do for -rc1. I still have ~10 patches for 3.4, will send those along when they've baked a little more." * emailed from Andrew Morton <akpm@linux-foundation.org>: (35 commits) backlight: fix typo in tosa_lcd.c crc32: add help text for the algorithm select option mm: move hugepage test examples to tools/testing/selftests/vm mm: move slabinfo.c to tools/vm mm: move page-types.c from Documentation to tools/vm selftests/Makefile: make `run_tests' depend on `all' selftests: launch individual selftests from the main Makefile radix-tree: use iterators in find_get_pages* functions radix-tree: rewrite gang lookup using iterator radix-tree: introduce bit-optimized iterator fs/proc/namespaces.c: prevent crash when ns_entries[] is empty nbd: rename the nbd_device variable from lo to nbd pidns: add reboot_pid_ns() to handle the reboot syscall sysctl: use bitmap library functions ipmi: use locks on watchdog timeout set on reboot ipmi: simplify locking ipmi: fix message handling during panics ipmi: use a tasklet for handling received messages ipmi: increase KCS timeouts ipmi: decrease the IPMI message transaction time in interrupt mode ...
2012-03-29kexec: add further check to crashkernelZhenzhong Duan
When using crashkernel=2M-256M, the kernel doesn't give any warning. This is misleading sometimes. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-29kexec: crash: don't save swapper_pg_dir for !CONFIG_MMU configurationsWill Deacon
nommu platforms don't have very interesting swapper_pg_dir pointers and usually just #define them to NULL, meaning that we can't include them in the vmcoreinfo on the kexec crash path. This patch only saves the swapper_pg_dir if we have an MMU. Signed-off-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Simon Horman <horms@verge.net.au> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28Remove all #inclusions of asm/system.hDavid Howells
Remove all #inclusions of asm/system.h preparatory to splitting and killing it. Performed with the following command: perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *` Signed-off-by: David Howells <dhowells@redhat.com>
2012-01-29PM / Sleep: Introduce "late suspend" and "early resume" of devicesRafael J. Wysocki
The current device suspend/resume phases during system-wide power transitions appear to be insufficient for some platforms that want to use the same callback routines for saving device states and related operations during runtime suspend/resume as well as during system suspend/resume. In principle, they could point their .suspend_noirq() and .resume_noirq() to the same callback routines as their .runtime_suspend() and .runtime_resume(), respectively, but at least some of them require device interrupts to be enabled while the code in those routines is running. It also makes sense to have device suspend-resume callbacks that will be executed with runtime PM disabled and with device interrupts enabled in case someone needs to run some special code in that context during system-wide power transitions. Apart from this, .suspend_noirq() and .resume_noirq() were introduced as a workaround for drivers using shared interrupts and failing to prevent their interrupt handlers from accessing suspended hardware. It appears to be better not to use them for other porposes, or we may have to deal with some serious confusion (which seems to be happening already). For the above reasons, introduce new device suspend/resume phases, "late suspend" and "early resume" (and analogously for hibernation) whose callback will be executed with runtime PM disabled and with device interrupts enabled and whose callback pointers generally may point to runtime suspend/resume routines. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reviewed-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Reviewed-by: Kevin Hilman <khilman@ti.com>
2012-01-13kdump: crashk_res init check for /sys/kernel/kexec_crash_sizeMichael Holzheu
Currently it is possible to set the crash_size via the sysfs /sys/kernel/kexec_crash_size even if no crash kernel memory has been defined with the "crashkernel" parameter. In this case "crashk_res" is not initialized and crashk_res.start = crashk_res.end = 0. Unfortunately resource_size(&crashk_res) returns 1 in this case. This breaks the s390 implementation of crash_(un)map_reserved_pages(). To fix the problem the correct "old_size" is now calculated in crash_shrink_memory(). "old_size is set to "0" if crashk_res is not initialized. With this change crash_shrink_memory() will do nothing, when "crashk_res" is not initialized. It will return "0" for "echo 0 > /sys/kernel/kexec_crash_size" and -EINVAL for "echo [not zero] > /sys/kernel/kexec_crash_size". In addition to that this patch also simplifies the "ret = -EINVAL" vs. "ret = 0" logic as suggested by Simon Horman. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Reviewed-by: Dave Young <dyoung@redhat.com> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> Reviewed-by: Simon Horman <horms@verge.net.au> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13kdump: add missing RAM resource in crash_shrink_memory()Michael Holzheu
When shrinking crashkernel memory using /sys/kernel/kexec_crash_size for the newly added memory no RAM resource is created at the moment. Example: $ cat /proc/iomem 00000000-bfffffff : System RAM 00000000-005b7ac3 : Kernel code 005b7ac4-009743bf : Kernel data 009bb000-00a85c33 : Kernel bss c0000000-cfffffff : Crash kernel d0000000-ffffffff : System RAM $ echo 0 > /sys/kernel/kexec_crash_size $ cat /proc/iomem 00000000-bfffffff : System RAM 00000000-005b7ac3 : Kernel code 005b7ac4-009743bf : Kernel data 009bb000-00a85c33 : Kernel bss <<-- here is System RAM missing d0000000-ffffffff : System RAM One result of this bug is that the memory chunk can never be set offline using memory hotplug. With this patch I insert a new "System RAM" resource for the released memory. Then the upper example looks like the following: $ echo 0 > /sys/kernel/kexec_crash_size $ cat /proc/iomem 00000000-bfffffff : System RAM 00000000-005b7ac3 : Kernel code 005b7ac4-009743bf : Kernel data 009bb000-00a85c33 : Kernel bss c0000000-cfffffff : System RAM <<-- new rescoure d0000000-ffffffff : System RAM And now I can set chunk c0000000-cfffffff offline. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13kexec: remove KMSG_DUMP_KEXECWANG Cong
KMSG_DUMP_KEXEC is useless because we already save kernel messages inside /proc/vmcore, and it is unsafe to allow modules to do other stuffs in a crash dump scenario. [akpm@linux-foundation.org: fix powerpc build] Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com> Reported-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Jarod Wilson <jarod@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-08PM / Sleep: Replace mutex_[un]lock(&pm_mutex) with [un]lock_system_sleep()Srivatsa S. Bhat
Using [un]lock_system_sleep() is safer than directly using mutex_[un]lock() on 'pm_mutex', since the latter could lead to freezing failures. Hence convert all the present users of mutex_[un]lock(&pm_mutex) to use these safe APIs instead. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-10-30[S390] kdump: Add infrastructure for unmapping crashkernel memoryMichael Holzheu
This patch introduces a mechanism that allows architecture backends to remove page tables for the crashkernel memory. This can protect the loaded kdump kernel from being overwritten by broken kernel code. Two new functions crash_map_reserved_pages() and crash_unmap_reserved_pages() are added that can be implemented by architecture code. The crash_map_reserved_pages() function is called before and crash_unmap_reserved_pages() after the crashkernel segments are loaded. The functions are also called in crash_shrink_memory() to create/remove page tables when the crashkernel memory size is reduced. To support architectures that have large pages this patch also introduces a new define KEXEC_CRASH_MEM_ALIGN. The crashkernel start and size must always be aligned with KEXEC_CRASH_MEM_ALIGN. Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30[S390] kdump: Initialize vmcoreinfo note at startupMichael Holzheu
Currently the vmcoreinfo note is only initialized in case of kdump. On s390 it is possible to create kernel dumps with other dump mechanisms than kdump (e.g. via hypervisor dump or stand-alone dump tools). For those dumps it would also be desirable to include the vmcoreinfo data. To accomplish this, with this patch the vmcoreinfo ELF note is always initialized, not only in case of a (kdump) crash. On s390 we will add an ABI defined pointer at a well known address to vmcoreinfo so that dump analysis tools are able to find this information. In particular on s390 we have a tool named zgetdump. With this tool it is possible to convert dump formats on the fly using fuse. E.g. you can mount a s390 stand-alone dump as ELF dump. When this is done, the tool finds the vmcoreinfo in the stand-alone dump via the well known ABI defined address and it creates the respective VMCOREINFO ELF note in the output ELF dump. This then can be used e.g. by makedumpfile for dump filtering. No more need for a vmlinux file with debug information. So this will look like the following: $ zgetdump --mount standalone.dump -f elf /mnt $ ls /mnt dump.elf $ readelf -n /mnt/dump.elf $ ... VMCOREINFO 0x00000474 Unknown note type: (0x00000000) $ makedumpfile -c -d 31 /mnt/dump.elf dump.kdump Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30[S390] kdump: Add KEXEC_CRASH_CONTROL_MEMORY_LIMITMichael Holzheu
On s390 there is a different KEXEC_CONTROL_MEMORY_LIMIT for the normal and the kdump kexec case. Therefore this patch introduces a new macro KEXEC_CRASH_CONTROL_MEMORY_LIMIT. This is set to KEXEC_CONTROL_MEMORY_LIMIT for all architectures that do not define KEXEC_CRASH_CONTROL_MEMORY_LIMIT. Acked-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-06-10treewide: Convert uses of struct resource to resource_size(ptr)Joe Perches
Several fixes as well where the +1 was missing. Done via coccinelle scripts like: @@ struct resource *ptr; @@ - ptr->end - ptr->start + 1 + resource_size(ptr) and some grep and typing. Mostly uncompiled, no cross-compilers. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-05-11PM: Remove sysdev suspend, resume and shutdown operationsRafael J. Wysocki
Since suspend, resume and shutdown operations in struct sysdev_class and struct sysdev_driver are not used any more, remove them. Also drop sysdev_suspend(), sysdev_resume() and sysdev_shutdown() used for executing those operations and modify all of their users accordingly. This reduces kernel code size quite a bit and reduces its complexity. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-19PM: Add missing syscore_suspend() and syscore_resume() callsRafael J. Wysocki
Device suspend/resume infrastructure is used not only by the suspend and hibernate code in kernel/power, but also by APM, Xen and the kexec jump feature. However, commit 40dc166cb5dddbd36aa4ad11c03915ea (PM / Core: Introduce struct syscore_ops for core subsystems PM) failed to add syscore_suspend() and syscore_resume() calls to that code, which generally leads to breakage when the features in question are used. To fix this problem, add the missing syscore_suspend() and syscore_resume() calls to arch/x86/kernel/apm_32.c, kernel/kexec.c and drivers/xen/manage.c. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Ian Campbell <ian.campbell@citrix.com>
2011-04-07Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6Linus Torvalds
* 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6: Fix common misspellings
2011-04-01kdump: Allow shrinking of kdump region to be overriddenAnton Blanchard
On ppc64 the crashkernel region almost always overlaps an area of firmware. This works fine except when using the sysfs interface to reduce the kdump region. If we free the firmware area we are guaranteed to crash. Rename free_reserved_phys_range to crash_free_reserved_phys_range and make it a weak function so we can override it. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-31Fix common misspellingsLucas De Marchi
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2010-11-01tree-wide: fix comment/printk typosUwe Kleine-König
"gadget", "through", "command", "maintain", "maintain", "controller", "address", "between", "initiali[zs]e", "instead", "function", "select", "already", "equal", "access", "management", "hierarchy", "registration", "interest", "relative", "memory", "offset", "already", Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-10-26use clear_page()/copy_page() in favor of memset()/memcpy() on whole pagesJan Beulich
After all that's what they are intended for. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-11kexec: return -EFAULT on copy_to_user() failuresDan Carpenter
copy_to/from_user() returns the number of bytes remaining to be copied. It never returns a negative value. The correct return code is -EFAULT and not -EIO. All the callers check for non-zero returns so that's Ok, but the return code is passed to the user so we should fix this. Signed-off-by: Dan Carpenter <error27@gmail.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Simon Kagstrom <simon.kagstrom@netinsight.net> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-29kexec: fix Oops in crash_shrink_memory()Pavan Naregundi
When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size" OOPSes the kernel in crash_shrink_memory. This happens when crash_shrink_memory tries to release the 'crashk_res' resource which are not reserved. Also value of "/sys/kernel/kexec_crash_size" shows as 1, which should be 0. This patch fixes the OOPS in crash_shrink_memory and shows "/sys/kernel/kexec_crash_size" as 0 when crash kernel memory is not reserved. Signed-off-by: Pavan Naregundi <pavan@linux.vnet.ibm.com> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> Cc: Simon Horman <horms@verge.net.au> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-12kexec: fix OOPS in crash_kernel_shrinkVitaly Mayatskikh
Two "echo 0 > /sys/kernel/kexec_crash_size" OOPSes kernel. Also content of this file is invalid after first shrink to zero: it shows 1 instead of 0. This scenario is unlikely to happen often (root privs, valid crashkernel= in cmdline, dump-capture kernel not loaded), I hit it only by chance. This patch fixes it. Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com> Cc: Cong Wang <amwang@redhat.com> Cc: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-02-17percpu: add __percpu sparse annotations to core kernel subsystemsTejun Heo
Add __percpu sparse annotations to core subsystems. These annotations are to make sparse consider percpu variables to be in a different address space and warn if accessed without going through percpu accessors. This patch doesn't affect normal builds. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: linux-mm@kvack.org Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Eric Biederman <ebiederm@xmission.com>
2010-01-24Merge git://git.infradead.org/~dwmw2/mtd-2.6.33Linus Torvalds
* git://git.infradead.org/~dwmw2/mtd-2.6.33: mtd: tests: fix read, speed and stress tests on NOR flash mtd: Really add ARM pismo support kmsg_dump: Dump on crash_kexec as well
2009-12-31kmsg_dump: Dump on crash_kexec as wellKOSAKI Motohiro
crash_kexec gets called before kmsg_dump(KMSG_DUMP_OOPS) if panic_on_oops is set, so the kernel log buffer is not stored for this case. This patch adds a KMSG_DUMP_KEXEC dump type which gets called when crash_kexec() is invoked. To avoid getting double dumps, the old KMSG_DUMP_PANIC is moved below crash_kexec(). The mtdoops driver is modified to handle KMSG_DUMP_KEXEC in the same way as a panic. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Simon Kagstrom <simon.kagstrom@netinsight.net> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-12-17Merge branch 'for-33' of git://repo.or.cz/linux-kbuildLinus Torvalds
* 'for-33' of git://repo.or.cz/linux-kbuild: (29 commits) net: fix for utsrelease.h moving to generated gen_init_cpio: fixed fwrite warning kbuild: fix make clean after mismerge kbuild: generate modules.builtin genksyms: properly consider EXPORT_UNUSED_SYMBOL{,_GPL}() score: add asm/asm-offsets.h wrapper unifdef: update to upstream revision 1.190 kbuild: specify absolute paths for cscope kbuild: create include/generated in silentoldconfig scripts/package: deb-pkg: use fakeroot if available scripts/package: add KBUILD_PKG_ROOTCMD variable scripts/package: tar-pkg: use tar --owner=root Kbuild: clean up marker net: add net_tstamp.h to headers_install kbuild: move utsrelease.h to include/generated kbuild: move autoconf.h to include/generated drop explicit include of autoconf.h kbuild: move compile.h to include/generated kbuild: drop include/asm kbuild: do not check for include/asm-$ARCH ... Fixed non-conflicting clean merge of modpost.c as per comments from Stephen Rothwell (modpost.c had grown an include of linux/autoconf.h that needed to be changed to generated/autoconf.h)
2009-12-16kexec: premit reduction of the reserved memory sizeAmerigo Wang
Implement shrinking the reserved memory for crash kernel, if it is more than enough. For example, if you have already reserved 128M, now you just want 100M, you can do: # echo $((100*1024*1024)) > /sys/kernel/kexec_crash_size Note, you can only do this before loading the crash kernel. Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Neil Horman <nhorman@redhat.com> Acked-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-12kbuild: move utsrelease.h to include/generatedSam Ravnborg
Fix up all users of utsrelease.h Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Michal Marek <mmarek@suse.cz>
2009-07-30kexec: fix omitting offset in extended crashkernel syntaxHidetoshi Seto
Setting "crashkernel=512M-2G:64M,2G-:128M" does not work but it turns to work if it has a trailing-whitespace, like "crashkernel=512M-2G:64M,2G-:128M ". It was because of a bug in the parser, running over the cmdline. This patch adds a check of the termination. Reported-by: Jin Dongming <jin.dongming@np.css.fujitsu.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Tested-by: Jin Dongming <jin.dongming@np.css.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>