summaryrefslogtreecommitdiff
path: root/arch/x86/kernel/setup.c
AgeCommit message (Collapse)Author
2010-10-22Merge branch 'core-memblock-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-memblock-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (74 commits) x86-64: Only set max_pfn_mapped to 512 MiB if we enter via head_64.S xen: Cope with unmapped pages when initializing kernel pagetable memblock, bootmem: Round pfn properly for memory and reserved regions memblock: Annotate memblock functions with __init_memblock memblock: Allow memblock_init to be called early memblock/arm: Fix memblock_region_is_memory() typo x86, memblock: Remove __memblock_x86_find_in_range_size() memblock: Fix wraparound in find_region() x86-32, memblock: Make add_highpages honor early reserved ranges x86, memblock: Fix crashkernel allocation arm, memblock: Fix the sparsemem build memblock: Fix section mismatch warnings powerpc, memblock: Fix memblock API change fallout memblock, microblaze: Fix memblock API change fallout x86: Remove old bootmem code x86, memblock: Use memblock_memory_size()/memblock_free_memory_size() to get correct dma_reserve x86: Remove not used early_res code x86, memblock: Replace e820_/_early string with memblock_ x86: Use memblock to replace early_res x86, memblock: Use memblock_debug to control debug message print out ... Fix up trivial conflicts in arch/x86/kernel/setup.c and kernel/Makefile
2010-10-21Merge branch 'x86-vmware-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-vmware-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, paravirt: Remove alloc_pmd_clone hook, only used by VMI x86, vmware: Remove deprecated VMI kernel support Fix up trivial #include conflict in arch/x86/kernel/smpboot.c
2010-10-21Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Remove stale pmtimer_64.c x86, cleanups: Use clear_page/copy_page rather than memset/memcpy x86: Remove unnecessary #ifdef ACPI/X86_IO_ACPI x86, cleanup: Remove obsolete boot_cpu_id variable
2010-10-21Merge branch 'x86-bios-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-bios-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, bios: Make the x86 early memory reservation a kernel option x86, bios: By default, reserve the low 64K for all BIOSes
2010-10-21Merge branch 'x86-amd-nb-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, amd_nb: Enable GART support for AMD family 0x15 CPUs x86, amd: Use compute unit information to determine thread siblings x86, amd: Extract compute unit information for AMD CPUs x86, amd: Add support for CPUID topology extension of AMD CPUs x86, nmi: Support NMI watchdog on newer AMD CPU families x86, mtrr: Assume SYS_CFG[Tom2ForceMemTypeWB] exists on all future AMD CPUs x86, k8: Rename k8.[ch] to amd_nb.[ch] and CONFIG_K8_NB to CONFIG_AMD_NB x86, k8-gart: Decouple handling of garts and northbridges x86, cacheinfo: Fix dependency of AMD L3 CID x86, kvm: add new AMD SVM feature bits x86, cpu: Fix allowed CPUID bits for KVM guests x86, cpu: Update AMD CPUID feature bits x86, cpu: Fix renamed, not-yet-shipping AMD CPUID feature bit x86, AMD: Remove needless CPU family check (for L3 cache info) x86, tsc: Remove CPU frequency calibration on AMD
2010-10-20x86-32, mm: Add an initial page table for core bootstrappingBorislav Petkov
This patch adds an initial page table with low mappings used exclusively for booting APs/resuming after ACPI suspend/machine restart. After this, there's no need to add low mappings to swapper_pg_dir and zap them later or create own swsusp PGD page solely for ACPI sleep needs - we have initial_page_table for that. Signed-off-by: Borislav Petkov <bp@alien8.de> LKML-Reference: <20101020070526.GA9588@liondog.tnic> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-10-20Merge branch 'x86/cleanups' into x86/trampolineH. Peter Anvin
2010-10-14x86-64: Only set max_pfn_mapped to 512 MiB if we enter via head_64.SJeremy Fitzhardinge
head_64.S maps up to 512 MiB, but that is not necessarity true for other entry paths, such as Xen. Thus, co-locate the setting of max_pfn_mapped with the code to actually set up the page tables in head_64.S. The 32-bit code is already so co-located. (The Xen code already sets max_pfn_mapped correctly for its own use case.) -v2: Yinghai fixed the following bug in this patch: | | max_pfn_mapped is in .bss section, so we need to set that | after bss get cleared. Without that we crash on bootup. | | That is safe because Xen does not call x86_64_start_kernel(). | Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Fixed-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> LKML-Reference: <4CB6AB24.9020504@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-06x86, memblock: Fix crashkernel allocationYinghai Lu
Cai Qian found crashkernel is broken with the x86 memblock changes. 1. crashkernel=128M@32M always reported that range is used, even if the first kernel is small and does not usethat range 2. we always got following report when using "kexec -p" Could not find a free area of memory of a000 bytes... locate_hole failed The root cause is that generic memblock_find_in_range() will try to allocate from the top of the range, whereas the kexec code was written assuming that allocation was always near the bottom and that it could blindly extend memory upward. Unfortunately the kexec code doesn't have a system for requesting the range that it really needs, so this is subject to probabilistic failures. This patch hacks around the problem by limiting the target range heuristically to below the traditional bzImage max range. This number is arbitrary and not always correct, and a much better result would be obtained by having kexec communicate this number based on the kernel header information and any appropriate command line options. Reported-and-Bisected-by: CAI Qian <caiqian@redhat.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4CABAF2A.5090501@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-09-20jump label: Make dynamic no-op selection available outside of ftraceJason Baron
Move Steve's code for finding the best 5-byte no-op from ftrace.c to alternative.c. The idea is that other consumers (in this case jump label) want to make use of that code. Signed-off-by: Jason Baron <jbaron@redhat.com> LKML-Reference: <96259ae74172dcac99c0020c249743c523a92e18.1284733808.git.jbaron@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-09-20x86, k8: Rename k8.[ch] to amd_nb.[ch] and CONFIG_K8_NB to CONFIG_AMD_NBAndreas Herrmann
The file names are somehow misleading as the code is not specific to AMD K8 CPUs anymore. The files accomodate code for other AMD CPU northbridges as well. Same is true for the config option which is valid for AMD CPU northbridges in general and not specific to K8. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20100917160343.GD4958@loge.amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-08-31Merge commit 'v2.6.36-rc3' into x86/memblockIngo Molnar
Conflicts: arch/x86/kernel/trampoline.c mm/memblock.c Merge reason: Resolve the conflicts, update to latest upstream. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-27x86: Remove old bootmem codeYinghai Lu
Requested by Ingo, Thomas and HPA. The old bootmem code is no longer necessary, and the transition is complete. Remove it. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-08-27x86, memblock: Use memblock_memory_size()/memblock_free_memory_size() to get ↵Yinghai Lu
correct dma_reserve memblock_memory_size() will return memory size in memblock.memory.region. memblock_free_memory_size() will return free memory size in memblock.memory.region. So We can get exact reseved size in specified range. Set the size right after initmem_init(), because later bootmem API will get area above 16M. (except some fallback). Later after we remove the bootmem, We could call that just before paging_init(). Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-08-27x86, memblock: Replace e820_/_early string with memblock_Yinghai Lu
1.include linux/memblock.h directly. so later could reduce e820.h reference. 2 this patch is done by sed scripts mainly -v2: use MEMBLOCK_ERROR instead of -1ULL or -1UL Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-08-27x86: Use memblock to replace early_resYinghai Lu
1. replace find_e820_area with memblock_find_in_range 2. replace reserve_early with memblock_x86_reserve_range 3. replace free_early with memblock_x86_free_range. 4. NO_BOOTMEM will switch to use memblock too. 5. use _e820, _early wrap in the patch, in following patch, will replace them all 6. because memblock_x86_free_range support partial free, we can remove some special care 7. Need to make sure that memblock_find_in_range() is called after memblock_x86_fill() so adjust some calling later in setup.c::setup_arch() -- corruption_check and mptable_update -v2: Move reserve_brk() early Before fill_memblock_area, to avoid overlap between brk and memblock_find_in_range() that could happen We have more then 128 RAM entry in E820 tables, and memblock_x86_fill() could use memblock_find_in_range() to find a new place for memblock.memory.region array. and We don't need to use extend_brk() after fill_memblock_area() So move reserve_brk() early before fill_memblock_area(). -v3: Move find_smp_config early To make sure memblock_find_in_range not find wrong place, if BIOS doesn't put mptable in right place. -v4: Treat RESERVED_KERN as RAM in memblock.memory. and they are already in memblock.reserved already.. use __NOT_KEEP_MEMBLOCK to make sure memblock related code could be freed later. -v5: Generic version __memblock_find_in_range() is going from high to low, and for 32bit active_region for 32bit does include high pages need to replace the limit with memblock.default_alloc_limit, aka get_max_mapped() -v6: Use current_limit instead -v7: check with MEMBLOCK_ERROR instead of -1ULL or -1L -v8: Set memblock_can_resize early to handle EFI with more RAM entries -v9: update after kmemleak changes in mainline Suggested-by: David S. Miller <davem@davemloft.net> Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-08-26x86, bios: Make the x86 early memory reservation a kernel optionH. Peter Anvin
Add a kernel command-line option so the x86 early memory reservation size can be adjusted at runtime instead of only at compile time. Suggested-by: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <tip-d0cd7425fab774a480cce17c2f649984312d0b55@git.kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-08-25x86, bios: By default, reserve the low 64K for all BIOSesH. Peter Anvin
The laundry list of BIOSes that need the low 64K reserved is getting very long, so make it the default across all BIOSes. This also allows the code to be simplified and unified with the reservation code for the first 4K. This resolves kernel bugzilla 16661 and who knows what else... Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> LKML-Reference: <tip-*@git.kernel.org>
2010-08-23x86, vmware: Remove deprecated VMI kernel supportAlok Kataria
With the recent innovations in CPU hardware acceleration technologies from Intel and AMD, VMware ran a few experiments to compare these techniques to guest paravirtualization technique on VMware's platform. These hardware assisted virtualization techniques have outperformed the performance benefits provided by VMI in most of the workloads. VMware expects that these hardware features will be ubiquitous in a couple of years, as a result, VMware has started a phased retirement of this feature from the hypervisor. Please note that VMI has always been an optimization and non-VMI kernels still work fine on VMware's platform. Latest versions of VMware's product which support VMI are, Workstation 7.0 and VSphere 4.0 on ESX side, future maintainence releases for these products will continue supporting VMI. For more details about VMI retirement take a look at this, http://blogs.vmware.com/guestosguide/2009/09/vmi-retirement.html This feature removal was scheduled for 2.6.37 back in September 2009. Signed-off-by: Alok N Kataria <akataria@vmware.com> LKML-Reference: <1282600151.19396.22.camel@ank32.eng.vmware.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-08-18x86-32: Separate 1:1 pagetables from swapper_pg_dirJoerg Roedel
This patch fixes machine crashes which occur when heavily exercising the CPU hotplug codepaths on a 32-bit kernel. These crashes are caused by AMD Erratum 383 and result in a fatal machine check exception. Here's the scenario: 1. On 32-bit, the swapper_pg_dir page table is used as the initial page table for booting a secondary CPU. 2. To make this work, swapper_pg_dir needs a direct mapping of physical memory in it (the low mappings). By adding those low, large page (2M) mappings (PAE kernel), we create the necessary conditions for Erratum 383 to occur. 3. Other CPUs which do not participate in the off- and onlining game may use swapper_pg_dir while the low mappings are present (when leave_mm is called). For all steps below, the CPU referred to is a CPU that is using swapper_pg_dir, and not the CPU which is being onlined. 4. The presence of the low mappings in swapper_pg_dir can result in TLB entries for addresses below __PAGE_OFFSET to be established speculatively. These TLB entries are marked global and large. 5. When the CPU with such TLB entry switches to another page table, this TLB entry remains because it is global. 6. The process then generates an access to an address covered by the above TLB entry but there is a permission mismatch - the TLB entry covers a large global page not accessible to userspace. 7. Due to this permission mismatch a new 4kb, user TLB entry gets established. Further, Erratum 383 provides for a small window of time where both TLB entries are present. This results in an uncorrectable machine check exception signalling a TLB multimatch which panics the machine. There are two ways to fix this issue: 1. Always do a global TLB flush when a new cr3 is loaded and the old page table was swapper_pg_dir. I consider this a hack hard to understand and with performance implications 2. Do not use swapper_pg_dir to boot secondary CPUs like 64-bit does. This patch implements solution 2. It introduces a trampoline_pg_dir which has the same layout as swapper_pg_dir with low_mappings. This page table is used as the initial page table of the booting CPU. Later in the bringup process, it switches to swapper_pg_dir and does a global TLB flush. This fixes the crashes in our test cases. -v2: switch to swapper_pg_dir right after entering start_secondary() so that we are able to access percpu data which might not be mapped in the trampoline page table. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> LKML-Reference: <20100816123833.GB28147@aftab> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-08-12x86, cleanup: Remove obsolete boot_cpu_id variableRobert Richter
boot_cpu_id is there for historical reasons and was renamed to boot_cpu_physical_apicid in patch: c70dcb7 x86: change boot_cpu_id to boot_cpu_physical_apicid However, there are some remaining occurrences of boot_cpu_id that are never touched in the kernel and thus its value is always 0. This patch removes boot_cpu_id completely. Signed-off-by: Robert Richter <robert.richter@amd.com> LKML-Reference: <1279731838-1522-8-git-send-email-robert.richter@amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-06-18x86, olpc: Add support for calling into OpenFirmwareAndres Salomon
Add support for saving OFW's cif, and later calling into it to run OFW commands. OFW remains resident in memory, living within virtual range 0xff800000 - 0xffc00000. A single page directory entry points to the pgdir that OFW actually uses, so rather than saving the entire page table, we grab and install that one entry permanently in the kernel's page table. This is currently only used by the OLPC XO. Note that this particular calling convention breaks PAE and PAT, and so cannot be used on newer x86 hardware. Signed-off-by: Andres Salomon <dilinger@queued.net> LKML-Reference: <20100618174653.7755a39a@dev.queued.net> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-05-24x86, setup: Phoenix BIOS fixup is needed on Dell Inspiron Mini 1012Gabor Gombas
The low-memory corruption checker triggers during suspend/resume, so we need to reserve the low 64k. Don't be fooled that the BIOS identifies itself as "Dell Inc.", it's still Phoenix BIOS. [ hpa: I think we blacklist almost every BIOS in existence. We should either change this to a whitelist or just make it unconditional. ] Signed-off-by: Gabor Gombas <gombasg@digikabel.hu> LKML-Reference: <201005241913.o4OJDIMM010877@imap1.linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@kernel.org>
2010-05-21x86, kgdb: early trap init for early debugJan Kiszka
Allow the x86 arch to have early exception processing for the purpose of debugging via the kgdb. Signed-off-by: Jan Kiszka <jan.kiszka@web.de> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-04-07Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip: x86: Fix double enable_IR_x2apic() call on SMP kernel on !SMP boards x86: Increase CONFIG_NODES_SHIFT max to 10 ibft, x86: Change reserve_ibft_region() to find_ibft_region() x86, hpet: Fix bug in RTC emulation x86, hpet: Erratum workaround for read after write of HPET comparator bootmem, x86: Fix 32bit numa system without RAM on node 0 nobootmem, x86: Fix 32bit numa system without RAM on node 0 x86: Handle overlapping mptables x86: Make e820_remove_range to handle all covered case x86-32, resume: do a global tlb flush in S4 resume
2010-04-05Merge branch 'master' into export-slabhTejun Heo
2010-04-01ibft, x86: Change reserve_ibft_region() to find_ibft_region()Yinghai Lu
This allows arch code could decide the way to reserve the ibft. And we should reserve ibft as early as possible, instead of BOOTMEM stage, in case the table is in RAM range and is not reserved by BIOS (this will often be the case.) Move to just after find_smp_config(). Also when CONFIG_NO_BOOTMEM=y, We will not have reserve_bootmem() anymore. -v2: fix typo about ibft pointed by Konrad Rzeszutek Wilk <konrad@darnok.org> Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4BB510FB.80601@kernel.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Peter Jones <pjones@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad@kernel.org> CC: Jan Beulich <jbeulich@novell.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-29x86: Make sure free_init_pages() frees pages on page boundaryYinghai Lu
When CONFIG_NO_BOOTMEM=y, it could use memory more effiently, or in a more compact fashion. Example: Allocated new RAMDISK: 00ec2000 - 0248ce57 Move RAMDISK from 000000002ea04000 - 000000002ffcee56 to 00ec2000 - 0248ce56 The new RAMDISK's end is not page aligned. Last page could be shared with other users. When free_init_pages are called for initrd or .init, the page could be freed and we could corrupt other data. code segment in free_init_pages(): | for (; addr < end; addr += PAGE_SIZE) { | ClearPageReserved(virt_to_page(addr)); | init_page_count(virt_to_page(addr)); | memset((void *)(addr & ~(PAGE_SIZE-1)), | POISON_FREE_INITMEM, PAGE_SIZE); | free_page(addr); | totalram_pages++; | } last half page could be used as one whole free page. So page align the boundaries. -v2: make the original initramdisk to be aligned, according to Johannes, otherwise we have the chance to lose one page. we still need to keep initrd_end not aligned, otherwise it could confuse decompressor. -v3: change to WARN_ON instead, suggested by Johannes. -v4: use PAGE_ALIGN, suggested by Johannes. We may fix that macro name later to PAGE_ALIGN_UP, and PAGE_ALIGN_DOWN Add comments about assuming ramdisk start is aligned in relocate_initrd(), change to re get ramdisk_image instead of save it to make diff smaller. Add warning for wrong range, suggested by Johannes. -v6: remove one WARN() We need to align beginning in free_init_pages() do not copy more than ramdisk_size, noticed by Johannes Reported-by: Stanislaw Gruszka <sgruszka@redhat.com> Tested-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <1269830604-26214-3-git-send-email-yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-03Merge branch 'x86-bootmem-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-bootmem-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits) early_res: Need to save the allocation name in drop_range_partial() sparsemem: Fix compilation on PowerPC early_res: Add free_early_partial() x86: Fix non-bootmem compilation on PowerPC core: Move early_res from arch/x86 to kernel/ x86: Add find_fw_memmap_area Move round_up/down to kernel.h x86: Make 32bit support NO_BOOTMEM early_res: Enhance check_and_double_early_res x86: Move back find_e820_area to e820.c x86: Add find_early_area_size x86: Separate early_res related code from e820.c x86: Move bios page reserve early to head32/64.c sparsemem: Put mem map for one node together. sparsemem: Put usemap for one node together x86: Make 64 bit use early_res instead of bootmem before slab x86: Only call dma32_reserve_bootmem 64bit !CONFIG_NUMA x86: Make early_node_mem get mem > 4 GB if possible x86: Dynamically increase early_res array size x86: Introduce max_early_res and early_res_count ...
2010-02-25x86: Do not reserve brk for DMI if it's not going to be usedThadeu Lima de Souza Cascardo
This will save 64K bytes from memory when loading linux if DMI is disabled, which is good for embedded systems. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> LKML-Reference: <1265758732-19320-1-git-send-email-cascardo@holoscopio.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-17Merge branch 'linus' into x86/mmThomas Gleixner
x86/mm is on 32-rc4 and missing the spinlock namespace changes which are needed for further commits into this topic. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-02-12x86: Make 64 bit use early_res instead of bootmem before slabYinghai Lu
Finally we can use early_res to replace bootmem for x86_64 now. Still can use CONFIG_NO_BOOTMEM to enable it or not. -v2: fix 32bit compiling about MAX_DMA32_PFN -v3: folded bug fix from LKML message below Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4B747239.4070907@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-11x86: Only call dma32_reserve_bootmem 64bit !CONFIG_NUMAYinghai Lu
64bit NUMA already make enough space under 4G with new early_node_mem. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <1265793639-15071-16-git-send-email-yinghai@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-11x86: Call early_res_to_bootmem one timeYinghai Lu
Simplify setup_node_mem: don't use bootmem from other node, instead just find_e820_area in early_node_mem. This keeps the boundary between early_res and boot mem more clear, and lets us only call early_res_to_bootmem() one time instead of for all nodes. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <1265793639-15071-12-git-send-email-yinghai@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-11Merge remote branch 'linus/master' into x86/bootmemH. Peter Anvin
2010-02-02x86: Remove BIOS data range from e820Yinghai Lu
In preparation for moving to the generic page_is_ram(), make explicit what we expect to be reserved and not reserved. Tested-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <20100122033004.335813103@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-29x86: Add quirk for Intel DG45FC board to avoid low memory corruptionDavid Härdeman
Commit 6aa542a694dc9ea4344a8a590d2628c33d1b9431 added a quirk for the Intel DG45ID board due to low memory corruption. The Intel DG45FC shares the same BIOS (and the same bug) as noted in: http://bugzilla.kernel.org/show_bug.cgi?id=13736 Signed-off-by: David Härdeman <david@hardeman.nu> LKML-Reference: <20100128200254.GA9134@hardeman.nu> Cc: <stable@kernel.org> Cc: Alexey Fisher <bug-track@fisher-privat.net> Cc: ykzhao <yakui.zhao@intel.com> Cc: Tony Bones <aabonesml@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-11x86: Use find_e820() instead of hard coded trampoline addressYinghai Lu
Jens found the following crash/regression: [ 0.000000] found SMP MP-table at [ffff8800000fdd80] fdd80 [ 0.000000] Kernel panic - not syncing: Overlapping early reservations 12-f011 MP-table mpc to 0-fff BIOS data page and [ 0.000000] Kernel panic - not syncing: Overlapping early reservations 12-f011 MP-table mpc to 6000-7fff TRAMPOLINE and bisected it to b24c2a9 ("x86: Move find_smp_config() earlier and avoid bootmem usage"). It turns out the BIOS is using the first 64k for mptable, without reserving it. So try to find good range for the real-mode trampoline instead of hard coding it, in case some bios tries to use that range for sth. Reported-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Tested-by: Jens Axboe <jens.axboe@oracle.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> LKML-Reference: <4B21630A.6000308@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-08Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (36 commits) x86, mm: Correct the implementation of is_untracked_pat_range() x86/pat: Trivial: don't create debugfs for memtype if pat is disabled x86, mtrr: Fix sorting of mtrr after subtracting x86: Move find_smp_config() earlier and avoid bootmem usage x86, platform: Change is_untracked_pat_range() to bool; cleanup init x86: Change is_ISA_range() into an inline function x86, mm: is_untracked_pat_range() takes a normal semiclosed range x86, mm: Call is_untracked_pat_range() rather than is_ISA_range() x86: UV SGI: Don't track GRU space in PAT x86: SGI UV: Fix BAU initialization x86, numa: Use near(er) online node instead of roundrobin for NUMA x86, numa, bootmem: Only free bootmem on NUMA failure path x86: Change crash kernel to reserve via reserve_early() x86: Eliminate redundant/contradicting cache line size config options x86: When cleaning MTRRs, do not fold WP into UC x86: remove "extern" from function prototypes in <asm/proto.h> x86, mm: Report state of NX protections during boot x86, mm: Clean up and simplify NX enablement x86, pageattr: Make set_memory_(x|nx) aware of NX support x86, sleep: Always save the value of EFER ... Fix up conflicts (added both iommu_shutdown and is_untracked_pat_range) to 'struct x86_platform_ops') in arch/x86/include/asm/x86_init.h arch/x86/kernel/x86_init.c
2009-12-05Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix a section mismatch in arch/x86/kernel/setup.c x86: Fixup last users of irq_chip->typename x86: Remove BKL from apm_32 x86: Remove BKL from microcode x86: use kernel_stack_pointer() in kprobes.c x86: use kernel_stack_pointer() in kgdb.c x86: use kernel_stack_pointer() in dumpstack.c x86: use kernel_stack_pointer() in process_32.c
2009-12-03Merge branch 'perf/mce' into perf/coreIngo Molnar
Merge reason: It's ready for v2.6.33. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-30x86: Fix a section mismatch in arch/x86/kernel/setup.cHelight.Xu
copy_edd() should be __init. warning msg: WARNING: vmlinux.o(.text+0x7759): Section mismatch in reference from the function copy_edd() to the variable .init.data:boot_params The function copy_edd() references the variable __initdata boot_params. This is often because copy_edd lacks a __initdata annotation or the annotation of boot_params is wrong. Signed-off-by: ZhenwenXu <helight.xu@gmail.com> LKML-Reference: <4B139F8F.4000907@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-11-24x86: Move find_smp_config() earlier and avoid bootmem usageYinghai Lu
Move the find_smp_config() call to before bootmem is initialized. Use reserve_early() instead of reserve_bootmem() in it. This simplifies the code, we only need to call find_smp_config() once and can remove the now unneeded reserve parameter from x86_init_mpparse::find_smp_config. We thus also reduce x86's dependency on bootmem allocations. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4B0BB9F2.70907@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23x86: Change crash kernel to reserve via reserve_early()Yinghai Lu
use find_e820_area()/reserve_early() instead. -v2: address Eric's request, to restore original semantics. will fail, if the provided address can not be used. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Eric W. Biederman <ebiederm@xmission.com> LKML-Reference: <4B09E2F9.7040403@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-16x86, mm: Report state of NX protections during bootKees Cook
It is possible for x86_64 systems to lack the NX bit either due to the hardware lacking support or the BIOS having turned off the CPU capability, so NX status should be reported. Additionally, anyone booting NX-capable CPUs in 32bit mode without PAE will lack NX functionality, so this change provides feedback for that case as well. Signed-off-by: Kees Cook <kees.cook@canonical.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <1258154897-6770-6-git-send-email-hpa@zytor.com>
2009-11-16x86, mm: Clean up and simplify NX enablementH. Peter Anvin
The 32- and 64-bit code used very different mechanisms for enabling NX, but even the 32-bit code was enabling NX in head_32.S if it is available. Furthermore, we had a bewildering collection of tests for the available of NX. This patch: a) merges the 32-bit set_nx() and the 64-bit check_efer() function into a single x86_configure_nx() function. EFER control is left to the head code. b) eliminates the nx_enabled variable entirely. Things that need to test for NX enablement can verify __supported_pte_mask directly, and cpu_has_nx gives the supported status of NX. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Tejun Heo <tj@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Vegard Nossum <vegardno@ifi.uio.no> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Chris Wright <chrisw@sous-sol.org> LKML-Reference: <1258154897-6770-5-git-send-email-hpa@zytor.com> Acked-by: Kees Cook <kees.cook@canonical.com>
2009-11-11x86: Make sure wakeup trampoline code is below 1MBYinghai Lu
Instead of using bootmem, try find_e820_area()/reserve_early(), and call acpi_reserve_memory() early, to allocate the wakeup trampoline code area below 1M. This is more reliable, and it also removes a dependency on bootmem. -v2: change function name to acpi_reserve_wakeup_memory(), as suggested by Rafael. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: pm list <linux-pm@lists.linux-foundation.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <4AFA210B.3020207@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10x86: Under BIOS control, restore AP's APIC_LVTTHMR to the BSP valueYong Wang
On platforms where the BIOS handles the thermal monitor interrupt, APIC_LVTTHMR on each logical CPU is programmed to generate a SMI and OS must not touch it. Unfortunately AP bringup sequence using INIT-SIPI-SIPI clears all the LVT entries except the mask bit. Essentially this results in all LVT entries including the thermal monitoring interrupt set to masked (clearing the bios programmed value for APIC_LVTTHMR). And this leads to kernel take over the thermal monitoring interrupt on AP's but not on BSP (leaving the bios programmed value only on BSP). As a result of this, we have seen system hangs when the thermal monitoring interrupt is generated. Fix this by reading the initial value of thermal LVT entry on BSP and if bios has taken over the control, then program the same value on all AP's and leave the thermal monitoring interrupt control on all the logical cpu's to the bios. Signed-off-by: Yong Wang <yong.y.wang@intel.com> Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Arjan van de Ven <arjan@infradead.org> LKML-Reference: <20091110013824.GA24940@ywang-moblin2.bj.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: stable@kernel.org
2009-11-06x86: Add Phoenix/MSC BIOSes to lowmem corruption listSimon Kagstrom
We have a board with a Phoenix/MSC BIOS which also corrupts the low 64KB of RAM, so add an entry to the table. Signed-off-by: Simon Kagstrom <simon.kagstrom@netinsight.net> LKML-Reference: <20091106154404.002648d9@marrow.netinsight.se> Signed-off-by: H. Peter Anvin <hpa@zytor.com>