Age | Commit message (Collapse) | Author |
|
Impact: fix early crash on LinuxBIOS systems
Kevin O'Connor reported that Coreboot aka LinuxBIOS tries to put
mptable somewhere very high, well above max_low_pfn (below which
BIOSes generally put the mptable), causing a panic.
The BIOS will probably be changed to be compatible with older
Linus versions, but nevertheless the MP-spec does not forbid
an MP-table in arbitrary system RAM, so make sure it all
works even if the table is in an unexpected place.
Check physptr with max_low_pfn * PAGE_SIZE.
Reported-by: Kevin O'Connor <kevin@koconnor.net>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Stefan Reinauer <stepan@coresystems.de>
Cc: coreboot@coreboot.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Impact: cleanup
Make x86_quirks support more transparent. The highlevel
methods are now named:
extern void x86_quirk_pre_intr_init(void);
extern void x86_quirk_intr_init(void);
extern void x86_quirk_trap_init(void);
extern void x86_quirk_pre_time_init(void);
extern void x86_quirk_time_init(void);
This makes it clear that if some platform extension has to
do something here that it is considered ... weird, and is
discouraged.
Also remove arch_hooks.h and move it into setup.h (and other
header files where appropriate).
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Impact: remove dead code
Remove:
- pre_setup_arch_hook()
- mca_nmi_hook()
If needed they can be added back via an x86_quirk handler.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Impact: remove unused/broken code
The Voyager subarch last built successfully on the v2.6.26 kernel
and has been stale since then and does not build on the v2.6.27,
v2.6.28 and v2.6.29-rc5 kernels.
No actual users beyond the maintainer reported this breakage.
Patches were sent and most of the fixes were accepted but the
discussion around how to do a few remaining issues cleanly
fizzled out with no resolution and the code remained broken.
In the v2.6.30 x86 tree development cycle 32-bit subarch support
has been reworked and removed - and the Voyager code, beyond the
build problems already known, needs serious and significant
changes and probably a rewrite to support it.
CONFIG_X86_VOYAGER has been marked BROKEN then. The maintainer has
been notified but no patches have been sent so far to fix it.
While all other subarchs have been converted to the new scheme,
voyager is still broken. We'd prefer to receive patches which
clean up the current situation in a constructive way, but even in
case of removal there is no obstacle to add that support back
after the issues have been sorted out in a mutually acceptable
fashion.
So remove this inactive code for now.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Change the CONFIG_X86_EXTENDED_PLATFORM help text to display the
32bit/64bit extended platform list. This is as suggested by Ingo.
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Cc: shai@scalex86.org
Cc: "Benzi Galili (Benzi@ScaleMP.com)" <benzi@scalemp.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Conflicts:
arch/x86/mach-default/setup.c
Semantic conflict resolution:
arch/x86/kernel/setup.c
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Move the sysdev_suspend/resume from the callee to the callers, with
no real change in semantics, so that we can rework the disabling of
interrupts during suspend/hibernation.
This is based on an earlier patch from Linus.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Right now nobody cares, but the suspend/resume code will eventually want
to suspend device interrupts without suspending the timer, and will
depend on this flag to know.
The modern x86 timer infrastructure uses the local APIC timers and never
shows up as a device interrupt at all, so it isn't affected and doesn't
need any of this.
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
ACPI: remove CONFIG_ACPI_SYSTEM
fujitsu-laptop: Use RFKILL support bitmask from firmware
x86_64: Fix S3 fail path
x86_64: acpi/wakeup_64 cleanup
battery: don't assume we are fully charged when not charging or discharging
ACPI: EC: Add delay for slow MSI controller
|
|
http://kisskb.ellerman.id.au/kisskb/buildresult/72115/:
| net/mac80211/ieee80211_i.h:327: error: syntax error before 'volatile'
| net/mac80211/ieee80211_i.h:350: error: syntax error before '}' token
| net/mac80211/ieee80211_i.h:455: error: field 'sta' has incomplete type
| distcc[19430] ERROR: compile net/mac80211/main.c on sprygo/32 failed
This is caused by
| # define mfp ((*(volatile struct MFP*)MFP_BAS))
in arch/m68k/include/asm/atarihw.h, which conflicts with the new "mfp" enum in
net/mac80211/ieee80211_i.h.
Rename "mfp" to "st_mfp", as it's a way too generic name for a global #define.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
If BIOS hands over the control to OS in legacy xapic mode, select
legacy xapic related ops in the early apic probe and shift to x2apic
ops later in the boot sequence, only after enabling x2apic mode.
If BIOS hands over the control in x2apic mode, select x2apic related
ops in the early apic probe.
This fixes the early boot panic, where we were selecting x2apic ops,
while the cpu is still in legacy xapic mode.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
As acpi_enter_sleep_state can fail, take this into account in
do_suspend_lowlevel and don't return to the do_suspend_lowlevel's
caller. This would break (currently) fpu status and preempt count.
Technically, this means use `call' instead of `jmp' and `jmp' to
the `resume_point' after the `call' (i.e. if
acpi_enter_sleep_state returns=fails). `resume_point' will handle
the restore of fpu and preempt count gracefully.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
- remove %ds re-set, it's already set in wakeup_long64
- remove double labels and alignment (ENTRY already adds both)
- use meaningful resume point labelname
- skip alignment while jumping from wakeup_long64 to the resume point
- remove .size, .type and unused labels
[v2]
- added ENDPROCs
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, mce: remove incorrect __cpuinit for mce_cpu_features()
MAINTAINERS: paravirt-ops maintainers update
|
|
Impact: Bug fix on UP
Checkin 6ec68bff3c81e776a455f6aca95c8c5f1d630198:
x86, mce: reinitialize per cpu features on resume
introduced a call to mce_cpu_features() in the resume path, in order
for the MCE machinery to get properly reinitialized after a resume.
However, this function (and its successors) was flagged __cpuinit,
which becomes __init on UP configurations (on SMP suspend/resume
requires CPU hotplug and so this would not be seen.)
Remove the offending __cpuinit annotations for mce_cpu_features() and
its successor functions.
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: use the right protections for split-up pagetables
x86, vmi: TSC going backwards check in vmi clocksource
|
|
Fix the typo && -> ||.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
oprofile for MN10300 seems to have been broken by the advent of the new
tracing framework.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
* Replace kmalloc() with uml_kmalloc() (fix build failure)
* Remove unnecessary UM_KERN_INFO in printk() (don't display '<6>' while
printing info)
Signed-off-by: Luca Bigliardi <shammash@artha.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Reviewed-by: WANG Cong <wangcong@zeuux.org>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Steven Rostedt found a bug in where in his modified kernel
ftrace was unable to modify the kernel text, due to the PMD
itself having been marked read-only as well in
split_large_page().
The fix, suggested by Linus, is to not try to 'clone' the
reference protection of a huge-page, but to use the standard
(and permissive) page protection bits of KERNPG_TABLE.
The 'cloning' makes sense for the ptes but it's a confused and
incorrect concept at the page table level - because the
pagetable entry is a set of all ptes and hence cannot
'clone' any single protection attribute - the ptes can be any
mixture of protections.
With the permissive KERNPG_TABLE, even if the pte protections
get changed after this point (due to ftrace doing code-patching
or other similar activities like kprobes), the resulting combined
protections will still be correct and the pte's restrictive
(or permissive) protections will control it.
Also update the comment.
This bug was there for a long time but has not caused visible
problems before as it needs a rather large read-only area to
trigger. Steve possibly hacked his kernel with some really
large arrays or so. Anyway, the bug is definitely worth fixing.
[ Huang Ying also experienced problems in this area when writing
the EFI code, but the real bug in split_large_page() was not
realized back then. ]
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Reported-by: Huang Ying <ying.huang@intel.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Impact: fix time warps under vmware
Similar to the check for TSC going backwards in the TSC clocksource,
we also need this check for VMI clocksource.
Signed-off-by: Alok N Kataria <akataria@vmware.com>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: stable@kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] xen_domu build fix
[IA64] fixes configs and add default config for ia64 xen domU
[IA64] Remove redundant cpu_clear() in __cpu_disable path
[IA64] Revert "prevent ia64 from invoking irq handlers on offline CPUs"
[IA64] bte_copy of BTE_MAX_XFER trips BUG_ON.
[IA64] Build fix for __early_pfn_to_nid() undefined link error
|
|
arch/ia64/xen/xen_pv_ops.c:156: error: xen_init_ops causes a section type conflict
arch/ia64/xen/xen_pv_ops.c:340: error: xen_iosapic_ops causes a section type conflict
Signed-off-by: Tony Luck <tony.luck@intel.com>
|
|
This patch fixes xen related Kconfigs and add default config
file for ia64 xen domU.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Tony Luck <aegl@agluck-desktop.(none)>
|
|
The second call to cpu_clear() is redundant, as we've already removed
the CPU from cpu_online_map before calling migrate_platform_irqs().
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Tony Luck <aegl@agluck-desktop.(none)>
|
|
This reverts commit e7b140365b86aaf94374214c6f4e6decbee2eb0a.
Commit e7b14036 removes the targetted disabled CPU from the
cpu_online_map after calls to migrate_platform_irqs and fixup_irqs.
Paul McKenney states that the reasoning behind the patch was to
prevent irq handlers from running on CPUs marked offline because:
RCU happily ignores CPUs that don't have their bits set in
cpu_online_map, so if there are RCU read-side critical sections
in the irq handlers being run, RCU will ignore them. If the
other CPUs were running, they might sequence through the RCU
state machine, which could result in data structures being
yanked out from under those irq handlers, which in turn could
result in oopses or worse.
Unfortunately, both ia64 functions above look at cpu_online_map to find
a new CPU to migrate interrupts onto. This means we can potentially
migrate an interrupt off ourself back to... ourself. Uh oh.
This causes an oops when we finally try to process pending interrupts on
the CPU we want to disable. The oops results from calling __do_IRQ with
a NULL pt_regs:
Unable to handle kernel NULL pointer dereference (address 0000000000000040)
Call Trace:
[<a000000100016930>] show_stack+0x50/0xa0
sp=e0000009c922fa00 bsp=e0000009c92214d0
[<a0000001000171a0>] show_regs+0x820/0x860
sp=e0000009c922fbd0 bsp=e0000009c9221478
[<a00000010003c700>] die+0x1a0/0x2e0
sp=e0000009c922fbd0 bsp=e0000009c9221438
[<a0000001006e92f0>] ia64_do_page_fault+0x950/0xa80
sp=e0000009c922fbd0 bsp=e0000009c92213d8
[<a00000010000c7a0>] ia64_native_leave_kernel+0x0/0x270
sp=e0000009c922fc60 bsp=e0000009c92213d8
[<a0000001000ecdb0>] profile_tick+0xd0/0x1c0
sp=e0000009c922fe30 bsp=e0000009c9221398
[<a00000010003bb90>] timer_interrupt+0x170/0x3e0
sp=e0000009c922fe30 bsp=e0000009c9221330
[<a00000010013a800>] handle_IRQ_event+0x80/0x120
sp=e0000009c922fe30 bsp=e0000009c92212f8
[<a00000010013aa00>] __do_IRQ+0x160/0x4a0
sp=e0000009c922fe30 bsp=e0000009c9221290
[<a000000100012290>] ia64_process_pending_intr+0x2b0/0x360
sp=e0000009c922fe30 bsp=e0000009c9221208
[<a0000001000112d0>] fixup_irqs+0xf0/0x2a0
sp=e0000009c922fe30 bsp=e0000009c92211a8
[<a00000010005bd80>] __cpu_disable+0x140/0x240
sp=e0000009c922fe30 bsp=e0000009c9221168
[<a0000001006c5870>] take_cpu_down+0x50/0xa0
sp=e0000009c922fe30 bsp=e0000009c9221148
[<a000000100122610>] stop_cpu+0xd0/0x200
sp=e0000009c922fe30 bsp=e0000009c92210f0
[<a0000001000e0440>] kthread+0xc0/0x140
sp=e0000009c922fe30 bsp=e0000009c92210c8
[<a000000100014ab0>] kernel_thread_helper+0xd0/0x100
sp=e0000009c922fe30 bsp=e0000009c92210a0
[<a00000010000a4c0>] start_kernel_thread+0x20/0x40
sp=e0000009c922fe30 bsp=e0000009c92210a0
I don't like this revert because it is fragile. ia64 is getting lucky
because we seem to only ever process timer interrupts in this path, but
if we ever race with an IPI here, we definitely use RCU and have the
potential of hitting an oops that Paul describes above.
Patching ia64's timer_interrupt() to check for NULL pt_regs is
insufficient though, as we still hit the above oops.
As a short term solution, I do think that this revert is the right
answer. The revert hold up under repeated testing (24+ hour test runs)
with this setup:
- 8-way rx6600
- randomly toggling CPU online/offline state every 2 seconds
- running CPU exercisers, memory hog, disk exercisers, and
network stressors
- average system load around ~160
In the long term, we really need to figure out why we set pt_regs = NULL
in ia64_process_pending_intr(). If it turns out that it is unnecessary
to do so, then we could safely re-introduce e7b14036 (along with some
other logic to be smarter about migrating interrupts).
One final note: x86 also removes the disabled CPU from cpu_online_map
and then re-enables interrupts for 1ms, presumably to handle any pending
interrupts:
arch/x86/kernel/irq_32.c (and irq_64.c):
cpu_disable_common:
[remove cpu from cpu_online_map]
fixup_irqs():
for_each_irq:
[break CPU affinities]
local_irq_enable();
mdelay(1);
local_irq_disable();
So they are doing implicitly what ia64 is doing explicitly.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Tony Luck <aegl@agluck-desktop.(none)>
|
|
BTE_MAX_XFER is wrong. It is one greater than the number of cache
lines the BTE is actually able to transfer. If you request a transfer
of exactly BTE_MAX_XFER size, you trip a very cryptic BUG_ON() which
should certainly be made more clear.
This patch fixes that constant and also cleans up the BUG_ON()s in
arch/ia64/sn/kernel/bte.c to test one condition per line.
Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Tony Luck <aegl@agluck-desktop.(none)>
|
|
ia64 only defines __early_pfn_to_nid() for SPARSEMEM && NUMA configurations,
so the recent:
commit: f2dbcfa738368c8a40d4a5f0b65dc9879577cb21
mm: clean up for early_pfn_to_nid()
ends up with some link problems for certain configuration files.
Fix arch/ia64/Kconfig to only define HAVE_ARCH_EARLY_PFN_TO_NID in the
cases where we do provide this function.
Signed-off-by: Tony Luck <tony.luck@intel.com>
|
|
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 5405/1: ep93xx: remove unused gesbc9312.h header
[ARM] 5404/1: Fix condition in arm_elf_read_implies_exec() to set READ_IMPLIES_EXEC
[ARM] omap: fix clock reparenting in omap2_clk_set_parent()
[ARM] 5403/1: pxa25x_ep_fifo_flush() *ep->reg_udccs always set to 0
[ARM] 5402/1: fix a case of wrap-around in sanity_check_meminfo()
[ARM] 5401/1: Orion: fix edge triggered GPIO interrupt support
[ARM] 5400/1: Add support for inverted rdy_busy pin for Atmel nand device controller
[ARM] 5391/1: AT91: Enable GPIO clocks earlier
[ARM] 5390/1: AT91: Watchdog fixes
[ARM] 5398/1: Add Wan ZongShun to MAINTAINERS for W90P910
[ARM] omap: fix _omap2_clksel_get_src_field()
[ARM] omap: fix omap2_divisor_to_clksel() error return value
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, mce: fix ifdef for 64bit thermal apic vector clear on shutdown
x86, mce: use force_sig_info to kill process in machine check
x86, mce: reinitialize per cpu features on resume
x86, rcu: fix strange load average and ksoftirqd behavior
|
|
Remove the gesbc9312.h header since it is unused.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
READ_IMPLIES_EXEC
READ_IMPLIES_EXEC must be set when:
o binary _is_ an executable stack (i.e. not EXSTACK_DISABLE_X)
o processor architecture is _under_ ARMv6 (XN bit is supported from ARMv6)
Signed-off-by: Makito SHIOKAWA <lkhmkt@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Standby memory detected with the sclp interface gets always registered
with add_memory calls without considering the limitationt that the
"mem=" kernel paramater implies.
So fix this and only register standby memory that is below the specified
limit.
This fixes zfcpdump since it uses "mem=32M". In case there is appr.
2GB standby memory present all of usable memory would be used for the
struct pages needed for standby memory.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
commit aa5e97ce4bbc9d5daeec16b1d15bb3f6b7b4f4d4
[PATCH] improve precision of process accounting.
Introduced a timing regression:
-bash-3.2# time ls
real 0m0.006s
user 0m1.754s
sys 0m1.094s
The problem was introduced by an error in cputime_to_timeval.
Cputime is now 1/4096 microsecond, therefore, we have to divide
the remainder with 4096 to get the microseconds.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
When changing the parent of a clock, it is necessary to keep the
clock use counts balanced otherwise things the parent state will
get corrupted. Since we already disable and re-enable the clock,
we might as well use the recursive versions instead.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
In the non highmem case, if two memory banks of 1GB each are provided,
the second bank would evade suppression since its virtual base would
be 0. Fix this by disallowing any memory bank which virtual base
address is found to be lower than PAGE_OFFSET.
Reported-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Now, early_pfn_in_nid(PFN, NID) may returns false if PFN is a hole.
and memmap initialization was not done. This was a trouble for
sparc boot.
To fix this, the PFN should be initialized and marked as PG_reserved.
This patch changes early_pfn_in_nid() return true if PFN is a hole.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reported-by: David Miller <davem@davemlloft.net>
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
What's happening is that the assertion in mm/page_alloc.c:move_freepages()
is triggering:
BUG_ON(page_zone(start_page) != page_zone(end_page));
Once I knew this is what was happening, I added some annotations:
if (unlikely(page_zone(start_page) != page_zone(end_page))) {
printk(KERN_ERR "move_freepages: Bogus zones: "
"start_page[%p] end_page[%p] zone[%p]\n",
start_page, end_page, zone);
printk(KERN_ERR "move_freepages: "
"start_zone[%p] end_zone[%p]\n",
page_zone(start_page), page_zone(end_page));
printk(KERN_ERR "move_freepages: "
"start_pfn[0x%lx] end_pfn[0x%lx]\n",
page_to_pfn(start_page), page_to_pfn(end_page));
printk(KERN_ERR "move_freepages: "
"start_nid[%d] end_nid[%d]\n",
page_to_nid(start_page), page_to_nid(end_page));
...
And here's what I got:
move_freepages: Bogus zones: start_page[2207d0000] end_page[2207dffc0] zone[fffff8103effcb00]
move_freepages: start_zone[fffff8103effcb00] end_zone[fffff8003fffeb00]
move_freepages: start_pfn[0x81f600] end_pfn[0x81f7ff]
move_freepages: start_nid[1] end_nid[0]
My memory layout on this box is:
[ 0.000000] Zone PFN ranges:
[ 0.000000] Normal 0x00000000 -> 0x0081ff5d
[ 0.000000] Movable zone start PFN for each node
[ 0.000000] early_node_map[8] active PFN ranges
[ 0.000000] 0: 0x00000000 -> 0x00020000
[ 0.000000] 1: 0x00800000 -> 0x0081f7ff
[ 0.000000] 1: 0x0081f800 -> 0x0081fe50
[ 0.000000] 1: 0x0081fed1 -> 0x0081fed8
[ 0.000000] 1: 0x0081feda -> 0x0081fedb
[ 0.000000] 1: 0x0081fedd -> 0x0081fee5
[ 0.000000] 1: 0x0081fee7 -> 0x0081ff51
[ 0.000000] 1: 0x0081ff59 -> 0x0081ff5d
So it's a block move in that 0x81f600-->0x81f7ff region which triggers
the problem.
This patch:
Declaration of early_pfn_to_nid() is scattered over per-arch include
files, and it seems it's complicated to know when the declaration is used.
I think it makes fix-for-memmap-init not easy.
This patch moves all declaration to include/linux/mm.h
After this,
if !CONFIG_NODES_POPULATES_NODE_MAP && !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
-> Use static definition in include/linux/mm.h
else if !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
-> Use generic definition in mm/page_alloc.c
else
-> per-arch back end function will be called.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reported-by: David Miller <davem@davemlloft.net>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Impact: Bugfix
The ifdef for the apic clear on shutdown for the 64bit intel thermal
vector was incorrect and never triggered. Fix that.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
|
Impact: bug fix (with tolerant == 3)
do_exit cannot be called directly from the exception handler because
it can sleep and the exception handler runs on the exception stack.
Use force_sig() instead.
Based on a earlier patch by Ying Huang who debugged the problem.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
|
Impact: Bug fix
This fixes a long standing bug in the machine check code. On resume the
boot CPU wouldn't get its vendor specific state like thermal handling
reinitialized. This means the boot cpu wouldn't ever get any thermal
events reported again.
Call the respective initialization functions on resume
v2: Remove ancient init because they don't have a resume device anyways.
Pointed out by Thomas Gleixner.
v3: Now fix the Subject too to reflect v2 change
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
|
The GPIO interrupts can be configured as either level triggered or edge
triggered, with a default of level triggered. When an edge triggered
interrupt is requested, the gpio_irq_set_type method is called which
currently switches the given IRQ descriptor between two struct irq_chip
instances: orion_gpio_irq_level_chip and orion_gpio_irq_edge_chip. This
happens via __setup_irq() which also calls irq_chip_set_defaults() to
assign default methods to uninitialized ones. The problem is that
irq_chip_set_defaults() is called before the irq_chip reference is
switched, leaving the new irq_chip (orion_gpio_irq_edge_chip in this
case) with uninitialized methods such as chip->startup() causing a kernel
oops.
Many solutions are possible, such as making irq_chip_set_defaults() global
and calling it from gpio_irq_set_type(), or calling __irq_set_trigger()
before irq_chip_set_defaults() in __setup_irq(). But those require
modifications to the generic IRQ code which might have adverse effect on
other architectures, and that would still be a fragile arrangement.
Manually copying the missing methods from within gpio_irq_set_type()
would be really ugly and it would break again the day new methods with
automatic defaults are added.
A better solution is to have a single irq_chip instance which can deal
with both edge and level triggered interrupts. It is also a good idea
to switch the IRQ handler instead, as the edge IRQ handler allows for
one edge IRQ event to be queued as the IRQ is actually masked only when
that second IRQ is received, at which point the hardware can queue an
additional IRQ event, making edge triggered interrupts a bit more
reliable.
Tested-by: Martin Michlmayr <tbm@cyrius.com>
Signed-off-by: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
doc: mmiotrace.txt, buffer size control change
trace: mmiotrace to the tracer menu in Kconfig
mmiotrace: count events lost due to not recording
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, vm86: fix preemption bug
x86, olpc: fix model detection without OFW
x86, hpet: fix for LS21 + HPET = boot hang
x86: CPA avoid repeated lazy mmu flush
x86: warn if arch_flush_lazy_mmu_cpu is called in preemptible context
x86/paravirt: make arch_flush_lazy_mmu/cpu disable preemption
x86, pat: fix warn_on_once() while mapping 0-1MB range with /dev/mem
x86/cpa: make sure cpa is safe to call in lazy mmu mode
x86, ptrace, mm: fix double-free on race
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/vsx: Fix VSX alignment handler for regs 32-63
powerpc/ps3: Move ps3_mm_add_memory to device_initcall
powerpc/mm: Fix numa reserve bootmem page selection
powerpc/mm: Fix _PAGE_CHG_MASK to protect _PAGE_SPECIAL
|
|
Impact: build fix, cleanup
A couple of arch setup callbacks were mistakenly in apic_32.c, breaking
the build.
Also simplify the code a bit.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
* 'kvm-updates/2.6.29' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: VMX: Flush volatile msrs before emulating rdmsr
KVM: Fix assigned devices circular locking dependency
KVM: x86: fix LAPIC pending count calculation
KVM: Fix INTx for device assignment
KVM: MMU: Map device MMIO as UC in EPT
KVM: x86: disable kvmclock on non constant TSC hosts
KVM: PIT: fix i8254 pending count read
KVM: Fix racy in kvm_free_assigned_irq
KVM: Add kvm_arch_sync_events to sync with asynchronize events
KVM: mmu_notifiers release method
KVM: Avoid using CONFIG_ in userspace visible headers
KVM: ia64: fix fp fault/trap handler
|
|
Damien Wyart reported high ksoftirqd CPU usage (20%) on an
otherwise idle system.
The function-graph trace Damien provided:
> 799.521187 | 1) <idle>-0 | | rcu_check_callbacks() {
> 799.521371 | 1) <idle>-0 | | rcu_check_callbacks() {
> 799.521555 | 1) <idle>-0 | | rcu_check_callbacks() {
> 799.521738 | 1) <idle>-0 | | rcu_check_callbacks() {
> 799.521934 | 1) <idle>-0 | | rcu_check_callbacks() {
> 799.522068 | 1) ksoftir-2324 | | rcu_check_callbacks() {
> 799.522208 | 1) <idle>-0 | | rcu_check_callbacks() {
> 799.522392 | 1) <idle>-0 | | rcu_check_callbacks() {
> 799.522575 | 1) <idle>-0 | | rcu_check_callbacks() {
> 799.522759 | 1) <idle>-0 | | rcu_check_callbacks() {
> 799.522956 | 1) <idle>-0 | | rcu_check_callbacks() {
> 799.523074 | 1) ksoftir-2324 | | rcu_check_callbacks() {
> 799.523214 | 1) <idle>-0 | | rcu_check_callbacks() {
> 799.523397 | 1) <idle>-0 | | rcu_check_callbacks() {
> 799.523579 | 1) <idle>-0 | | rcu_check_callbacks() {
> 799.523762 | 1) <idle>-0 | | rcu_check_callbacks() {
> 799.523960 | 1) <idle>-0 | | rcu_check_callbacks() {
> 799.524079 | 1) ksoftir-2324 | | rcu_check_callbacks() {
> 799.524220 | 1) <idle>-0 | | rcu_check_callbacks() {
> 799.524403 | 1) <idle>-0 | | rcu_check_callbacks() {
> 799.524587 | 1) <idle>-0 | | rcu_check_callbacks() {
> 799.524770 | 1) <idle>-0 | | rcu_check_callbacks() {
> [ . . . ]
Shows rcu_check_callbacks() being invoked way too often. It should be called
once per jiffy, and here it is called no less than 22 times in about
3.5 milliseconds, meaning one call every 160 microseconds or so.
Why do we need to call rcu_pending() and rcu_check_callbacks() from the
idle loop of 32-bit x86, especially given that no other architecture does
this?
The following patch removes the call to rcu_pending() and
rcu_check_callbacks() from the x86 32-bit idle loop in order to
reduce the softirq load on idle systems.
Reported-by: Damien Wyart <damien.wyart@free.fr>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Move the 32-bit extended-arch APIC drivers to arch/x86/kernel/apic/
too, and rename apic_64.c to probe_64.c.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
arch/x86/kernel/ is getting a bit crowded, and the APIC
drivers are scattered into various different files.
Move them to arch/x86/kernel/apic/*, and also remove
the 'gen' prefix from those which had it.
Also move APIC related functionality: the IO-APIC driver,
the NMI and the IPI code.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|