summaryrefslogtreecommitdiff
path: root/arch/x86/kernel
AgeCommit message (Collapse)Author
2008-04-11asmlinkage_protect replaces prevent_tail_callRoland McGrath
The prevent_tail_call() macro works around the problem of the compiler clobbering argument words on the stack, which for asmlinkage functions is the caller's (user's) struct pt_regs. The tail/sibling-call optimization is not the only way that the compiler can decide to use stack argument words as scratch space, which we have to prevent. Other optimizations can do it too. Until we have new compiler support to make "asmlinkage" binding on the compiler's own use of the stack argument frame, we have work around all the manifestations of this issue that crop up. More cases seem to be prevented by also keeping the incoming argument variables live at the end of the function. This makes their original stack slots attractive places to leave those variables, so the compiler tends not clobber them for something else. It's still no guarantee, but it handles some observed cases that prevent_tail_call() did not. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-10x86: Simplify cpu_idle_waitVenki Pallipadi
This patch also resolves hangs on boot: http://lkml.org/lkml/2008/2/23/263 http://bugzilla.kernel.org/show_bug.cgi?id=10093 The bug was causing once-in-few-reboots 10-15 sec wait during boot on certain laptops. Earlier commit 40d6a146629b98d8e322b6f9332b182c7cbff3df added smp_call_function in cpu_idle_wait() to kick cpus that are in tickless idle. Looking at cpu_idle_wait code at that time, code seemed to be over-engineered for a case which is rarely used (while changing idle handler). Below is a simplified version of cpu_idle_wait, which just makes a dummy smp_call_function to all cpus, to make them come out of old idle handler and start using the new idle handler. It eliminates code in the idle loop to handle cpu_idle_wait. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-10pop previous section in alternative.cSteven Rostedt
gcc expects all toplevel assembly to return to the original section type. The code in alteranative.c does not do this. This caused some strange bugs in sched-devel where code would end up in the .rodata section and when the kernel sets the NX bit on all .rodata, the kernel would crash when executing this code. This patch adds a .previous marker to return the code back to the original section. Credit goes to Andrew Pinski for telling me it wasn't a gcc bug but a bug in the toplevel asm code in the kernel. ;-) Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-07x86: fix call to set_cyc2ns_scale() from time_cpufreq_notifier()Karsten Wiese
In time_cpufreq_notifier() the cpu id to act upon is held in freq->cpu. Use it instead of smp_processor_id() in the call to set_cyc2ns_scale(). This makes the preempt_*able() unnecessary and lets set_cyc2ns_scale() update the intended cpu's cyc2ns. Related mail/thread: http://lkml.org/lkml/2007/12/7/130 Signed-off-by: Karsten Wiese <fzu@wemgehoertderstaat.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-07revert "x86: tsc prevent time going backwards"Ingo Molnar
revert: | commit 47001d603375f857a7fab0e9c095d964a1ea0039 | Author: Thomas Gleixner <tglx@linutronix.de> | Date: Tue Apr 1 19:45:18 2008 +0200 | | x86: tsc prevent time going backwards it has been identified to cause suspend regression - and the commit fixes a longstanding bug that existed before 2.6.25 was opened - so it can wait some more until the effects are better understood. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-06Fix booting pentium+ with dodgy TSCRusty Russell
We handle a broken tsc these days, so no need to panic. We clear the TSC bit when tsc_init decides it's unreliable (eg. under lguest w/ bad host TSC), leading to bogus panic. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-04x86: revert assign IRQs to hpet timerThomas Gleixner
The commits: commit 37a47db8d7f0f38dac5acf5a13abbc8f401707fa Author: Balaji Rao <balajirrao@gmail.com> Date: Wed Jan 30 13:30:03 2008 +0100 x86: assign IRQs to HPET timers, fix and commit e3f37a54f690d3e64995ea7ecea08c5ab3070faf Author: Balaji Rao <balajirrao@gmail.com> Date: Wed Jan 30 13:30:03 2008 +0100 x86: assign IRQs to HPET timers have been identified to cause a regression on some platforms due to the assignement of legacy IRQs which makes the legacy devices connected to those IRQs disfunctional. Revert them. This fixes http://bugzilla.kernel.org/show_bug.cgi?id=10382 Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-04x86: tsc prevent time going backwardsThomas Gleixner
We already catch most of the TSC problems by sanity checks, but there is a subtle bug which has been in the code for ever. This can cause time jumps in the range of hours. This was reported in: http://lkml.org/lkml/2007/8/23/96 and http://lkml.org/lkml/2008/3/31/23 I was able to reproduce the problem with a gettimeofday loop test on a dual core and a quad core machine which both have sychronized TSCs. The TSCs seems not to be perfectly in sync though, but the kernel is not able to detect the slight delta in the sync check. Still there exists an extremly small window where this delta can be observed with a real big time jump. So far I was only able to reproduce this with the vsyscall gettimeofday implementation, but in theory this might be observable with the syscall based version as well. CPU 0 updates the clock source variables under xtime/vyscall lock and CPU1, where the TSC is slighty behind CPU0, is reading the time right after the seqlock was unlocked. The clocksource reference data was updated with the TSC from CPU0 and the value which is read from TSC on CPU1 is less than the reference data. This results in a huge delta value due to the unsigned subtraction of the TSC value and the reference value. This algorithm can not be changed due to the support of wrapping clock sources like pm timer. The huge delta is converted to nanoseconds and added to xtime, which is then observable by the caller. The next gettimeofday call on CPU1 will show the correct time again as now the TSC has advanced above the reference value. To prevent this TSC specific wreckage we need to compare the TSC value against the reference value and return the latter when it is larger than the actual TSC value. I pondered to mark the TSC unstable when the readout is smaller than the reference value, but this would render an otherwise good and fast clocksource unusable without a real good reason. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-04x86, agpgart: scary messages are fortunately obsoletePavel Machek
Fix obsolete printks in aperture-64. We used not to handle missing agpgart, but we handle it okay now. Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-04x86: print message if nmi_watchdog=2 cannot be enabledIngo Molnar
right now if there's no CPU support for nmi_watchdog=2 we'll just refuse it silently. print a useful warning. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-04x86: fix nmi_watchdog=2 on Pentium-D CPUsIngo Molnar
implement nmi_watchdog=2 on this class of CPUs: cpu family : 15 model : 6 model name : Intel(R) Pentium(R) D CPU 3.00GHz the watchdog's ->setup() method is safe anyway, so if the CPU cannot support it we'll bail out safely. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-03x86 ptrace: avoid unnecessary wrmsrRoland McGrath
This avoids using wrmsr on MSR_IA32_DEBUGCTLMSR when it's not needed. No wrmsr ever needs to be done if noone has ever used block stepping. Without this change, using ptrace on 2.6.25 on an x86 KVM guest will tickle KVM's missing support for the MSR and crash the guest kernel. Though host KVM is the buggy one, this makes for a regression in the guest behavior from 2.6.24->2.6.25 that we can easily avoid. I also corrected some bad whitespace. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-02vmcoreinfo: add the symbol "phys_base"Ken'ichi Ohmichi
Fix the problem that makedumpfile sometimes fails on x86_64 machine. This patch adds the symbol "phys_base" to a vmcoreinfo data. The vmcoreinfo data has the minimum debugging information only for dump filtering. makedumpfile (dump filtering command) gets it to distinguish unnecessary pages, and makedumpfile creates a small dumpfile. On x86_64 kernel which compiled with CONFIG_PHYSICAL_START=0x0 and CONFIG_RELOCATABLE=y, makedumpfile fails like the following: # makedumpfile -d31 /proc/vmcore dumpfile The kernel version is not supported. The created dumpfile may be incomplete. _exclude_free_page: Can't get next online node. makedumpfile Failed. # The cause is the lack of the symbol "phys_base" in a vmcoreinfo data. If the symbol "phys_base" does not exist, makedumpfile considers an x86_64 kernel as non relocatable. As the result, makedumpfile misunderstands the physical address where the kernel is loaded, and it cannot translate a kernel virtual address to physical address correctly. To fix this problem, this patch adds the symbol "phys_base" to a vmcoreinfo data. Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: <stable@kernel.org> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-27x86: ptrace.c: fix defined-but-unused warningsAndrew Morton
arch/x86/kernel/ptrace.c:548: warning: 'ptrace_bts_get_size' defined but not used arch/x86/kernel/ptrace.c:558: warning: 'ptrace_bts_read_record' defined but not used arch/x86/kernel/ptrace.c:607: warning: 'ptrace_bts_clear' defined but not used arch/x86/kernel/ptrace.c:617: warning: 'ptrace_bts_drain' defined but not used arch/x86/kernel/ptrace.c:720: warning: 'ptrace_bts_config' defined but not used arch/x86/kernel/ptrace.c:788: warning: 'ptrace_bts_status' defined but not used Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-03-26x86: fix trim mtrr not to setup_memory two timesYinghai Lu
we could call find_max_pfn() directly instead of setup_memory() to get max_pfn needed for mtrr trimming. otherwise setup_memory() is called two times... that is duplicated... [ mingo@elte.hu: both Thomas and me simulated a double call to setup_bootmem_allocator() and can confirm that it is a real bug which can hang in certain configs. It's not been reported yet but that is probably due to the relatively scarce nature of MTRR-trimming systems. ] Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-26x86: GEODE: add missing module.h includeAndres Salomon
On Wed, 26 Mar 2008 11:56:22 -0600 Jordan Crouse <jordan.crouse@amd.com> wrote: > On 26/03/08 14:31 +0100, Stefan Pfetzing wrote: > > Hello Jordan, > > > > I just tried to build your geodwdt driver for the geode watchdog. Therefore > > I pulled your repository from http://git.infradead.org/geode.git (or more, > > the git url). > > > > I tried to build the geodewdt driver as a module - which didn't work, and > > it failed with the same problem as earlier mentioned on lkmk [1]. I also > > checked the fix [2], but that seems to be already in your (or linus) tree - > > and so I'm unsure what the problem is. > > > > [1] http://kerneltrap.org/mailarchive/linux-kernel/2008/2/17/884074 > > [2] http://kerneltrap.org/mailarchive/linux-kernel/2008/2/17/884174 > > > > Building directly into the kernel seems to work. > > > > Maybe you have some idea? > > Hmm - that is strange. Exporting the symbols should work. I recommend > starting over with a clean tree. > > CCing Andres - any thoughts? > > Jordan > Er, yeah. The patch below should fix it. This should probably go into 2.6.25. Oops, EXPORT_SYMBOL_GPL wasn't being declared due to this header being missing. Signed-off-by: Andres Salomon <dilinger@debian.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-26x86, cpufreq: fix Speedfreq-SMI call that clobbers ECXStephan Diestelhorst
I have found that using SMI to change the cpu's frequency on my DELL Latitude L400 clobbers the ECX register in speedstep_set_state, causing unneccessary retries because the "state" variable has changed silently (GCC assumes it is still present in ECX). play safe and avoid gcc caching any register across IO port accesses that trigger SMIs. Signed-off by: <Stephan.Diestelhorst@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-26x86: fix memoryless node oops during bootYinghai Lu
fix oops during boot reported in this thread: http://lkml.org/lkml/2008/2/6/65 enable booting on memoryless nodes. Reported-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-26x86: add dmi quirk for io_delayIngo Molnar
reported by mereandor@gmail.com, in: http://bugzilla.kernel.org/show_bug.cgi?id=6307 Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-03-26x86: convert mtrr/generic.c to kernel-docRandy Dunlap
Convert function comment blocks to kernel-doc notation. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-03-22x86: revert: reserve dma32 early for gartThomas Gleixner
Revert commit f62f1fc9ef94f74fda2b456d935ba2da69fa0a40 Author: Yinghai Lu <yhlu.kernel@gmail.com> Date: Fri Mar 7 15:02:50 2008 -0800 x86: reserve dma32 early for gart The patch has a dependency on bootmem modifications which are not .25 material that late in the -rc cycle. The problem which is addressed by the patch is limited to machines with 256G and more memory booted with NUMA disabled. This is not a .25 regression and the audience which is affected by this problem is very limited, so it's safer to do the revert than pulling in intrusive bootmem changes right now. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-03-21x86: trim mtrr don't close gap for resource allocation.Yinghai Lu
fix the bug reported here: http://bugzilla.kernel.org/show_bug.cgi?id=10232 use update_memory_range() instead of add_memory_range() directly to avoid closing the gap. ( the new code only affects and runs on systems where the MTRR workaround triggers. ) Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-03-21x86: fix reboot problem with Dell Optiplex 745, 0KW626 boardHeinz-Ado Arnolds
we have seen a little problem in rebooting Dell Optiplex 745 with the 0KW626 board. Here is a small patch enabling reboot with this board, which forces the default reboot path it into the BIOS reboot mode. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-03-21x86: fix fault_msg nul terminationJiri Slaby
The fault_msg text is not explictly nul terminated now in startup assembly. Do so by converting .ascii to .asciz. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-03-21x86: fix long standing bug with usb after hibernation with 4GB ramPavel Machek
aperture_64.c takes a piece of memory and makes it into iommu window... but such window may not be saved by swsusp -- that leads to oops during hibernation. Signed-off-by: Pavel Machek <pavel@suse.cz> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-03-21x86: hpet clock enable quirk on nVidia nForce 430Zbigniew Luszpinski
this patch allows hpet=force on nVidia nForce 430 southbridge. This patch was tested by me on my old Asus A8N-VM CSM (where bios does not support hpet and does not advertise it via acpi entry). My nForce430 version: lspci -nn | grep LPC 00:0a.0 ISA bridge [0601]: nVidia Corporation MCP51 LPC Bridge [10de:0260] (rev a2) Kernel 2.6.24.3 after patching and using hpet=force reports this: dmesg | grep -i hpet Kernel command line: root=/dev/sda8 ro vga=773 video=vesafb:mtrr:4,ywrap vt.default_utf8=0 hpet=force Force enabled HPET at base address 0xfed00000 hpet clockevent registered Time: hpet clocksource has been installed. grep -i hpet /proc/timer_list Clock Event Device: hpet set_next_event: hpet_legacy_next_event set_mode: hpet_legacy_set_mode grep Clock /proc/timer_list (before patching) Clock Event Device: pit Clock Event Device: lapic grep Clock /proc/timer_list (after patching) Clock Event Device: hpet Clock Event Device: lapic Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-03-21x86: reserve dma32 early for gartYinghai Lu
a system with 256 GB of RAM, when NUMA is disabled crashes the following way: Your BIOS doesn't leave a aperture memory hole Please enable the IOMMU option in the BIOS setup This costs you 64 MB of RAM Cannot allocate aperture memory hole (ffff8101c0000000,65536K) Kernel panic - not syncing: Not enough memory for aperture Pid: 0, comm: swapper Not tainted 2.6.25-rc4-x86-latest.git #33 Call Trace: [<ffffffff84037c62>] panic+0xb2/0x190 [<ffffffff840381fc>] ? release_console_sem+0x7c/0x250 [<ffffffff847b1628>] ? __alloc_bootmem_nopanic+0x48/0x90 [<ffffffff847b0ac9>] ? free_bootmem+0x29/0x50 [<ffffffff847ac1f7>] gart_iommu_hole_init+0x5e7/0x680 [<ffffffff847b255b>] ? alloc_large_system_hash+0x16b/0x310 [<ffffffff84506a2f>] ? _etext+0x0/0x1 [<ffffffff847a2e8c>] pci_iommu_alloc+0x1c/0x40 [<ffffffff847ac795>] mem_init+0x45/0x1a0 [<ffffffff8479ff35>] start_kernel+0x295/0x380 [<ffffffff8479f1c2>] _sinittext+0x1c2/0x230 the root cause is : memmap PMD is too big, [ffffe200e0600000-ffffe200e07fffff] PMD ->ffff81383c000000 on node 0 almost near 4G..., and vmemmap_alloc_block will use up the ram under 4G. solution will be: 1. make memmap allocation get memory above 4G... 2. reserve some dma32 range early before we try to set up memmap for all. and release that before pci_iommu_alloc, so gart or swiotlb could get some range under 4g limit for sure. the patch is using method 2. because method1 may need more code to handle SPARSEMEM and SPASEMEM_VMEMMAP will get Your BIOS doesn't leave a aperture memory hole Please enable the IOMMU option in the BIOS setup This costs you 64 MB of RAM Mapping aperture over 65536 KB of RAM @ 4000000 Memory: 264245736k/268959744k available (8484k kernel code, 4187464k reserved, 4004k data, 724k init) Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-03-21x86: add the DFF (Desktop Form Factor) Dell Optiplex 745 to the reboot ↵Coleman Kane
errata list We recently got some of the "Desktop Form Factor" Optiplex 745's in. I noticed that there's an entry for the SFF one's, but the BIOS model number of the DFF differs from that of the SFF. We have been reliably experiencing the same (as far as I can tell) reboot bug as the SFF boxes. Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-03-21x86: tight online check in setup_per_cpu_areasYinghai Lu
when numa disabled I got this compile warning: arch/x86/kernel/setup64.c: In function setup_per_cpu_areas: arch/x86/kernel/setup64.c:147: warning: the address of contig_page_data will always evaluate as true it seems we missed checking if the node is online before we try to refer NODE_DATA. Fix it. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-03-21x86: fix dma_alloc_pagesYinghai Lu
memory-less node support: this patch uses updated dev_to_node, because dev_to_node already makes sure it returns an online node. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-03-11x86: ia32 syscall restart fixRoland McGrath
The code to restart syscalls after signals depends on checking for a negative orig_ax, and for particular negative -ERESTART* values in ax. These fields are 64 bits and for a 32-bit task they get zero-extended. The syscall restart behavior is lost, a regression from a native 32-bit kernel and from 64-bit tasks' behavior. This patch fixes the problem by doing sign-extension where it matters. For orig_ax, the only time the value should be -1 but winds up as 0x0ffffffff is via a 32-bit ptrace call. So the patch changes ptrace to sign-extend the 32-bit orig_eax value when it's stored; it doesn't change the checks on orig_ax, though it uses the new current_syscall() inline to better document the subtle importance of the used of signedness there. The ax value is stored a lot of ways and it seems hard to get them all sign-extended at their origins. So for that, we use the current_syscall_ret() to sign-extend it only for 32-bit tasks at the time of the -ERESTART* comparisons. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-08x86_64: make ptrace always sign-extend orig_ax to 64 bitsRoland McGrath
This makes 64-bit ptrace calls setting the 64-bit orig_ax field for a 32-bit task sign-extend the low 32 bits up to 64. This matches what a 64-bit debugger expects when tracing a 32-bit task. This follows on my "x86_64 ia32 syscall restart fix". This didn't matter until that was fixed. The debugger ignores or zeros the high half of every register slot it sets (including the orig_rax pseudo-register) uniformly. It expects that the setting of the low 32 bits always has the same meaning as a 32-bit debugger setting those same 32 bits with native 32-bit facilities. This never arose before because the syscall restart check never matched any -ERESTART* values due to lack of sign extension. Before that fix, even 32-bit ptrace setting orig_eax to -1 failed to trigger the restart check anyway. So this was never noticed as a regression of 64-bit debuggers vs 32-bit debuggers on the same 64-bit kernel. Signed-off-by: Roland McGrath <roland@redhat.com> [ Changed to just do the sign-extension unconditionally on x86-64, since orig_ax is always just a small integer and doesn't need the full 64-bit range ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-07x86: re-add reboot fixupsIngo Molnar
Jan Beulich noticed that the reboot fixups went missing during reboot.c unification. (commit 4d022e35fd7e07c522c7863fee6f07e53cf3fc14) Geode and a few other rare boards with special reboot quirks are affected. Reported-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-07x86: fix typo in step.cJan Beulich
TIF_DEBUGCTLMSR has no meaning in the actual MSR... Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-07x86: fix merge mistake in i387.cJan Beulich
convert_fxsr_to_user() in 2.6.24's i387_32.c did this, and convert_to_fxsr() also does the inverse, so I assume it's an oversight that it is no longer being done. [ mingo@elte.hu: we encode it this way because there's no space for the 'FPU Last Instruction Opcode' (->fop) field in the legacy user_i387_ia32_struct that PTRACE_GETFPREGS/PTRACE_SETFPREGS uses. it's probably pure legacy - i'd be surprised if any user-space relied on the FPU Last Opcode in any way. But indeed we used to do it previously so the most conservative thing is to preserve that piece of information. ] Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-07x86: clear DF before calling signal handlerAurelien Jarno
The Linux kernel currently does not clear the direction flag before calling a signal handler, whereas the x86/x86-64 ABI requires that. Linux had this behavior/bug forever, but this becomes a real problem with gcc version 4.3, which assumes that the direction flag is correctly cleared at the entry of a function. This patches changes the setup_frame() functions to clear the direction before entering the signal handler. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: H. Peter Anvin <hpa@zytor.com>
2008-03-05[CPUFREQ] Remove debugging message from e_powersaverDave Jones
We don't need to printk a message every time we transition. Leave the code there, but ifdef'd out, as it's useful when adding support for new processors. Reported-by: Petr Titěra <P.Titera@century.cz> Signed-off-by: Dave Jones <davej@redhat.com>
2008-03-04x86, i387: fix ptrace leakage using init_fpu()Suresh Siddha
This bug got introduced by the recent i387 merge: commit 4421011120b2304e5c248ae4165a2704588aedf1 Author: Roland McGrath <roland@redhat.com> Date: Wed Jan 30 13:31:50 2008 +0100 x86: x86 i387 user_regset Current usage of unlazy_fpu() in ptrace specific routines is wrong. unlazy_fpu() will not init fpu if the task never used math. So the ptrace calls can expose the parent tasks FPU data in some cases. Replace it with the init_fpu() which will init the math state, if the task never used math before. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-29x86: disable BTS ptrace extensions for nowIngo Molnar
revert the BTS ptrace extension for now. based on general objections from Roland McGrath: http://lkml.org/lkml/2008/2/21/323 we'll let the BTS functionality cook some more and re-enable it in v2.6.26. We'll leave the dead code around to help the development of this code. (X86_BTS is not defined at the moment) Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-29x86: delay the export removal of init_mmIngo Molnar
delay the removal of this symbol export by one more kernel release, giving external modules such as VirtualBox a chance to stop using it. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-29x86: restore vsyscall64 prochandlerThomas Gleixner
a recent fix: commit ce28b9864b853803320c3f1d8de1b81aa4120b14 Author: Thomas Gleixner <tglx@linutronix.de> Date: Wed Feb 20 23:57:30 2008 +0100 x86: fix vsyscall wreckage removed the broken /kernel/vsyscall64 handler completely. This triggers the following debug check: sysctl table check failed: /kernel/vsyscall64 No proc_handler Restore the sane part of the proc handler. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-29x86: tls prevent_tail_callRoland McGrath
Fix a kernel bug (vmware boot problem) reported by Tomasz Grobelny, which occurs with certain .config variants and gccs. The x86 TLS cleanup in commit efd1ca52d04d2f6df337a3332cee56cd60e6d4c4 made the sys_set_thread_area and sys_get_thread_area functions ripe for tail call optimization. If the compiler chooses to use it for them, it can clobber the user trap frame because these are asmlinkage functions. Reported-by: Tomasz Grobelny <tomasz@grobelny.oswiecenia.net> Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-26x86: fix boot failure on 486 due to TSC breakageMikael Pettersson
> Diffing dmesg between git7 and git8 doesn't sched any light since > git8 also removed the printouts of the x86 caps as they were being > initialised and updated. I'm currently adding those printouts back > in the hope of seeing where and when the caps get broken. That turned out to be very illuminating: --- dmesg-2.6.24-git7 2008-02-24 18:01:25.295851000 +0100 +++ dmesg-2.6.24-git8 2008-02-24 18:01:25.530358000 +0100 ... CPU: After generic identify, caps: 00000003 00000000 00000000 00000000 00000000 00000000 00000000 00000000 CPU: After all inits, caps: 00000003 00000000 00000000 00000000 00000000 00000000 00000000 00000000 +CPU: After applying cleared_cpu_caps, caps: 00000013 00000000 00000000 00000000 00000000 00000000 00000000 00000000 Notice how the TSC cap bit goes from Off to On. (The first two lines are printout loops from -git7 forward-ported to -git8, the third line is the same printout loop added just after the xor-with-cleared_cpu_caps[] loop.) Here's how the breakage occurs: 1. arch/x86/kernel/tsc_32.c:tsc_init() sees !cpu_has_tsc, so bails and calls setup_clear_cpu_cap(X86_FEATURE_TSC). 2. include/asm-x86/cpufeature.h:setup_clear_cpu_cap(bit) clears the bit in boot_cpu_data and sets it in cleared_cpu_caps 3. arch/x86/kernel/cpu/common.c:identify_cpu() XORs all caps in with cleared_cpu_caps HOWEVER, at this point c->x86_capability correctly has TSC Off, cleared_cpu_caps has TSC On, so the XOR incorrectly sets TSC to On in c->x86_capability, with disastrous results. The real bug is that clearing bits with XOR only works if the bits are known to be 1 prior to the XOR, and that's not true here. A simple fix is to convert the XOR to AND-NOT instead. The following patch does that, and allows my 486 to boot 2.6.25-rc kernels again. [ mingo@elte.hu: fixed a similar bug in setup_64.c as well. ] The breakage was introduced via commit 7d851c8d3db0. Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-26x86: make c_idle.work have a static address.Glauber Costa
Currently, c_idle is declared in the stack, and thus, have no static address. Peter Zijlstra points out this simple solution, in which c_idle.work is initializated separatedly. Note that the INIT_WORK macro has a static declaration of a key inside. Signed-off-by: Glauber Costa <gcosta@redhat.com> Acked-by: Peter Zijlstra <pzijlstr@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-26x86: don't save unreliable stack trace entriesVegard Nossum
Currently, there is no way for print_stack_trace() to determine whether a given stack trace entry was deemed reliable or not, simply because save_stack_trace() does not record this information. (Perhaps needless to say, this makes the saved stack traces A LOT harder to read, and probably with no other benefits, since debugging features that use save_stack_trace() most likely also require frame pointers, etc.) This patch reverts to the old behaviour of only recording the reliable trace entries for saved stack traces. Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-26x86: don't make swapper_pg_pmd globalAdrian Bunk
There doesn't seem to be any reason for swapper_pg_pmd being global. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-26x86: don't print a warning when MTRR are blank and running in KVMJoerg Roedel
Inside a KVM virtual machine the MTRRs are usually blank. This confuses Linux and causes a warning message at boot. This patch removes that warning message when running Linux as a KVM guest. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-26x86: fix execve with -fstack-protectIngo Molnar
pointed out by pageexec@freemail.hu: > what happens here is that gcc treats the argument area as owned by the > callee, not the caller and is allowed to do certain tricks. for ssp it > will make a copy of the struct passed by value into the local variable > area and pass *its* address down, and it won't copy it back into the > original instance stored in the argument area. > > so once sys_execve returns, the pt_regs passed by value hasn't at all > changed and its default content will cause a nice double fault (FWIW, > this part took me the longest to debug, being down with cold didn't > help it either ;). To fix this we pass in pt_regs by pointer. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-26x86: fix vsyscall wreckageThomas Gleixner
based on a report from Arne Georg Gleditsch about user-space apps misbehaving after toggling /proc/sys/kernel/vsyscall64, a review of the code revealed that the "NOP patching" done there is fundamentally unsafe for a number of reasons: 1) the patching code runs without synchronizing other CPUs 2) it inserts NOPs even if there is no clock source which provides vread 3) when the clock source changes to one without vread we run in exactly the same problem as in #2 4) if nobody toggles the proc entry from 1 to 0 and to 1 again, then the syscall is not patched out as a result it is possible to break user-space via this patching. The only safe thing for now is to remove the patching. This code was broken since v2.6.21. Reported-by: Arne Georg Gleditsch <arne.gleditsch@dolphinics.no> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-26x86: rename KERNEL_TEXT_SIZE => KERNEL_IMAGE_SIZEIngo Molnar
The KERNEL_TEXT_SIZE constant was mis-named, as we not only map the kernel text but data, bss and init sections as well. That name led me on the wrong path with the KERNEL_TEXT_SIZE regression, because i knew how big of _text_ my images have and i knew about the 40 MB "text" limit so i wrongly thought to be on the safe side of the 40 MB limit with my 29 MB of text, while the total image size was slightly above 40 MB. Signed-off-by: Ingo Molnar <mingo@elte.hu>