summaryrefslogtreecommitdiff
path: root/arch/x86/kernel/cpu
AgeCommit message (Collapse)Author
2009-12-05Merge branch 'x86-debug-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Limit number of per cpu TSC sync messages x86: dumpstack, 64-bit: Disable preemption when walking the IRQ/exception stacks x86: dumpstack: Clean up the x86_stack_ids[][] initalization and other details x86, cpu: mv display_cacheinfo -> cpu_detect_cache_sizes x86: Suppress stack overrun message for init_task x86: Fix cpu_devs[] initialization in early_cpu_init() x86: Remove CPU cache size output for non-Intel too x86: Minimise printk spew from per-vendor init code x86: Remove the CPU cache size printk's cpumask: Avoid cpumask_t in arch/x86/kernel/apic/nmi.c x86: Make sure we also print a Code: line for show_regs()
2009-12-05Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits) x86, apic: Enable lapic nmi watchdog on AMD Family 11h x86: Remove unnecessary mdelay() from cpu_disable_common() x86, ioapic: Document another case when level irq is seen as an edge x86, ioapic: Fix the EOI register detection mechanism x86, io-apic: Move the effort of clearing remoteIRR explicitly before migrating the irq x86: SGI UV: Map low MMR ranges x86: apic: Print out SRAT table APIC id in hex x86: Re-get cfg_new in case reuse/move irq_desc x86: apic: Remove not needed #ifdef x86: io-apic: IO-APIC MMIO should not fail on resource insertion x86: Remove asm/apicnum.h x86: apic: Do not use stacked physid_mask_t x86, apic: Get rid of apicid_to_cpu_present assign on 64-bit x86, ioapic: Use snrpintf while set names for IO-APIC resourses x86, apic: Use PAGE_SIZE instead of numbers x86: Remove local_irq_enable()/local_irq_disable() in fixup_irqs() x86: Use EOI register in io-apic on intel platforms x86: Force irq complete move during cpu offline x86: Remove move_cleanup_count from irq_cfg x86, intr-remap: Avoid irq_chip mask/unmask in fixup_irqs() for intr-remapping ...
2009-12-05Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (470 commits) x86: Fix comments of register/stack access functions perf tools: Replace %m with %a in sscanf hw-breakpoints: Keep track of user disabled breakpoints tracing/syscalls: Make syscall events print callbacks static tracing: Add DEFINE_EVENT(), DEFINE_SINGLE_EVENT() support to docbook perf: Don't free perf_mmap_data until work has been done perf_event: Fix compile error perf tools: Fix _GNU_SOURCE macro related strndup() build error trace_syscalls: Remove unused syscall_name_to_nr() trace_syscalls: Simplify syscall profile trace_syscalls: Remove duplicate init_enter_##sname() trace_syscalls: Add syscall_nr field to struct syscall_metadata trace_syscalls: Remove enter_id exit_id trace_syscalls: Set event_enter_##sname->data to its metadata trace_syscalls: Remove unused event_syscall_enter and event_syscall_exit perf_event: Initialize data.period in perf_swevent_hrtimer() perf probe: Simplify event naming perf probe: Add --list option for listing current probe events perf probe: Add argv_split() from lib/argv_split.c perf probe: Move probe event utility functions to probe-event.c ...
2009-12-03Merge branch 'perf/mce' into perf/coreIngo Molnar
Merge reason: It's ready for v2.6.33. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-03x86, apic: Enable lapic nmi watchdog on AMD Family 11hMikael Pettersson
The x86 lapic nmi watchdog does not recognize AMD Family 11h, resulting in: NMI watchdog: CPU not supported As far as I can see from available documentation (the BKDM), family 11h looks identical to family 10h as far as the PMU is concerned. Extending the check to accept family 11h results in: Testing NMI watchdog ... OK. I've been running with this change on a Turion X2 Ultra ZM-82 laptop for a couple of weeks now without problems. Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: <stable@kernel.org> LKML-Reference: <19223.53436.931768.278021@pilspetsen.it.uu.se> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-26x86, mce: Add __cpuinit to hotplug callback functionsHidetoshi Seto
The mce_disable_cpu() and mce_reenable_cpu() are called only from mce_cpu_callback() which is marked as __cpuinit. So these functions can be __cpuinit too. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Andi Kleen <ak@linux.intel.com> LKML-Reference: <4B0E3C4E.4090809@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-24perf_events, x86: Fix validate_event bugStephane Eranian
The validate_event() was failing on valid event combinations. The function was assuming that if x86_schedule_event() returned 0, it meant error. But x86_schedule_event() returns the counter index and 0 is a perfectly valid value. An error is returned if the function returns a negative value. Furthermore, validate_event() was also failing for event groups because the event->pmu was not set until after hw_perf_event_init(). Signed-off-by: Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: paulus@samba.org Cc: perfmon2-devel@lists.sourceforge.net Cc: eranian@gmail.com LKML-Reference: <4b0bdf36.1818d00a.07cc.25ae@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> -- arch/x86/kernel/cpu/perf_event.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
2009-11-23x86, cpu: mv display_cacheinfo -> cpu_detect_cache_sizesBorislav Petkov
display_cacheinfo() doesn't display anything anymore and it is used to detect CPU cache sizes. Rename it accordingly. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> LKML-Reference: <20091121130145.GA31357@liondog.tnic> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-11-23perf events: Do not generate function trace entries in perf codeIngo Molnar
Decreases perf overhead when function tracing is enabled, by about 50%. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-18[CPUFREQ] speedstep-ich: fix error caused by ↵Rusty Russell
394122ab144dae4b276d74644a2f11c44a60ac5c "[CPUFREQ] cpumask: avoid playing with cpus_allowed in speedstep-ich.c" changed the code to mistakenly pass the current cpu as the "processor" argument of speedstep_get_frequency(), whereas it should be the type of the processor. Addresses http://bugzilla.kernel.org/show_bug.cgi?id=14340 Based on a patch by Dave Mueller. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Dominik Brodowski <linux@brodo.de> Reported-by: Dave Mueller <dave.mueller@gmx.ch> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Jones <davej@redhat.com>
2009-11-18[CPUFREQ] acpi-cpufreq: blacklist Intel 0f68: Fix HT detection and put in ↵John Villalovos
notification message Removing the SMT/HT check, since the Errata doesn't mention Hyper-Threading. Adding in a printk, so that the user knows why acpi-cpufreq refuses to load. Also, once system is blacklisted, don't repeat checks to see if blacklisted. This also causes the message to only be printed once, rather than for each CPU. Signed-off-by: John L. Villalovos <john.l.villalovos@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>
2009-11-18[CPUFREQ] powernow-k8: Fix test in get_transition_latency()Roel Kluin
Not makes it a bool before the comparison. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Dave Jones <davej@redhat.com>
2009-11-18[CPUFREQ] longhaul: select Longhaul version 2 for capable CPUsKrzysztof Helt
There is a typo in the longhaul detection code so only Longhaul v1 or Longhaul v3 is selected. The Longhaul v2 is not selected even for CPUs which are capable of. Tested on PCChips Giga Pro board. Frequency changes work and the Longhaul v2 detects that the board is not capable of changing CPU voltage. Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Signed-off-by: Dave Jones <davej@redhat.com>
2009-11-15Merge commit 'v2.6.32-rc7' into perf/coreIngo Molnar
Merge reason: pick up perf fixlets Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-14x86: Fix cpu_devs[] initialization in early_cpu_init()Ingo Molnar
Yinghai Lu noticed that this commit: 0388423: x86: Minimise printk spew from per-vendor init code mistakenly left out the initialization of cpu_devs[] in the !PROCESSOR_SELECT case. Fix it. Reported-by: Yinghai Lu <yinghai@kernel.org> Cc: Dave Jones <davej@redhat.com> LKML-Reference: <20091113203000.GA19160@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-14x86: Remove CPU cache size output for non-Intel tooRoland Dreier
As Dave Jones said about the output in intel_cacheinfo.c: "They aren't useful, and pollute the dmesg output a lot (especially on machines with many cores). Also the same information can be trivially found out from userspace." Give the generic display_cacheinfo() function the same treatment. Signed-off-by: Roland Dreier <rolandd@cisco.com> Acked-by: Dave Jones <davej@redhat.com> Cc: Mike Travis <travis@sgi.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Tejun Heo <tj@kernel.org> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Yinghai Lu <yinghai@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <adaocn6dp99.fsf_-_@roland-alpha.cisco.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-14x86: Minimise printk spew from per-vendor init codeDave Jones
In the default case where the kernel supports all CPU vendors, we currently print out a bunch of not useful messages on every system. 32-bit: KERNEL supported cpus: Intel GenuineIntel AMD AuthenticAMD NSC Geode by NSC Cyrix CyrixInstead Centaur CentaurHauls Transmeta GenuineTMx86 Transmeta TransmetaCPU UMC UMC UMC UMC 64-bit: KERNEL supported cpus: Intel GenuineIntel AMD AuthenticAMD Centaur CentaurHauls Given that "what CPUs does the kernel support" isn't useful for the "support everything" case, we can suppress these printk's. Signed-off-by: Dave Jones <davej@redhat.com> LKML-Reference: <20091113203000.GA19160@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-13x86: Remove the CPU cache size printk'sDave Jones
They aren't really useful, and they pollute the dmesg output a lot (especially on machines with many cores). Also the same information can be trivially found out from userspace. Reported-by: Mike Travis <travis@sgi.com> Signed-off-by: Dave Jones <davej@redhat.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Roland Dreier <rdreier@cisco.com> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Tejun Heo <tj@kernel.org> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Yinghai Lu <yinghai@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091112231542.GA7129@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-12perf_event, x86: Annotate init functions and dataHiroshi Shimamoto
Annotate init functions and data with __init and __initconst. Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@gmail.com> LKML-Reference: <4AFB721E.8070203@ct.jp.nec.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-12x86, mce: Fix __init annotationsHidetoshi Seto
The intel_init_thermal() is called from resume path, so it cannot be marked as __init. OTOH mce_banks_init() is only called from __mcheck_cpu_cap_init() which is marked as __cpuinit, so it can be also marked as __cpuinit. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Yong Wang <yong.y.wang@linux.intel.com> LKML-Reference: <4AFBB0B8.2070501@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-11x86: Mark the thermal init functions __initYong Wang
Mark the thermal init functions __init so that the init memory can be freed. Signed-off-by: Yong Wang <yong.y.wang@intel.com> LKML-Reference: <20091111075125.GA17900@ywang-moblin2.bj.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10x86: Under BIOS control, restore AP's APIC_LVTTHMR to the BSP valueYong Wang
On platforms where the BIOS handles the thermal monitor interrupt, APIC_LVTTHMR on each logical CPU is programmed to generate a SMI and OS must not touch it. Unfortunately AP bringup sequence using INIT-SIPI-SIPI clears all the LVT entries except the mask bit. Essentially this results in all LVT entries including the thermal monitoring interrupt set to masked (clearing the bios programmed value for APIC_LVTTHMR). And this leads to kernel take over the thermal monitoring interrupt on AP's but not on BSP (leaving the bios programmed value only on BSP). As a result of this, we have seen system hangs when the thermal monitoring interrupt is generated. Fix this by reading the initial value of thermal LVT entry on BSP and if bios has taken over the control, then program the same value on all AP's and leave the thermal monitoring interrupt control on all the logical cpu's to the bios. Signed-off-by: Yong Wang <yong.y.wang@intel.com> Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Arjan van de Ven <arjan@infradead.org> LKML-Reference: <20091110013824.GA24940@ywang-moblin2.bj.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: stable@kernel.org
2009-11-04Merge commit 'v2.6.32-rc6' into perf/coreIngo Molnar
Conflicts: tools/perf/Makefile Merge reason: Resolve the conflict, merge to upstream and merge in perf fixes so we can add a dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-02x86: Fix printk message typo in mtrr cleanup codeDave Jones
Trivial typo. Signed-off-by: Dave Jones <davej@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16x86, mce: Add a global MCE init helperBorislav Petkov
Add an early initcall (pre SMP) which sets up global MCE functionality. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <1255689093-26921-2-git-send-email-borislav.petkov@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16x86, mce: Fix up MCE naming nomenclatureBorislav Petkov
Prefix global/setup routines with "mcheck_" thus differentiating from the internal facilities prefixed with "mce_". Also, prefix the per cpu calls with mcheck_cpu and rename them to reflect the MCE setup hierarchy of calls better. There should be no functionality change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <1255689093-26921-1-git-send-email-borislav.petkov@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16Merge branches 'x86/mce' and 'x86/urgent' into perf/mceIngo Molnar
Merge reason: Put all MCE changes into this branch, we are queueing up a dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16x86: Don't print number of MCE banks for every CPURoland Dreier
The MCE initialization code explicitly says it doesn't handle asymmetric configurations where different CPUs support different numbers of MCE banks, and it prints a big warning in that case. Therefore, printing the "mce: CPU supports <x> MCE banks" message into the kernel log for every CPU is pure redundancy that clutters the log significantly for systems with lots of CPUs. Signed-off-by: Roland Dreier <rolandd@cisco.com> LKML-Reference: <adaeip473qt.fsf@cisco.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13perf_event, x86, mce: Use TRACE_EVENT() for MCE loggingHidetoshi Seto
This approach is the first baby step towards solving many of the structural problems the x86 MCE logging code is having today: - It has a private ring-buffer implementation that has a number of limitations and has been historically fragile and buggy. - It is using a quirky /dev/mcelog ioctl driven ABI that is MCE specific. /dev/mcelog is not part of any larger logging framework and hence has remained on the fringes for many years. - The MCE logging code is still very unclean partly due to its ABI limitations. Fields are being reused for multiple purposes, and the whole message structure is limited and x86 specific to begin with. All in one, the x86 tree would like to move away from this private implementation of an event logging facility to a broader framework. By using perf events we gain the following advantages: - Multiple user-space agents can access MCE events. We can have an mcelog daemon running but also a system-wide tracer capturing important events in flight-recorder mode. - Sampling support: the kernel and the user-space call-chain of MCE events can be stored and analyzed as well. This way actual patterns of bad behavior can be matched to precisely what kind of activity happened in the kernel (and/or in the app) around that moment in time. - Coupling with other hardware and software events: the PMU can track a number of other anomalies - monitoring software might chose to monitor those plus the MCE events as well - in one coherent stream of events. - Discovery of MCE sources - tracepoints are enumerated and tools can act upon the existence (or non-existence) of various channels of MCE information. - Filtering support: we just subscribe to and act upon the events we are interested in. Then even on a per event source basis there's in-kernel filter expressions available that can restrict the amount of data that hits the event channel. - Arbitrary deep per cpu buffering of events - we can buffer 32 entries or we can buffer as much as we want, as long as we have the RAM. - An NMI-safe ring-buffer implementation - mappable to user-space. - Built-in support for timestamping of events, PID markers, CPU markers, etc. - A rich ABI accessible over system call interface. Per cpu, per task and per workload monitoring of MCE events can be done this way. The ABI itself has a nice, meaningful structure. - Extensible ABI: new fields can be added without breaking tooling. New tracepoints can be added as the hardware side evolves. There's various parsers that can be used. - Lots of scheduling/buffering/batching modes of operandi for MCE events. poll() support. mmap() support. read() support. You name it. - Rich tooling support: even without any MCE specific extensions added the 'perf' tool today offers various views of MCE data: perf report, perf stat, perf trace can all be used to view logged MCE events and perhaps correlate them to certain user-space usage patterns. But it can be used directly as well, for user-space agents and policy action in mcelog, etc. With this we hope to achieve significant code cleanup and feature improvements in the MCE code, and we hope to be able to drop the /dev/mcelog facility in the end. This patch is just a plain dumb dump of mce_log() records to the tracepoints / perf events framework - a first proof of concept step. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Andi Kleen <ak@linux.intel.com> LKML-Reference: <4AD42A0D.7050104@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13Merge commit 'v2.6.32-rc4' into perf/coreIngo Molnar
Merge reason: we were on an -rc1 base, merge up to -rc4. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13perf_events, x86: Fix event constraints codeIngo Molnar
There was namespace overlap due to a rename i did - this caused the following build warning, reported by Stephen Rothwell against linux-next x86_64 allmodconfig: arch/x86/kernel/cpu/perf_event.c: In function 'intel_get_event_idx': arch/x86/kernel/cpu/perf_event.c:1445: warning: 'event_constraint' is used uninitialized in this function This is a real bug not just a warning: fix it by renaming the global event-constraints table pointer to 'event_constraints'. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Stephane Eranian <eranian@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091013144223.369d616d.sfr@canb.auug.org.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12mce, edac: Use an atomic notifier for MCEs decodingBorislav Petkov
Add an atomic notifier which ensures proper locking when conveying MCE info to EDAC for decoding. The actual notifier call overrides a default, negative priority notifier. Note: make sure we register the default decoder only once since mcheck_init() runs on each CPU. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20091003065752.GA8935@liondog.tnic> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-11headers: remove sched.h from interrupt.hAlexey Dobriyan
After m68k's task_thread_info() doesn't refer to current, it's possible to remove sched.h from interrupt.h and not break m68k! Many thanks to Heiko Carstens for allowing this. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-10-09perf, x86: Add simple group validationPeter Zijlstra
Refuse to add events when the group wouldn't fit onto the PMU anymore. Naive implementation. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@gmail.com> LKML-Reference: <1254911461.26976.239.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-09perf_events: Add event constraints support for Intel processorsStephane Eranian
On some Intel processors, not all events can be measured in all counters. Some events can only be measured in one particular counter, for instance. Assigning an event to the wrong counter does not crash the machine but this yields bogus counts, i.e., silent error. This patch changes the event to counter assignment logic to take into account event constraints for Intel P6, Core and Nehalem processors. There is no contraints on Intel Atom. There are constraints on Intel Yonah (Core Duo) but they are not provided in this patch given that this processor is not yet supported by perf_events. As a result of the constraints, it is possible for some event groups to never actually be loaded onto the PMU if they contain two events which can only be measured on a single counter. That situation can be detected with the scaling information extracted with read(). Signed-off-by: Stephane Eranian <eranian@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1254840129-6198-3-git-send-email-eranian@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-09perf_events: Check for filters on fixed counter eventsStephane Eranian
Intel fixed counters do not support all the filters possible with a generic counter. Thus, if a fixed counter event is passed but with certain filters set, then the fixed_mode_idx() function must fail and the event must be measured in a generic counter instead. Reject filters are: inv, edge, cnt-mask. Signed-off-by: Stephane Eranian <eranian@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1254840129-6198-2-git-send-email-eranian@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-02x86: Simplify bound checks in the MTRR codeArjan van de Ven
The current bound checks for copy_from_user in the MTRR driver are not as obvious as they could be, and gcc agrees with that. This patch simplifies the boundary checks to the point that gcc can now prove to itself that the copy_from_user() is never going past its bounds. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <20090926205150.30797709@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-02x86: EDAC: MCE: Fix MCE decoding callback logicIngo Molnar
Make decoding of MCEs happen only on AMD hardware by registering a non-default callback only on CPU families which support it. While looking at the interaction of decode_mce() with the other MCE code i also noticed a few other things and made the following cleanups/fixes: - Fixed the mce_decode() weak alias - a weak alias is really not good here, it should be a proper callback. A weak alias will be overriden if a piece of code is built into the kernel - not good, obviously. - The patch initializes the callback on AMD family 10h and 11h. - Added the more correct fallback printk of: No support for human readable MCE decoding on this CPU type. Transcribe the message and run it through 'mcelog --ascii' to decode. On CPUs that dont have a decoder. - Made the surrounding code more readable. Note that the callback allows us to have a default fallback - without having to check the CPU versions during the printout itself. When an EDAC module registers itself, it can install the decode-print function. (there's no unregister needed as this is core code.) version -v2 by Borislav Petkov: - add K8 to the set of supported CPUs - always build in edac_mce_amd since we use an early_initcall now - fix checkpatch warnings Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <20091001141432.GA11410@aftab> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-30Revert "x86, mce: do not compile mcelog message on AMD"Linus Torvalds
This reverts commit 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9, as requested by Andi Kleen: "Obviously kernels compiled with AMD support can still run on non AMD systems, so messages like this can never be removed at compile time." Requsted-by: Andi Kleen <andi@firstfloor.org> Cc: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-26Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf tools: Dont use openat() perf tools: Fix buffer allocation perf tools: .gitignore += perf*.html perf tools: Handle relative paths while loading module symbols perf tools: Fix module symbol loading bug perf_event, x86: Fix 'perf sched record' crashing the machine perf_event: Update PERF_EVENT_FORK header definition perf stat: Fix zero total printouts
2009-09-24Merge branch 'linus' into x86/urgentIngo Molnar
Merge reason: Queueing up dependent early-printk fix. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-23x86: mce: Use safer ways to access MCE registersIngo Molnar
Use rdmsrl_safe() when accessing MCE registers. While in theory we always 'know' which ones are safe to access from the capability bits, there's a lot of hardware variations and reality might differ from theory, as it did in this case: http://bugzilla.kernel.org/show_bug.cgi?id=14204 [ 0.010016] mce: CPU supports 5 MCE banks [ 0.011029] general protection fault: 0000 [#1] [ 0.011998] last sysfs file: [ 0.011998] Modules linked in: [ 0.011998] [ 0.011998] Pid: 0, comm: swapper Not tainted (2.6.31_router #1) HP Vectra [ 0.011998] EIP: 0060:[<c100d9b9>] EFLAGS: 00010246 CPU: 0 [ 0.011998] EIP is at mce_rdmsrl+0x19/0x60 [ 0.011998] EAX: 00000000 EBX: 00000001 ECX: 00000407 EDX: 08000000 [ 0.011998] ESI: 00000000 EDI: 8c000000 EBP: 00000405 ESP: c17d5eac So WARN_ONCE() instead of crashing the box. ( also fix a number of stylistic inconsistencies in the code. ) Note, we might still crash in wrmsrl() if we get that far, but we shouldnt if the registers are truly inaccessible. Reported-by: GNUtoo <GNUtoo@no-log.org> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Andi Kleen <ak@linux.intel.com> LKML-Reference: <bug-14204-5438@http.bugzilla.kernel.org/> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-23perf_event, x86: Fix 'perf sched record' crashing the machinePeter Zijlstra
Chris Malley reported that 'perf sched record' sometimes crashes his box with: [ 389.272175] BUG: unable to handle kernel paging request at ffffb300 [ 389.272294] IP: [<c011b0bd>] default_send_IPI_self+0x1d/0x50 [ 389.272366] *pde = 0073f067 *pte = 00000000 [ 389.274708] Call Trace: [ 389.274752] [<c010e3b4>] ? set_perf_event_pending+0x14/0x20 [ 389.274801] [<c01b9751>] ? perf_output_unlock+0x121/0x1a0 [ 389.274848] [<c01b981a>] ? perf_output_end+0x4a/0x70 [ 389.274893] [<c01ba690>] ? __perf_event_overflow+0x240/0x2f0 [ 389.274942] [<c030963e>] ? atomic64_cmpxchg+0x1e/0x30 [ 389.274988] [<c01ba8f4>] ? perf_swevent_ctx_event+0x1b4/0x1c0 [ 389.275035] [<c01ba773>] ? perf_swevent_ctx_event+0x33/0x1c0 [ 389.275081] [<c01ba9a7>] ? do_perf_sw_event+0xa7/0x160 [ 389.275127] [<c01baae2>] ? perf_tp_event+0x82/0xa0 [ 389.275174] [<c012e9c6>] ? ftrace_profile_sched_stat_runtime+0xe6/0x120 [ 389.275224] [<c012e8e0>] ? ftrace_profile_sched_stat_runtime+0x0/0x120 [ 389.275273] [<c013c85a>] ? update_curr+0x18a/0x230 [ 389.275318] [<c013cdc5>] ? put_prev_task_fair+0x155/0x160 [ 389.275366] [<c01618b5>] ? sched_clock_cpu+0xd5/0x110 [ 389.275413] [<c04e7525>] ? _spin_lock_irq+0x45/0x50 [ 389.275458] [<c04e424e>] ? schedule+0x20e/0xb10 The problem is that the box has no lapic enabled: [ 0.042445] Local APIC not detected. Using dummy APIC emulation. The below seems like the best fix. We disabled all lapic bits, except the self-IPI-resend logic. Reported-by: Chris Malley <mail@chrismalley.co.uk> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <7863dc4c0909221409v7893bfd3o4b590d5951a233ba@mail.gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-22x86: mce, inject: Use real inject-msg in raise_localHuang Ying
Current raise_local() uses a struct mce that comes from mce_write() as a parameter instead of the real inject-msg, so when we set mce.finished = 0 to clear injected MCE, the real inject stays valid. This will cause the remaining inject-msg affect the next injection, which is not desired. To fix this, real inject-msg is used in raise_local instead of the one on the stack. This patch is based on the diagnosis and the fixes by Dean Nelson. Reported-by: Dean Nelson <dnelson@redhat.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Andi Kleen <ak@linux.intel.com> LKML-Reference: <1253601357.15717.757.camel@yhuang-dev.sh.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-22x86: mce: Fix thermal throttling message stormIngo Molnar
If a system switches back and forth between hot and cold mode, the MCE code will print a stream of critical kernel messages. Extend the throttling code to properly notice this, by only printing the first hot + cold transition and omitting the rest up to CHECK_INTERVAL (5 minutes). This way we'll only get a single incident of: [ 102.356584] CPU0: Temperature above threshold, cpu clock throttled (total events = 1) [ 102.357000] Disabling lock debugging due to kernel taint [ 102.369223] CPU0: Temperature/speed normal Every 5 minutes. The 'total events' count tells the number of cold/hot transitions detected, should overheating occur after 5 minutes again: [ 402.357580] CPU0: Temperature above threshold, cpu clock throttled (total events = 24891) [ 402.358001] CPU0: Temperature/speed normal [ 450.704142] Machine check events logged Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Andi Kleen <ak@linux.intel.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-22x86: mce: Clean up thermal throttling state tracking codeIngo Molnar
Instead of a mess of three separate percpu variables, consolidate the state into a single structure. Also clean up therm_throt_process(), use cleaner and more understandable variable names and a clearer logic. This, without changing the logic, makes the code more streamlined, more readable and smaller as well: text data bss dec hex filename 1487 169 4 1660 67c therm_throt.o.before 1432 176 4 1612 64c therm_throt.o.after Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Andi Kleen <ak@linux.intel.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-22Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits) trivial: fix typo in aic7xxx comment trivial: fix comment typo in drivers/ata/pata_hpt37x.c trivial: typo in kernel-parameters.txt trivial: fix typo in tracing documentation trivial: add __init/__exit macros in drivers/gpio/bt8xxgpio.c trivial: add __init macro/ fix of __exit macro location in ipmi_poweroff.c trivial: remove unnecessary semicolons trivial: Fix duplicated word "options" in comment trivial: kbuild: remove extraneous blank line after declaration of usage() trivial: improve help text for mm debug config options trivial: doc: hpfall: accept disk device to unload as argument trivial: doc: hpfall: reduce risk that hpfall can do harm trivial: SubmittingPatches: Fix reference to renumbered step trivial: fix typos "man[ae]g?ment" -> "management" trivial: media/video/cx88: add __init/__exit macros to cx88 drivers trivial: fix typo in CONFIG_DEBUG_FS in gcov doc trivial: fix missing printk space in amd_k7_smp_check trivial: fix typo s/ketymap/keymap/ in comment trivial: fix typo "to to" in multiple files trivial: fix typos in comments s/DGBU/DBGU/ ...
2009-09-21Merge branch 'perfcounters-rename-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-rename-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf: Tidy up after the big rename perf: Do the big rename: Performance Counters -> Performance Events perf_counter: Rename 'event' to event_id/hw_event perf_counter: Rename list_entry -> group_entry, counter_list -> group_list Manually resolved some fairly trivial conflicts with the tracing tree in include/trace/ftrace.h and kernel/trace/trace_syscalls.c.
2009-09-21Merge branch 'perfcounters-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf_counter, powerpc, sparc: Fix compilation after perf_counter_overflow() change perf_counter: x86: Fix PMU resource leak perf util: SVG performance improvements perf util: Make the timechart SVG width dynamic perf timechart: Show the duration of scheduler delays in the SVG perf timechart: Show the name of the waker/wakee in timechart
2009-09-21Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Print the hypervisor returned tsc_khz during boot x86: Correct segment permission flags in 64-bit linker script x86: cpuinit-annotate SMP boot trampolines properly x86: Increase timeout for EHCI debug port reset completion in early printk x86: Fix uaccess_32.h typo x86: Trivial whitespace cleanups x86, apic: Fix missed handling of discrete apics x86/i386: Remove duplicated #include x86, mtrr: Convert loop to a while based construct, avoid naked semicolon Revert 'x86: Fix system crash when loading with "reservetop" parameter' x86, mce: Fix compile warning in case of CONFIG_SMP=n x86, apic: Use logical flat on intel with <= 8 logical cpus x86: SGI UV: Map MMIO-High memory range x86: SGI UV: Add volatile semantics to macros that access chipset registers x86: SGI UV: Fix IPI macros x86: apic: Convert BUG() to BUG_ON() x86: Remove final bits of CONFIG_X86_OLD_MCE