summaryrefslogtreecommitdiff
path: root/arch/x86/events/intel/core.c
AgeCommit message (Collapse)Author
2017-07-05perf/x86/intel: Use ULL constant to prevent undefined shift behaviourColin King
[ Upstream commit ad5013d5699d30ded0cdbbc68b93b2aa28222c6e ] When x86_pmu.num_counters is 32 the shift of the integer constant 1 is exceeding 32bit and therefor undefined behaviour. Fix this by shifting 1ULL instead of 1. Reported-by: CoverityScan CID#1192105 ("Bad bit shift operation") Signed-off-by: Colin Ian King <colin.king@canonical.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Link: http://lkml.kernel.org/r/20170111114310.17928-1-colin.king@canonical.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05perf/x86/intel: Handle exclusive threadid correctly on CPU hotplugZhou Chengming
[ Upstream commit 4e71de7986386d5fd3765458f27d612931f27f5e ] The CPU hotplug function intel_pmu_cpu_starting() sets cpu_hw_events.excl_thread_id unconditionally to 1 when the shared exclusive counters data structure is already availabe for the sibling thread. This works during the boot process because the first sibling gets threadid 0 assigned and the second sibling which shares the data structure gets 1. But when the first thread of the core is offlined and onlined again it shares the data structure with the second thread and gets exclusive thread id 1 assigned as well. Prevent this by checking the threadid of the already online thread. [ tglx: Rewrote changelog ] Signed-off-by: Zhou Chengming <zhouchengming1@huawei.com> Cc: NuoHan Qiao <qiaonuohan@huawei.com> Cc: ak@linux.intel.com Cc: peterz@infradead.org Cc: kan.liang@intel.com Cc: dave.hansen@linux.intel.com Cc: eranian@google.com Cc: qiaonuohan@huawei.com Cc: davidcc@google.com Cc: guohanjun@huawei.com Link: http://lkml.kernel.org/r/1484536871-3131-1-git-send-email-zhouchengming1@huawei.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-29perf/x86/intel: Add 1G DTLB load/store miss support for SKLKan Liang
commit fb3a5055cd7098f8d1dd0cd38d7172211113255f upstream. Current DTLB load/store miss events (0x608/0x649) only counts 4K,2M and 4M page size. Need to extend the events to support any page size (4K/2M/4M/1G). The complete DTLB load/store miss events are: DTLB_LOAD_MISSES.WALK_COMPLETED 0xe08 DTLB_STORE_MISSES.WALK_COMPLETED 0xe49 Signed-off-by: Kan Liang <Kan.liang@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: eranian@google.com Link: http://lkml.kernel.org/r/20170619142609.11058-1-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-06perf/x86: Fix full width counter, counter overflowPeter Zijlstra (Intel)
Lukasz reported that perf stat counters overflow handling is broken on KNL/SLM. Both these parts have full_width_write set, and that does indeed have a problem. In order to deal with counter wrap, we must sample the counter at at least half the counter period (see also the sampling theorem) such that we can unambiguously reconstruct the count. However commit: 069e0c3c4058 ("perf/x86/intel: Support full width counting") sets the sampling interval to the full period, not half. Fixing that exposes another issue, in that we must not sign extend the delta value when we shift it right; the counter cannot have decremented after all. With both these issues fixed, counter overflow functions correctly again. Reported-by: Lukasz Odzioba <lukasz.odzioba@intel.com> Tested-by: Liang, Kan <kan.liang@intel.com> Tested-by: Odzioba, Lukasz <lukasz.odzioba@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: stable@vger.kernel.org Fixes: 069e0c3c4058 ("perf/x86/intel: Support full width counting") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-28perf/x86/intel: Honour the CPUID for number of fixed counters in hypervisorsImre Palik
perf doesn't seem to honour the number of fixed counters specified by CPUID leaf 0xa. It always assumes that Intel CPUs have at least 3 fixed counters. So if some of the fixed counters are masked out by the hypervisor, it still tries to check/set them. This patch makes perf behave nicer when the kernel is running under a hypervisor that doesn't expose all the counters. This patch contains some ideas from Matt Wilson. Signed-off-by: Imre Palik <imrep@amazon.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Alexander Kozyrev <alexander.kozyrev@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Artyom Kuanbekov <artyom.kuanbekov@intel.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Wilson <msw@amazon.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1477037939-15605-1-git-send-email-imrep.amz@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-17perf/x86/intel: Add Knights Mill CPUIDPiotr Luc
Add Knights Mill (KNM) to the list of CPUIDs supported by PMU. Signed-off-by: Piotr Luc <piotr.luc@intel.com> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20161012182634.2462-1-piotr.luc@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-23Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15perf/x86/intel: Don't disable "intel_bts" around "intel" event batchingAlexander Shishkin
At the moment, intel_bts events get disabled from intel PMU's disable callback, which includes event scheduling transactions of said PMU, which have nothing to do with intel_bts events. We do want to keep intel_bts events off inside the PMI handler to avoid filling up their buffer too soon. This patch moves intel_bts enabling/disabling directly to the PMI handler. Reported-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160915082233.11065-1-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-10perf/x86: Ensure perf_sched_cb_{inc,dec}() is only called from pmu::{add,del}()Peter Zijlstra
Currently perf_sched_cb_{inc,dec}() are called from pmu::{start,stop}(), which has the problem that this can happen from NMI context, this is making it hard to optimize perf_pmu_sched_task(). Furthermore, we really only need this accounting on pmu::{add,del}(), so doing it from pmu::{start,stop}() is doing more work than we really need. Introduce x86_pmu::{add,del}() and wire up the LBR and PEBS. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-29Merge branch 'smp-hotplug-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull smp hotplug updates from Thomas Gleixner: "This is the next part of the hotplug rework. - Convert all notifiers with a priority assigned - Convert all CPU_STARTING/DYING notifiers The final removal of the STARTING/DYING infrastructure will happen when the merge window closes. Another 700 hundred line of unpenetrable maze gone :)" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits) timers/core: Correct callback order during CPU hot plug leds/trigger/cpu: Move from CPU_STARTING to ONLINE level powerpc/numa: Convert to hotplug state machine arm/perf: Fix hotplug state machine conversion irqchip/armada: Avoid unused function warnings ARC/time: Convert to hotplug state machine clocksource/atlas7: Convert to hotplug state machine clocksource/armada-370-xp: Convert to hotplug state machine clocksource/exynos_mct: Convert to hotplug state machine clocksource/arm_global_timer: Convert to hotplug state machine rcu: Convert rcutree to hotplug state machine KVM/arm/arm64/vgic-new: Convert to hotplug state machine smp/cfd: Convert core to hotplug state machine x86/x2apic: Convert to CPU hotplug state machine profile: Convert to hotplug state machine timers/core: Convert to hotplug state machine hrtimer: Convert to hotplug state machine x86/tboot: Convert to hotplug state machine arm64/armv8 deprecated: Convert to hotplug state machine hwtracing/coresight-etm4x: Convert to hotplug state machine ...
2016-07-14perf/x86: Convert the core to the hotplug state machineThomas Gleixner
Replace the perf_notifier() install mechanism, which invokes magically the callback on the current CPU. Convert the hardware specific callbacks which are invoked from the x86 perf core to return proper error codes instead of totally pointless NOTIFY_BAD return values. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Adam Borowski <kilobyte@angband.pl> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160713153333.670720553@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07Merge branch 'perf/urgent' into perf/core, to pick up fixes before merging ↵Ingo Molnar
new changes Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-03perf/x86/intel: Update event constraints when HT is offStephane Eranian
This patch updates the event constraints for non-PEBS mode for Intel Broadwell and Skylake processors. When HT is off, each CPU gets 8 generic counters. However, not all events can be programmed on any of the 8 counters. This patch adds the constraints for the MEM_* events which can only be measured on the bottom 4 counters. The constraints are also valid when HT is off because, then, there are only 4 generic counters and they are the bottom counters. Signed-off-by: Stephane Eranian <eranian@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: kan.liang@intel.com Link: http://lkml.kernel.org/r/1467411742-13245-1-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-27perf/x86/intel: Fix MSR_LAST_BRANCH_FROM_x bug when no TSXDavid Carrillo-Cisneros
Intel's SDM states that bits 61:62 in MSR_LAST_BRANCH_FROM_x are the TSX flags for formats with LBR_TSX flags (i.e. LBR_FORMAT_EIP_EFLAGS2). However, when the CPU has TSX support deactivated, bits 61:62 actually behave as follows: - For wrmsr(), bits 61:62 are considered part of the sign extension. - When capturing branches, the LBR hw will always clear bits 61:62. regardless of the sign extension. Therefore, if: 1) LBR has TSX format. 2) CPU has no TSX support enabled. ... then any value passed to wrmsr() must be sign extended to 63 bits and any value from rdmsr() must be converted to have a sign extension of 61 bits, ignoring the values at TSX flags. This bug was masked by the work-around to the Intel's CPU bug: BJ94. "LBR May Contain Incorrect Information When Using FREEZE_LBRS_ON_PMI" in Document Number: 324643-037US. The aforementioned work-around uses hw flags to filter out all kernel branches, limiting LBR callstack to user level execution only. Since user addresses are not sign extended, they do not trigger the wrmsr() bug in MSR_LAST_BRANCH_FROM_x when saved/restored at context switch. To verify the hw bug: $ perf record -b -e cycles sleep 1 $ rdmsr -p 0 0x680 0x1fffffffb0b9b0cc $ wrmsr -p 0 0x680 0x1fffffffb0b9b0cc write(): Input/output error The quirk for LBR_FROM_ MSRs is required before calls to wrmsrl() and after rdmsrl(). This patch introduces it for wrmsrl()'s done for testing LBR support. Future patch in series adds the quirk for context switch, that would be required if LBR callstack is to be enabled for ring 0. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Stephane Eranian <eranian@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1466533874-52003-3-git-send-email-davidcc@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-27perf/x86/intel: Print LBR support statement after validationDavid Carrillo-Cisneros
The following commit: 338b522ca43c ("perf/x86/intel: Protect LBR and extra_regs against KVM lying") added an additional test to LBR support detection that is performed after printing the LBR support statement to dmesg. Move the LBR support output after the very last test, to make sure we print the true status of LBR support. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Stephane Eranian <eranian@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1466533874-52003-2-git-send-email-davidcc@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-08perf/x86/intel: Use Intel family macros for core perf eventsDave Hansen
Use the new model number macros instead of spelling things out in the comments. Note that this is missing a Nehalem model that is mentioned in intel_idle which is fixed up in a later patch. The resulting binary (arch/x86/events/intel/core.o) is exactly the same with and without this patch modulo some harmless changes to restoring %esi in the return path of functions, even those untouched by this patch. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: jacob.jun.pan@intel.com Link: http://lkml.kernel.org/r/20160603001929.C5F1C079@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03perf/x86/intel: Use new topology_max_smt_threads() in HT leak workaroundAndi Kleen
Now that we have topology_max_smt_threads() use it to detect the HT workarounds for older CPUs. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/1463703002-19686-6-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03perf/x86/intel: Add topdown events to Intel AtomAndi Kleen
Add topdown event declarations to Silvermont / Airmont. These cores do not support the full Top Down metrics, but an useful subset (FrontendBound, Retiring, Backend Bound/Bad Speculation). The perf stat tool automatically handles the missing events and combines the available metrics. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/1463703002-19686-5-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03perf/x86/intel: Add topdown events to Intel CoreAndi Kleen
Add declarations for the events needed for topdown to the Intel big core CPUs starting with Sandy Bridge. We need to report different values if HyperThreading is on or off. The only thing this patch does is to export some events in sysfs. topdown level 1 uses a set of abstracted metrics which are generic to out of order CPU cores (although some CPUs may not implement all of them): topdown-total-slots Available slots in the pipeline topdown-slots-issued Slots issued into the pipeline topdown-slots-retired Slots successfully retired topdown-fetch-bubbles Pipeline gaps in the frontend topdown-recovery-bubbles Pipeline gaps during recovery from misspeculation A slot is a single operation in the CPU pipe line. These metrics then allow to compute four useful metrics: FrontendBound, BackendBound, Retiring, BadSpeculation. The formulas to compute the metrics are generic, they only change based on the availability on the abstracted input values. The kernel declares the events supported by the current CPU and their scaling factors (such as the pipeline width) and perf stat then computes the formulas based on the available metrics. This is similar how existing perf metrics, such as TSC metrics or IPC, are implemented. This abstracts all CPU pipe line specific knowledge in the kernel driver, but still avoids the need for larger scale perf interface changes. For HyperThreading the any bit is needed to get accurate values when both threads are executing. This implies that the events can only be collected as root or with perf_event_paranoid=-1 for now. The basic scheme is based on the following paper: Yasin, A Top Down Method for Performance analysis and Counter architecture ISPASS14 (pdf available via google) Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/1463703002-19686-4-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03perf/x86/intel: Change offcore response masks for Knights LandingLukasz Odzioba
Due to change in register definition we need to update OCR mask: MSR_OFFCORE_RESP0 reserved bits: 3,4,18,29,30,33,34, 8,11,14 MSR_OFFCORE_RESP1 reserved bits: 3,4,18,29,30,33,34, 38 Reported-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: akpm@linux-foundation.org Cc: hpa@zytor.com Cc: kan.liang@intel.com Cc: lukasz.anaczkowski@intel.com Cc: zheng.z.yan@intel.com Link: http://lkml.kernel.org/r/1463433419-16893-1-git-send-email-lukasz.odzioba@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03perf/x86/intel: Add 'static' keyword to locally used arraysLukasz Odzioba
Add the 'static' keyword to intel_bdw_event_constraints[], snb_events_attrs[], nhm_events_attrs[] and intel_skl_event_constraints arrays[], because they are only used locally. Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: akpm@linux-foundation.org Cc: hpa@zytor.com Cc: kan.liang@intel.com Cc: lukasz.anaczkowski@intel.com Cc: zheng.z.yan@intel.com Link: http://lkml.kernel.org/r/1463433378-16816-1-git-send-email-lukasz.odzioba@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-12Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-12perf/x86: Fix undefined shift on 32-bit kernelsAndrey Ryabinin
Jim reported: UBSAN: Undefined behaviour in arch/x86/events/intel/core.c:3708:12 shift exponent 35 is too large for 32-bit type 'long unsigned int' The use of 'unsigned long' type obviously is not correct here, make it 'unsigned long long' instead. Reported-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Imre Palik <imrep@amazon.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: 2c33645d366d ("perf/x86: Honor the architectural performance monitoring version") Link: http://lkml.kernel.org/r/1462974711-10037-1-git-send-email-aryabinin@virtuozzo.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05perf/x86/intel/pt: Bypass PT vs. LBR exclusivity if the core supports itAlexander Shishkin
Not all cores prevent using Intel PT and LBRs simultaneously, although most of them still do as of today. This patch adds an opt-in flag for such cores to disable mutual exclusivity between PT and LBR; also flip it on for Goldmont. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1461857746-31346-4-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05perf/x86: Add model numbers for Kabylake CPUsAndi Kleen
Everything the same as Skylake, just new model numbers. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461977748-17616-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-23perf/x86/intel: Add LBR filter support for Silvermont and Airmont CPUsKan Liang
LBR filtering is also supported on the Silvermont and Airmont microarchitectures. The layout of MSR_LBR_SELECT is the same as Nehalem. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1460706825-46163-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-23perf/x86/intel: Add Goldmont CPU supportKan Liang
Add perf core PMU support for Intel Goldmont CPU cores: - The init code is based on Silvermont. - There is a new cache event list, based on the Silvermont cache event list. - Goldmont has 32 LBR entries. It also uses new LBRv6 format, which report the cycle information using upper 16-bit of the LBR_TO. - It's recommended to use CPU_CLK_UNHALTED.CORE_P + NPEBS for precise cycles. For details, please refer to the latest SDM058: http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3b-part-2-manual.pdf Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1460706167-45320-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-23perf/x86/intel: Add model number for Skylake Server to perfAndi Kleen
Everything the same as base Skylake, just a new model number. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1460751933-2264-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08perf/x86/intel: Fix PEBS data source interpretation on Nehalem/WestmereAndi Kleen
Jiri reported some time ago that some entries in the PEBS data source table in perf do not agree with the SDM. We investigated and the bits changed for Sandy Bridge, but the SDM was not updated. perf already implements the bits correctly for Sandy Bridge and later. This patch patches it up for Nehalem and Westmere. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/1456871124-15985-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08perf/x86/pebs: Add proper PEBS constraints for BroadwellStephane Eranian
This patch adds a Broadwell specific PEBS event constraint table. Broadwell has a fix for the HT corruption bug erratum HSD29 on Haswell. Therefore, there is no need to mark events 0xd0, 0xd1, 0xd2, 0xd3 has requiring the exclusive mode across both sibling HT threads. This holds true for regular counting and sampling (see core.c) and PEBS (ds.c) which we fix in this patch. In doing so, we relax evnt scheduling for these events, they can now be programmed on any 4 counters without impacting what is measured on the sibling thread. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@redhat.com Cc: adrian.hunter@intel.com Cc: jolsa@redhat.com Cc: kan.liang@intel.com Cc: namhyung@kernel.org Link: http://lkml.kernel.org/r/1457034642-21837-4-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08perf/x86/pebs: Add workaround for broken OVFL status on HSW+Stephane Eranian
This patch fixes an issue with the GLOBAL_OVERFLOW_STATUS bits on Haswell, Broadwell and Skylake processors when using PEBS. The SDM stipulates that when the PEBS iterrupt threshold is crossed, an interrupt is posted and the kernel is interrupted. The kernel will find GLOBAL_OVF_SATUS bit 62 set indicating there are PEBS records to drain. But the bits corresponding to the actual counters should NOT be set. The kernel follows the SDM and assumes that all PEBS events are processed in the drain_pebs() callback. The kernel then checks for remaining overflows on any other (non-PEBS) events and processes these in the for_each_bit_set(&status) loop. As it turns out, under certain conditions on HSW and later processors, on PEBS buffer interrupt, bit 62 is set but the counter bits may be set as well. In that case, the kernel drains PEBS and generates SAMPLES with the EXACT tag, then it processes the counter bits, and generates normal (non-EXACT) SAMPLES. I ran into this problem by trying to understand why on HSW sampling on a PEBS event was sometimes returning SAMPLES without the EXACT tag. This should not happen on user level code because HSW has the eventing_ip which always point to the instruction that caused the event. The workaround in this patch simply ensures that the bits for the counters used for PEBS events are cleared after the PEBS buffer has been drained. With this fix 100% of the PEBS samples on my user code report the EXACT tag. Before: $ perf record -e cpu/event=0xd0,umask=0x81/upp ./multichase $ perf report -D | fgrep SAMPLES PERF_RECORD_SAMPLE(IP, 0x2): 11775/11775: 0x406de5 period: 73469 addr: 0 exact=Y \--- EXACT tag is missing After: $ perf record -e cpu/event=0xd0,umask=0x81/upp ./multichase $ perf report -D | fgrep SAMPLES PERF_RECORD_SAMPLE(IP, 0x4002): 11775/11775: 0x406de5 period: 73469 addr: 0 exact=Y \--- EXACT tag is set The problem tends to appear more often when multiple PEBS events are used. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: adrian.hunter@intel.com Cc: kan.liang@intel.com Cc: namhyung@kernel.org Link: http://lkml.kernel.org/r/1457034642-21837-3-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08perf/x86/intel: Fix PEBS warning by only restoring active PMU in pmiKan Liang
This patch tries to fix a PEBS warning found in my stress test. The following perf command can easily trigger the pebs warning or spurious NMI error on Skylake/Broadwell/Haswell platforms: sudo perf record -e 'cpu/umask=0x04,event=0xc4/pp,cycles,branches,ref-cycles,cache-misses,cache-references' --call-graph fp -b -c1000 -a Also the NMI watchdog must be enabled. For this case, the events number is larger than counter number. So perf has to do multiplexing. In perf_mux_hrtimer_handler, it does perf_pmu_disable(), schedule out old events, rotate_ctx, schedule in new events and finally perf_pmu_enable(). If the old events include precise event, the MSR_IA32_PEBS_ENABLE should be cleared when perf_pmu_disable(). The MSR_IA32_PEBS_ENABLE should keep 0 until the perf_pmu_enable() is called and the new event is precise event. However, there is a corner case which could restore PEBS_ENABLE to stale value during the above period. In perf_pmu_disable(), GLOBAL_CTRL will be set to 0 to stop overflow and followed PMI. But there may be pending PMI from an earlier overflow, which cannot be stopped. So even GLOBAL_CTRL is cleared, the kernel still be possible to get PMI. At the end of the PMI handler, __intel_pmu_enable_all() will be called, which will restore the stale values if old events haven't scheduled out. Once the stale pebs value is set, it's impossible to be corrected if the new events are non-precise. Because the pebs_enabled will be set to 0. x86_pmu.enable_all() will ignore the MSR_IA32_PEBS_ENABLE setting. As a result, the following NMI with stale PEBS_ENABLE trigger pebs warning. The pending PMI after enabled=0 will become harmless if the NMI handler does not change the state. This patch checks cpuc->enabled in pmi and only restore the state when PMU is active. Here is the dump: Call Trace: <NMI> [<ffffffff813c3a2e>] dump_stack+0x63/0x85 [<ffffffff810a46f2>] warn_slowpath_common+0x82/0xc0 [<ffffffff810a483a>] warn_slowpath_null+0x1a/0x20 [<ffffffff8100fe2e>] intel_pmu_drain_pebs_nhm+0x2be/0x320 [<ffffffff8100caa9>] intel_pmu_handle_irq+0x279/0x460 [<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40 [<ffffffff811f290d>] ? vunmap_page_range+0x20d/0x330 [<ffffffff811f2f11>] ? unmap_kernel_range_noflush+0x11/0x20 [<ffffffff8148379f>] ? ghes_copy_tofrom_phys+0x10f/0x2a0 [<ffffffff814839c8>] ? ghes_read_estatus+0x98/0x170 [<ffffffff81005a7d>] perf_event_nmi_handler+0x2d/0x50 [<ffffffff810310b9>] nmi_handle+0x69/0x120 [<ffffffff810316f6>] default_do_nmi+0xe6/0x100 [<ffffffff810317f2>] do_nmi+0xe2/0x130 [<ffffffff817aea71>] end_repeat_nmi+0x1a/0x1e [<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40 [<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40 [<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40 <<EOE>> <IRQ> [<ffffffff81006df8>] ? x86_perf_event_set_period+0xd8/0x180 [<ffffffff81006eec>] x86_pmu_start+0x4c/0x100 [<ffffffff8100722d>] x86_pmu_enable+0x28d/0x300 [<ffffffff811994d7>] perf_pmu_enable.part.81+0x7/0x10 [<ffffffff8119cb70>] perf_mux_hrtimer_handler+0x200/0x280 [<ffffffff8119c970>] ? __perf_install_in_context+0xc0/0xc0 [<ffffffff8110f92d>] __hrtimer_run_queues+0xfd/0x280 [<ffffffff811100d8>] hrtimer_interrupt+0xa8/0x190 [<ffffffff81199080>] ? __perf_read_group_add.part.61+0x1a0/0x1a0 [<ffffffff81051bd8>] local_apic_timer_interrupt+0x38/0x60 [<ffffffff817af01d>] smp_apic_timer_interrupt+0x3d/0x50 [<ffffffff817ad15c>] apic_timer_interrupt+0x8c/0xa0 <EOI> [<ffffffff81199080>] ? __perf_read_group_add.part.61+0x1a0/0x1a0 [<ffffffff81123de5>] ? smp_call_function_single+0xd5/0x130 [<ffffffff81123ddb>] ? smp_call_function_single+0xcb/0x130 [<ffffffff81199080>] ? __perf_read_group_add.part.61+0x1a0/0x1a0 [<ffffffff8119765a>] event_function_call+0x10a/0x120 [<ffffffff8119c660>] ? ctx_resched+0x90/0x90 [<ffffffff811971e0>] ? cpu_clock_event_read+0x30/0x30 [<ffffffff811976d0>] ? _perf_event_disable+0x60/0x60 [<ffffffff8119772b>] _perf_event_enable+0x5b/0x70 [<ffffffff81197388>] perf_event_for_each_child+0x38/0xa0 [<ffffffff811976d0>] ? _perf_event_disable+0x60/0x60 [<ffffffff811a0ffd>] perf_ioctl+0x12d/0x3c0 [<ffffffff8134d855>] ? selinux_file_ioctl+0x95/0x1e0 [<ffffffff8124a3a1>] do_vfs_ioctl+0xa1/0x5a0 [<ffffffff81036d29>] ? sched_clock+0x9/0x10 [<ffffffff8124a919>] SyS_ioctl+0x79/0x90 [<ffffffff817ac4b2>] entry_SYSCALL_64_fastpath+0x1a/0xa4 ---[ end trace aef202839fe9a71d ]--- Uhhuh. NMI received for unknown reason 2d on CPU 2. Do you have a strange power saving mode enabled? Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1457046448-6184-1-git-send-email-kan.liang@intel.com [ Fixed various typos and other small details. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-17perf/x86: Move perf_event.h to its new homeBorislav Petkov
Now that all functionality has been moved to arch/x86/events/, move the perf_event.h header and adjust include paths. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-18-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-17perf/x86: Move perf_event_intel.c ............ => x86/events/intel/core.cBorislav Petkov
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-3-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>