summaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)Author
2013-07-02Merge branch 'x86-platform-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform updates from Ingo Molnar: "Two changes: - A Kconfig dependency fix/cleanup - Introduce the 'make kvmconfig' KVM configuration helper utility that turns the current .config into a KVM-bootable config. Useful for debugging specific native kernel configs that have no KVM config options enabled on VM setups." * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform: Make X86_GOLDFISH depend on X86_EXTENDED_PLATFORM x86/platform: Add kvmconfig to the phony targets x86, platform, kvm, kconfig: Turn existing .config's into KVM-capable configs
2013-07-02Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm changes from Ingo Molnar: "Misc improvements: - Fix /proc/mtrr reporting - Fix ioremap printout - Remove the unused pvclock fixmap entry on 32-bit - misc cleanups" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ioremap: Correct function name output x86: Fix /proc/mtrr with base/size more than 44bits ix86: Don't waste fixmap entries x86/mm: Drop unneeded include <asm/*pgtable, page*_types.h> x86_64: Correct phys_addr in cleanup_highmap comment
2013-07-02Merge branch 'x86-microcode-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 microcode loading update from Ingo Molnar: "Two main changes that improve microcode loading on AMD CPUs: - Add support for all-in-one binary microcode files that concatenate the microcode images of multiple processor families, by Jacob Shin - Add early microcode loading (embedded in the initrd) support, also by Jacob Shin" * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, microcode, amd: Another early loading fixup x86, microcode, amd: Allow multiple families' bin files appended together x86, microcode, amd: Make find_ucode_in_initrd() __init x86, microcode, amd: Fix warnings and errors on with CONFIG_MICROCODE=m x86, microcode, amd: Early microcode patch loading support for AMD x86, microcode, amd: Refactor functions to prepare for early loading x86, microcode: Vendor abstract out save_microcode_in_initrd() x86, microcode, intel: Correct typo in printk
2013-07-02Merge branch 'x86-fpu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 FPU changes from Ingo Molnar: "There are two bigger changes in this tree: - Add an [early-use-]safe static_cpu_has() variant and other robustness improvements, including the new X86_DEBUG_STATIC_CPU_HAS configurable debugging facility, motivated by recent obscure FPU code bugs, by Borislav Petkov - Reimplement FPU detection code in C and drop the old asm code, by Peter Anvin." * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, fpu: Use static_cpu_has_safe before alternatives x86: Add a static_cpu_has_safe variant x86: Sanity-check static_cpu_has usage x86, cpu: Add a synthetic, always true, cpu feature x86: Get rid of ->hard_math and all the FPU asm fu
2013-07-02Merge branch 'x86-efi-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 EFI changes from Ingo Molnar: "Two fixes that should in principle increase robustness of our interaction with the EFI firmware, and a cleanup" * 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, efi: retry ExitBootServices() on failure efi: Convert runtime services function ptrs UEFI: Don't pass boot services regions to SetVirtualAddressMap()
2013-07-02Merge branch 'x86-debug-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 debug update from Ingo Molnar: "Misc debuggability improvements: - Optimize the x86 CPU register printout a bit - Expose the tboot TXT log via debugfs - Small do_debug() cleanup" * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tboot: Provide debugfs interfaces to access TXT log x86: Remove weird PTR_ERR() in do_debug x86/debug: Only print out DR registers if they are not power-on defaults
2013-07-02Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpu updates from Ingo Molnar: "Two changes: - Extend 32-bit double fault debugging aid to 64-bit - Fix a build warning" * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/intel/cacheinfo: Shut up last long-standing warning x86: Extend #DF debugging aid to 64-bit
2013-07-02Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Misc x86 cleanups" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, reloc: Use xorl instead of xorq in relocate_kernel_64.S x86, cleanups: Remove extra tab in __flush_tlb_one() x86/mce: Remove check for CONFIG_X86_MCE_P4THERMAL
2013-07-02Merge branch 'x86-boot-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot build fix from Ingo Molnar: "Small fixlet for the build process" * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Close opened file descriptor
2013-07-02Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull asm/x86 changes from Ingo Molnar: "Misc changes, with a bigger processor-flags cleanup/reorganization by Peter Anvin" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, asm, cleanup: Replace open-coded control register values with symbolic x86, processor-flags: Fix the datatypes and add bit number defines x86: Rename X86_CR4_RDWRGSFS to X86_CR4_FSGSBASE x86, flags: Rename X86_EFLAGS_BIT1 to X86_EFLAGS_FIXED linux/const.h: Add _BITUL() and _BITULL() x86/vdso: Convert use of typedef ctl_table to struct ctl_table x86: __force_order doesn't need to be an actual variable
2013-07-02Merge branch 'sched-mm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull voluntary preemption fixes from Ingo Molnar: "This tree contains a speedup which is achieved through better might_sleep()/might_fault() preemption point annotations for uaccess functions, by Michael S Tsirkin: 1. The only reason uaccess routines might sleep is if they fault. Make this explicit for all architectures. 2. A voluntary preemption point in uaccess functions means compiler can't inline them efficiently, this breaks assumptions that they are very fast and small that e.g. net code seems to make. Remove this preemption point so behaviour matches with what callers assume. 3. Accesses (e.g through socket ops) to kernel memory with KERNEL_DS like net/sunrpc does will never sleep. Remove an unconditinal might_sleep() in the might_fault() inline in kernel.h (used when PROVE_LOCKING is not set). 4. Accesses with pagefault_disable() return EFAULT but won't cause caller to sleep. Check for that and thus avoid might_sleep() when PROVE_LOCKING is set. These changes offer a nice speedup for CONFIG_PREEMPT_VOLUNTARY=y kernels, here's a network bandwidth measurement between a virtual machine and the host: before: incoming: 7122.77 Mb/s outgoing: 8480.37 Mb/s after: incoming: 8619.24 Mb/s [ +21.0% ] outgoing: 9455.42 Mb/s [ +11.5% ] I kept these changes in a separate tree, separate from scheduler changes, because it's a mixed MM and scheduler topic" * 'sched-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: mm, sched: Allow uaccess in atomic with pagefault_disable() mm, sched: Drop voluntary schedule from might_fault() x86: uaccess s/might_sleep/might_fault/ tile: uaccess s/might_sleep/might_fault/ powerpc: uaccess s/might_sleep/might_fault/ mn10300: uaccess s/might_sleep/might_fault/ microblaze: uaccess s/might_sleep/might_fault/ m32r: uaccess s/might_sleep/might_fault/ frv: uaccess s/might_sleep/might_fault/ arm64: uaccess s/might_sleep/might_fault/ asm-generic: uaccess s/might_sleep/might_fault/
2013-07-02Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Kernel improvements: - watchdog driver improvements by Li Zefan - Power7 CPI stack events related improvements by Sukadev Bhattiprolu - event multiplexing via hrtimers and other improvements by Stephane Eranian - kernel stack use optimization by Andrew Hunter - AMD IOMMU uncore PMU support by Suravee Suthikulpanit - NMI handling rate-limits by Dave Hansen - various hw_breakpoint fixes by Oleg Nesterov - hw_breakpoint overflow period sampling and related signal handling fixes by Jiri Olsa - Intel Haswell PMU support by Andi Kleen Tooling improvements: - Reset SIGTERM handler in workload child process, fix from David Ahern. - Makefile reorganization, prep work for Kconfig patches, from Jiri Olsa. - Add automated make test suite, from Jiri Olsa. - Add --percent-limit option to 'top' and 'report', from Namhyung Kim. - Sorting improvements, from Namhyung Kim. - Expand definition of sysfs format attribute, from Michael Ellerman. Tooling fixes: - 'perf tests' fixes from Jiri Olsa. - Make Power7 CPI stack events available in sysfs, from Sukadev Bhattiprolu. - Handle death by SIGTERM in 'perf record', fix from David Ahern. - Fix printing of perf_event_paranoid message, from David Ahern. - Handle realloc failures in 'perf kvm', from David Ahern. - Fix divide by 0 in variance, from David Ahern. - Save parent pid in thread struct, from David Ahern. - Handle JITed code in shared memory, from Andi Kleen. - Fixes for 'perf diff', from Jiri Olsa. - Remove some unused struct members, from Jiri Olsa. - Add missing liblk.a dependency for python/perf.so, fix from Jiri Olsa. - Respect CROSS_COMPILE in liblk.a, from Rabin Vincent. - No need to do locking when adding hists in perf report, only 'top' needs that, from Namhyung Kim. - Fix alignment of symbol column in in the hists browser (top, report) when -v is given, from NAmhyung Kim. - Fix 'perf top' -E option behavior, from Namhyung Kim. - Fix bug in isupper() and islower(), from Sukadev Bhattiprolu. - Fix compile errors in bp_signal 'perf test', from Sukadev Bhattiprolu. ... and more things" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (102 commits) perf/x86: Disable PEBS-LL in intel_pmu_pebs_disable() perf/x86: Fix shared register mutual exclusion enforcement perf/x86/intel: Support full width counting x86: Add NMI duration tracepoints perf: Drop sample rate when sampling is too slow x86: Warn when NMI handlers take large amounts of time hw_breakpoint: Introduce "struct bp_cpuinfo" hw_breakpoint: Simplify *register_wide_hw_breakpoint() hw_breakpoint: Introduce cpumask_of_bp() hw_breakpoint: Simplify the "weight" usage in toggle_bp_slot() paths hw_breakpoint: Simplify list/idx mess in toggle_bp_slot() paths perf/x86/intel: Add mem-loads/stores support for Haswell perf/x86/intel: Support Haswell/v4 LBR format perf/x86/intel: Move NMI clearing to end of PMI handler perf/x86/intel: Add Haswell PEBS support perf/x86/intel: Add simple Haswell PMU support perf/x86/intel: Add Haswell PEBS record support perf/x86/intel: Fix sparse warning perf/x86/amd: AMD IOMMU Performance Counter PERF uncore PMU implementation perf/x86/amd: Add IOMMU Performance Counter resource management ...
2013-07-02Merge branch 'core-mutexes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull WW mutex support from Ingo Molnar: "This tree adds support for wound/wait style locks, which the graphics guys would like to make use of in the TTM graphics subsystem. Wound/wait mutexes are used when other multiple lock acquisitions of a similar type can be done in an arbitrary order. The deadlock handling used here is called wait/wound in the RDBMS literature: The older tasks waits until it can acquire the contended lock. The younger tasks needs to back off and drop all the locks it is currently holding, ie the younger task is wounded. See this LWN.net description of W/W mutexes: https://lwn.net/Articles/548909/ The comments there outline specific usecases for this facility (which have already been implemented for the DRM tree). Also see Documentation/ww-mutex-design.txt for more details" * 'core-mutexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking-selftests: Handle unexpected failures more strictly mutex: Add more w/w tests to test EDEADLK path handling mutex: Add more tests to lib/locking-selftest.c mutex: Add w/w tests to lib/locking-selftest.c mutex: Add w/w mutex slowpath debugging mutex: Add support for wound/wait style locks arch: Make __mutex_fastpath_lock_retval return whether fastpath succeeded or not
2013-07-02Merge tag 'driver-core-3.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here's the big driver core merge for 3.11-rc1 Lots of little things, and larger firmware subsystem updates, all described in the shortlog. Nice thing here is that we finally get rid of CONFIG_HOTPLUG, after 10+ years, thanks to Stephen Rohtwell (it had been always on for a number of kernel releases, now it's just removed)" * tag 'driver-core-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (27 commits) driver core: device.h: fix doc compilation warnings firmware loader: fix another compile warning with PM_SLEEP unset build some drivers only when compile-testing firmware loader: fix compile warning with PM_SLEEP set kobject: sanitize argument for format string sysfs_notify is only possible on file attributes firmware loader: simplify holding module for request_firmware firmware loader: don't export cache_firmware and uncache_firmware drivers/base: Use attribute groups to create sysfs memory files firmware loader: fix compile warning firmware loader: fix build failure with !CONFIG_FW_LOADER_USER_HELPER Documentation: Updated broken link in HOWTO Finally eradicate CONFIG_HOTPLUG driver core: firmware loader: kill FW_ACTION_NOHOTPLUG requests before suspend driver core: firmware loader: don't cache FW_ACTION_NOHOTPLUG firmware Documentation: Tidy up some drivers/base/core.c kerneldoc content. platform_device: use a macro instead of platform_driver_register firmware: move EXPORT_SYMBOL annotations firmware: Avoid deadlock of usermodehelper lock at shutdown dell_rbu: Select CONFIG_FW_LOADER_USER_HELPER explicitly ...
2013-07-02Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS patches (part 1) from Al Viro: "The major change in this pile is ->readdir() replacement with ->iterate(), dealing with ->f_pos races in ->readdir() instances for good. There's a lot more, but I'd prefer to split the pull request into several stages and this is the first obvious cutoff point." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (67 commits) [readdir] constify ->actor [readdir] ->readdir() is gone [readdir] convert ecryptfs [readdir] convert coda [readdir] convert ocfs2 [readdir] convert fatfs [readdir] convert xfs [readdir] convert btrfs [readdir] convert hostfs [readdir] convert afs [readdir] convert ncpfs [readdir] convert hfsplus [readdir] convert hfs [readdir] convert befs [readdir] convert cifs [readdir] convert freevxfs [readdir] convert fuse [readdir] convert hpfs reiserfs: switch reiserfs_readdir_dentry to inode reiserfs: is_privroot_deh() needs only directory inode, actually ...
2013-07-02x86/tracing: Add irq_enter/exit() in smp_trace_reschedule_interrupt()Seiji Aguchi
Reschedule vector tracepoints may be called in cpu idle state. This causes lockdep check warning below. The tracepoint requires rcu but for accuracy it also requires irq_enter() (tracepoints record the irq context), thus, the tracepoint interrupt handler should be calling irq_enter() and not rcu_irq_enter() (irq_enter() calls rcu_irq_enter()). So, add irq_enter/exit() to smp_trace_reschedule_interrupt() with common pre/post processing functions, smp_entering_irq() and exiting_irq() (exiting_irq() calls just irq_exit() in arch/x86/include/asm/apic.h), because these can be shared among reschedule, call_function, and call_function_single vectors. [ 50.720557] Testing event reschedule_exit: [ 50.721349] [ 50.721502] =============================== [ 50.721835] [ INFO: suspicious RCU usage. ] [ 50.722169] 3.10.0-rc6-00004-gcf910e8 #190 Not tainted [ 50.722582] ------------------------------- [ 50.722915] /c/kernel-tests/src/linux/arch/x86/include/asm/trace/irq_vectors.h:50 suspicious rcu_dereference_check() usage! [ 50.723770] [ 50.723770] other info that might help us debug this: [ 50.723770] [ 50.724385] [ 50.724385] RCU used illegally from idle CPU! [ 50.724385] rcu_scheduler_active = 1, debug_locks = 0 [ 50.725232] RCU used illegally from extended quiescent state! [ 50.725690] no locks held by swapper/0/0. [ 50.726010] [ 50.726010] stack backtrace: [...] Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/51CDCFA3.9080101@hds.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-29consolidate io_remap_pfn_range definitionsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-28x86: xen: Sync the CMOS RTC as well as the Xen wallclockDavid Vrabel
Adjustments to Xen's persistent clock via update_persistent_clock() don't actually persist, as the Xen wallclock is a software only clock and modifications to it do not modify the underlying CMOS RTC. The x86_platform.set_wallclock hook is there to keep the hardware RTC synchronized. On a guest this is pointless. On Dom0 we can use the native implementaion which actually updates the hardware RTC, but we still need to keep the software emulation of RTC for the guests up to date. The subscription to the pvclock_notifier allows us to emulate this easily. The notifier is called at every tick and when the clock was set. Right now we only use that notifier when the clock was set, but due to the fact that it is called periodically from the timekeeping update code, we can utilize it to emulate the NTP driven drift compensation of update_persistant_clock() for the Xen wall (software) clock. Add a 11 minutes periodic update to the pvclock_gtod notifier callback to achieve that. The static variable 'next' which maintains that 11 minutes update cycle is protected by the core code serialization so there is no need to add a Xen specific serialization mechanism. [ tglx: Massaged changelog and added a few comments ] Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: John Stultz <john.stultz@linaro.org> Cc: <xen-devel@lists.xen.org> Link: http://lkml.kernel.org/r/1372329348-20841-6-git-send-email-david.vrabel@citrix.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-06-28x86: xen: Sync the wallclock when the system time is setDavid Vrabel
Currently the Xen wallclock is only updated every 11 minutes if NTP is synchronized to its clock source (using the sync_cmos_clock() work). If a guest is started before NTP is synchronized it may see an incorrect wallclock time. Use the pvclock_gtod notifier chain to receive a notification when the system time has changed and update the wallclock to match. This chain is called on every timer tick and we want to avoid an extra (expensive) hypercall on every tick. Because dom0 has historically never provided a very accurate wallclock and guests do not expect one, we can do this simply: the wallclock is only updated if the clock was set. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: John Stultz <john.stultz@linaro.org> Cc: <xen-devel@lists.xen.org> Link: http://lkml.kernel.org/r/1372329348-20841-5-git-send-email-david.vrabel@citrix.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-06-28xen/time: remove blocked time accounting from xen "clockchip"Laszlo Ersek
... because the "clock_event_device framework" already accounts for idle time through the "event_handler" function pointer in xen_timer_interrupt(). The patch is intended as the completion of [1]. It should fix the double idle times seen in PV guests' /proc/stat [2]. It should be orthogonal to stolen time accounting (the removed code seems to be isolated). The approach may be completely misguided. [1] https://lkml.org/lkml/2011/10/6/10 [2] http://lists.xensource.com/archives/html/xen-devel/2010-08/msg01068.html John took the time to retest this patch on top of v3.10 and reported: "idle time is correctly incremented for pv and hvm for the normal case, nohz=off and nohz=idle." so lets put this patch in. CC: stable@vger.kernel.org Signed-off-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-06-28Merge branch 'acpi-pm'Rafael J. Wysocki
* acpi-pm: ACPI / PM: Rework and clean up acpi_dev_pm_get_state() ACPI / PM: Replace ACPI_STATE_D3 with ACPI_STATE_D3_COLD in device_pm.c ACPI / PM: Rename function acpi_device_power_state() and make it static ACPI / PM: acpi_processor_suspend() can be static xen / ACPI / sleep: Register an acpi_suspend_lowlevel callback. x86 / ACPI / sleep: Provide registration for acpi_suspend_lowlevel.
2013-06-28x86/ioremap: Correct function name outputBorislav Petkov
Infact, let the compiler enter the function name so that there are no discrepancies. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1372369996-20556-1-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-28Merge tag 'please-pull-mce' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/ras Pull MCE cleanup from Tony Luck: "Changes to simplify the SDM means that we can also simplify the code for SRAR (software recoverable action required) errors." Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-28x86/tboot: Provide debugfs interfaces to access TXT logQiaowei Ren
These logs come from tboot (Trusted Boot, an open source, pre-kernel/VMM module that uses Intel TXT to perform a measured and verified launch of an OS kernel/VMM.). Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Gang Wei <gang.wei@intel.com> Link: http://lkml.kernel.org/r/1372053333-21788-1-git-send-email-qiaowei.ren@intel.com [ Beautified the code a bit. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-27x86/mce: Update MCE severity condition checkChen Gong
Update some SRAR severity conditions check to make it clearer, according to latest Intel SDM Vol 3(June 2013), table 15-20. Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-06-27KVM: Fix RTC interrupt coalescing trackingGleb Natapov
This reverts most of the f1ed0450a5fac7067590317cbf027f566b6ccbca. After the commit kvm_apic_set_irq() no longer returns accurate information about interrupt injection status if injection is done into disabled APIC. RTC interrupt coalescing tracking relies on the information to be accurate and cannot recover if it is not. Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-06-27kvm: Add a tracepoint write_tsc_offsetYoshihiro YUNOMAE
Add a tracepoint write_tsc_offset for tracing TSC offset change. We want to merge ftrace's trace data of guest OSs and the host OS using TSC for timestamp in chronological order. We need "TSC offset" values for each guest when merge those because the TSC value on a guest is always the host TSC plus guest's TSC offset. If we get the TSC offset values, we can calculate the host TSC value for each guest events from the TSC offset and the event TSC value. The host TSC values of the guest events are used when we want to merge trace data of guests and the host in chronological order. (Note: the trace_clock of both the host and the guest must be set x86-tsc in this case) This tracepoint also records vcpu_id which can be used to merge trace data for SMP guests. A merge tool will read TSC offset for each vcpu, then the tool converts guest TSC values to host TSC values for each vcpu. TSC offset is stored in the VMCS by vmx_write_tsc_offset() or vmx_adjust_tsc_offset(). KVM executes the former function when a guest boots. The latter function is executed when kvm clock is updated. Only host can read TSC offset value from VMCS, so a host needs to output TSC offset value when TSC offset is changed. Since the TSC offset is not often changed, it could be overwritten by other frequent events while tracing. To avoid that, I recommend to use a special instance for getting this event: 1. set a instance before booting a guest # cd /sys/kernel/debug/tracing/instances # mkdir tsc_offset # cd tsc_offset # echo x86-tsc > trace_clock # echo 1 > events/kvm/kvm_write_tsc_offset/enable 2. boot a guest Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-06-27KVM: MMU: Inform users of mmio generation wraparoundTakuya Yoshikawa
Without this information, users will just see unexpected performance problems and there is little chance we will get good reports from them: note that mmio generation is increased even when we just start, or stop, dirty logging for some memory slot, in which case users cannot expect all shadow pages to be zapped. printk_ratelimited() is used for this taking into account the problems that we can see the information many times when we start multiple VMs and guests can trigger this by reading ROM in a loop for example. Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-27KVM: MMU: document fast invalidate all pagesXiao Guangrong
Document it to Documentation/virtual/kvm/mmu.txt Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-27KVM: MMU: document write_flooding_countXiao Guangrong
Document write_flooding_count to Documentation/virtual/kvm/mmu.txt Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-27KVM: MMU: document clear_spte_countXiao Guangrong
Document it to Documentation/virtual/kvm/mmu.txt Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-27KVM: MMU: drop kvm_mmu_zap_mmio_sptesXiao Guangrong
Drop kvm_mmu_zap_mmio_sptes and use kvm_mmu_invalidate_zap_all_pages instead to handle mmio generation number overflow Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-27KVM: MMU: init kvm generation close to mmio wrap-around valueXiao Guangrong
Then it has the chance to trigger mmio generation number wrap-around Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> [Change from MMIO_MAX_GEN - 13 to MMIO_MAX_GEN - 150, because 13 is very close to the number of calls to KVM_SET_USER_MEMORY_REGION before the guest is started and there is any chance to create any spte. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-27KVM: MMU: add tracepoint for check_mmio_spteXiao Guangrong
It is useful for debug mmio spte invalidation Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-27KVM: MMU: fast invalidate all mmio sptesXiao Guangrong
This patch tries to introduce a very simple and scale way to invalidate all mmio sptes - it need not walk any shadow pages and hold mmu-lock KVM maintains a global mmio valid generation-number which is stored in kvm->memslots.generation and every mmio spte stores the current global generation-number into his available bits when it is created When KVM need zap all mmio sptes, it just simply increase the global generation-number. When guests do mmio access, KVM intercepts a MMIO #PF then it walks the shadow page table and get the mmio spte. If the generation-number on the spte does not equal the global generation-number, it will go to the normal #PF handler to update the mmio spte Since 19 bits are used to store generation-number on mmio spte, we zap all mmio sptes when the number is round Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-27KVM: MMU: make return value of mmio page fault handler more readableXiao Guangrong
Define some meaningful names instead of raw code Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-27KVM: MMU: store generation-number into mmio spteXiao Guangrong
Store the generation-number into bit3 ~ bit11 and bit52 ~ bit61, totally 19 bits can be used, it should be enough for nearly all most common cases In this patch, the generation-number is always 0, it will be changed in the later patch [Gleb: masking generation bits from spte in get_mmio_spte_gfn() and get_mmio_spte_access()] Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-27Merge branch 'core/mutexes' of ↵Dave Airlie
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into drm-next Merge in the tip core/mutexes branch for future GPU driver use. Ingo will send this branch to Linus prior to drm-next. * 'core/mutexes' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) locking-selftests: Handle unexpected failures more strictly mutex: Add more w/w tests to test EDEADLK path handling mutex: Add more tests to lib/locking-selftest.c mutex: Add w/w tests to lib/locking-selftest.c mutex: Add w/w mutex slowpath debugging mutex: Add support for wound/wait style locks arch: Make __mutex_fastpath_lock_retval return whether fastpath succeeded or not powerpc/pci: Fix boot panic on mpc83xx (regression) s390/ipl: Fix FCP WWPN and LUN format strings for read fs: fix new splice.c kernel-doc warning spi/pxa2xx: fix memory corruption due to wrong size used in devm_kzalloc() s390/mem_detect: fix memory hole handling s390/dma: support debug_dma_mapping_error s390/dma: fix mapping_error detection s390/irq: Only define synchronize_irq() on SMP Input: xpad - fix for "Mad Catz Street Fighter IV FightPad" controllers Input: wacom - add a new stylus (0x100802) for Intuos5 and Cintiqs spi/pxa2xx: use GFP_ATOMIC in sg table allocation fuse: hold i_mutex in fuse_file_fallocate() Input: add missing dependencies on CONFIG_HAS_IOMEM ...
2013-06-27Merge tag 'v3.10-rc7' into drm-nextDave Airlie
Linux 3.10-rc7 The sdvo lvds fix in this -fixes pull commit c3456fb3e4712d0448592af3c5d644c9472cd3c1 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Mon Jun 10 09:47:58 2013 +0200 drm/i915: prefer VBT modes for SVDO-LVDS over EDID has a silent functional conflict with commit 990256aec2f10800595dddf4d1c3441fcd6b2616 Author: Ville Syrjälä <ville.syrjala@linux.intel.com> Date: Fri May 31 12:17:07 2013 +0000 drm: Add probed modes in probe order in drm-next. W simply need to add the vbt modes before edid modes, i.e. the other way round than now. Conflicts: drivers/gpu/drm/drm_prime.c drivers/gpu/drm/i915/intel_sdvo.c
2013-06-26x86, microcode, amd: Another early loading fixupJacob Shin
commit cd1c32ca969ebfd65e61312c988223bb14f09c2e is an early premature rendition of the patch. Augment it with this delta patch to: * correctly mark offset and size of the matching bin file * use __pa instead of __pa_nodebug during AP load * check for !initrd_start before using it Signed-off-by: Jacob Shin <jacob.shin@amd.com> Link: http://lkml.kernel.org/r/20130620152414.GA6676@jshin-Toonie Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-06-26perf/x86: Disable PEBS-LL in intel_pmu_pebs_disable()Stephane Eranian
Make sure intel_pmu_pebs_disable() and intel_pmu_pebs_enable() are symmetrical w.r.t. PEBS-LL and precise store. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1371824448-7306-2-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-26perf/x86: Fix shared register mutual exclusion enforcementStephane Eranian
This patch fixes a problem with the shared registers mutual exclusion code and incremental event scheduling by the generic perf_event code. There was a bug whereby the mutual exclusion on the shared registers was not enforced because of incremental scheduling abort due to event constraints. As an example on Intel Nehalem, consider the following events: group1= L1D_CACHE_LD:E_STATE,OFFCORE_RESPONSE_0:PF_RFO,L1D_CACHE_LD:I_STATE group2= L1D_CACHE_LD:I_STATE The L1D_CACHE_LD event can only be measured by 2 counters. Yet, there are 3 instances here. The first group can be scheduled and is committed. Then, the generic code tries to schedule group2 and this fails (because there is no more counter to support the 3rd instance of L1D_CACHE_LD). But in x86_schedule_events() error path, put_event_contraints() is invoked on ALL the events and not just the ones that just failed. That causes the "lock" on the shared offcore_response MSR to be released. Yet the first group is actually scheduled and is exposed to reprogramming of that shared msr by the sibling HT thread. In other words, there is no guarantee on what is measured. This patch fixes the problem by tagging committed events with the PERF_X86_EVENT_COMMITTED tag. In the error path of x86_schedule_events(), only the events NOT tagged have their constraint released. The tag is eventually removed when the event in descheduled. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20130620164254.GA3556@quad Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-26Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Three small fixlets" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: hw_breakpoint: Use cpu_possible_mask in {reserve,release}_bp_slot() hw_breakpoint: Fix cpu check in task_bp_pinned(cpu) kprobes: Fix arch_prepare_kprobe to handle copy insn failures
2013-06-26x86/platform: Make X86_GOLDFISH depend on X86_EXTENDED_PLATFORMBen Hutchings
All non-PC platforms are supposed to be dependent on this option. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Jun Nakajima <jnakajim@gmail.com> Link: http://lkml.kernel.org/n/tip-Bcihhqhstm67fchjnkxoiJbu@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-26arch: Make __mutex_fastpath_lock_retval return whether fastpath succeeded or notMaarten Lankhorst
This will allow me to call functions that have multiple arguments if fastpath fails. This is required to support ticket mutexes, because they need to be able to pass an extra argument to the fail function. Originally I duplicated the functions, by adding __mutex_fastpath_lock_retval_arg. This ended up being just a duplication of the existing function, so a way to test if fastpath was called ended up being better. This also cleaned up the reservation mutex patch some by being able to call an atomic_set instead of atomic_xchg, and making it easier to detect if the wrong unlock function was previously used. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: robclark@gmail.com Cc: rostedt@goodmis.org Cc: daniel@ffwll.ch Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20130620113105.4001.83929.stgit@patser Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-26perf/x86/intel: Support full width countingAndi Kleen
Recent Intel CPUs like Haswell and IvyBridge have a new alternative MSR range for perfctrs that allows writing the full counter width. Enable this range if the hardware reports it using a new capability bit. Currently the perf code queries CPUID to get the counter width, and sign extends the counter values as needed. The traditional PERFCTR MSRs always limit to 32bit, even though the counter internally is larger (usually 48 bits on recent CPUs) When the new capability is set use the alternative range which do not have these restrictions. This lowers the overhead of perf stat slightly because it has to do less interrupts to accumulate the counter value. On Haswell it also avoids some problems with TSX aborting when the end of the counter range is reached. ( See the patch "perf/x86/intel: Avoid checkpointed counters causing excessive TSX aborts" for more details. ) Signed-off-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Stephane Eranian <eranian@google.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1372173153-20215-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-26Merge tag 'please-pull-mce-bitmap-comment' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/ras Pull MCE updates from Tony Luck: "Better comments so we understand our existing machine check bank bitmaps - prelude to adding another bitmap soon." Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-25x86, asm, cleanup: Replace open-coded control register values with symbolicH. Peter Anvin
Clean up an unnecessary open-coded control register values. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/n/tip-um7za1nzf6brb17o0h4om6e3@git.kernel.org
2013-06-25x86, processor-flags: Fix the datatypes and add bit number definesH. Peter Anvin
The control registers are unsigned long (32 bits on i386, 64 bits on x86-64), and so make that manifest in the data type for the various constants. Add defines with a _BIT suffix which defines the bit number, as opposed to the bit mask. This should resolve some issues with ~bitmask that Linus discovered. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/n/tip-cwckhbrib2aux1qbteaebij0@git.kernel.org
2013-06-25x86: Rename X86_CR4_RDWRGSFS to X86_CR4_FSGSBASEH. Peter Anvin
Rename X86_CR4_RDWRGSFS to X86_CR4_FSGSBASE to match the SDM. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Link: http://lkml.kernel.org/n/tip-buq1evi5dpykxx7ak6amaam0@git.kernel.org