summaryrefslogtreecommitdiff
path: root/arch/x86/include
AgeCommit message (Collapse)Author
2015-06-23Merge branch 'x86-microcode-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pul x86 microcode updates from Ingo Molnar: "x86 microcode loader updates from Borislav Petkov: - early parsing of the built-in microcode - cleanups - misc smaller fixes" * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode: Correct CPU family related variable types x86/microcode: Disable builtin microcode loading on 32-bit for now x86/microcode/intel: Rename get_matching_sig() x86/microcode/intel: Simplify get_matching_sig() x86/microcode/intel: Simplify update_match_cpu() x86/microcode/intel: Rename get_matching_microcode x86/cpu/microcode: Zap changelog x86/microcode: Parse built-in microcode early x86/microcode/intel: Remove unused @rev arg of get_matching_sig() x86/microcode/intel: Get rid of revision_is_newer()
2015-06-23Merge branch 'x86-fpu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 FPU updates from Ingo Molnar: "This tree contains two main changes: - The big FPU code rewrite: wide reaching cleanups and reorganization that pulls all the FPU code together into a clean base in arch/x86/fpu/. The resulting code is leaner and faster, and much easier to understand. This enables future work to further simplify the FPU code (such as removing lazy FPU restores). By its nature these changes have a substantial regression risk: FPU code related bugs are long lived, because races are often subtle and bugs mask as user-space failures that are difficult to track back to kernel side backs. I'm aware of no unfixed (or even suspected) FPU related regression so far. - MPX support rework/fixes. As this is still not a released CPU feature, there were some buglets in the code - should be much more robust now (Dave Hansen)" * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (250 commits) x86/fpu: Fix double-increment in setup_xstate_features() x86/mpx: Allow 32-bit binaries on 64-bit kernels again x86/mpx: Do not count MPX VMAs as neighbors when unmapping x86/mpx: Rewrite the unmap code x86/mpx: Support 32-bit binaries on 64-bit kernels x86/mpx: Use 32-bit-only cmpxchg() for 32-bit apps x86/mpx: Introduce new 'directory entry' to 'addr' helper function x86/mpx: Add temporary variable to reduce masking x86: Make is_64bit_mm() widely available x86/mpx: Trace allocation of new bounds tables x86/mpx: Trace the attempts to find bounds tables x86/mpx: Trace entry to bounds exception paths x86/mpx: Trace #BR exceptions x86/mpx: Introduce a boot-time disable flag x86/mpx: Restrict the mmap() size check to bounds tables x86/mpx: Remove redundant MPX_BNDCFG_ADDR_MASK x86/mpx: Clean up the code by not passing a task pointer around when unnecessary x86/mpx: Use the new get_xsave_field_ptr()API x86/fpu/xstate: Wrap get_xsave_addr() to make it safer x86/fpu/xstate: Fix up bad get_xsave_addr() assumptions ...
2015-06-22Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 CPU features from Ingo Molnar: "Various CPU feature support related changes: in particular the /proc/cpuinfo model name sanitization change should be monitored, it has a chance to break stuff. (but really shouldn't and there are no regression reports)" * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu/amd: Give access to the number of nodes in a physical package x86/cpu: Trim model ID whitespace x86/cpu: Strip any /proc/cpuinfo model name field whitespace x86/cpu/amd: Set X86_FEATURE_EXTD_APICID for future processors x86/gart: Check for GART support before accessing GART registers
2015-06-22Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Misc cleanups" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Clean up types in xlate_dev_mem_ptr() some more x86: Deinline dma_free_attrs() x86: Deinline dma_alloc_attrs() x86: Remove unused TI_cpu x86: Merge common 32-bit values in asm-offsets.c
2015-06-22Merge branch 'sched-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes are: - lockless wakeup support for futexes and IPC message queues (Davidlohr Bueso, Peter Zijlstra) - Replace spinlocks with atomics in thread_group_cputimer(), to improve scalability (Jason Low) - NUMA balancing improvements (Rik van Riel) - SCHED_DEADLINE improvements (Wanpeng Li) - clean up and reorganize preemption helpers (Frederic Weisbecker) - decouple page fault disabling machinery from the preemption counter, to improve debuggability and robustness (David Hildenbrand) - SCHED_DEADLINE documentation updates (Luca Abeni) - topology CPU masks cleanups (Bartosz Golaszewski) - /proc/sched_debug improvements (Srikar Dronamraju)" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (79 commits) sched/deadline: Remove needless parameter in dl_runtime_exceeded() sched: Remove superfluous resetting of the p->dl_throttled flag sched/deadline: Drop duplicate init_sched_dl_class() declaration sched/deadline: Reduce rq lock contention by eliminating locking of non-feasible target sched/deadline: Make init_sched_dl_class() __init sched/deadline: Optimize pull_dl_task() sched/preempt: Add static_key() to preempt_notifiers sched/preempt: Fix preempt notifiers documentation about hlist_del() within unsafe iteration sched/stop_machine: Fix deadlock between multiple stop_two_cpus() sched/debug: Add sum_sleep_runtime to /proc/<pid>/sched sched/debug: Replace vruntime with wait_sum in /proc/sched_debug sched/debug: Properly format runnable tasks in /proc/sched_debug sched/numa: Only consider less busy nodes as numa balancing destinations Revert 095bebf61a46 ("sched/numa: Do not move past the balance point if unbalanced") sched/fair: Prevent throttling in early pick_next_task_fair() preempt: Reorganize the notrace definitions a bit preempt: Use preempt_schedule_context() as the official tracing preemption point sched: Make preempt_schedule_context() function-tracing safe x86: Remove cpu_sibling_mask() and cpu_core_mask() x86: Replace cpu_**_mask() with topology_**_cpumask() ...
2015-06-22Merge branch 'locking-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes are: - 'qspinlock' support, enabled on x86: queued spinlocks - these are now the spinlock variant used by x86 as they outperform ticket spinlocks in every category. (Waiman Long) - 'pvqspinlock' support on x86: paravirtualized variant of queued spinlocks. (Waiman Long, Peter Zijlstra) - 'qrwlock' support, enabled on x86: queued rwlocks. Similar to queued spinlocks, they are now the variant used by x86: CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y CONFIG_QUEUED_SPINLOCKS=y CONFIG_ARCH_USE_QUEUED_RWLOCKS=y CONFIG_QUEUED_RWLOCKS=y - various lockdep fixlets - various locking primitives cleanups, further WRITE_ONCE() propagation" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) locking/lockdep: Remove hard coded array size dependency locking/qrwlock: Don't contend with readers when setting _QW_WAITING lockdep: Do not break user-visible string locking/arch: Rename set_mb() to smp_store_mb() locking/arch: Add WRITE_ONCE() to set_mb() rtmutex: Warn if trylock is called from hard/softirq context arch: Remove __ARCH_HAVE_CMPXCHG locking/rtmutex: Drop usage of __HAVE_ARCH_CMPXCHG locking/qrwlock: Rename QUEUE_RWLOCK to QUEUED_RWLOCKS locking/pvqspinlock: Rename QUEUED_SPINLOCK to QUEUED_SPINLOCKS locking/pvqspinlock: Replace xchg() by the more descriptive set_mb() locking/pvqspinlock, x86: Enable PV qspinlock for Xen locking/pvqspinlock, x86: Enable PV qspinlock for KVM locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching locking/pvqspinlock: Implement simple paravirt support for the qspinlock locking/qspinlock: Revert to test-and-set on hypervisors locking/qspinlock: Use a simple write to grab the lock locking/qspinlock: Optimize for smaller NR_CPUS locking/qspinlock: Extract out code snippets for the next patch locking/qspinlock: Add pending bit ...
2015-06-18x86/cpu/amd: Give access to the number of nodes in a physical packageAravind Gopalakrishnan
Stash the number of nodes in a physical processor package locally and add an accessor to be called by interested parties. The first user is the MCE injection module which uses it to find the node base core in a package for injecting a certain type of errors. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> [ Rewrote the commit message, merged it with the accessor patch and unified naming. ] Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jacob Shin <jacob.w.shin@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: mchehab@osg.samsung.com Link: http://lkml.kernel.org/r/1433868317-18417-2-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Support 32-bit binaries on 64-bit kernelsDave Hansen
Right now, the kernel can only switch between 64-bit and 32-bit binaries at compile time. This patch adds support for 32-bit binaries on 64-bit kernels when we support ia32 emulation. We essentially choose which set of table sizes to use when doing arithmetic for the bounds table calculations. This also uses a different approach for calculating the table indexes than before. I think the new one makes it much more clear what is going on, and allows us to share more code between the 32-bit and 64-bit cases. Based-on-patch-by: Qiaowei Ren <qiaowei.ren@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150607183705.E01F21E2@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Introduce new 'directory entry' to 'addr' helper functionDave Hansen
Currently, to get from a bounds directory entry to the virtual address of a bounds table, we simply mask off a few low bits. However, the set of bits we mask off is different for 32-bit and 64-bit binaries. This breaks the operation out in to a helper function and also adds a temporary variable to store the result until we are sure we are returning one. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150607183704.007686CE@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86: Make is_64bit_mm() widely availableDave Hansen
The uprobes code has a nice helper, is_64bit_mm(), that consults both the runtime and compile-time flags for 32-bit support. Instead of reinventing the wheel, pull it in to an x86 header so we can use it for MPX. I prefer passing the 'mm' around to test_thread_flag(TIF_IA32) because it makes it explicit where the context is coming from. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150607183704.F0209999@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Trace allocation of new bounds tablesDave Hansen
Bounds tables are a significant consumer of memory. It is important to know when they are being allocated. Add a trace point to trace whenever an allocation occurs and also its virtual address. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150607183704.EC23A93E@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Trace the attempts to find bounds tablesDave Hansen
There are two different events being traced here. They are doing similar things so share a trace "EVENT_CLASS" and are presented together. 1. Trace when MPX is zapping pages "mpx_unmap_zap": When MPX can not free an entire bounds table, it will instead try to zap unused parts of a bounds table to free the backing memory. This decreases RSS (resident set size) without decreasing the virtual space allocated for bounds tables. 2. Trace attempts to find bounds tables "mpx_unmap_search": This event traces any time we go looking to unmap a bounds table for a given virtual address range. This is useful to ensure that the kernel actually "tried" to free a bounds table versus times it succeeded in finding one. It might try and fail if it realized that a table was shared with an adjacent VMA which is not being unmapped. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150607183703.B9D2468B@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Trace entry to bounds exception pathsDave Hansen
There are two basic things that can happen as the result of a bounds exception (#BR): 1. We allocate a new bounds table 2. We pass up a bounds exception to userspace. This patch adds a trace point for the case where we are passing the exception up to userspace with a signal. We are also explicit that we're printing out the inverse of the 'upper' that we encounter. If you want to filter, for instance, you need to ~ the value first. The reason we do this is because of how 'upper' is stored in the bounds table. If a pointer's range is: 0x1000 -> 0x2000 it is stored in the bounds table as (32-bits here for brevity): lower: 0x00001000 upper: 0xffffdfff That is so that an all 0's entry: lower: 0x00000000 upper: 0x00000000 corresponds to the "init" bounds which store a *range* of: 0x00000000 -> 0xffffffff That is, by far, the common case, and that lets us use the zero page, or deduplicate the memory, etc... The 'upper' stored in the table is gibberish to print by itself, so we print ~upper to get the *actual*, logical, human-readable value printed out. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150607183703.027BB9B0@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Trace #BR exceptionsDave Hansen
This is the first in a series of MPX tracing patches. I've found these extremely useful in the process of debugging applications and the kernel code itself. This exception hooks in to the bounds (#BR) exception very early and allows capturing the key registers which would influence how the exception is handled. Note that bndcfgu/bndstatus are technically still 64-bit registers even in 32-bit mode. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150607183703.5FE2619A@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Remove redundant MPX_BNDCFG_ADDR_MASKQiaowei Ren
MPX_BNDCFG_ADDR_MASK is defined two times, so this patch removes redundant one. Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150607183702.5F129376@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Clean up the code by not passing a task pointer around when unnecessaryDave Hansen
The MPX code can only work on the current task. You can not, for instance, enable MPX management in another process or thread. You can also not handle a fault for another process or thread. Despite this, we pass a task_struct around prolifically. This patch removes all of the task struct passing for code paths where the code can not deal with another task (which turns out to be all of them). This has no functional changes. It's just a cleanup. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: bp@alien8.de Link: http://lkml.kernel.org/r/20150607183702.6A81DA2C@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Use the new get_xsave_field_ptr()APIDave Hansen
The MPX registers (bndcsr/bndcfgu/bndstatus) are not directly accessible via normal instructions. They essentially act as if they were floating point registers and are saved/restored along with those registers. There are two main paths in the MPX code where we care about the contents of these registers: 1. #BR (bounds) faults 2. the prctl() code where we are setting MPX up Both of those paths _might_ be called without the FPU having been used. That means that 'tsk->thread.fpu.state' might never be allocated. Also, fpu_save_init() is not preempt-safe. It was a bug to call it without disabling preemption. The new get_xsave_addr() calls unlazy_fpu() instead and properly disables preemption. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave@sr71.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: bp@alien8.de Link: http://lkml.kernel.org/r/20150607183701.BC0D37CF@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/fpu/xstate: Wrap get_xsave_addr() to make it saferDave Hansen
The MPX code appears is calling a low-level FPU function (copy_fpregs_to_fpstate()). This function is not able to be called in all contexts, although it is safe to call directly in some cases. Although probably correct, the current code is ugly and potentially error-prone. So, add a wrapper that calls the (slightly) higher-level fpu__save() (which is preempt- safe) and also ensures that we even *have* an FPU context (in the case that this was called when in lazy FPU mode). Ingo had this to say about the details about when we need preemption disabled: > it's indeed generally unsafe to access/copy FPU registers with preemption enabled, > for two reasons: > > - on older systems that use FSAVE the instruction destroys FPU register > contents, which has to be handled carefully > > - even on newer systems if we copy to FPU registers (which this code doesn't) > then we don't want a context switch to occur in the middle of it, because a > context switch will write to the fpstate, potentially overwriting our new data > with old FPU state. > > But it's safe to access FPU registers with preemption enabled in a couple of > special cases: > > - potentially destructively saving FPU registers: the signal handling code does > this in copy_fpstate_to_sigframe(), because it can rely on the signal restore > side to restore the original FPU state. > > - reading FPU registers on modern systems: we don't do this anywhere at the > moment, mostly to keep symmetry with older systems where FSAVE is > destructive. > > - initializing FPU registers on modern systems: fpu__clear() does this. Here > it's safe because we don't copy from the fpstate. > > - directly writing FPU registers from user-space memory (!). We do this in > fpu__restore_sig(), and it's safe because neither context switches nor > irq-handler FPU use can corrupt the source context of the copy (which is > user-space memory). > > Note that the MPX code's current use of copy_fpregs_to_fpstate() was safe I think, > because: > > - MPX is predicated on eagerfpu, so the destructive F[N]SAVE instruction won't be > used. > > - the code was only reading FPU registers, and was doing it only in places that > guaranteed that an FPU state was already active (i.e. didn't do it in > kthreads) Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave@sr71.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: bp@alien8.de Link: http://lkml.kernel.org/r/20150607183700.AA881696@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-07preempt: Use preempt_schedule_context() as the official tracing preemption pointFrederic Weisbecker
preempt_schedule_context() is a tracing safe preemption point but it's only used when CONFIG_CONTEXT_TRACKING=y. Other configs have tracing recursion issues since commit: b30f0e3ffedf ("sched/preempt: Optimize preemption operations on __schedule() callers") introduced function based preemp_count_*() ops. Lets make it available on all configs and give it a more appropriate name for its new position. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1433432349-1021-3-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-07x86/microcode: Correct CPU family related variable typesAndy Shevchenko
Change the type of variables and function prototypes to be in alignment with what the x86_*() / __x86_*() family/model functions return. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1433436928-31903-21-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-05Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: - early_idt_handlers[] fix that fixes the build with bleeding edge tooling - build warning fix on GCC 5.1 - vm86 fix plus self-test to make it harder to break it again" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm/irq: Stop relying on magic JMP behavior for early_idt_handlers x86/asm/entry/32, selftests: Add a selftest for kernel entries from VM86 mode x86/boot: Add CONFIG_PARAVIRT_SPINLOCKS quirk to arch/x86/boot/compressed/misc.h x86/asm/entry/32: Really make user_mode() work correctly for VM86 mode
2015-06-02x86/asm/irq: Stop relying on magic JMP behavior for early_idt_handlersAndy Lutomirski
The early_idt_handlers asm code generates an array of entry points spaced nine bytes apart. It's not really clear from that code or from the places that reference it what's going on, and the code only works in the first place because GAS never generates two-byte JMP instructions when jumping to global labels. Clean up the code to generate the correct array stride (member size) explicitly. This should be considerably more robust against screw-ups, as GAS will warn if a .fill directive has a negative count. Using '. =' to advance would have been even more robust (it would generate an actual error if it tried to move backwards), but it would pad with nulls, confusing anyone who tries to disassemble the code. The new scheme should be much clearer to future readers. While we're at it, improve the comments and rename the array and common code. Binutils may start relaxing jumps to non-weak labels. If so, this change will fix our build, and we may need to backport this change. Before, on x86_64: 0000000000000000 <early_idt_handlers>: 0: 6a 00 pushq $0x0 2: 6a 00 pushq $0x0 4: e9 00 00 00 00 jmpq 9 <early_idt_handlers+0x9> 5: R_X86_64_PC32 early_idt_handler-0x4 ... 48: 66 90 xchg %ax,%ax 4a: 6a 08 pushq $0x8 4c: e9 00 00 00 00 jmpq 51 <early_idt_handlers+0x51> 4d: R_X86_64_PC32 early_idt_handler-0x4 ... 117: 6a 00 pushq $0x0 119: 6a 1f pushq $0x1f 11b: e9 00 00 00 00 jmpq 120 <early_idt_handler> 11c: R_X86_64_PC32 early_idt_handler-0x4 After: 0000000000000000 <early_idt_handler_array>: 0: 6a 00 pushq $0x0 2: 6a 00 pushq $0x0 4: e9 14 01 00 00 jmpq 11d <early_idt_handler_common> ... 48: 6a 08 pushq $0x8 4a: e9 d1 00 00 00 jmpq 120 <early_idt_handler_common> 4f: cc int3 50: cc int3 ... 117: 6a 00 pushq $0x0 119: 6a 1f pushq $0x1f 11b: eb 03 jmp 120 <early_idt_handler_common> 11d: cc int3 11e: cc int3 11f: cc int3 Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Binutils <binutils@sourceware.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H.J. Lu <hjl.tools@gmail.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/ac027962af343b0c599cbfcf50b945ad2ef3d7a8.1432336324.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-02Merge branch 'linus' into sched/core, to resolve conflictIngo Molnar
Conflicts: arch/sparc/include/asm/topology_64.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-29x86/asm/entry/32: Really make user_mode() work correctly for VM86 modeJan Beulich
While commit efa7045103 ("x86/asm/entry: Make user_mode() work correctly if regs came from VM86 mode") claims that "user_mode() is now identical to user_mode_vm()", this wasn't actually the case - no prior commit made it so. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/5566EB0D020000780007E655@mail.emea.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27tools/power turbostat: enable turbostat to support Knights Landing (KNL)Dasaratharaman Chandramouli
Changes mainly to account for minor differences in Knights Landing(KNL): 1. KNL supports C1 and C6 core states. 2. KNL supports PC2, PC3 and PC6 package states. 3. KNL has a different encoding of the TURBO_RATIO_LIMIT MSR Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2015-05-27x86: Remove cpu_sibling_mask() and cpu_core_mask()Bartosz Golaszewski
These functions are arch-specific and duplicate the functionality of macros defined in linux/include/topology.h. Remove them as all the callers in x86 have now switched to using the topology_**_cpumask() family. Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Benoit Cousson <bcousson@baylibre.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jean Delvare <jdelvare@suse.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Drokin <oleg.drokin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Russell King <linux@arm.linux.org.uk> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: http://lkml.kernel.org/r/1432645896-12588-10-git-send-email-bgolaszewski@baylibre.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27sched/topology: Rename topology_thread_cpumask() to topology_sibling_cpumask()Bartosz Golaszewski
Rename topology_thread_cpumask() to topology_sibling_cpumask() for more consistency with scheduler code. Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Benoit Cousson <bcousson@baylibre.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jean Delvare <jdelvare@suse.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Drokin <oleg.drokin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Russell King <linux@arm.linux.org.uk> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: http://lkml.kernel.org/r/1432645896-12588-2-git-send-email-bgolaszewski@baylibre.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Make WARN_ON_FPU() more robust in the !CONFIG_X86_DEBUG_FPU caseIngo Molnar
Make sure the WARN_ON_FPU() macro consumes the macro argument, to avoid 'unused variable' build warnings if the only use of a variable is in debugging code. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Simplify copy_kernel_to_xregs_booting()Ingo Molnar
copy_kernel_to_xregs_booting() has a second parameter that is the mask of xfeatures that should be copied - but this parameter is always -1. Simplify the call site of this function, this also makes it more similar to the function call signature of other copy_kernel_to*regs() functions. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Standardize the parameter type of copy_kernel_to_fpregs()Ingo Molnar
Bring the __copy_fpstate_to_fpregs() and copy_fpstate_to_fpregs() functions in line with the parameter passing convention of other kernel-to-FPU-registers copying functions: pass around an in-memory FPU register state pointer, instead of struct fpu *. NOTE: This patch also changes the assembly constraint of the FXSAVE-leak workaround from 'fpu->fpregs_active' to 'fpstate' - but that is fine, as we only need a valid memory address there for the FILDL instruction. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Remove error return values from copy_kernel_to_*regs() functionsIngo Molnar
None of the copy_kernel_to_*regs() FPU register copying functions are supposed to fail, and all of them have debugging checks that enforce this. Remove their return values and simplify their call sites, which have redundant error checks and error handling code paths. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Rename copy_fpstate_to_fpregs() to copy_kernel_to_fpregs()Ingo Molnar
Bring the __copy_fpstate_to_fpregs() and copy_fpstate_to_fpregs() functions in line with the naming of other kernel-to-FPU-registers copying functions. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Add debugging checks to all copy_kernel_to_*() functionsIngo Molnar
Copying from in-kernel FPU context buffers to FPU registers are never supposed to fault. Add debugging checks to copy_kernel_to_fxregs() and copy_kernel_to_fregs() to double check this assumption. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Rename fpu__activate_fpstate() to fpu__activate_fpstate_write()Ingo Molnar
Remaining users of fpu__activate_fpstate() are all places that want to modify FPU registers, rename the function to fpu__activate_fpstate_write() according to this usage. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Split out the fpu__activate_fpstate_read() methodIngo Molnar
Currently fpu__activate_fpstate() is used for two distinct purposes: - read access by ptrace and core dumping, where in the core dumping case the current task's FPU state may be examined as well. - write access by ptrace, which modifies FPU registers and expects the modified registers to be reloaded on the next context switch. Split out the reading side into fpu__activate_fpstate_read(). ( Note that this is just a pure duplication of fpu__activate_fpstate() for the time being, we'll optimize the new function in the next patch. ) Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Fix FPU register read access to the current taskIngo Molnar
Bobby Powers reported the following FPU warning during ELF coredumping: WARNING: CPU: 0 PID: 27452 at arch/x86/kernel/fpu/core.c:324 fpu__activate_stopped+0x8a/0xa0() This warning unearthed an invalid assumption about fpu__activate_stopped() that I added in: 67e97fc2ec57 ("x86/fpu: Rename init_fpu() to fpu__unlazy_stopped() and add debugging check") the old init_fpu() function had an (intentional but obscure) side effect: when FPU registers are accessed for the current task, for reading, then it synchronized live in-register FPU state with the fpstate by saving it. So fix this bug by saving the FPU if we are the current task. We'll still warn in fpu__save() if this is called for not yet stopped child tasks, so the debugging check is still preserved. Also rename the function to fpu__activate_fpstate(), because it's not exclusively used for stopped tasks, but for the current task as well. ( Note that this bug calls for a cleaner separation of access-for-read and access-for-modification FPU methods, but we'll do that in separate patches. ) Reported-by: Bobby Powers <bobbypowers@gmail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-25x86/fpu: Micro-optimize the copy_xregs_to_kernel*() and ↵Ingo Molnar
copy_kernel_to_xregs*() functions The copy_xregs_to_kernel*() and copy_kernel_to_xregs*() functions are used to copy FPU registers to kernel memory and vice versa. They are never expected to fail, yet they have a return code, mostly because that way they can share the assembly macros with the copy*user*() functions. This error code is then silently ignored by the context switching and other code - which made the bug in: b8c1b8ea7b21 ("x86/fpu: Fix FPU state save area alignment bug") harder to fix than necessary. So remove the return values and check for no faults when FPU debugging is enabled in the .config. This improves the eagerfpu context switching fast path by a couple of instructions, when FPU debugging is disabled: ffffffff810407fa: 89 c2 mov %eax,%edx ffffffff810407fc: 48 0f ae 2f xrstor64 (%rdi) ffffffff81040800: 31 c0 xor %eax,%eax -ffffffff81040802: eb 0a jmp ffffffff8104080e <__switch_to+0x321> +ffffffff81040802: eb 16 jmp ffffffff8104081a <__switch_to+0x32d> ffffffff81040804: 31 c0 xor %eax,%eax ffffffff81040806: 48 0f ae 8b c0 05 00 fxrstor64 0x5c0(%rbx) ffffffff8104080d: 00 Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-25x86/fpu: Improve the initialization logic of 'err' around xstate_fault() ↵Ingo Molnar
constraints There's a confusing aspect of how xstate_fault() constraints are handled by the FPU register/memory copying functions in fpu/internal.h: they use "0" (0) to signal that the asm code will not always set 'err' to a valid value. But 'err' is already initialized to 0 in C code, which is duplicated by the asm() constraint. Should the initialization value ever be changed, it might become subtly inconsistent with the not too clear asm() constraint. Use 'err' as the value of the input variable instead, to clarify this all. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-25x86/fpu: Improve xstate_fault() handlingIngo Molnar
There are two problems with xstate_fault handling: - The xstate_fault() macro takes an argument, but that's propagated into the assembly named label as well. This is technically correct currently but might result in failures if anytime a more complex argument is used. So use a separate '_err' name instead for the label. - All the xstate_fault() using functions have an error variable named 'err', which is an output variable to the asm() they are using. The problem is, it's not always set by the asm(), in which case the compiler might optimize out its initialization, so that the C variable 'err' might become corrupted after the asm() - confusing anyone who tries to take advantage of this variable after the asm(). Mark it an input variable as well. This is a latent bug currently, but an upcoming debug patch will make use of 'err'. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-25x86/fpu: Rename xstate related 'fx' references to 'xstate'Ingo Molnar
So the xstate code was probably first copied from the fxregs code, hence it carried over the 'fx' naming for the state pointer variable. But this is slightly confusing, as we usually on call the (legacy) MMX/SSE state 'fx', both in data structures and in the functions build around FXSAVE/FXRSTOR. So rename it to 'xstate' to make it more apparent what it is related to. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-25x86/fpu: Move the xstate copying functions into fpu/internal.hIngo Molnar
All the other register<-> memory copying functions are defined in fpu/internal.h, so move the xstate variants there too. Beyond being more consistent, this also allows FPU debugging checks to be added to them. (Because they can now use the macros defined in fpu/internal.h.) Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-25Merge branch 'linus' into x86/fpuIngo Molnar
Resolve semantic conflict in arch/x86/kvm/cpuid.c with: c447e76b4cab ("kvm/fpu: Enable eager restore kvm FPU for MPX") By removing the FPU internal include files. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-25x86/fpu: Fix FPU state save area alignment bugIngo Molnar
On most configs task-struct is cache line aligned, which makes the XSAVE area's 64-byte required alignment work out fine. But on some .config's task_struct is aligned only to 16 bytes (enforced by ARCH_MIN_TASKALIGN), which makes things like fpu__copy() (that XSAVEOPT uses) not work so well. I broke this in: 7366ed771f6e ("x86/fpu: Simplify FPU handling by embedding the fpstate in task_struct (again)") which embedded the fpstate in the task_struct. The alignment requirements of the FPU code were originally present in ARCH_MIN_TASKALIGN, which still has a value of 16, which was the alignment requirement of the FPU state area prior XSAVE. But this link was not documented (and not required) and the link got lost when the FPU state area was made dynamic years ago. With XSAVEOPT the minimum alignment requirment went up to 64 bytes, and the embedding of the FPU state area in task_struct exposed it again - and '16' was not increased to '64'. So fix this bug, but also try to address the underlying lost link of information that made it easier to happen: - document ARCH_MIN_TASKALIGN a bit better - use alignof() to recover the current alignment requirements. This would work in the future as well, should the alignment requirements go up to 128 bytes with things like AVX512. ( We should probably also use the vSMP alignment rules for all of x86, but that's for another patch. ) Reported-by: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-22Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "This includes a fix for two oopses, one on PPC and on x86. The rest is fixes for bugs with newer Intel processors" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm/fpu: Enable eager restore kvm FPU for MPX Revert "KVM: x86: drop fpu_activate hook" kvm: fix crash in kvm_vcpu_reload_apic_access_page KVM: MMU: fix SMAP virtualization KVM: MMU: fix CR4.SMEP=1, CR0.WP=0 with shadow pages KVM: MMU: fix smap permission check KVM: PPC: Book3S HV: Fix list traversal in error case
2015-05-20kvm/fpu: Enable eager restore kvm FPU for MPXLiang Li
The MPX feature requires eager KVM FPU restore support. We have verified that MPX cannot work correctly with the current lazy KVM FPU restore mechanism. Eager KVM FPU restore should be enabled if the MPX feature is exposed to VM. Signed-off-by: Yang Zhang <yang.z.zhang@intel.com> Signed-off-by: Liang Li <liang.z.li@intel.com> [Also activate the FPU on AMD processors. - Paolo] Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-05-20Revert "KVM: x86: drop fpu_activate hook"Paolo Bonzini
This reverts commit 4473b570a7ebb502f63f292ccfba7df622e5fdd3. We'll use the hook again. Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-05-20x86/fpu/xstate: Use explicit parameter in xstate_fault()Ingo Molnar
While looking at xstate.h it took me some time to realize that 'xstate_fault' uses 'err' as a silent parameter. This is not obvious at the call site, at all. Make it an explicit macro argument, so that the syntactic connection is easier to see. Also explain xstate_fault() a bit. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-19x86/fpu: Reorganize fpu/internal.hIngo Molnar
fpu/internal.h has grown organically, with not much high level structure, which hurts its readability. Organize the various definitions into 5 sections: - high level FPU state functions - FPU/CPU feature flag helpers - fpstate handling functions - FPU context switching helpers - misc helper functions Other related changes: - Move MXCSR_DEFAULT to fpu/types.h. - drop the unused X87_FSW_ES define No change in functionality. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-19x86/fpu: Add CONFIG_X86_DEBUG_FPU=y FPU debugging codeIngo Molnar
There are various internal FPU state debugging checks that never trigger in practice, but which are useful for FPU code development. Separate these out into CONFIG_X86_DEBUG_FPU=y, and also add a couple of new ones. The size difference is about 0.5K of code on defconfig: text data bss filename 15028906 2578816 1638400 vmlinux 15029430 2578816 1638400 vmlinux ( Keep this enabled by default until the new FPU code is debugged. ) Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-19x86/fpu: Pass 'struct fpu' to fpu__restore()Ingo Molnar
This cleans up the call sites and the function a bit, and also makes it more symmetric with the other high level FPU state handling functions. It's still only valid for the current task, as we copy to the FPU registers of the current CPU. No change in functionality. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>