summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2017-05-25arm: KVM: Do not use stack-protector to compile HYP codeMarc Zyngier
commit 501ad27c67ed0b90df465f23d33e9aed64058a47 upstream. We like living dangerously. Nothing explicitely forbids stack-protector to be used in the HYP code, while distributions routinely compile their kernel with it. We're just lucky that no code actually triggers the instrumentation. Let's not try our luck for much longer, and disable stack-protector for code living at HYP. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25arm64: KVM: Do not use stack-protector to compile EL2 codeMarc Zyngier
commit cde13b5dad60471886a3bccb4f4134c647c4a9dc upstream. We like living dangerously. Nothing explicitely forbids stack-protector to be used in the EL2 code, while distributions routinely compile their kernel with it. We're just lucky that no code actually triggers the instrumentation. Let's not try our luck for much longer, and disable stack-protector for code living at EL2. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25powerpc/tm: Fix FP and VMX register corruptionMichael Neuling
commit f48e91e87e67b56bef63393d1a02c6e22c1d7078 upstream. In commit dc3106690b20 ("powerpc: tm: Always use fp_state and vr_state to store live registers"), a section of code was removed that copied the current state to checkpointed state. That code should not have been removed. When an FP (Floating Point) unavailable is taken inside a transaction, we need to abort the transaction. This is because at the time of the tbegin, the FP state is bogus so the state stored in the checkpointed registers is incorrect. To fix this, we treclaim (to get the checkpointed GPRs) and then copy the thread_struct FP live state into the checkpointed state. We then trecheckpoint so that the FP state is correctly restored into the CPU. The copying of the FP registers from live to checkpointed is what was missing. This simplifies the logic slightly from the original patch. tm_reclaim_thread() will now always write the checkpointed FP state. Either the checkpointed FP state will be written as part of the actual treclaim (in tm.S), or it'll be a copy of the live state. Which one we use is based on MSR[FP] from userspace. Similarly for VMX. Fixes: dc3106690b20 ("powerpc: tm: Always use fp_state and vr_state to store live registers") Signed-off-by: Michael Neuling <mikey@neuling.org> Reviewed-by: cyrilbur@gmail.com Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25powerpc/64e: Fix hang when debugging programs with relocated kernelLiuHailong
commit fd615f69a18a9d4aa5ef02a1dc83f319f75da8e7 upstream. Debug interrupts can be taken during interrupt entry, since interrupt entry does not automatically turn them off. The kernel will check whether the faulting instruction is between [interrupt_base_book3e, __end_interrupts], and if so clear MSR[DE] and return. However, when the kernel is built with CONFIG_RELOCATABLE, it can't use LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e) and LOAD_REG_IMMEDIATE(r15,__end_interrupts), as they ignore relocation. Thus, if the kernel is actually running at a different address than it was built at, the address comparison will fail, and the exception entry code will hang at kernel_dbg_exc. r2(toc) is also not usable here, as r2 still holds data from the interrupted context, so LOAD_REG_ADDR() doesn't work either. So we use the *name@got* to get the EV of two labels directly. Test programs test.c shows as follows: int main(int argc, char *argv[]) { if (access("/proc/sys/kernel/perf_event_paranoid", F_OK) == -1) printf("Kernel doesn't have perf_event support\n"); } Steps to reproduce the bug, for example: 1) ./gdb ./test 2) (gdb) b access 3) (gdb) r 4) (gdb) s Signed-off-by: Liu Hailong <liu.hailong6@zte.com.cn> Signed-off-by: Jiang Xuexin <jiang.xuexin@zte.com.cn> Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn> Reviewed-by: Liu Song <liu.song11@zte.com.cn> Reviewed-by: Huang Jian <huang.jian@zte.com.cn> [scottwood: cleaned up commit message, and specified bad behavior as a hang rather than an oops to correspond to mainline kernel behavior] Fixes: 1cb6e0649248 ("powerpc/book3e: support CONFIG_RELOCATABLE") Signed-off-by: Scott Wood <oss@buserror.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25powerpc/iommu: Do not call PageTransHuge() on tail pagesAlexey Kardashevskiy
commit e889e96e98e8da97bd39e46b7253615eabe14397 upstream. The CMA pages migration code does not support compound pages at the moment so it performs few tests before proceeding to actual page migration. One of the tests - PageTransHuge() - has VM_BUG_ON_PAGE(PageTail()) as it is designed to be called on head pages only. Since we also test for PageCompound(), and it contains PageTail() and PageHead(), we can simplify the check by leaving just PageCompound() and therefore avoid possible VM_BUG_ON_PAGE. Fixes: 2e5bbb5461f1 ("KVM: PPC: Book3S HV: Migrate pinned pages out of CMA") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25powerpc/pseries: Fix of_node_put() underflow during DLPAR removeTyrel Datwyler
commit 68baf692c435339e6295cb470ea5545cbc28160e upstream. Historically struct device_node references were tracked using a kref embedded as a struct field. Commit 75b57ecf9d1d ("of: Make device nodes kobjects so they show up in sysfs") (Mar 2014) refactored device_nodes to be kobjects such that the device tree could by more simply exposed to userspace using sysfs. Commit 0829f6d1f69e ("of: device_node kobject lifecycle fixes") (Mar 2014) followed up these changes to better control the kobject lifecycle and in particular the referecne counting via of_node_get(), of_node_put(), and of_node_init(). A result of this second commit was that it introduced an of_node_put() call when a dynamic node is detached, in of_node_remove(), that removes the initial kobj reference created by of_node_init(). Traditionally as the original dynamic device node user the pseries code had assumed responsibilty for releasing this final reference in its platform specific DLPAR detach code. This patch fixes a refcount underflow introduced by commit 0829f6d1f6, and recently exposed by the upstreaming of the recount API. Messages like the following are no longer seen in the kernel log with this patch following DLPAR remove operations of cpus and pci devices. rpadlpar_io: slot PHB 72 removed refcount_t: underflow; use-after-free. ------------[ cut here ]------------ WARNING: CPU: 5 PID: 3335 at lib/refcount.c:128 refcount_sub_and_test+0xf4/0x110 Fixes: 0829f6d1f69e ("of: device_node kobject lifecycle fixes") Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> [mpe: Make change log commit references more verbose] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25powerpc/book3s/mce: Move add_taint() later in virtual modeMahesh Salgaonkar
commit d93b0ac01a9ce276ec39644be47001873d3d183c upstream. machine_check_early() gets called in real mode. The very first time when add_taint() is called, it prints a warning which ends up calling opal call (that uses OPAL_CALL wrapper) for writing it to console. If we get a very first machine check while we are in opal we are doomed. OPAL_CALL overwrites the PACASAVEDMSR in r13 and in this case when we are done with MCE handling the original opal call will use this new MSR on it's way back to opal_return. This usually leads to unexpected behaviour or the kernel to panic. Instead move the add_taint() call later in the virtual mode where it is safe to call. This is broken with current FW level. We got lucky so far for not getting very first MCE hit while in OPAL. But easily reproducible on Mambo. Fixes: 27ea2c420cad ("powerpc: Set the correct kernel taint on machine check errors.") Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25powerpc/eeh: Avoid use after free in eeh_handle_special_event()Russell Currey
commit daeba2956f32f91f3493788ff6ee02fb1b2f02fa upstream. eeh_handle_special_event() is called when an EEH event is detected but can't be narrowed down to a specific PE. This function looks through every PE to find one in an erroneous state, then calls the regular event handler eeh_handle_normal_event() once it knows which PE has an error. However, if eeh_handle_normal_event() found that the PE cannot possibly be recovered, it will free it, rendering the passed PE stale. This leads to a use after free in eeh_handle_special_event() as it attempts to clear the "recovering" state on the PE after eeh_handle_normal_event() returns. Thus, make sure the PE is valid when attempting to clear state in eeh_handle_special_event(). Fixes: 8a6b1bc70dbb ("powerpc/eeh: EEH core to handle special event") Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25powerpc/mm: Ensure IRQs are off in switch_mm()David Gibson
commit 9765ad134a00a01cbcc69c78ff6defbfad209bc5 upstream. powerpc expects IRQs to already be (soft) disabled when switch_mm() is called, as made clear in the commit message of 9c1e105238c4 ("powerpc: Allow perf_counters to access user memory at interrupt time"). Aside from any race conditions that might exist between switch_mm() and an IRQ, there is also an unconditional hard_irq_disable() in switch_slb(). If that isn't followed at some point by an IRQ enable then interrupts will remain disabled until we return to userspace. It is true that when switch_mm() is called from the scheduler IRQs are off, but not when it's called by use_mm(). Looking closer we see that last year in commit f98db6013c55 ("sched/core: Add switch_mm_irqs_off() and use it in the scheduler") this was made more explicit by the addition of switch_mm_irqs_off() which is now called by the scheduler, vs switch_mm() which is used by use_mm(). Arguably it is a bug in use_mm() to call switch_mm() in a different context than it expects, but fixing that will take time. This was discovered recently when vhost started throwing warnings such as: BUG: sleeping function called from invalid context at kernel/mutex.c:578 in_atomic(): 0, irqs_disabled(): 1, pid: 10768, name: vhost-10760 no locks held by vhost-10760/10768. irq event stamp: 10 hardirqs last enabled at (9): _raw_spin_unlock_irq+0x40/0x80 hardirqs last disabled at (10): switch_slb+0x2e4/0x490 softirqs last enabled at (0): copy_process+0x5e8/0x1260 softirqs last disabled at (0): (null) Call Trace: show_stack+0x88/0x390 (unreliable) dump_stack+0x30/0x44 __might_sleep+0x1c4/0x2d0 mutex_lock_nested+0x74/0x5c0 cgroup_attach_task_all+0x5c/0x180 vhost_attach_cgroups_work+0x58/0x80 [vhost] vhost_worker+0x24c/0x3d0 [vhost] kthread+0xec/0x100 ret_from_kernel_thread+0x5c/0xd4 Prior to commit 04b96e5528ca ("vhost: lockless enqueuing") (Aug 2016) the vhost_worker() would do a spin_unlock_irq() not long after calling use_mm(), which had the effect of reenabling IRQs. Since that commit removed the locking in vhost_worker() the body of the vhost_worker() loop now runs with interrupts off causing the warnings. This patch addresses the problem by making the powerpc code mirror the x86 code, ie. we disable interrupts in switch_mm(), and optimise the scheduler case by defining switch_mm_irqs_off(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [mpe: Flesh out/rewrite change log, add stable] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25s390/cputime: fix incorrect system timeMartin Schwidefsky
commit 07a63cbe8bcb6ba72fb989dcab1ec55ec6c36c7e upstream. git commit c5328901aa1db134 "[S390] entry[64].S improvements" removed the update of the exit_timer lowcore field from the critical section cleanup of the .Lsysc_restore/.Lsysc_done and .Lio_restore/.Lio_done blocks. If the PSW is updated by the critical section cleanup to point to user space again, the interrupt entry code will do a vtime calculation after the cleanup completed with an exit_timer value which has *not* been updated. Due to this incorrect system time deltas are calculated. If an interrupt occured with an old PSW between .Lsysc_restore/.Lsysc_done or .Lio_restore/.Lio_done update __LC_EXIT_TIMER with the system entry time of the interrupt. Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25s390/kdump: Add final noteMichael Holzheu
commit dcc00b79fc3d076832f7240de8870f492629b171 upstream. Since linux v3.14 with commit 38dfac843cb6d7be1 ("vmcore: prevent PT_NOTE p_memsz overflow during header update") on s390 we get the following message in the kdump kernel: Warning: Exceeded p_memsz, dropping PT_NOTE entry n_namesz=0x6b6b6b6b, n_descsz=0x6b6b6b6b The reason for this is that we don't create a final zero note in the ELF header which the proc/vmcore code uses to find out the end of the notes section (see also kernel/kexec_core.c:final_note()). It still worked on s390 by chance because we (most of the time?) have the byte pattern 0x6b6b6b6b after the notes section which also makes the notes parsing code stop in update_note_header_size_elf64() because 0x6b6b6b6b is interpreded as note size: if ((real_sz + sz) > max_sz) { pr_warn("Warning: Exceeded p_memsz, dropping P ...); break; } So fix this and add the missing final note to the ELF header. We don't have to adjust the memory size for ELF header ("alloc_size") because the new ELF note still fits into the 0x1000 base memory. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25x86: fix 32-bit case of __get_user_asm_u64()Linus Torvalds
commit 33c9e9729033387ef0521324c62e7eba529294af upstream. The code to fetch a 64-bit value from user space was entirely buggered, and has been since the code was merged in early 2016 in commit b2f680380ddf ("x86/mm/32: Add support for 64-bit __get_user() on 32-bit kernels"). Happily the buggered routine is almost certainly entirely unused, since the normal way to access user space memory is just with the non-inlined "get_user()", and the inlined version didn't even historically exist. The normal "get_user()" case is handled by external hand-written asm in arch/x86/lib/getuser.S that doesn't have either of these issues. There were two independent bugs in __get_user_asm_u64(): - it still did the STAC/CLAC user space access marking, even though that is now done by the wrapper macros, see commit 11f1a4b9755f ("x86: reorganize SMAP handling in user space accesses"). This didn't result in a semantic error, it just means that the inlined optimized version was hugely less efficient than the allegedly slower standard version, since the CLAC/STAC overhead is quite high on modern Intel CPU's. - the double register %eax/%edx was marked as an output, but the %eax part of it was touched early in the asm, and could thus clobber other inputs to the asm that gcc didn't expect it to touch. In particular, that meant that the generated code could look like this: mov (%eax),%eax mov 0x4(%eax),%edx where the load of %edx obviously was _supposed_ to be from the 32-bit word that followed the source of %eax, but because %eax was overwritten by the first instruction, the source of %edx was basically random garbage. The fixes are trivial: remove the extraneous STAC/CLAC entries, and mark the 64-bit output as early-clobber to let gcc know that no inputs should alias with the output register. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25KVM: X86: Fix read out-of-bounds vulnerability in kvm pio emulationWanpeng Li
commit cbfc6c9184ce71b52df4b1d82af5afc81a709178 upstream. Huawei folks reported a read out-of-bounds vulnerability in kvm pio emulation. - "inb" instruction to access PIT Mod/Command register (ioport 0x43, write only, a read should be ignored) in guest can get a random number. - "rep insb" instruction to access PIT register port 0x43 can control memcpy() in emulator_pio_in_emulated() to copy max 0x400 bytes but only read 1 bytes, which will disclose the unimportant kernel memory in host but no crash. The similar test program below can reproduce the read out-of-bounds vulnerability: void hexdump(void *mem, unsigned int len) { unsigned int i, j; for(i = 0; i < len + ((len % HEXDUMP_COLS) ? (HEXDUMP_COLS - len % HEXDUMP_COLS) : 0); i++) { /* print offset */ if(i % HEXDUMP_COLS == 0) { printf("0x%06x: ", i); } /* print hex data */ if(i < len) { printf("%02x ", 0xFF & ((char*)mem)[i]); } else /* end of block, just aligning for ASCII dump */ { printf(" "); } /* print ASCII dump */ if(i % HEXDUMP_COLS == (HEXDUMP_COLS - 1)) { for(j = i - (HEXDUMP_COLS - 1); j <= i; j++) { if(j >= len) /* end of block, not really printing */ { putchar(' '); } else if(isprint(((char*)mem)[j])) /* printable char */ { putchar(0xFF & ((char*)mem)[j]); } else /* other char */ { putchar('.'); } } putchar('\n'); } } } int main(void) { int i; if (iopl(3)) { err(1, "set iopl unsuccessfully\n"); return -1; } static char buf[0x40]; /* test ioport 0x40,0x41,0x42,0x43,0x44,0x45 */ memset(buf, 0xab, sizeof(buf)); asm volatile("push %rdi;"); asm volatile("mov %0, %%rdi;"::"q"(buf)); asm volatile ("mov $0x40, %rdx;"); asm volatile ("in %dx,%al;"); asm volatile ("stosb;"); asm volatile ("mov $0x41, %rdx;"); asm volatile ("in %dx,%al;"); asm volatile ("stosb;"); asm volatile ("mov $0x42, %rdx;"); asm volatile ("in %dx,%al;"); asm volatile ("stosb;"); asm volatile ("mov $0x43, %rdx;"); asm volatile ("in %dx,%al;"); asm volatile ("stosb;"); asm volatile ("mov $0x44, %rdx;"); asm volatile ("in %dx,%al;"); asm volatile ("stosb;"); asm volatile ("mov $0x45, %rdx;"); asm volatile ("in %dx,%al;"); asm volatile ("stosb;"); asm volatile ("pop %rdi;"); hexdump(buf, 0x40); printf("\n"); /* ins port 0x40 */ memset(buf, 0xab, sizeof(buf)); asm volatile("push %rdi;"); asm volatile("mov %0, %%rdi;"::"q"(buf)); asm volatile ("mov $0x20, %rcx;"); asm volatile ("mov $0x40, %rdx;"); asm volatile ("rep insb;"); asm volatile ("pop %rdi;"); hexdump(buf, 0x40); printf("\n"); /* ins port 0x43 */ memset(buf, 0xab, sizeof(buf)); asm volatile("push %rdi;"); asm volatile("mov %0, %%rdi;"::"q"(buf)); asm volatile ("mov $0x20, %rcx;"); asm volatile ("mov $0x43, %rdx;"); asm volatile ("rep insb;"); asm volatile ("pop %rdi;"); hexdump(buf, 0x40); printf("\n"); return 0; } The vcpu->arch.pio_data buffer is used by both in/out instrutions emulation w/o clear after using which results in some random datas are left over in the buffer. Guest reads port 0x43 will be ignored since it is write only, however, the function kernel_pio() can't distigush this ignore from successfully reads data from device's ioport. There is no new data fill the buffer from port 0x43, however, emulator_pio_in_emulated() will copy the stale data in the buffer to the guest unconditionally. This patch fixes it by clearing the buffer before in instruction emulation to avoid to grant guest the stale data in the buffer. In addition, string I/O is not supported for in kernel device. So there is no iteration to read ioport %RCX times for string I/O. The function kernel_pio() just reads one round, and then copy the io size * %RCX to the guest unconditionally, actually it copies the one round ioport data w/ other random datas which are left over in the vcpu->arch.pio_data buffer to the guest. This patch fixes it by introducing the string I/O support for in kernel device in order to grant the right ioport datas to the guest. Before the patch: 0x000000: fe 38 93 93 ff ff ab ab .8...... 0x000008: ab ab ab ab ab ab ab ab ........ 0x000010: ab ab ab ab ab ab ab ab ........ 0x000018: ab ab ab ab ab ab ab ab ........ 0x000020: ab ab ab ab ab ab ab ab ........ 0x000028: ab ab ab ab ab ab ab ab ........ 0x000030: ab ab ab ab ab ab ab ab ........ 0x000038: ab ab ab ab ab ab ab ab ........ 0x000000: f6 00 00 00 00 00 00 00 ........ 0x000008: 00 00 00 00 00 00 00 00 ........ 0x000010: 00 00 00 00 4d 51 30 30 ....MQ00 0x000018: 30 30 20 33 20 20 20 20 00 3 0x000020: ab ab ab ab ab ab ab ab ........ 0x000028: ab ab ab ab ab ab ab ab ........ 0x000030: ab ab ab ab ab ab ab ab ........ 0x000038: ab ab ab ab ab ab ab ab ........ 0x000000: f6 00 00 00 00 00 00 00 ........ 0x000008: 00 00 00 00 00 00 00 00 ........ 0x000010: 00 00 00 00 4d 51 30 30 ....MQ00 0x000018: 30 30 20 33 20 20 20 20 00 3 0x000020: ab ab ab ab ab ab ab ab ........ 0x000028: ab ab ab ab ab ab ab ab ........ 0x000030: ab ab ab ab ab ab ab ab ........ 0x000038: ab ab ab ab ab ab ab ab ........ After the patch: 0x000000: 1e 02 f8 00 ff ff ab ab ........ 0x000008: ab ab ab ab ab ab ab ab ........ 0x000010: ab ab ab ab ab ab ab ab ........ 0x000018: ab ab ab ab ab ab ab ab ........ 0x000020: ab ab ab ab ab ab ab ab ........ 0x000028: ab ab ab ab ab ab ab ab ........ 0x000030: ab ab ab ab ab ab ab ab ........ 0x000038: ab ab ab ab ab ab ab ab ........ 0x000000: d2 e2 d2 df d2 db d2 d7 ........ 0x000008: d2 d3 d2 cf d2 cb d2 c7 ........ 0x000010: d2 c4 d2 c0 d2 bc d2 b8 ........ 0x000018: d2 b4 d2 b0 d2 ac d2 a8 ........ 0x000020: ab ab ab ab ab ab ab ab ........ 0x000028: ab ab ab ab ab ab ab ab ........ 0x000030: ab ab ab ab ab ab ab ab ........ 0x000038: ab ab ab ab ab ab ab ab ........ 0x000000: 00 00 00 00 00 00 00 00 ........ 0x000008: 00 00 00 00 00 00 00 00 ........ 0x000010: 00 00 00 00 00 00 00 00 ........ 0x000018: 00 00 00 00 00 00 00 00 ........ 0x000020: ab ab ab ab ab ab ab ab ........ 0x000028: ab ab ab ab ab ab ab ab ........ 0x000030: ab ab ab ab ab ab ab ab ........ 0x000038: ab ab ab ab ab ab ab ab ........ Reported-by: Moguofang <moguofang@huawei.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Moguofang <moguofang@huawei.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25KVM: x86: Fix potential preemption when get the current kvmclock timestampWanpeng Li
commit e2c2206a18993bc9f62393d49c7b2066c3845b25 upstream. BUG: using __this_cpu_read() in preemptible [00000000] code: qemu-system-x86/2809 caller is __this_cpu_preempt_check+0x13/0x20 CPU: 2 PID: 2809 Comm: qemu-system-x86 Not tainted 4.11.0+ #13 Call Trace: dump_stack+0x99/0xce check_preemption_disabled+0xf5/0x100 __this_cpu_preempt_check+0x13/0x20 get_kvmclock_ns+0x6f/0x110 [kvm] get_time_ref_counter+0x5d/0x80 [kvm] kvm_hv_process_stimers+0x2a1/0x8a0 [kvm] ? kvm_hv_process_stimers+0x2a1/0x8a0 [kvm] ? kvm_arch_vcpu_ioctl_run+0xac9/0x1ce0 [kvm] kvm_arch_vcpu_ioctl_run+0x5bf/0x1ce0 [kvm] kvm_vcpu_ioctl+0x384/0x7b0 [kvm] ? kvm_vcpu_ioctl+0x384/0x7b0 [kvm] ? __fget+0xf3/0x210 do_vfs_ioctl+0xa4/0x700 ? __fget+0x114/0x210 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x23/0xc2 RIP: 0033:0x7f9d164ed357 ? __this_cpu_preempt_check+0x13/0x20 This can be reproduced by run kvm-unit-tests/hyperv_stimer.flat w/ CONFIG_PREEMPT and CONFIG_DEBUG_PREEMPT enabled. Safe access to per-CPU data requires a couple of constraints, though: the thread working with the data cannot be preempted and it cannot be migrated while it manipulates per-CPU variables. If the thread is preempted, the thread that replaces it could try to work with the same variables; migration to another CPU could also cause confusion. However there is no preemption disable when reads host per-CPU tsc rate to calculate the current kvmclock timestamp. This patch fixes it by utilizing get_cpu/put_cpu pair to guarantee both __this_cpu_read() and rdtsc() are not preempted. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25KVM: x86: Fix load damaged SSEx MXCSR registerWanpeng Li
commit a575813bfe4bc15aba511a5e91e61d242bff8b9d upstream. Reported by syzkaller: BUG: unable to handle kernel paging request at ffffffffc07f6a2e IP: report_bug+0x94/0x120 PGD 348e12067 P4D 348e12067 PUD 348e14067 PMD 3cbd84067 PTE 80000003f7e87161 Oops: 0003 [#1] SMP CPU: 2 PID: 7091 Comm: kvm_load_guest_ Tainted: G OE 4.11.0+ #8 task: ffff92fdfb525400 task.stack: ffffbda6c3d04000 RIP: 0010:report_bug+0x94/0x120 RSP: 0018:ffffbda6c3d07b20 EFLAGS: 00010202 do_trap+0x156/0x170 do_error_trap+0xa3/0x170 ? kvm_load_guest_fpu.part.175+0x12a/0x170 [kvm] ? mark_held_locks+0x79/0xa0 ? retint_kernel+0x10/0x10 ? trace_hardirqs_off_thunk+0x1a/0x1c do_invalid_op+0x20/0x30 invalid_op+0x1e/0x30 RIP: 0010:kvm_load_guest_fpu.part.175+0x12a/0x170 [kvm] ? kvm_load_guest_fpu.part.175+0x1c/0x170 [kvm] kvm_arch_vcpu_ioctl_run+0xed6/0x1b70 [kvm] kvm_vcpu_ioctl+0x384/0x780 [kvm] ? kvm_vcpu_ioctl+0x384/0x780 [kvm] ? sched_clock+0x13/0x20 ? __do_page_fault+0x2a0/0x550 do_vfs_ioctl+0xa4/0x700 ? up_read+0x1f/0x40 ? __do_page_fault+0x2a0/0x550 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x23/0xc2 SDM mentioned that "The MXCSR has several reserved bits, and attempting to write a 1 to any of these bits will cause a general-protection exception(#GP) to be generated". The syzkaller forks' testcase overrides xsave area w/ random values and steps on the reserved bits of MXCSR register. The damaged MXCSR register values of guest will be restored to SSEx MXCSR register before vmentry. This patch fixes it by catching userspace override MXCSR register reserved bits w/ random values and bails out immediately. Reported-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25ARM: tegra: paz00: Mark panel regulator as enabled on bootMarc Dietrich
commit 0c18927f51f4d390abdcf385bff5f995407ee732 upstream. Current U-Boot enables the display already. Marking the regulator as enabled on boot fixes sporadic panel initialization failures. Signed-off-by: Marc Dietrich <marvin24@gmx.de> Tested-by: Misha Komarovskiy <zombah@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-20pstore: Fix flags to enable dumps on powerpcAnkit Kumar
commit 041939c1ec54208b42f5cd819209173d52a29d34 upstream. After commit c950fd6f201a kernel registers pstore write based on flag set. Pstore write for powerpc is broken as flags(PSTORE_FLAGS_DMESG) is not set for powerpc architecture. On panic, kernel doesn't write message to /fs/pstore/dmesg*(Entry doesn't gets created at all). This patch enables pstore write for powerpc architecture by setting PSTORE_FLAGS_DMESG flag. Fixes: c950fd6f201a ("pstore: Split pstore fragile flags") Signed-off-by: Ankit Kumar <ankit@linux.vnet.ibm.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-20arm64: KVM: Fix decoding of Rt/Rt2 when trapping AArch32 CP accessesMarc Zyngier
commit c667186f1c01ca8970c785888868b7ffd74e51ee upstream. Our 32bit CP14/15 handling inherited some of the ARMv7 code for handling the trapped system registers, completely missing the fact that the fields for Rt and Rt2 are now 5 bit wide, and not 4... Let's fix it, and provide an accessor for the most common Rt case. Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-20KVM: arm/arm64: fix races in kvm_psci_vcpu_onAndrew Jones
commit 6c7a5dce22b3f3cc44be098e2837fa6797edb8b8 upstream. Fix potential races in kvm_psci_vcpu_on() by taking the kvm->lock mutex. In general, it's a bad idea to allow more than one PSCI_CPU_ON to process the same target VCPU at the same time. One such problem that may arise is that one PSCI_CPU_ON could be resetting the target vcpu, which fills the entire sys_regs array with a temporary value including the MPIDR register, while another looks up the VCPU based on the MPIDR value, resulting in no target VCPU found. Resolves both races found with the kvm-unit-tests/arm/psci unit test. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Reported-by: Levente Kurusa <lkurusa@redhat.com> Suggested-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-20KVM: x86: fix user triggerable warning in kvm_apic_accept_events()David Hildenbrand
commit 28bf28887976d8881a3a59491896c718fade7355 upstream. If we already entered/are about to enter SMM, don't allow switching to INIT/SIPI_RECEIVED, otherwise the next call to kvm_apic_accept_events() will report a warning. Same applies if we are already in MP state INIT_RECEIVED and SMM is requested to be turned on. Refuse to set the VCPU events in this case. Fixes: cd7764fe9f73 ("KVM: x86: latch INITs while in system management mode") Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-20perf/x86: Fix Broadwell-EP DRAM RAPL eventsVince Weaver
commit 33b88e708e7dfa58dc896da2a98f5719d2eb315c upstream. It appears as though the Broadwell-EP DRAM units share the special units quirk with Haswell-EP/KNL. Without this patch, you get really high results (a single DRAM using 20W of power). The powercap driver in drivers/powercap/intel_rapl.c already has this change. Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-20um: Fix PTRACE_POKEUSER on x86_64Richard Weinberger
commit 9abc74a22d85ab29cef9896a2582a530da7e79bf upstream. This is broken since ever but sadly nobody noticed. Recent versions of GDB set DR_CONTROL unconditionally and UML dies due to a heap corruption. It turns out that the PTRACE_POKEUSER was copy&pasted from i386 and assumes that addresses are 4 bytes long. Fix that by using 8 as address size in the calculation. Reported-by: jie cao <cj3054@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-20x86, pmem: Fix cache flushing for iovec write < 8 bytesBen Hutchings
commit 8376efd31d3d7c44bd05be337adde023cc531fa1 upstream. Commit 11e63f6d920d added cache flushing for unaligned writes from an iovec, covering the first and last cache line of a >= 8 byte write and the first cache line of a < 8 byte write. But an unaligned write of 2-7 bytes can still cover two cache lines, so make sure we flush both in that case. Fixes: 11e63f6d920d ("x86, pmem: fix broken __copy_user_nocache ...") Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-20x86/boot: Fix BSS corruption/overwrite bug in early x86 kernel startupAshish Kalra
commit d594aa0277e541bb997aef0bc0a55172d8138340 upstream. The minimum size for a new stack (512 bytes) setup for arch/x86/boot components when the bootloader does not setup/provide a stack for the early boot components is not "enough". The setup code executing as part of early kernel startup code, uses the stack beyond 512 bytes and accidentally overwrites and corrupts part of the BSS section. This is exposed mostly in the early video setup code, where it was corrupting BSS variables like force_x, force_y, which in-turn affected kernel parameters such as screen_info (screen_info.orig_video_cols) and later caused an exception/panic in console_init(). Most recent boot loaders setup the stack for early boot components, so this stack overwriting into BSS section issue has not been exposed. Signed-off-by: Ashish Kalra <ashish@bluestacks.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170419152015.10011-1-ashishkalra@Ashishs-MacBook-Pro.local Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-20xen: adjust early dom0 p2m handling to xen hypervisor behaviorJuergen Gross
commit 69861e0a52f8733355ce246f0db15e1b240ad667 upstream. When booted as pv-guest the p2m list presented by the Xen is already mapped to virtual addresses. In dom0 case the hypervisor might make use of 2M- or 1G-pages for this mapping. Unfortunately while being properly aligned in virtual and machine address space, those pages might not be aligned properly in guest physical address space. So when trying to obtain the guest physical address of such a page pud_pfn() and pmd_pfn() must be avoided as those will mask away guest physical address bits not being zero in this special case. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14xen: Revert commits da72ff5bfcb0 and 72a9b186292dBoris Ostrovsky
commit 84d582d236dc1f9085e741affc72e9ba061a67c2 upstream. Recent discussion (http://marc.info/?l=xen-devel&m=149192184523741) established that commit 72a9b186292d ("xen: Remove event channel notification through Xen PCI platform device") (and thus commit da72ff5bfcb0 ("partially revert "xen: Remove event channel notification through Xen PCI platform device"")) are unnecessary and, in fact, prevent HVM guests from booting on Xen releases prior to 4.0 Therefore we revert both of those commits. The summary of that discussion is below: Here is the brief summary of the current situation: Before the offending commit (72a9b186292): 1) INTx does not work because of the reset_watches path. 2) The reset_watches path is only taken if you have Xen > 4.0 3) The Linux Kernel by default will use vector inject if the hypervisor support. So even INTx does not work no body running the kernel with Xen > 4.0 would notice. Unless he explicitly disabled this feature either in the kernel or in Xen (and this can only be disabled by modifying the code, not user-supported way to do it). After the offending commit (+ partial revert): 1) INTx is no longer support for HVM (only for PV guests). 2) Any HVM guest The kernel will not boot on Xen < 4.0 which does not have vector injection support. Since the only other mode supported is INTx which. So based on this summary, I think before commit (72a9b186292) we were in much better position from a user point of view. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Julien Grall <julien.grall@arm.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Ross Lagerwall <ross.lagerwall@citrix.com> Cc: xen-devel@lists.xenproject.org Cc: linux-kernel@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: Anthony Liguori <aliguori@amazon.com> Cc: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14bpf, arm64: fix jit branch offset related to ldimm64Daniel Borkmann
[ Upstream commit ddc665a4bb4b728b4e6ecec8db1b64efa9184b9c ] When the instruction right before the branch destination is a 64 bit load immediate, we currently calculate the wrong jump offset in the ctx->offset[] array as we only account one instruction slot for the 64 bit load immediate although it uses two BPF instructions. Fix it up by setting the offset into the right slot after we incremented the index. Before (ldimm64 test 1): [...] 00000020: 52800007 mov w7, #0x0 // #0 00000024: d2800060 mov x0, #0x3 // #3 00000028: d2800041 mov x1, #0x2 // #2 0000002c: eb01001f cmp x0, x1 00000030: 54ffff82 b.cs 0x00000020 00000034: d29fffe7 mov x7, #0xffff // #65535 00000038: f2bfffe7 movk x7, #0xffff, lsl #16 0000003c: f2dfffe7 movk x7, #0xffff, lsl #32 00000040: f2ffffe7 movk x7, #0xffff, lsl #48 00000044: d29dddc7 mov x7, #0xeeee // #61166 00000048: f2bdddc7 movk x7, #0xeeee, lsl #16 0000004c: f2ddddc7 movk x7, #0xeeee, lsl #32 00000050: f2fdddc7 movk x7, #0xeeee, lsl #48 [...] After (ldimm64 test 1): [...] 00000020: 52800007 mov w7, #0x0 // #0 00000024: d2800060 mov x0, #0x3 // #3 00000028: d2800041 mov x1, #0x2 // #2 0000002c: eb01001f cmp x0, x1 00000030: 540000a2 b.cs 0x00000044 00000034: d29fffe7 mov x7, #0xffff // #65535 00000038: f2bfffe7 movk x7, #0xffff, lsl #16 0000003c: f2dfffe7 movk x7, #0xffff, lsl #32 00000040: f2ffffe7 movk x7, #0xffff, lsl #48 00000044: d29dddc7 mov x7, #0xeeee // #61166 00000048: f2bdddc7 movk x7, #0xeeee, lsl #16 0000004c: f2ddddc7 movk x7, #0xeeee, lsl #32 00000050: f2fdddc7 movk x7, #0xeeee, lsl #48 [...] Also, add a couple of test cases to make sure JITs pass this test. Tested on Cavium ThunderX ARMv8. The added test cases all pass after the fix. Fixes: 8eee539ddea0 ("arm64: bpf: fix out-of-bounds read in bpf2a64_offset()") Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Xi Wang <xi.wang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14sparc64: fix fault handling in NGbzero.S and GENbzero.SDave Aldridge
commit 3c7f62212018b904ae17f5636ead18a4dca3a88f upstream. When any of the functions contained in NGbzero.S and GENbzero.S vector through *bzero_from_clear_user, we may end up taking a fault when executing one of the store alternate address space instructions. If this happens, the exception handler does not restore the %asi register. This commit fixes the issue by introducing a new exception handler that ensures the %asi register is restored when a fault is handled. Orabug: 25577560 Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Reviewed-by: Rob Gardner <rob.gardner@oracle.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14MIPS: R2-on-R6 MULTU/MADDU/MSUBU emulation bugfixLeonid Yegoshin
commit d65e5677ad5b3a49c43f60ec07644dc1f87bbd2e upstream. MIPS instructions MULTU, MADDU and MSUBU emulation requires registers HI/LO to be converted to signed 32bits before 64bit sign extension on MIPS64. Bug was found on running MIPS32 R2 test application on MIPS64 R6 kernel. Fixes: b0a668fb2038 ("MIPS: kernel: mips-r2-to-r6-emul: Add R2 emulator for MIPS R6") Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> Reported-by: Nikola.Veljkovic@imgtec.com Cc: paul.burton@imgtec.com Cc: yamada.masahiro@socionext.com Cc: akpm@linux-foundation.org Cc: andrea.gelmini@gelma.net Cc: macro@imgtec.com Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/14043/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14KVM: nVMX: do not leak PML full vmexit to L1Ladi Prosek
commit ab007cc94ff9d82f5a8db8363b3becbd946e58cf upstream. The PML feature is not exposed to guests so we should not be forwarding the vmexit either. This commit fixes BSOD 0x20001 (HYPERVISOR_ERROR) when running Hyper-V enabled Windows Server 2016 in L1 on hardware that supports PML. Fixes: 843e4330573c ("KVM: VMX: Add PML support in VMX") Signed-off-by: Ladi Prosek <lprosek@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14KVM: nVMX: initialize PML fields in vmcs02Ladi Prosek
commit 1fb883bb827ee8efc1cc9ea0154f953f8a219d38 upstream. L2 was running with uninitialized PML fields which led to incomplete dirty bitmap logging. This manifested as all kinds of subtle erratic behavior of the nested guest. Fixes: 843e4330573c ("KVM: VMX: Add PML support in VMX") Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14Revert "KVM: nested VMX: disable perf cpuid reporting"Jim Mattson
commit 0b4c208d443ba2af82b4c70f99ca8df31e9a0020 upstream. This reverts commit bc6134942dbbf31c25e9bd7c876be5da81c9e1ce. A CPUID instruction executed in VMX non-root mode always causes a VM-exit, regardless of the leaf being queried. Fixes: bc6134942dbb ("KVM: nested VMX: disable perf cpuid reporting") Signed-off-by: Jim Mattson <jmattson@google.com> [The issue solved by bc6134942dbb has been resolved with ff651cb613b4 ("KVM: nVMX: Add nested msr load/restore algorithm").] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14x86/platform/intel-mid: Correct MSI IRQ line for watchdog deviceAndy Shevchenko
commit 80354c29025833acd72ddac1ffa21c6cb50128cd upstream. The interrupt line used for the watchdog is 12, according to the official Intel Edison BSP code. And indeed after fixing it we start getting an interrupt and thus the watchdog starts working again: [ 191.699951] Kernel panic - not syncing: Kernel Watchdog Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: David Cohen <david.a.cohen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 78a3bb9e408b ("x86: intel-mid: add watchdog platform code for Merrifield") Link: http://lkml.kernel.org/r/20170312150744.45493-1-andriy.shevchenko@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14kprobes/x86: Fix kernel panic when certain exception-handling addresses are ↵Masami Hiramatsu
probed commit 75013fb16f8484898eaa8d0b08fed942d790f029 upstream. Fix to the exception table entry check by using probed address instead of the address of copied instruction. This bug may cause unexpected kernel panic if user probe an address where an exception can happen which should be fixup by __ex_table (e.g. copy_from_user.) Unless user puts a kprobe on such address, this doesn't cause any problem. This bug has been introduced years ago, by commit: 464846888d9a ("x86/kprobes: Fix a bug which can modify kernel code permanently"). Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 464846888d9a ("x86/kprobes: Fix a bug which can modify kernel code permanently") Link: http://lkml.kernel.org/r/148829899399.28855.12581062400757221722.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14x86/pci-calgary: Fix iommu_free() comparison of unsigned expression >= 0Nikola Pajkovsky
commit 68dee8e2f2cacc54d038394e70d22411dee89da2 upstream. commit 8fd524b355da ("x86: Kill bad_dma_address variable") has killed bad_dma_address variable and used instead of macro DMA_ERROR_CODE which is always zero. Since dma_addr is unsigned, the statement dma_addr >= DMA_ERROR_CODE is always true, and not needed. arch/x86/kernel/pci-calgary_64.c: In function ‘iommu_free’: arch/x86/kernel/pci-calgary_64.c:299:2: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] if (unlikely((dma_addr >= DMA_ERROR_CODE) && (dma_addr < badend))) { Fixes: 8fd524b355da ("x86: Kill bad_dma_address variable") Signed-off-by: Nikola Pajkovsky <npajkovsky@suse.cz> Cc: iommu@lists.linux-foundation.org Cc: Jon Mason <jdmason@kudzu.us> Cc: Muli Ben-Yehuda <mulix@mulix.org> Link: http://lkml.kernel.org/r/7612c0f9dd7c1290407dbf8e809def922006920b.1479161177.git.npajkovsky@suse.cz Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14x86/ioapic: Restore IO-APIC irq_chip retrigger callbackRuslan Ruslichenko
commit a9b4f08770b415f30f2fb0f8329a370c8f554aa3 upstream. commit d32932d02e18 removed the irq_retrigger callback from the IO-APIC chip and did not add it to the new IO-APIC-IR irq chip. There is no harm because the interrupts are resent in software when the retrigger callback is NULL, but it's less efficient. So restore them. [ tglx: Massaged changelog ] Fixes: d32932d02e18 ("x86/irq: Convert IOAPIC to use hierarchical irqdomain interfaces") Signed-off-by: Ruslan Ruslichenko <rruslich@cisco.com> Cc: xe-linux-external@cisco.com Link: http://lkml.kernel.org/r/1484662432-13580-1-git-send-email-rruslich@cisco.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14ARM: dts: sun7i: lamobo-r1: Fix CPU port RGMII settingsFlorian Fainelli
commit 0cdefd5b5485ee6eb3512a75739d09a4090176ed upstream. The CPU port of the BCM53125 is configured with RGMII (no delays) but this should actually be RGMII with transmit delay (rgmii-txid) because STMMAC takes care of inserting the transmitter delay. This fixes occasional packet loss encountered. Fixes: d7b9eaff5f0c ("ARM: dts: sun7i: Add BCM53125 switch nodes to the lamobo-r1 board") Reported-by: Hartmut Knaack <knaack.h@gmx.de> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14ARM: OMAP5 / DRA7: Fix HYP mode boot for thumb2 buildMatthijs van Duin
commit 448c077eeb02240c430db2a2c3bf5285a4c65d66 upstream. 'adr' yields a data-pointer, not a function-pointer. Fixes: 999f934de195 ("ARM: omap5/dra7xx: Enable booting secondary CPU in HYP mode") Signed-off-by: Matthijs van Duin <matthijsvanduin@gmail.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14ARM: dts: NSP: GPIO reboot open-sourceJon Mason
commit acfa28b3649ec07775efaac0c00de2db39d71634 upstream. The libgpio code pre-sets the GPIO values for the gpio-reset in the device tree. This results in the device being reset during bringup. To prevent this pre-setting, use the "open-source" flag in the device tree. Signed-off-by: Jon Mason <jon.mason@broadcom.com> Fixes: b1aaf88 ("ARM: dts: NSP: Add GPIO reboot method to bcm958625hr DTS file") Fixes: 10baed1 ("ARM: dts: NSP: Add GPIO reboot method to bcm958625xmc DTS file") Fixes: 088e3148 ("ARM: dts: NSP: Add new DT file for bcm958522er") Fixes: e3227c1 ("ARM: dts: NSP: Add new DT file for bcm958525er") Fixes: 2f8bc00 ("ARM: dts: NSP: Add new DT file for bcm958622hr") Fixes: d454c37 ("ARM: dts: NSP: Add new DT file for bcm958623hr") Fixes: f27eacf ("ARM: dts: NSP: Add new DT file for bcm988312hr") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14arm64: Improve detection of user/non-user mappings in set_pte(_at)Catalin Marinas
commit ec663d967b2276448a416406ca59ff247c0c80c5 upstream. Commit cab15ce604e5 ("arm64: Introduce execute-only page access permissions") allowed a valid user PTE to have the PTE_USER bit clear. As a consequence, the pte_valid_not_user() macro in set_pte() was replaced with pte_valid_global() under the assumption that only user pages have the nG bit set. EFI mappings, however, also have the nG bit set and set_pte() wrongly ignores issuing the DSB+ISB. This patch reinstates the pte_valid_not_user() macro and adds the PTE_UXN bit check since all kernel mappings have this bit set. For clarity, pte_exec() is renamed to pte_user_exec() as it only checks for the absence of PTE_UXN. Consequently, the user executable check in set_pte_at() drops the pte_ng() test since pte_user_exec() is sufficient. Fixes: cab15ce604e5 ("arm64: Introduce execute-only page access permissions") Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14arm: dts: qcom: Fix ipq board clock ratesStephen Boyd
commit 06dbf468a2c42bf6c327a8eaf11ecb3ea96196f9 upstream. The ipq board has these rates as 25MHz, and not 19.2 and 27. I copy/pasted from other boards that have those rates but forgot to fix the rates here. Fixes: 30fc4212d541 ("arm: dts: qcom: Add more board clocks") Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Andy Gross <andy.gross@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14arm64: dts: r8a7795: Mark EthernetAVB device node disabledGeert Uytterhoeven
commit 0d1390ff283f6c38595288e7f74da6829896b8b7 upstream. Device nodes representing I/O devices should be marked disabled in the SoC-specific DTS, and overridden by board-specific DTSes where needed. Fixes: a92843c8a6f8c039 ("arm64: dts: r8a7795: add EthernetAVB device node") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14perf/x86/intel/pt: Add format strings for PTWRITE and power event tracingAlexander Shishkin
commit 5443624bedd0d23e112d5f2a919435182875bce9 upstream. Commit: 8ee83b2ab3 ("perf/x86/intel/pt: Add support for PTWRITE and power event tracing") forgot to add format strings to the PT driver. So one could enable these features by setting corresponding bits in the event config, but not by their mnemonic names. This patch adds the format strings. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Fixes: 8ee83b2ab3 ("perf/x86/intel/pt: Add support for PTWRITE...") Link: http://lkml.kernel.org/r/20170127151644.8585-2-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14powerpc: Correctly disable latent entropy GCC plugin on prom_init.oAndrew Donnellan
commit eac6f8b0c7adb003776dbad9d037ee2fc64f9d62 upstream. Commit 38addce8b600 ("gcc-plugins: Add latent_entropy plugin") excludes certain powerpc early boot code from the latent entropy plugin by adding appropriate CFLAGS. It looks like this was supposed to cover prom_init.o, but ended up saying init.o (which doesn't exist) instead. Fix the typo. Fixes: 38addce8b600 ("gcc-plugins: Add latent_entropy plugin") Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14powerpc/ftrace: Fix confusing help text for DISABLE_MPROFILE_KERNELAnton Blanchard
commit 496e9cb5b2aa2ba303d2bbd08518f9be2219ab4b upstream. The final paragraph of the help text is reversed. We want to enable this option by default, and disable it if the toolchain has a working -mprofile-kernel. Fixes: 8c50b72a3b4f ("powerpc/ftrace: Add Kconfig & Make glue for mprofile-kernel") Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14powerpc/powernv: Fix opal_exit tracepoint opcodeMichael Ellerman
commit a7e0fb6c2029a780444d09560f739e020d54fe4d upstream. Currently the opal_exit tracepoint usually shows the opcode as 0: <idle>-0 [047] d.h. 635.654292: opal_entry: opcode=63 <idle>-0 [047] d.h. 635.654296: opal_exit: opcode=0 retval=0 kopald-1209 [019] d... 636.420943: opal_entry: opcode=10 kopald-1209 [019] d... 636.420959: opal_exit: opcode=0 retval=0 This is because we incorrectly load the opcode into r0 before calling __trace_opal_exit(), whereas it expects the opcode in r3 (first function parameter). In fact we are leaving the retval in r3, so opcode and retval will always show the same value. Instead load the opcode into r3, resulting in: <idle>-0 [040] d.h. 636.618625: opal_entry: opcode=63 <idle>-0 [040] d.h. 636.618627: opal_exit: opcode=63 retval=0 Fixes: c49f63530bb6 ("powernv: Add OPAL tracepoints") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14powerpc/mm: Fixup wrong LPCR_VRMASD valueAneesh Kumar K.V
commit 4ab2537c4204b976e4ca350bbdc193b4649cad28 upstream. In commit a4b349540a26af ("powerpc/mm: Cleanup LPCR defines") we updated LPCR_VRMASD wrongly as below. -#define LPCR_VRMASD (0x1ful << (63-16)) +#define LPCR_VRMASD_SH 47 +#define LPCR_VRMASD (ASM_CONST(1) << LPCR_VRMASD_SH) We initialize the VRMA bits in LPCR to 0x00 in kvm. Hence using a different mask value as above while updating lpcr should not have any impact. This patch updates it to the correct value. Fixes: a4b349540a26 ("powerpc/mm: Cleanup LPCR defines") Reported-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Jia He <hejianet@gmail.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-03ftrace/x86: Fix triple fault with graph tracing and suspend-to-ramJosh Poimboeuf
commit 34a477e5297cbaa6ecc6e17c042a866e1cbe80d6 upstream. On x86-32, with CONFIG_FIRMWARE and multiple CPUs, if you enable function graph tracing and then suspend to RAM, it will triple fault and reboot when it resumes. The first fault happens when booting a secondary CPU: startup_32_smp() load_ucode_ap() prepare_ftrace_return() ftrace_graph_is_dead() (accesses 'kill_ftrace_graph') The early head_32.S code calls into load_ucode_ap(), which has an an ftrace hook, so it calls prepare_ftrace_return(), which calls ftrace_graph_is_dead(), which tries to access the global 'kill_ftrace_graph' variable with a virtual address, causing a fault because the CPU is still in real mode. The fix is to add a check in prepare_ftrace_return() to make sure it's running in protected mode before continuing. The check makes sure the stack pointer is a virtual kernel address. It's a bit of a hack, but it's not very intrusive and it works well enough. For reference, here are a few other (more difficult) ways this could have potentially been fixed: - Move startup_32_smp()'s call to load_ucode_ap() down to *after* paging is enabled. (No idea what that would break.) - Track down load_ucode_ap()'s entire callee tree and mark all the functions 'notrace'. (Probably not realistic.) - Pause graph tracing in ftrace_suspend_notifier_call() or bringup_cpu() or __cpu_up(), and ensure that the pause facility can be queried from real mode. Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: "Rafael J . Wysocki" <rjw@rjwysocki.net> Cc: linux-acpi@vger.kernel.org Cc: Borislav Petkov <bp@alien8.de> Cc: Len Brown <lenb@kernel.org> Link: http://lkml.kernel.org/r/5c1272269a580660703ed2eccf44308e790c7a98.1492123841.git.jpoimboe@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-03ARCv2: save r30 on kernel entry as gcc uses it for code-genVineet Gupta
commit ecd43afdbe72017aefe48080631eb625e177ef4d upstream. This is not exposed to userspace debugers yet, which can be done independently as a seperate patch ! Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-03MIPS: Avoid BUG warning in arch_check_elfJames Cowgill
commit c46f59e90226fa5bfcc83650edebe84ae47d454b upstream. arch_check_elf contains a usage of current_cpu_data that will call smp_processor_id() with preemption enabled and therefore triggers a "BUG: using smp_processor_id() in preemptible" warning when an fpxx executable is loaded. As a follow-up to commit b244614a60ab ("MIPS: Avoid a BUG warning during prctl(PR_SET_FP_MODE, ...)"), apply the same fix to arch_check_elf by using raw_current_cpu_data instead. The rationale quoted from the previous commit: "It is assumed throughout the kernel that if any CPU has an FPU, then all CPUs would have an FPU as well, so it is safe to perform the check with preemption enabled - change the code to use raw_ variant of the check to avoid the warning." Fixes: 46490b572544 ("MIPS: kernel: elf: Improve the overall ABI and FPU mode checks") Signed-off-by: James Cowgill <James.Cowgill@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/15951/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>