summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2016-03-22x86/extable: use generic search and sort routinesArd Biesheuvel
Replace the arch specific versions of search_extable() and sort_extable() with calls to the generic ones, which now support relative exception tables as well. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22s390/extable: use generic search and sort routinesArd Biesheuvel
Replace the arch specific versions of search_extable() and sort_extable() with calls to the generic ones, which now support relative exception tables as well. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22alpha/extable: use generic search and sort routinesArd Biesheuvel
Replace the arch specific versions of search_extable() and sort_extable() with calls to the generic ones, which now support relative exception tables as well. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22kernel: add kcov code coverageDmitry Vyukov
kcov provides code coverage collection for coverage-guided fuzzing (randomized testing). Coverage-guided fuzzing is a testing technique that uses coverage feedback to determine new interesting inputs to a system. A notable user-space example is AFL (http://lcamtuf.coredump.cx/afl/). However, this technique is not widely used for kernel testing due to missing compiler and kernel support. kcov does not aim to collect as much coverage as possible. It aims to collect more or less stable coverage that is function of syscall inputs. To achieve this goal it does not collect coverage in soft/hard interrupts and instrumentation of some inherently non-deterministic or non-interesting parts of kernel is disbled (e.g. scheduler, locking). Currently there is a single coverage collection mode (tracing), but the API anticipates additional collection modes. Initially I also implemented a second mode which exposes coverage in a fixed-size hash table of counters (what Quentin used in his original patch). I've dropped the second mode for simplicity. This patch adds the necessary support on kernel side. The complimentary compiler support was added in gcc revision 231296. We've used this support to build syzkaller system call fuzzer, which has found 90 kernel bugs in just 2 months: https://github.com/google/syzkaller/wiki/Found-Bugs We've also found 30+ bugs in our internal systems with syzkaller. Another (yet unexplored) direction where kcov coverage would greatly help is more traditional "blob mutation". For example, mounting a random blob as a filesystem, or receiving a random blob over wire. Why not gcov. Typical fuzzing loop looks as follows: (1) reset coverage, (2) execute a bit of code, (3) collect coverage, repeat. A typical coverage can be just a dozen of basic blocks (e.g. an invalid input). In such context gcov becomes prohibitively expensive as reset/collect coverage steps depend on total number of basic blocks/edges in program (in case of kernel it is about 2M). Cost of kcov depends only on number of executed basic blocks/edges. On top of that, kernel requires per-thread coverage because there are always background threads and unrelated processes that also produce coverage. With inlined gcov instrumentation per-thread coverage is not possible. kcov exposes kernel PCs and control flow to user-space which is insecure. But debugfs should not be mapped as user accessible. Based on a patch by Quentin Casasnovas. [akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode'] [akpm@linux-foundation.org: unbreak allmodconfig] [akpm@linux-foundation.org: follow x86 Makefile layout standards] Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: syzkaller <syzkaller@googlegroups.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Tavis Ormandy <taviso@google.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Kees Cook <keescook@google.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: David Drysdale <drysdale@google.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22rapidio: add global inbound port write interfacesAlexandre Bounine
Add new Port Write handler registration interfaces that attach PW handlers to local mport device objects. This is different from old interface that attaches PW callback to individual RapidIO device. The new interfaces are intended for use for common event handling (e.g. hot-plug notifications) while the old interface is available for individual device drivers. This patch is based on patch proposed by Andre van Herk but preserves existing per-device interface and adds lock protection for list handling. Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22powerpc/fsl_rio: changes to mport registrationAlexandre Bounine
Change mport object initialization/registration sequence to match reworked version of rio_register_mport() in the core code. Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22fs/coredump: prevent fsuid=0 dumps into user-controlled directoriesJann Horn
This commit fixes the following security hole affecting systems where all of the following conditions are fulfilled: - The fs.suid_dumpable sysctl is set to 2. - The kernel.core_pattern sysctl's value starts with "/". (Systems where kernel.core_pattern starts with "|/" are not affected.) - Unprivileged user namespace creation is permitted. (This is true on Linux >=3.8, but some distributions disallow it by default using a distro patch.) Under these conditions, if a program executes under secure exec rules, causing it to run with the SUID_DUMP_ROOT flag, then unshares its user namespace, changes its root directory and crashes, the coredump will be written using fsuid=0 and a path derived from kernel.core_pattern - but this path is interpreted relative to the root directory of the process, allowing the attacker to control where a coredump will be written with root privileges. To fix the security issue, always interpret core_pattern for dumps that are written under SUID_DUMP_ROOT relative to the root directory of init. Signed-off-by: Jann Horn <jann@thejh.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22x86/compat: remove is_compat_task()Andy Lutomirski
x86's is_compat_task always checked the current syscall type, not the task type. It has no non-arch users any more, so just remove it to avoid confusion. On x86, nothing should really be checking the task ABI. There are legitimate users for the syscall ABI and for the mm ABI. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22sparc/syscall: fix syscall_get_archAndy Lutomirski
Sparc's syscall_get_arch was buggy: it returned the task arch, not the syscall arch. This could confuse seccomp and audit. I don't think this is as bad for seccomp as it looks: sparc's 32-bit and 64-bit syscalls are numbered the same. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22sparc/compat: provide an accurate in_compat_syscall implementationAndy Lutomirski
On sparc64 compat-enabled kernels, any task can make 32-bit and 64-bit syscalls. is_compat_task returns true in 32-bit tasks, which does not necessarily imply that the current syscall is 32-bit. Provide an in_compat_syscall implementation that checks whether the current syscall is compat. As far as I know, sparc is the only architecture on which is_compat_task checks the compat status of the task and on which the compat status of a syscall can differ from the compat status of the task. On x86, is_compat_task checks the syscall type, not the task type. [akpm@linux-foundation.org: add comment, per Sam] [akpm@linux-foundation.org: update comment, per Andy] Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22Merge tag 'for-linus-4.6-rc0-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from David Vrabel: "Features and fixes for 4.6: - Make earlyprintk=xen work for HVM guests - Remove module support for things never built as modules" * tag 'for-linus-4.6-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: drivers/xen: make platform-pci.c explicitly non-modular drivers/xen: make sys-hypervisor.c explicitly non-modular drivers/xen: make xenbus_dev_[front/back]end explicitly non-modular drivers/xen: make [xen-]ballon explicitly non-modular xen: audit usages of module.h ; remove unnecessary instances xen/x86: Drop mode-selecting ifdefs in startup_xen() xen/x86: Zero out .bss for PV guests hvc_xen: make early_printk work with HVM guests hvc_xen: fix xenboot for DomUs hvc_xen: add earlycon support
2016-03-22Merge tag 'iommu-updates-v4.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: - updates for the Exynos IOMMU driver to make use of default domains and to add support for the SYSMMU v5 - new Mediatek IOMMU driver - support for the ARMv7 short descriptor format in the io-pgtable code - default domain support for the ARM SMMU - couple of other small fixes all over the place * tag 'iommu-updates-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (41 commits) iommu/ipmmu-vmsa: Add r8a7795 DT binding iommu/mediatek: Check for NULL instead of IS_ERR() iommu/io-pgtable-armv7s: Fix kmem_cache_alloc() flags iommu/mediatek: Fix handling of of_count_phandle_with_args result iommu/dma: Fix NEED_SG_DMA_LENGTH dependency iommu/mediatek: Mark PM functions as __maybe_unused iommu/mediatek: Select ARM_DMA_USE_IOMMU iommu/exynos: Use proper readl/writel register interface iommu/exynos: Pointers are nto physical addresses dts: mt8173: Add iommu/smi nodes for mt8173 iommu/mediatek: Add mt8173 IOMMU driver memory: mediatek: Add SMI driver dt-bindings: mediatek: Add smi dts binding dt-bindings: iommu: Add binding for mediatek IOMMU iommu/ipmmu-vmsa: Use ARCH_RENESAS iommu/exynos: Support multiple attach_device calls iommu/exynos: Add Maintainers entry for Exynos SYSMMU driver iommu/exynos: Add support for v5 SYSMMU iommu/exynos: Update device tree documentation iommu/exynos: Add support for SYSMMU controller with bogus version reg ...
2016-03-22KVM: page_track: fix access to NULL slotPaolo Bonzini
This happens when doing the reboot test from virt-tests: [ 131.833653] BUG: unable to handle kernel NULL pointer dereference at (null) [ 131.842461] IP: [<ffffffffa0950087>] kvm_page_track_is_active+0x17/0x60 [kvm] [ 131.850500] PGD 0 [ 131.852763] Oops: 0000 [#1] SMP [ 132.007188] task: ffff880075fbc500 ti: ffff880850a3c000 task.ti: ffff880850a3c000 [ 132.138891] Call Trace: [ 132.141639] [<ffffffffa092bd11>] page_fault_handle_page_track+0x31/0x40 [kvm] [ 132.149732] [<ffffffffa093380f>] paging64_page_fault+0xff/0x910 [kvm] [ 132.172159] [<ffffffffa092c734>] kvm_mmu_page_fault+0x64/0x110 [kvm] [ 132.179372] [<ffffffffa06743c2>] handle_exception+0x1b2/0x430 [kvm_intel] [ 132.187072] [<ffffffffa067a301>] vmx_handle_exit+0x1e1/0xc50 [kvm_intel] ... Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com> Fixes: 3d0c27ad6ee465f174b09ee99fcaf189c57d567a Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22KVM: PPC: do not compile in vfio.o unconditionallyPaolo Bonzini
Build on 32-bit PPC fails with the following error: int kvm_vfio_ops_init(void) ^ In file included from arch/powerpc/kvm/../../../virt/kvm/vfio.c:21:0: arch/powerpc/kvm/../../../virt/kvm/vfio.h:8:90: note: previous definition of ‘kvm_vfio_ops_init’ was here arch/powerpc/kvm/../../../virt/kvm/vfio.c:292:6: error: redefinition of ‘kvm_vfio_ops_exit’ void kvm_vfio_ops_exit(void) ^ In file included from arch/powerpc/kvm/../../../virt/kvm/vfio.c:21:0: arch/powerpc/kvm/../../../virt/kvm/vfio.h:12:91: note: previous definition of ‘kvm_vfio_ops_exit’ was here scripts/Makefile.build:258: recipe for target arch/powerpc/kvm/../../../virt/kvm/vfio.o failed make[3]: *** [arch/powerpc/kvm/../../../virt/kvm/vfio.o] Error 1 Check whether CONFIG_KVM_VFIO is set before including vfio.o in the build. Reported-by: Pranith Kumar <bobby.prani@gmail.com> Tested-by: Pranith Kumar <bobby.prani@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22kvm, rt: change async pagefault code locking for PREEMPT_RTRik van Riel
The async pagefault wake code can run from the idle task in exception context, so everything here needs to be made non-preemptible. Conversion to a simple wait queue and raw spinlock does the trick. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22KVM/PPC: update the comment of memory barrier in the kvmppc_prepare_to_enter()Lan Tianyu
The barrier also orders the write to mode from any reads to the page tables done and so update the comment. Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22KVM/x86: update the comment of memory barrier in the vcpu_enter_guest()Lan Tianyu
The barrier also orders the write to mode from any reads to the page tables done and so update the comment. Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22KVM/x86: Call smp_wmb() before increasing tlbs_dirtyLan Tianyu
Update spte before increasing tlbs_dirty to make sure no tlb flush in lost after spte is zapped. This pairs with the barrier in the kvm_flush_remote_tlbs(). Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22KVM/x86: Replace smp_mb() with smp_store_mb/release() in the ↵Lan Tianyu
walk_shadow_page_lockless_begin/end() Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22KVM: Remove redundant smp_mb() in the kvm_mmu_commit_zap_page()Lan Tianyu
There is already a barrier inside of kvm_flush_remote_tlbs() which can help to make sure everyone sees our modifications to the page tables and see changes to vcpu->mode here. So remove the smp_mb in the kvm_mmu_commit_zap_page() and update the comment. Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22KVM, pkeys: expose CPUID/CR4 to guestHuaitong Han
X86_FEATURE_PKU is referred to as "PKU" in the hardware documentation: CPUID.7.0.ECX[3]:PKU. X86_FEATURE_OSPKE is software support for pkeys, enumerated with CPUID.7.0.ECX[4]:OSPKE, and it reflects the setting of CR4.PKE(bit 22). This patch disables CPUID:PKU without ept, because pkeys is not yet implemented for shadow paging. Signed-off-by: Huaitong Han <huaitong.han@intel.com> Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22KVM, pkeys: add pkeys support for permission_faultHuaitong Han
Protection keys define a new 4-bit protection key field (PKEY) in bits 62:59 of leaf entries of the page tables, the PKEY is an index to PKRU register(16 domains), every domain has 2 bits(write disable bit, access disable bit). Static logic has been produced in update_pkru_bitmask, dynamic logic need read pkey from page table entries, get pkru value, and deduce the correct result. [ Huaitong: Xiao helps to modify many sections. ] Signed-off-by: Huaitong Han <huaitong.han@intel.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22KVM, pkeys: introduce pkru_mask to cache conditionsHuaitong Han
PKEYS defines a new status bit in the PFEC. PFEC.PK (bit 5), if some conditions is true, the fault is considered as a PKU violation. pkru_mask indicates if we need to check PKRU.ADi and PKRU.WDi, and does cache some conditions for permission_fault. [ Huaitong: Xiao helps to modify many sections. ] Signed-off-by: Huaitong Han <huaitong.han@intel.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22KVM, pkeys: save/restore PKRU when guest/host switchesXiao Guangrong
Currently XSAVE state of host is not restored after VM-exit and PKRU is managed by XSAVE so the PKRU from guest is still controlling the memory access even if the CPU is running the code of host. This is not safe as KVM needs to access the memory of userspace (e,g QEMU) to do some emulation. So we save/restore PKRU when guest/host switches. Signed-off-by: Huaitong Han <huaitong.han@intel.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22x86: pkey: introduce write_pkru() for KVMXiao Guangrong
KVM will use it to switch pkru between guest and host. CC: Ingo Molnar <mingo@redhat.com> CC: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Huaitong Han <huaitong.han@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22KVM, pkeys: add pkeys support for xsave stateHuaitong Han
This patch adds pkeys support for xsave state. Signed-off-by: Huaitong Han <huaitong.han@intel.com> Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22KVM, pkeys: disable pkeys for guests in non-paging modeHuaitong Han
Pkeys is disabled if CPU is in non-paging mode in hardware. However KVM always uses paging mode to emulate guest non-paging, mode with TDP. To emulate this behavior, pkeys needs to be manually disabled when guest switches to non-paging mode. Signed-off-by: Huaitong Han <huaitong.han@intel.com> Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22KVM: x86: remove magic number with enum cpuid_leafsHuaitong Han
This patch removes magic number with enum cpuid_leafs. Signed-off-by: Huaitong Han <huaitong.han@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22KVM: MMU: return page fault error code from permission_faultPaolo Bonzini
This will help in the implementation of PKRU, where the PK bit of the page fault error code cannot be computed in advance (unlike I/D, R/W and U/S). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22KVM: PPC: Create a virtual-mode only TCE table handlersAlexey Kardashevskiy
Upcoming in-kernel VFIO acceleration needs different handling in real and virtual modes which makes it hard to support both modes in the same handler. This creates a copy of kvmppc_rm_h_stuff_tce and kvmppc_rm_h_put_tce in addition to the existing kvmppc_rm_h_put_tce_indirect. This also fixes linker breakage when only PR KVM was selected (leaving HV KVM off): the kvmppc_h_put_tce/kvmppc_h_stuff_tce functions would not compile at all and the linked would fail. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22KVM: VMX: fix nested vpid for old KVM guestsPaolo Bonzini
Old KVM guests invoke single-context invvpid without actually checking whether it is supported. This was fixed by commit 518c8ae ("KVM: VMX: Make sure single type invvpid is supported before issuing invvpid instruction", 2010-08-01) and the patch after, but pre-2.6.36 kernels lack it including RHEL 6. Reported-by: jmontleo@redhat.com Tested-by: jmontleo@redhat.com Cc: stable@vger.kernel.org Fixes: 99b83ac893b84ed1a62ad6d1f2b6cc32026b9e85 Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22KVM: VMX: avoid guest hang on invalid invvpid instructionPaolo Bonzini
A guest executing an invalid invvpid instruction would hang because the instruction pointer was not updated. Reported-by: jmontleo@redhat.com Tested-by: jmontleo@redhat.com Cc: stable@vger.kernel.org Fixes: 99b83ac893b84ed1a62ad6d1f2b6cc32026b9e85 Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22KVM: VMX: avoid guest hang on invalid invept instructionPaolo Bonzini
A guest executing an invalid invept instruction would hang because the instruction pointer was not updated. Cc: stable@vger.kernel.org Fixes: bfd0a56b90005f8c8a004baf407ad90045c2b11e Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22ARM: dts: at91: sama5d4 Xplained: don't disable hsmci regulatorLudovic Desroches
If enabling the hsmci regulator on card detection, the board can reboot on sd card insertion. Keeping the regulator always enabled fixes this issue. Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com> Fixes: 8d545f32bd77 ("ARM: at91/dt: sama5d4 xplained: add regulators for v(q)mmc1 supplies") Cc: stable@vger.kernel.org #4.3 and later Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
2016-03-22ARM: dts: at91: sama5d3 Xplained: don't disable hsmci regulatorLudovic Desroches
If enabling the hsmci regulator on card detection, the board can reboot on sd card insertion. Keeping the regulator always enabled fixes this issue. Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com> Fixes: 1b53e3416dd0 ("ARM: at91/dt: sama5d3 xplained: add fixed regulator for vmmc0") Cc: stable@vger.kernel.org #4.3 and later Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
2016-03-21Merge tag 'arm64-perf' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm[64] perf updates from Will Deacon: "I have another mixed bag of ARM-related perf patches here. It's about 25% CPU and 75% interconnect, but with drivers/bus/ languishing without an obvious maintainer or tree, Olof and I agreed to keep all of these PMU patches together. I suspect a whole load of code from drivers/bus/arm-* can be moved under drivers/perf/, so that's on the radar for the future. Summary: - Initial support for ARMv8.1 CPU PMUs - Support for the CPU PMU in Cavium ThunderX - CPU PMU support for systems running 32-bit Linux in secure mode - Support for the system PMU in ARM CCI-550 (Cache Coherent Interconnect)" * tag 'arm64-perf' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (26 commits) drivers/perf: arm_pmu: avoid NULL dereference when not using devicetree arm64: perf: Extend ARMV8_EVTYPE_MASK to include PMCR.LC arm-cci: remove unused variable arm-cci: don't return value from void function arm-cci: make private functions static arm-cci: CoreLink CCI-550 PMU driver arm-cci500: Rearrange PMU driver for code sharing with CCI-550 PMU arm-cci: CCI-500: Work around PMU counter writes arm-cci: Provide hook for writing to PMU counters arm-cci: Add helper to enable PMU without synchornising counters arm-cci: Add routines to save/restore all counters arm-cci: Get the status of a counter arm-cci: write_counter: Remove redundant check arm-cci: Delay PMU counter writes to pmu::pmu_enable arm-cci: Refactor CCI PMU enable/disable methods arm-cci: Group writes to counter arm-cci: fix handling cpumask_any_but return value arm-cci: simplify sysfs attr handling drivers/perf: arm_pmu: implement CPU_PM notifier arm64: dts: Add Cavium ThunderX specific PMU ...
2016-03-21Merge tag 'arc-4.6-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC architecture updates from Vineet Gupta: - Big Endian io accessors fix [Lada] - Spellos fixes [Adam] - Fix for DW GMAC breakage [Alexey] - Making DMA API 64-bit ready - Shutting up -Wmaybe-uninitialized noise for ARC - Other minor fixes here and there, comments update * tag 'arc-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (21 commits) ARCv2: ioremap: Support dynamic peripheral address space ARC: dma: reintroduce platform specific dma<->phys ARC: dma: ioremap: use phys_addr_t consistenctly in code paths ARC: dma: pass_phys() not sg_virt() to cache ops ARC: dma: non-coherent pages need V-P mapping if in HIGHMEM ARC: dma: Use struct page based page allocator helpers ARC: build: Turn off -Wmaybe-uninitialized for ARC gcc 4.8 ARC: [plat-axs10x] add Ethernet PHY description in .dts arc: use of_platform_default_populate() to populate default bus ARC: thp: unbork !CONFIG_TRANSPARENT_HUGEPAGE build arc: [plat-nsimosci*] use ezchip network driver ARCv2: LLSC: software backoff is NOT needed starting HS2.1c ARC: mm: Use virt_to_pfn() for addr >> PAGE_SHIFT pattern ARC: [plat-nsim] document ranges ARC: build: Better way to detect ISA compatible toolchain ARCv2: Allow enabling PAE40 w/o HIGHMEM ARC: [BE] readl()/writel() to work in Big Endian CPU configuration ARC: [*defconfig] No need to specify CONFIG_CROSS_COMPILE ARC: [BE] Select correct CROSS_COMPILE prefix ARC: bitops: Remove non relevant comments ...
2016-03-21xen: audit usages of module.h ; remove unnecessary instancesPaul Gortmaker
Code that uses no modular facilities whatsoever should not be sourcing module.h at all, since that header drags in a bunch of other headers with it. Similarly, code that is not explicitly using modular facilities like module_init() but only is declaring module_param setup variables should be using moduleparam.h and not the larger module.h file for that. In making this change, we also uncover an implicit use of BUG() in inline fcns within arch/arm/include/asm/xen/hypercall.h so we explicitly source <linux/bug.h> for that file now. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-03-21Merge branches 'arm/rockchip', 'arm/exynos', 'arm/smmu', 'arm/mediatek', ↵Joerg Roedel
'arm/io-pgtable', 'arm/renesas' and 'core' into next
2016-03-21kvm: arm64: Disable compiler instrumentation for hypervisor codeCatalin Marinas
With the recent rewrite of the arm64 KVM hypervisor code in C, enabling certain options like KASAN would allow the compiler to generate memory accesses or function calls to addresses not mapped at EL2. This patch disables the compiler instrumentation on the arm64 hypervisor code for gcov-based profiling (GCOV_KERNEL), undefined behaviour sanity checker (UBSAN) and kernel address sanitizer (KASAN). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: <stable@vger.kernel.org> # 4.5+ Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-03-21arm64: Split pr_notice("Virtual kernel memory layout...") into multiple ↵Catalin Marinas
pr_cont() The printk() implementation has a limit of LOG_LINE_MAX (== 1024 - 32) buffer per call which the arm64 mem_init() breaches when printing the virtual memory layout with CONFIG_KASAN enabled. The result is that the last line is no longer printed. This patch splits the call into a pr_notice() + additional pr_cont() calls. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com>
2016-03-21arm64: drop unused __local_flush_icache_all()Kefeng Wang
After commit 65da0a8e34a8 ("arm64: use non-global mappings for UEFI runtime regions"), nobody use __local_flush_icache_all() anymore, so drop it. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-03-21arm64: fix KASLR boot-time I-cache maintenanceMark Rutland
Commit f80fb3a3d50843a4 ("arm64: add support for kernel ASLR") missed a DSB necessary to complete I-cache maintenance in the primary boot path, and hence stale instructions may still be present in the I-cache and may be executed until the I-cache maintenance naturally completes. Since commit 8ec41987436d566f ("arm64: mm: ensure patched kernel text is fetched from PoU"), all CPUs invalidate their I-caches after their MMU is enabled. Prior a CPU's MMU having been enabled, arbitrary lines may have been fetched from the PoC into I-caches. We never patch text expected to be executed with the MMU off. Thus, it is unnecessary to perform broadcast I-cache maintenance in the primary boot path. This patch reduces the scope of the I-cache maintenance to the local CPU, and adds the missing DSB with similar scope, matching prior maintenance in the primary boot path. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Ard Biesehvuel <ard.biesheuvel@linaro.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-03-21arm64/kernel: fix incorrect EL0 check in inv_entry macroArd Biesheuvel
The implementation of macro inv_entry refers to its 'el' argument without the required leading backslash, which results in an undefined symbol 'el' to be passed into the kernel_entry macro rather than the index of the exception level as intended. This undefined symbol strangely enough does not result in build failures, although it is visible in vmlinux: $ nm -n vmlinux |head U el 0000000000000000 A _kernel_flags_le_hi32 0000000000000000 A _kernel_offset_le_hi32 0000000000000000 A _kernel_size_le_hi32 000000000000000a A _kernel_flags_le_lo32 ..... However, it does result in incorrect code being generated for invalid exceptions taken from EL0, since the argument check in kernel_entry assumes EL1 if its argument does not equal '0'. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-03-21perf/x86/intel/rapl: Add missing Broadwell modelsSrinivas Pandruvada
Added Broadwell-H and Broadwell-Server. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: bp@alien8.de Link: http://lkml.kernel.org/r/1458517938-25308-1-git-send-email-srinivas.pandruvada@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-21perf/x86/intel/uncore: Remove ev_sel_ext bit support for PCUKan Liang
The ev_sel_ext in PCU_MSR_PMON_CTL is locked on some CPU models, so despite it being documented in the SDM, if we write 1 to that bit then we can get a #GP fault. Which #GP the perf fuzzer happily triggered in Peter Zijlstra's testing. Also, there are no public events which use that bit, so remove ev_sel_ext bit support for PCU. Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@intel.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1458500301-3594-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-21arm64: KVM: Turn kvm_ksym_ref into a NOP on VHEMarc Zyngier
When running with VHE, there is no need to translate kernel pointers to the EL2 memory space, since we're already there (and we have a much saner memory map to start with). Unfortunately, kvm_ksym_ref is getting in the way, and the first call into the "hypervisor" section is going to end up in fireworks, since we're now branching into nowhereland. Meh. A potential solution is to test if VHE is engaged or not, and only perform the translation in the negative case. With this in place, VHE is able to run again. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-03-21KVM: arm/arm64: disable preemption when calling smp_call_function_manyEric Auger
Preemption must be disabled when calling smp_call_function_many Reported-by: bartosz.wawrzyniak@tieto.com Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-03-21perf/x86/amd/power: Add AMD accumulated power reporting mechanismHuang Rui
Introduce an AMD accumlated power reporting mechanism for the Family 15h, Model 60h processor that can be used to calculate the average power consumed by a processor during a measurement interval. The feature support is indicated by CPUID Fn8000_0007_EDX[12]. This feature will be implemented both in hwmon and perf. The current design provides one event to report per package/processor power consumption by counting each compute unit power value. Here the gory details of how the computation is done: * Tsample: compute unit power accumulator sample period * Tref: the PTSC counter period (PTSC: performance timestamp counter) * N: the ratio of compute unit power accumulator sample period to the PTSC period * Jmax: max compute unit accumulated power which is indicated by MSR_C001007b[MaxCpuSwPwrAcc] * Jx/Jy: compute unit accumulated power which is indicated by MSR_C001007a[CpuSwPwrAcc] * Tx/Ty: the value of performance timestamp counter which is indicated by CU_PTSC MSR_C0010280[PTSC] * PwrCPUave: CPU average power i. Determine the ratio of Tsample to Tref by executing CPUID Fn8000_0007. N = value of CPUID Fn8000_0007_ECX[CpuPwrSampleTimeRatio[15:0]]. ii. Read the full range of the cumulative energy value from the new MSR MaxCpuSwPwrAcc. Jmax = value returned. iii. At time x, software reads CpuSwPwrAcc and samples the PTSC. Jx = value read from CpuSwPwrAcc and Tx = value read from PTSC. iv. At time y, software reads CpuSwPwrAcc and samples the PTSC. Jy = value read from CpuSwPwrAcc and Ty = value read from PTSC. v. Calculate the average power consumption for a compute unit over time period (y-x). Unit of result is uWatt: if (Jy < Jx) // Rollover has occurred Jdelta = (Jy + Jmax) - Jx else Jdelta = Jy - Jx PwrCPUave = N * Jdelta * 1000 / (Ty - Tx) Simple example: root@hr-zp:/home/ray/tip# ./tools/perf/perf stat -a -e 'power/power-pkg/' make -j4 CHK include/config/kernel.release CHK include/generated/uapi/linux/version.h CHK include/generated/utsrelease.h CHK include/generated/timeconst.h CHK include/generated/bounds.h CHK include/generated/asm-offsets.h CALL scripts/checksyscalls.sh CHK include/generated/compile.h SKIPPED include/generated/compile.h Building modules, stage 2. Kernel: arch/x86/boot/bzImage is ready (#40) MODPOST 4225 modules Performance counter stats for 'system wide': 183.44 mWatts power/power-pkg/ 341.837270111 seconds time elapsed root@hr-zp:/home/ray/tip# ./tools/perf/perf stat -a -e 'power/power-pkg/' sleep 10 Performance counter stats for 'system wide': 0.18 mWatts power/power-pkg/ 10.012551815 seconds time elapsed Suggested-by: Peter Zijlstra <peterz@infradead.org> Suggested-by: Ingo Molnar <mingo@kernel.org> Suggested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Robert Richter <rric@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: jacob.w.shin@gmail.com Link: http://lkml.kernel.org/r/1457502306-2559-1-git-send-email-ray.huang@amd.com [ Fixed the modular build. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-21x86/cpufeature, perf/x86: Add AMD Accumulated Power Mechanism feature flagHuang Rui
AMD CPU family 15h model 0x60 introduces a mechanism for measuring accumulated power. It is used to report the processor power consumption and support for it is indicated by CPUID Fn8000_0007_EDX[12]. Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Aaron Lu <aaron.lu@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andreas Herrmann <herrmann.der.user@googlemail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hector Marco-Gisbert <hecmargi@upv.es> Cc: Jacob Shin <jacob.w.shin@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Kristen Carlson Accardi <kristen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <rric@kernel.org> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wan Zongshun <Vincent.Wan@amd.com> Cc: spg_linux_kernel@amd.com Link: http://lkml.kernel.org/r/1452739808-11871-4-git-send-email-ray.huang@amd.com [ Resolved conflict and moved the synthetic CPUID slot to 19. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>