summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2016-08-26arm64: errata: Pass --fix-cortex-a53-843419 to ld if workaround enabledWill Deacon
Cortex-A53 erratum 843419 is worked around by the linker, although it is a configure-time option to GCC as to whether ld is actually asked to apply the workaround or not. This patch ensures that we pass --fix-cortex-a53-843419 to the linker when both CONFIG_ARM64_ERRATUM_843419=y and the linker supports the option. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-26Revert "arm64: hibernate: Refuse to hibernate if the boot cpu is offline"James Morse
Now that we use the MPIDR to resume on the same CPU that we hibernated on, we no longer need to refuse to hibernate if the boot cpu is offline. (Which we can't possibly know if kexec causes logical CPUs to be renumbered). This reverts commit 1fe492ce6482b77807b25d29690a48c46456beee. Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-26arm64: hibernate: Resume when hibernate image created on non-boot CPUJames Morse
disable_nonboot_cpus() assumes that the lowest numbered online CPU is the boot CPU, and that this is the correct CPU to run any power management code on. On arm64 CPU0 can be taken offline. For hibernate/resume this means we may hibernate on a CPU other than CPU0. If the system is rebooted with kexec 'CPU0' will be assigned to a different CPU. This complicates hibernate/resume as now we can't trust the CPU numbers. We currently forbid hibernate if CPU0 has been hotplugged out to avoid this situation without kexec. Save the MPIDR of the CPU we hibernated on in the hibernate arch-header, use hibernate_resume_nonboot_cpu_disable() to direct which CPU we should resume on based on the MPIDR of the CPU we hibernated on. This allows us to hibernate/resume on any CPU, even if the logical numbers have been shuffled by kexec. Signed-off-by: James Morse <james.morse@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-26arm64: always enable DEBUG_RODATA and remove the Kconfig optionMark Rutland
Follow the example set by x86 in commit 9ccaf77cf05915f5 ("x86/mm: Always enable CONFIG_DEBUG_RODATA and remove the Kconfig option"), and make these protections a fundamental security feature rather than an opt-in. This also results in a minor code simplification. For those rare cases when users wish to disable this protection (e.g. for debugging), this can be done by passing 'rodata=off' on the command line. As DEBUG_RODATA_ALIGN is only intended to address a performance/memory tradeoff, and does not affect correctness, this is left user-selectable. DEBUG_MODULE_RONX is also left user-selectable until the core code provides a boot-time option to disable the protection for debugging use-cases. Cc: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-25arm64: mark reserved memblock regions explicitly in iomemAKASHI Takahiro
Kdump(kexec-tools) parses /proc/iomem to identify all the memory regions on the system. Since the current kernel names "nomap" regions, like UEFI runtime services code/data, as "System RAM," kexec-tools sets up elf core header to include them in a crash dump file (/proc/vmcore). Then crash dump kernel parses UEFI memory map again, re-marks those regions as "nomap" and does not create a memory mapping for them unlike the other areas of System RAM. In this case, copying /proc/vmcore through copy_oldmem_page() on crash dump kernel will end up with a kernel abort, as reported in [1]. This patch names all the "nomap" regions explicitly as "reserved" so that we can exclude them from a crash dump file. acpi_os_ioremap() must also be modified because those regions have WB attributes [2]. Apart from kdump, this change also matches x86's use of acpi (and /proc/iomem). [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-August/448186.html [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-August/450089.html Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: James Morse <james.morse@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-25arm64: hibernate: Support DEBUG_PAGEALLOCJames Morse
DEBUG_PAGEALLOC removes the valid bit of page table entries to prevent any access to unallocated memory. Hibernate uses this as a hint that those pages don't need to be saved/restored. This patch adds the kernel_page_present() function it uses. hibernate.c copies the resume kernel's linear map for use during restore. Add _copy_pte() to fill-in the holes made by DEBUG_PAGEALLOC in the resume kernel, so we can restore data the original kernel had at these addresses. Finally, DEBUG_PAGEALLOC means the linear-map alias of KERNEL_START to KERNEL_END may have holes in it, so we can't lazily clean this whole area to the PoC. Only clean the new mmuoff region, and the kernel/kvm idmaps. This reverts commit da24eb1f3f9e2c7b75c5f8c40d8e48e2c4789596. Reported-by: Will Deacon <will.deacon@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-25arm64: vmlinux.ld: Add mmuoff data sections and move mmuoff text into idmapJames Morse
Resume from hibernate needs to clean any text executed by the kernel with the MMU off to the PoC. Collect these functions together into the .idmap.text section as all this code is tightly coupled and also needs the same cleaning after resume. Data is more complicated, secondary_holding_pen_release is written with the MMU on, clean and invalidated, then read with the MMU off. In contrast __boot_cpu_mode is written with the MMU off, the corresponding cache line is invalidated, so when we read it with the MMU on we don't get stale data. These cache maintenance operations conflict with each other if the values are within a Cache Writeback Granule (CWG) of each other. Collect the data into two sections .mmuoff.data.read and .mmuoff.data.write, the linker script ensures mmuoff.data.write section is aligned to the architectural maximum CWG of 2KB. Signed-off-by: James Morse <james.morse@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-25arm64: Create sections.hJames Morse
Each time new section markers are added, kernel/vmlinux.ld.S is updated, and new extern char __start_foo[] definitions are scattered through the tree. Create asm/include/sections.h to collect these definitions (and include the existing asm-generic version). Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-25arm64: Introduce execute-only page access permissionsCatalin Marinas
The ARMv8 architecture allows execute-only user permissions by clearing the PTE_UXN and PTE_USER bits. However, the kernel running on a CPU implementation without User Access Override (ARMv8.2 onwards) can still access such page, so execute-only page permission does not protect against read(2)/write(2) etc. accesses. Systems requiring such protection must enable features like SECCOMP. This patch changes the arm64 __P100 and __S100 protection_map[] macros to the new __PAGE_EXECONLY attributes. A side effect is that pte_user() no longer triggers for __PAGE_EXECONLY since PTE_USER isn't set. To work around this, the check is done on the PTE_NG bit via the pte_ng() macro. VM_READ is also checked now for page faults. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-25arm64: kprobe: Always clear pstate.D in breakpoint exception handlerPratyush Anand
Whenever we are hitting a kprobe from a none-kprobe debug exception handler, we hit an infinite occurrences of "Unexpected kernel single-step exception at EL1" PSTATE.D is debug exception mask bit. It is set whenever we enter into an exception mode. When it is set then Watchpoint, Breakpoint, and Software Step exceptions are masked. However, software Breakpoint Instruction exceptions can never be masked. Therefore, if we ever execute a BRK instruction, irrespective of D-bit setting, we will be receiving a corresponding breakpoint exception. For example: - We are executing kprobe pre/post handler, and kprobe has been inserted in one of the instruction of a function called by handler. So, it executes BRK instruction and we land into the case of KPROBE_REENTER. (This case is already handled by current code) - We are executing uprobe handler or any other BRK handler such as in WARN_ON (BRK BUG_BRK_IMM), and we trace that path using kprobe.So, we enter into kprobe breakpoint handler,from another BRK handler.(This case is not being handled currently) In all such cases kprobe breakpoint exception will be raised when we were already in debug exception mode. SPSR's D bit (bit 9) shows the value of PSTATE.D immediately before the exception was taken. So, in above example cases we would find it set in kprobe breakpoint handler. Single step exception will always be followed by a kprobe breakpoint exception.However, it will only be raised gracefully if we clear D bit while returning from breakpoint exception. If D bit is set then, it results into undefined exception and when it's handler enables dbg then single step exception is generated, however it will never be handled(because address does not match and therefore treated as unexpected). This patch clears D-flag unconditionally in setup_singlestep, so that we can always get single step exception correctly after returning from breakpoint exception. Additionally, it also removes D-flag set statement for KPROBE_REENTER return path, because debug exception for KPROBE_REENTER will always take place in a debug exception state. So, D-flag will already be set in this case. Acked-by: Sandeepa Prabhu <sandeepa.s.prabhu@gmail.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-22arm64: head.S: get rid of x25 and x26 with 'global' scopeArd Biesheuvel
Currently, x25 and x26 hold the physical addresses of idmap_pg_dir and swapper_pg_dir, respectively, when running early boot code. But having registers with 'global' scope in files that contain different sections with different lifetimes, and that are called by different CPUs at different times is a bit messy, especially since stashing the values does not buy us anything in terms of code size or clarity. So simply replace each reference to x25 or x26 with an adrp instruction referring to idmap_pg_dir or swapper_pg_dir directly. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-22arm64: apply __ro_after_init to some objectsJisheng Zhang
These objects are set during initialization, thereafter are read only. Previously I only want to mark vdso_pages, vdso_spec, vectors_page and cpu_ops as __read_mostly from performance point of view. Then inspired by Kees's patch[1] to apply more __ro_after_init for arm, I think it's better to mark them as __ro_after_init. What's more, I find some more objects are also read only after init. So apply __ro_after_init to all of them. This patch also removes global vdso_pagelist and tries to clean up vdso_spec[] assignment code. [1] http://www.spinics.net/lists/arm-kernel/msg523188.html Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-22arm64: vdso: constify vm_special_mapping used for aarch32 vectors pageJisheng Zhang
The vm_special_mapping spec which is used for aarch32 vectors page is never modified, so mark it as const. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-22arm64: vdso: add __init section marker to alloc_vectors_pageJisheng Zhang
It is not needed after booting, this patch moves the alloc_vectors_page function to the __init section. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-22arm64: remove redundant "select HAVE_CLK"Masahiro Yamada
HAVE_CLK is select'ed by CLKDEV_LOOKUP, which is select'ed by COMMON_CLK, which is select'ed by ARM64. No sub-architecture needs to select HAVE_CLK explicitly. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-22arm64: remove traces of perf_ops_bpMark Rutland
Even though perf_ops_bp was removed/renamed back in commit b0a873ebbf87bf38 ("perf: Register PMU implementations"), as part of v2.6.37, its definition still lives on in some arch headers. This patch removes the vestigal definition from arm64. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-22arm64: perf: Use the builtin_platform_driverKefeng Wang
Use the builtin_platform_driver() to simplify code. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-22arm64: mm: convert __dma_* routines to use start, sizeKwangwoo Lee
__dma_* routines have been converted to use start and size instread of start and end addresses. The patch was origianlly for adding __clean_dcache_area_poc() which will be used in pmem driver to clean dcache to the PoC(Point of Coherency) in arch_wb_cache_pmem(). The functionality of __clean_dcache_area_poc() was equivalent to __dma_clean_range(). The difference was __dma_clean_range() uses the end address, but __clean_dcache_area_poc() uses the size to clean. Thus, __clean_dcache_area_poc() has been revised with a fallthrough function of __dma_clean_range() after the change that __dma_* routines use start and size instead of using start and end. As a consequence of using start and size, the name of __dma_* routines has also been altered following the terminology below: area: takes a start and size range: takes a start and end Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-22arm64: factor work_pending state machine to CChris Metcalf
Currently ret_fast_syscall, work_pending, and ret_to_user form an ad-hoc state machine that can be difficult to reason about due to duplicated code and a large number of branch targets. This patch factors the common logic out into the existing do_notify_resume function, converting the code to C in the process, making the code more legible. This patch tries to closely mirror the existing behaviour while using the usual C control flow primitives. As local_irq_{disable,enable} may be instrumented, we balance exception entry (where we will almost most likely enable IRQs) with a call to trace_hardirqs_on just before the return to userspace. Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-22arm64: hibernate: reduce TLB maintenance scopeMark Rutland
In break_before_make_ttbr_switch we perform broadcast TLB maintenance for the inner shareable domain, and use a DSB ISH to complete this. However, at the point we execute this, secondary CPUs are either physically offline, or executing code outside of the kernel. Upon entering the kernel, secondary CPUs will invalidate their TLBs before enabling their MMUs. Thus we do not need to invalidate TLBs of other CPUs, and as with idmap_cpu_replace_ttbr1 we can reduce the scope of maintenance to the TLBs of the local CPU. This keeps our TLB maintenance code consistent, and is a minor optimisation. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: James Morse <james.morse@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-20parisc: Fix order of EREFUSED define in errno.hHelge Deller
When building gccgo in userspace, errno.h gets parsed and the go include file sysinfo.go is generated. Since EREFUSED is defined to the same value as ECONNREFUSED, and ECONNREFUSED is defined later on in errno.h, this leads to go complaining that EREFUSED isn't defined yet. Fix this trivial problem by moving the define of EREFUSED down after ECONNREFUSED in errno.h (and clean up the indenting while touching this line). Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org
2016-08-20parisc: Fix automatic selection of cr16 clocksourceHelge Deller
Commit 54b66800907 (parisc: Add native high-resolution sched_clock() implementation) added support to use the CPU-internal cr16 counters as reliable clocksource with the help of HAVE_UNSTABLE_SCHED_CLOCK. Sadly the commit missed to remove the hack which prevented cr16 to become the default clocksource even on SMP systems. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # 4.7+
2016-08-19Merge tag 'devicetree-fixes-for-4.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull DeviceTree fixes from Rob Herring: - a couple of DT node ref counting fixes - fix __unflatten_device_tree for PPC PCI hotplug case - rework marking irq controllers as OF_POPULATED in cases where real driver is used. - disable of_platform_default_populate_init on PPC. The change in initcall order causes problems which need to be sorted out later. * tag 'devicetree-fixes-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of: fix reference counting in of_graph_get_endpoint_by_regs of/platform: disable the of_platform_default_populate_init() for all the ppc boards ARM: imx6: mark GPC node as not populated after irq init to probe pm domain driver of/irq: Mark interrupt controllers as populated before initialisation drivers/of: Validate device node in __unflatten_device_tree() of: Delete an unnecessary check before the function call "of_node_put"
2016-08-18Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "An initrd microcode loading fix, and an SMP bootup topology setup fix to resolve crashes on SGI/UV systems if the BIOS is configured in a certain way" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/smp: Fix __max_logical_packages value setup x86/microcode/AMD: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y
2016-08-18Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Avoid a literal load with the MMU off on the CPU resume path (potential inconsistency between cache and RAM) - Build error with CONFIG_ACPI=n fixed - Compiler warning in the arch/arm64/mm/dump.c code fixed * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Fix shift warning in arch/arm64/mm/dump.c arm64: kernel: avoid literal load of virtual address with MMU off arm64: Fix NUMA build error when !CONFIG_ACPI
2016-08-18Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "Only three fixes this time: - Emil found an overflow problem with the memory layout sanity check. - Ard Biesheuvel noticed that late-allocated page tables (for EFI) weren't being properly constructed. - Guenter Roeck reported a problem found on qemu caused by the recent addr_limit changes" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: fix address limit restoration for undefined instructions ARM: 8591/1: mm: use fully constructed struct pages for EFI pgd allocations ARM: 8590/1: sanity_check_meminfo(): avoid overflow on vmalloc_limit
2016-08-18Merge tag 'pm-4.8-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "More hibernation-related material: one fix for a recent regression in the core, one small cleanup of the x86-64 resume code and a documentation update. Specifics: - Fix a hibernate core regression resulting from uncovering a latent bug in its implementation of memory bitmaps by a recent commit (James Morse). - Use __pa() to compute a physical address in the x86-64 code finalizing resume from hibernation (Rafael Wysocki). - Update power management documentation related to system sleep states to remove outdated information from it and to add a description of a recently introduced hibernation debug feature to it (Rafael Wysocki)" * tag 'pm-4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / hibernate: Fix rtree_next_node() to avoid walking off list ends x86/power/64: Use __pa() for physical address computation PM / sleep: Update some system sleep documentation
2016-08-18arm64: Fix shift warning in arch/arm64/mm/dump.cCatalin Marinas
When building with 48-bit VAs and 16K page configuration, it's possible to get the following warning when building the arm64 page table dumping code: arch/arm64/mm/dump.c: In function ‘walk_pud’: arch/arm64/mm/dump.c:274:102: warning: right shift count >= width of type [-Wshift-count-overflow] This is because pud_offset(pgd, 0) performs a shift to the right by 36 while the value 0 has the type 'int' by default, therefore 32-bit. This patch modifies all the p*_offset() uses in arch/arm64/mm/dump.c to use 0UL for the address argument. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-08-18x86/smp: Fix __max_logical_packages value setupJiri Olsa
Frank reported kernel panic when he disabled several cores in BIOS via following option: Core Disable Bitmap(Hex) [0] with number 0xFFE, which leaves 16 CPUs in system (out of 48). The kernel panic below goes along with following messages: smpboot: Max logical packages: 2^M smpboot: APIC(0) Converting physical 0 to logical package 0^M smpboot: APIC(20) Converting physical 1 to logical package 1^M smpboot: APIC(40) Package 2 exceeds logical package map^M smpboot: CPU 8 APICId 40 disabled^M smpboot: APIC(60) Package 3 exceeds logical package map^M smpboot: CPU 12 APICId 60 disabled^M ... general protection fault: 0000 [#1] SMP^M Modules linked in:^M CPU: 15 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc5+ #1^M Hardware name: SGI UV300/UV300, BIOS SGI UV 300 series BIOS 05/25/2016^M task: ffff8801673e0000 ti: ffff8801673ac000 task.ti: ffff8801673ac000^M RIP: 0010:[<ffffffff81014d54>] [<ffffffff81014d54>] uncore_change_context+0xd4/0x180^M ... [<ffffffff810158ac>] uncore_event_init_cpu+0x6c/0x70^M [<ffffffff81d8c91c>] intel_uncore_init+0x1c2/0x2dd^M [<ffffffff81d8c75a>] ? uncore_cpu_setup+0x17/0x17^M [<ffffffff81002190>] do_one_initcall+0x50/0x190^M [<ffffffff810ab193>] ? parse_args+0x293/0x480^M [<ffffffff81d87365>] kernel_init_freeable+0x1a5/0x249^M [<ffffffff81d86a35>] ? set_debug_rodata+0x12/0x12^M [<ffffffff816dc19e>] kernel_init+0xe/0x110^M [<ffffffff816e93bf>] ret_from_fork+0x1f/0x40^M [<ffffffff816dc190>] ? rest_init+0x80/0x80^M The reason for the panic is wrong value of __max_logical_packages, which lets logical_package_map uninitialized and the uncore code relying on this map being properly initialized (maybe we should add some safety checks there as well). The __max_logical_packages is computed as: DIV_ROUND_UP(total_cpus, ncpus); - ncpus being number of cores With above BIOS setup we get total_cpus == 16 which set __max_logical_packages to 2 (ncpus is 12). Once topology_update_package_map processes CPU with logical pkg over 2 we display above messages and fail to initialize the physical_to_logical_pkg map, which makes the uncore code crash. The fix is to remove logical_package_map bitmap completely and keep and update the logical_packages number instead. After we enumerate all the present CPUs, we check if the enumerated logical packages count is within its computed maximum from BIOS data. If it's not the case, we set this maximum to the new enumerated value and freeze any new addition of logical packages. The freeze is because lot of init code like uncore/rapl/cqm depends on having maximum logical package value set to allocate their data, so we can't change it later on. Prarit Bhargava tested the patch and confirms that it solves the problem: From dmidecode: Core Count: 24 Core Enabled: 24 Thread Count: 48 Orig kernel boot log: [ 0.464981] smpboot: Max logical packages: 19 [ 0.469861] smpboot: APIC(0) Converting physical 0 to logical package 0 [ 0.477261] smpboot: APIC(40) Converting physical 1 to logical package 1 [ 0.484760] smpboot: APIC(80) Converting physical 2 to logical package 2 [ 0.492258] smpboot: APIC(c0) Converting physical 3 to logical package 3 1. nr_cpus=8, should stop enumerating in package 0: [ 0.533664] smpboot: APIC(0) Converting physical 0 to logical package 0 [ 0.539596] smpboot: Max logical packages: 19 2. max_cpus=8, should still enumerate all packages: [ 0.526494] smpboot: APIC(0) Converting physical 0 to logical package 0 [ 0.532428] smpboot: APIC(40) Converting physical 1 to logical package 1 [ 0.538456] smpboot: APIC(80) Converting physical 2 to logical package 2 [ 0.544486] smpboot: APIC(c0) Converting physical 3 to logical package 3 [ 0.550524] smpboot: Max logical packages: 19 3. nr_cpus=49 ( 2 socket + 1 core on 3rd socket), should stop enumerating in package 2: [ 0.521378] smpboot: APIC(0) Converting physical 0 to logical package 0 [ 0.527314] smpboot: APIC(40) Converting physical 1 to logical package 1 [ 0.533345] smpboot: APIC(80) Converting physical 2 to logical package 2 [ 0.539368] smpboot: Max logical packages: 19 4. maxcpus=49, should still enumerate all packages: [ 0.525591] smpboot: APIC(0) Converting physical 0 to logical package 0 [ 0.531525] smpboot: APIC(40) Converting physical 1 to logical package 1 [ 0.537547] smpboot: APIC(80) Converting physical 2 to logical package 2 [ 0.543579] smpboot: APIC(c0) Converting physical 3 to logical package 3 [ 0.549624] smpboot: Max logical packages: 19 5. kdump (nr_cpus=1) works as well. Reported-by: Frank Ramsay <framsay@redhat.com> Tested-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Reviewed-by: Prarit Bhargava <prarit@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160815101700.GA30090@krava Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-18x86/microcode/AMD: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=yBorislav Petkov
Similar to: efaad554b4ff ("x86/microcode/intel: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y") ... fix microcode loading from the initrd on AMD by adding the randomization offset to the microcode patch container within the initrd. Reported-and-tested-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-tip-commits@vger.kernel.org Link: http://lkml.kernel.org/r/20160817113314.GA19221@nazgul.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-18Merge branch 'pm-sleep'Rafael J. Wysocki
* pm-sleep: PM / hibernate: Fix rtree_next_node() to avoid walking off list ends x86/power/64: Use __pa() for physical address computation PM / sleep: Update some system sleep documentation
2016-08-17arm64: kernel: avoid literal load of virtual address with MMU offArd Biesheuvel
Literal loads of virtual addresses are subject to runtime relocation when CONFIG_RELOCATABLE=y, and given that the relocation routines run with the MMU and caches enabled, literal loads of relocated values performed with the MMU off are not guaranteed to return the latest value unless the memory covering the literal is cleaned to the PoC explicitly. So defer the literal load until after the MMU has been enabled, just like we do for primary_switch() and secondary_switch() in head.S. Fixes: 1e48ef7fcc37 ("arm64: add support for building vmlinux as a relocatable PIE binary") Cc: <stable@vger.kernel.org> # 4.6+ Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-08-17arm64: Fix NUMA build error when !CONFIG_ACPICatalin Marinas
Since asm/acpi.h is only included by linux/acpi.h when CONFIG_ACPI is enabled, disabling the latter leads to the following build error on arm64: arch/arm64/mm/numa.c: In function ‘arm64_numa_init’: arch/arm64/mm/numa.c:395:24: error: ‘arm64_acpi_numa_init’ undeclared (first use in this function) if (!acpi_disabled && !numa_init(arm64_acpi_numa_init)) This patch include the asm/acpi.h explicitly in arch/arm64/mm/numa.c for the arm64_acpi_numa_init() definition. Fixes: d8b47fca8c23 ("arm64, ACPI, NUMA: NUMA support based on SRAT and SLIT") Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-08-16Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "A couple of bug fixes, minor cleanup and a change to the default config" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/dasd: fix failing CUIR assignment under LPAR s390/pageattr: handle numpages parameter correctly s390/dasd: fix hanging device after clear subchannel s390/qdio: avoid reschedule of outbound tasklet once killed s390/qdio: remove checks for ccw device internal state s390/qdio: fix double return code evaluation s390/qdio: get rid of spin_lock_irqsave usage s390/cio: remove subchannel_id from ccw_device_private s390/qdio: obtain subchannel_id via ccw_device_get_schid() s390/cio: stop using subchannel_id from ccw_device_private s390/config: make the vector optimized crc function builtin s390/lib: fix memcmp and strstr s390/crc32-vx: Fix checksum calculation for small sizes s390: clarify compressed image code path
2016-08-15x86/power/64: Use __pa() for physical address computationRafael J. Wysocki
The value of temp_level4_pgt is the physical address of the top-level page directory, so use __pa() to compute it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Ingo Molnar <mingo@kernel.org>
2016-08-15Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu Pull m68knommu fix from Greg Ungerer: "This contains only a single fix for a register corruption problem on certain types of m68k flat format binaries" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: m68knommu: fix user a5 register being overwritten
2016-08-14Merge tag 'fixes-for-linus-4.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull h8300 and unicore32 architecture fixes from Guenter Roeck: "Two patches to fix h8300 and unicore32 builds. unicore32 builds have been broken since v4.6. The fix has been available in -next since March of this year. h8300 builds have been broken since the last commit window. The fix has been available in -next since June of this year" * tag 'fixes-for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: h8300: Add missing include file to asm/io.h unicore32: mm: Add missing parameter to arch_vma_access_permitted
2016-08-14Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - support for nr_cpus= command line argument (maxcpus was previously changed to allow secondary CPUs to be hot-plugged) - ARM PMU interrupt handling fix - fix potential TLB conflict in the hibernate code - improved handling of EL1 instruction aborts (better error reporting) - removal of useless jprobes code for stack saving/restoring - defconfig updates * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: defconfig: enable CONFIG_LOCALVERSION_AUTO arm64: defconfig: add options for virtualization and containers arm64: hibernate: handle allocation failures arm64: hibernate: avoid potential TLB conflict arm64: Handle el1 synchronous instruction aborts cleanly arm64: Remove stack duplicating code from jprobes drivers/perf: arm-pmu: Fix handling of SPI lacking "interrupt-affinity" property drivers/perf: arm-pmu: convert arm_pmu_mutex to spinlock arm64: Support hard limit of cpu count by nr_cpus
2016-08-13Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Radim Krčmář: "KVM: - lock kvm_device list to prevent corruption on device creation. PPC: - split debugfs initialization from creation of the xics device to unlock the newly taken kvm lock earlier. s390: - prevent userspace from triggering two WARN_ON_ONCE. MIPS: - fix several issues in the management of TLB faults (Cc: stable)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: MIPS: KVM: Propagate kseg0/mapped tlb fault errors MIPS: KVM: Fix gfn range check in kseg0 tlb faults MIPS: KVM: Add missing gfn range check MIPS: KVM: Fix mapped fault broken commpage handling KVM: Protect device ops->create and list_add with kvm->lock KVM: PPC: Move xics_debugfs_init out of create KVM: s390: reset KVM_REQ_MMU_RELOAD if mapping the prefix failed KVM: s390: set the prefix initially properly
2016-08-13h8300: Add missing include file to asm/io.hGuenter Roeck
h8300 builds fail with arch/h8300/include/asm/io.h:9:15: error: unknown type name ‘u8’ arch/h8300/include/asm/io.h:15:15: error: unknown type name ‘u16’ arch/h8300/include/asm/io.h:21:15: error: unknown type name ‘u32’ and many related errors. Fixes: 23c82d41bdf4 ("kexec-allow-architectures-to-override-boot-mapping-fix") Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-08-13unicore32: mm: Add missing parameter to arch_vma_access_permittedGuenter Roeck
unicore32 fails to compile with the following errors. mm/memory.c: In function ‘__handle_mm_fault’: mm/memory.c:3381: error: too many arguments to function ‘arch_vma_access_permitted’ mm/gup.c: In function ‘check_vma_flags’: mm/gup.c:456: error: too many arguments to function ‘arch_vma_access_permitted’ mm/gup.c: In function ‘vma_permits_fault’: mm/gup.c:640: error: too many arguments to function ‘arch_vma_access_permitted’ Fixes: d61172b4b695b ("mm/core, x86/mm/pkeys: Differentiate instruction fetches") Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn>
2016-08-12Merge tag 'pm-4.8-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "Two hibernation fixes allowing it to work with the recently added randomization of the kernel identity mapping base on x86-64 and one cpufreq driver regression fix. Specifics: - Fix the x86 identity mapping creation helpers to avoid the assumption that the base address of the mapping will always be aligned at the PGD level, as it may be aligned at the PUD level if address space randomization is enabled (Rafael Wysocki). - Fix the hibernation core to avoid executing tracing functions before restoring the processor state completely during resume (Thomas Garnier). - Fix a recently introduced regression in the powernv cpufreq driver that causes it to crash due to an out-of-bounds array access (Akshay Adiga)" * tag 'pm-4.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / hibernate: Restore processor state before using per-CPU variables x86/power/64: Always create temporary identity mapping correctly cpufreq: powernv: Fix crash in gpstate_timer_handler()
2016-08-12Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "This is bigger than usual - the reason is partly a pent-up stream of fixes after the merge window and partly accidental. The fixes are: - five patches to fix a boot failure on Andy Lutomirsky's laptop - four SGI UV platform fixes - KASAN fix - warning fix - documentation update - swap entry definition fix - pkeys fix - irq stats fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic/x2apic, smp/hotplug: Don't use before alloc in x2apic_cluster_probe() x86/efi: Allocate a trampoline if needed in efi_free_boot_services() x86/boot: Rework reserve_real_mode() to allow multiple tries x86/boot: Defer setup_real_mode() to early_initcall time x86/boot: Synchronize trampoline_cr4_features and mmu_cr4_features directly x86/boot: Run reserve_bios_regions() after we initialize the memory map x86/irq: Do not substract irq_tlb_count from irq_call_count x86/mm: Fix swap entry comment and macro x86/mm/kaslr: Fix -Wformat-security warning x86/mm/pkeys: Fix compact mode by removing protection keys' XSAVE buffer manipulation x86/build: Reduce the W=1 warnings noise when compiling x86 syscall tables x86/platform/UV: Fix kernel panic running RHEL kdump kernel on UV systems x86/platform/UV: Fix problem with UV4 BIOS providing incorrect PXM values x86/platform/UV: Fix bug with iounmap() of the UV4 EFI System Table causing a crash x86/platform/UV: Fix problem with UV4 Socket IDs not being contiguous x86/entry: Clarify the RF saving/restoring situation with SYSCALL/SYSRET x86/mm: Disable preemption during CR3 read+write x86/mm/KASLR: Increase BRK pages for KASLR memory randomization x86/mm/KASLR: Fix physical memory calculation on KASLR memory randomization x86, kasan, ftrace: Put APIC interrupt handlers into .irqentry.text
2016-08-12Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Ingo Molnar: "Misc fixes: a /dev/rtc regression fix, two APIC timer period calibration fixes, an ARM clocksource driver fix and a NOHZ power use regression fix" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/hpet: Fix /dev/rtc breakage caused by RTC cleanup x86/timers/apic: Inform TSC deadline clockevent device about recalibration x86/timers/apic: Fix imprecise timer interrupts by eliminating TSC clockevents frequency roundoff error timers: Fix get_next_timer_interrupt() computation clocksource/arm_arch_timer: Force per-CPU interrupt to be level-triggered
2016-08-12Merge branches 'pm-sleep' and 'pm-cpufreq'Rafael J. Wysocki
* pm-sleep: PM / hibernate: Restore processor state before using per-CPU variables x86/power/64: Always create temporary identity mapping correctly * pm-cpufreq: cpufreq: powernv: Fix crash in gpstate_timer_handler()
2016-08-12Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Mostly tooling fixes, plus two uncore-PMU fixes, an uprobes fix, a perf-cgroups fix and an AUX events fix" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Add enable_box for client MSR uncore perf/x86/intel/uncore: Fix uncore num_counters uprobes/x86: Fix RIP-relative handling of EVEX-encoded instructions perf/core: Set cgroup in CPU contexts for new cgroup events perf/core: Fix sideband list-iteration vs. event ordering NULL pointer deference crash perf probe ppc64le: Fix probe location when using DWARF perf probe: Add function to post process kernel trace events tools: Sync cpufeatures headers with the kernel toops: Sync tools/include/uapi/linux/bpf.h with the kernel tools: Sync cpufeatures.h and vmx.h with the kernel perf probe: Support signedness casting perf stat: Avoid skew when reading events perf probe: Fix module name matching perf probe: Adjust map->reloc offset when finding kernel symbol from map perf hists: Trim libtraceevent trace_seq buffers perf script: Add 'bpf-output' field to usage message
2016-08-12Merge branch 'efi-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fixes from Ingo Molnar: "A fix for EFI capsules and an SGI UV platform fix" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi/capsule: Allocate whole capsule into virtual memory x86/platform/uv: Skip UV runtime services mapping in the efi_runtime_disabled case
2016-08-12Merge tag 'powerpc-4.8-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Some powerpc fixes for 4.8: Misc: - powerpc/vdso: Fix build rules to rebuild vdsos correctly from Nicholas Piggin - powerpc/ptrace: Fix coredump since ptrace TM changes from Cyril Bur - powerpc/32: Fix csum_partial_copy_generic() from Christophe Leroy - cxl: Set psl_fir_cntl to production environment value from Frederic Barrat - powerpc/eeh: Switch to conventional PCI address output in EEH log from Guilherme G. Piccoli - cxl: Use fixed width predefined types in data structure. from Philippe Bergheaud - powerpc/vdso: Add missing include file from Guenter Roeck - powerpc: Fix unused function warning 'lmb_to_memblock' from Alastair D'Silva - powerpc/powernv/ioda: Fix TCE invalidate to work in real mode again from Alexey Kardashevskiy - powerpc/cell: Add missing error code in spufs_mkgang() from Dan Carpenter - crypto: crc32c-vpmsum - Convert to CPU feature based module autoloading from Anton Blanchard - powerpc/pasemi: Fix coherent_dma_mask for dma engine from Darren Stevens Benjamin Herrenschmidt: - powerpc/32: Fix crash during static key init - powerpc: Update obsolete comment in setup_32.c about early_init() - powerpc: Print the kernel load address at the end of prom_init() - powerpc/pnv/pci: Fix incorrect PE reservation attempt on some 64-bit BARs - powerpc/xics: Properly set Edge/Level type and enable resend Mahesh Salgaonkar: - powerpc/book3s: Fix MCE console messages for unrecoverable MCE. - powerpc/powernv: Fix MCE handler to avoid trashing CR0/CR1 registers. - powerpc/powernv: Move IDLE_STATE_ENTER_SEQ macro to cpuidle.h - powerpc/powernv: Load correct TOC pointer while waking up from winkle. Andrew Donnellan: - cxl: Fix sparse warnings - cxl: Fix NULL dereference in cxl_context_init() on PowerVM guests Michael Ellerman: - selftests/powerpc: Specify we expect to build with std=gnu99 - powerpc/Makefile: Use cflags-y/aflags-y for setting endian options - powerpc/pci: Fix endian bug in fixed PHB numbering" * tag 'powerpc-4.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (26 commits) selftests/powerpc: Specify we expect to build with std=gnu99 powerpc/vdso: Fix build rules to rebuild vdsos correctly powerpc/Makefile: Use cflags-y/aflags-y for setting endian options powerpc/32: Fix crash during static key init powerpc: Update obsolete comment in setup_32.c about early_init() powerpc: Print the kernel load address at the end of prom_init() powerpc/ptrace: Fix coredump since ptrace TM changes powerpc/32: Fix csum_partial_copy_generic() cxl: Set psl_fir_cntl to production environment value powerpc/pnv/pci: Fix incorrect PE reservation attempt on some 64-bit BARs powerpc/book3s: Fix MCE console messages for unrecoverable MCE. powerpc/pci: Fix endian bug in fixed PHB numbering powerpc/eeh: Switch to conventional PCI address output in EEH log cxl: Fix sparse warnings cxl: Fix NULL dereference in cxl_context_init() on PowerVM guests cxl: Use fixed width predefined types in data structure. powerpc/vdso: Add missing include file powerpc: Fix unused function warning 'lmb_to_memblock' powerpc/powernv: Fix MCE handler to avoid trashing CR0/CR1 registers. powerpc/powernv: Move IDLE_STATE_ENTER_SEQ macro to cpuidle.h ...
2016-08-12arm64: defconfig: enable CONFIG_LOCALVERSION_AUTOMasahiro Yamada
When CONFIG_LOCALVERSION_AUTO is disabled, the version string is just a tag name (or with a '+' appended if HEAD is not a tagged commit). During the development (and especially when git-bisecting), longer version string would be helpful to identify the commit we are running. This is a default y option, so drop the unset to enable it. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-08-12arm64: defconfig: add options for virtualization and containersRiku Voipio
Enable options commonly needed by popular virtualization and container applications. Use modules when possible to avoid too much overhead for users not interested. - add namespace and cgroup options needed - add seccomp - optional, but enhances Qemu etc - bridge, nat, veth, macvtap and multicast for routing guests and containers - btfrs and overlayfs modules for container COW backends - while near it, make fuse a module instead of built-in. Generated with make saveconfig and dropping unrelated spurious change hunks while commiting. bloat-o-meter old-vmlinux vmlinux: add/remove: 905/390 grow/shrink: 767/229 up/down: 183513/-94861 (88652) .... Total: Before=10515408, After=10604060, chg +0.84% Signed-off-by: Riku Voipio <riku.voipio@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>