summaryrefslogtreecommitdiff
path: root/arch/arm/kernel
AgeCommit message (Collapse)Author
2016-06-19Merge tag 'fixes-rcu-fiq-signed' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes Fixes for omaps for v4.7-rc cycle: - Two boot warning fixes from the RCU tree that should have gotten merged several weeks ago already but did not because of issues with who merges them. Paul has now split the RCU warning fixes into sets for various maintainers. - Fix ams-delta FIQ regression caused by omap1 sparse IRQ changes - Fix PM for omap3 boards using timer12 and gptimer, like the original beagleboard - Fix hangs on am437x-sk-evm by lowering the I2C bus speed * tag 'fixes-rcu-fiq-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: am437x-sk-evm: Reduce i2c0 bus speed for tps65218 ARM: OMAP2+: timer: add probe for clocksources ARM: OMAP1: fix ams-delta FIQ handler to work with sparse IRQ arm: Use _rcuidle for smp_cross_call() tracepoints arm: Use _rcuidle tracepoint to allow use from idle Signed-off-by: Olof Johansson <olof@lixom.net>
2016-06-14arm: Use _rcuidle for smp_cross_call() tracepointsPaul E. McKenney
Further testing with false negatives suppressed by commit 293e2421fe25 ("rcu: Remove superfluous versions of rcu_read_lock_sched_held()") identified another unprotected use of RCU from the idle loop. Because RCU actively ignores idle-loop code (for energy-efficiency reasons, among other things), using RCU from the idle loop can result in too-short grace periods, in turn resulting in arbitrary misbehavior. The resulting lockdep-RCU splat is as follows: ------------------------------------------------------------------------ =============================== [ INFO: suspicious RCU usage. ] 4.6.0-rc5-next-20160426+ #1112 Not tainted ------------------------------- include/trace/events/ipi.h:35 suspicious rcu_dereference_check() usage! other info that might help us debug this: RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 0 RCU used illegally from extended quiescent state! no locks held by swapper/0/0. stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc5-next-20160426+ #1112 Hardware name: Generic OMAP4 (Flattened Device Tree) [<c0110308>] (unwind_backtrace) from [<c010c3a8>] (show_stack+0x10/0x14) [<c010c3a8>] (show_stack) from [<c047fec8>] (dump_stack+0xb0/0xe4) [<c047fec8>] (dump_stack) from [<c010dcfc>] (smp_cross_call+0xbc/0x188) [<c010dcfc>] (smp_cross_call) from [<c01c9e28>] (generic_exec_single+0x9c/0x15c) [<c01c9e28>] (generic_exec_single) from [<c01ca0a0>] (smp_call_function_single_async+0 x38/0x9c) [<c01ca0a0>] (smp_call_function_single_async) from [<c0603728>] (cpuidle_coupled_poke_others+0x8c/0xa8) [<c0603728>] (cpuidle_coupled_poke_others) from [<c0603c10>] (cpuidle_enter_state_coupled+0x26c/0x390) [<c0603c10>] (cpuidle_enter_state_coupled) from [<c0183c74>] (cpu_startup_entry+0x198/0x3a0) [<c0183c74>] (cpu_startup_entry) from [<c0b00c0c>] (start_kernel+0x354/0x3c8) [<c0b00c0c>] (start_kernel) from [<8000807c>] (0x8000807c) ------------------------------------------------------------------------ Reported-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Cc: Russell King <linux@arm.linux.org.uk> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: <linux-omap@vger.kernel.org> Cc: <linux-arm-kernel@lists.infradead.org>
2016-05-26Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Mostly tooling and PMU driver fixes, but also a number of late updates such as the reworking of the call-chain size limiting logic to make call-graph recording more robust, plus tooling side changes for the new 'backwards ring-buffer' extension to the perf ring-buffer" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits) perf record: Read from backward ring buffer perf record: Rename variable to make code clear perf record: Prevent reading invalid data in record__mmap_read perf evlist: Add API to pause/resume perf trace: Use the ptr->name beautifier as default for "filename" args perf trace: Use the fd->name beautifier as default for "fd" args perf report: Add srcline_from/to branch sort keys perf evsel: Record fd into perf_mmap perf evsel: Add overwrite attribute and check write_backward perf tools: Set buildid dir under symfs when --symfs is provided perf trace: Only auto set call-graph to "dwarf" when syscalls are being traced perf annotate: Sort list of recognised instructions perf annotate: Fix identification of ARM blt and bls instructions perf tools: Fix usage of max_stack sysctl perf callchain: Stop validating callchains by the max_stack sysctl perf trace: Fix exit_group() formatting perf top: Use machine->kptr_restrict_warned perf trace: Warn when trying to resolve kernel addresses with kptr_restrict=1 perf machine: Do not bail out if not managing to read ref reloc symbol perf/x86/intel/p4: Trival indentation fix, remove space ...
2016-05-24vdso: make arch_setup_additional_pages wait for mmap_sem for write killableMichal Hocko
most architectures are relying on mmap_sem for write in their arch_setup_additional_pages. If the waiting task gets killed by the oom killer it would block oom_reaper from asynchronous address space reclaim and reduce the chances of timely OOM resolving. Wait for the lock in the killable mode and return with EINTR if the task got killed while waiting. Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Andy Lutomirski <luto@amacapital.net> [x86 vdso] Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-21printk/nmi: generic solution for safe printk in NMIPetr Mladek
printk() takes some locks and could not be used a safe way in NMI context. The chance of a deadlock is real especially when printing stacks from all CPUs. This particular problem has been addressed on x86 by the commit a9edc8809328 ("x86/nmi: Perform a safe NMI stack trace on all CPUs"). The patchset brings two big advantages. First, it makes the NMI backtraces safe on all architectures for free. Second, it makes all NMI messages almost safe on all architectures (the temporary buffer is limited. We still should keep the number of messages in NMI context at minimum). Note that there already are several messages printed in NMI context: WARN_ON(in_nmi()), BUG_ON(in_nmi()), anything being printed out from MCE handlers. These are not easy to avoid. This patch reuses most of the code and makes it generic. It is useful for all messages and architectures that support NMI. The alternative printk_func is set when entering and is reseted when leaving NMI context. It queues IRQ work to copy the messages into the main ring buffer in a safe context. __printk_nmi_flush() copies all available messages and reset the buffer. Then we could use a simple cmpxchg operations to get synchronized with writers. There is also used a spinlock to get synchronized with other flushers. We do not longer use seq_buf because it depends on external lock. It would be hard to make all supported operations safe for a lockless use. It would be confusing and error prone to make only some operations safe. The code is put into separate printk/nmi.c as suggested by Steven Rostedt. It needs a per-CPU buffer and is compiled only on architectures that call nmi_enter(). This is achieved by the new HAVE_NMI Kconfig flag. The are MN10300 and Xtensa architectures. We need to clean up NMI handling there first. Let's do it separately. The patch is heavily based on the draft from Peter Zijlstra, see https://lkml.org/lkml/2015/6/10/327 [arnd@arndb.de: printk-nmi: use %zu format string for size_t] [akpm@linux-foundation.org: min_t->min - all types are size_t here] Signed-off-by: Petr Mladek <pmladek@suse.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Suggested-by: Steven Rostedt <rostedt@goodmis.org> Cc: Jan Kara <jack@suse.cz> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> [arm part] Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: Jiri Kosina <jkosina@suse.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: David Miller <davem@davemloft.net> Cc: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-21exit_thread: accept a task parameter to be exitedJiri Slaby
We need to call exit_thread from copy_process in a fail path. So make it accept task_struct as a parameter. [v2] * s390: exit_thread_runtime_instr doesn't make sense to be called for non-current tasks. * arm: fix the comment in vfp_thread_copy * change 'me' to 'tsk' for task_struct * now we can change only archs that actually have exit_thread [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: "David S. Miller" <davem@davemloft.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Liqin <liqin.linux@gmail.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Chris Zankel <chris@zankel.net> Cc: David Howells <dhowells@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Jonas Bonn <jonas@southpole.se> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Mikael Starvik <starvik@axis.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@arm.linux.org.uk> Cc: Steven Miao <realmz6@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM updates from Russell King: "Changes included in this pull request: - revert pxa2xx-flash back to using ioremap_cached() and switch memremap() to use arch_memremap_wb() - remove pci=firmware command line argument handling - remove unnecessary arm_dma_set_mask() implementation, the generic implementation will do for ARM - removal of the ARM kallsyms "hack" to work around mode switching veneers and vectors located below PAGE_OFFSET - tidy up build system output a little - add L2 cache power management DT bindings - remove duplicated local_irq_disable() in reboot paths - handle AMBA primecell devices better at registration time with PM domains (needed for Samsung SoCs) - ARM specific preparation to support Keystone II kexec" * 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8567/1: cache-uniphier: activate ways for secondary CPUs ARM: 8570/2: Documentation: devicetree: Add PL310 PM bindings ARM: 8569/1: pl2x0: Add OF control of cache power management ARM: 8568/1: reboot: remove duplicated local_irq_disable() ARM: 8566/1: drivers: amba: properly handle devices with power domains ARM: provide arm_has_idmap_alias() helper ARM: kexec: remove 512MB restriction on kexec crashdump ARM: provide improved virt_to_idmap() functionality ARM: kexec: fix crashkernel= handling ARM: 8557/1: specify install, zinstall, and uinstall as PHONY targets ARM: 8562/1: suppress "include/generated/mach-types.h is up to date." ARM: 8553/1: kallsyms: remove --page-offset command line option ARM: 8552/1: kallsyms: remove special lower address limit for CONFIG_ARM ARM: 8555/1: kallsyms: ignore ARM mode switching veneers ARM: 8548/1: dma-mapping: remove arm_dma_set_mask() ARM: 8554/1: kernel: pci: remove pci=firmware command line parameter handling ARM: memremap: implement arch_memremap_wb() memremap: add arch specific hook for MEMREMAP_WB mappings mtd: pxa2xx-flash: switch back from memremap to ioremap_cached ARM: reintroduce ioremap_cached() for creating cached I/O mappings
2016-05-20Merge tag 'perf-core-for-mingo-20160516' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Honour the kernel.perf_event_max_stack knob more precisely by not counting PERF_CONTEXT_{KERNEL,USER} when deciding when to stop adding entries to the perf_sample->ip_callchain[] array (Arnaldo Carvalho de Melo) - Fix identation of 'stalled-backend-cycles' in 'perf stat' (Namhyung Kim) - Update runtime using 'cpu-clock' event in 'perf stat' (Namhyung Kim) - Use 'cpu-clock' for cpu targets in 'perf stat' (Namhyung Kim) - Avoid fractional digits for integer scales in 'perf stat' (Andi Kleen) - Store vdso buildid unconditionally, as it appears in callchains and we're not checking those when creating the build-id table, so we end up not being able to resolve VDSO symbols when doing analysis on a different machine than the one where recording was done, possibly of a different arch even (arm -> x86_64) (He Kuang) Infrastructure changes: - Generalize max_stack sysctl handler, will be used for configuring multiple kernel knobs related to callchains (Arnaldo Carvalho de Melo) Cleanups: - Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE, to stop using open coded strings (Masami Hiramatsu) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-19Merge branches 'amba', 'devel-stable', 'kexec-for-next' and 'misc' into ↵Russell King
for-linus
2016-05-17Merge tag 'pm-4.7-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "The majority of changes go into the cpufreq subsystem this time. To me, quite obviously, the biggest ticket item is the new "schedutil" governor. Interestingly enough, it's the first new cpufreq governor since the beginning of the git era (except for some out-of-the-tree ones). There are two main differences between it and the existing governors. First, it uses the information provided by the scheduler directly for making its decisions, so it doesn't have to track anything by itself. Second, it can invoke drivers (supporting that feature) to adjust CPU performance right away without having to spawn work items to be executed in process context or similar. Currently, the acpi-cpufreq driver is the only one supporting that mode of operation, but then it is used on a large number of systems. The "schedutil" governor as included here is very simple and mostly regarded as a foundation for future work on the integration of the scheduler with CPU power management (in fact, there is work in progress on top of it already). Nevertheless it works and the preliminary results obtained with it are encouraging. There also is some consolidation of CPU frequency management for ARM platforms that can add their machine IDs the the new stub dt-platdev driver now and that will take care of creating the requisite platform device for cpufreq-dt, so it is not necessary to do that in platform code any more. Several ARM platforms are switched over to using this generic mechanism. In addition to that, the intel_pstate driver is now going to respect CPU frequency limits set by the platform firmware (or a BMC) and provided via the ACPI _PPC object. The devfreq subsystem is getting a new "passive" governor for SoCs subsystems that will depend on somebody else to manage their voltage rails and its support for Samsung Exynos SoCs is consolidated. The rest is support for new hardware (Intel Broxton support in intel_idle for one example), bug fixes, optimizations and cleanups in a number of places. Specifics: - New cpufreq "schedutil" governor (making decisions based on CPU utilization information provided by the scheduler and capable of switching CPU frequencies right away if the underlying driver supports that) and support for fast frequency switching in the acpi-cpufreq driver (Rafael Wysocki) - Consolidation of CPU frequency management on ARM platforms allowing them to get rid of some platform-specific boilerplate code if they are going to use the cpufreq-dt driver (Viresh Kumar, Finley Xiao, Marc Gonzalez) - Support for ACPI _PPC and CPU frequency limits in the intel_pstate driver (Srinivas Pandruvada) - Fixes and cleanups in the cpufreq core and generic governor code (Rafael Wysocki, Sai Gurrappadi) - intel_pstate driver optimizations and cleanups (Rafael Wysocki, Philippe Longepe, Chen Yu, Joe Perches) - cpufreq powernv driver fixes and cleanups (Akshay Adiga, Shilpasri Bhat) - cpufreq qoriq driver fixes and cleanups (Jia Hongtao) - ACPI cpufreq driver cleanups (Viresh Kumar) - Assorted cpufreq driver updates (Ashwin Chaugule, Geliang Tang, Javier Martinez Canillas, Paul Gortmaker, Sudeep Holla) - Assorted cpufreq fixes and cleanups (Joe Perches, Arnd Bergmann) - Fixes and cleanups in the OPP (Operating Performance Points) framework, mostly related to OPP sharing, and reorganization of OF-dependent code in it (Viresh Kumar, Arnd Bergmann, Sudeep Holla) - New "passive" governor for devfreq (for SoC subsystems that will rely on someone else for the management of their power resources) and consolidation of devfreq support for Exynos platforms, coding style and typo fixes for devfreq (Chanwoo Choi, MyungJoo Ham) - PM core fixes and cleanups, mostly to make it work better with the generic power domains (genpd) framework, and updates for that framework (Ulf Hansson, Thierry Reding, Colin Ian King) - Intel Broxton support for the intel_idle driver (Len Brown) - cpuidle core optimization and fix (Daniel Lezcano, Dave Gerlach) - ARM cpuidle cleanups (Jisheng Zhang) - Intel Kabylake support for the RAPL power capping driver (Jacob Pan) - AVS (Adaptive Voltage Switching) rockchip-io driver update (Heiko Stuebner) - Updates for the cpupower tool (Arjun Sreedharan, Colin Ian King, Mattia Dongili, Thomas Renninger)" * tag 'pm-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (112 commits) intel_pstate: Clean up get_target_pstate_use_performance() intel_pstate: Use sample.core_avg_perf in get_avg_pstate() intel_pstate: Clarify average performance computation intel_pstate: Avoid unnecessary synchronize_sched() during initialization cpufreq: schedutil: Make default depend on CONFIG_SMP cpufreq: powernv: del_timer_sync when global and local pstate are equal cpufreq: powernv: Move smp_call_function_any() out of irq safe block intel_pstate: Clean up intel_pstate_get() cpufreq: schedutil: Make it depend on CONFIG_SMP cpufreq: governor: Fix handling of special cases in dbs_update() PM / OPP: Move CONFIG_OF dependent code in a separate file cpufreq: intel_pstate: Ignore _PPC processing under HWP cpufreq: arm_big_little: use generic OPP functions for {init, free}_opp_table PM / OPP: add non-OF versions of dev_pm_opp_{cpumask_, }remove_table cpufreq: tango: Use generic platdev driver PM / OPP: pass cpumask by reference cpufreq: Fix GOV_LIMITS handling for the userspace governor cpupower: fix potential memory leak PM / devfreq: style/typo fixes PM / devfreq: exynos: Add the detailed correlation for Exynos5422 bus ..
2016-05-17perf core: Add a 'nr' field to perf_event_callchain_contextArnaldo Carvalho de Melo
We will use it to count how many addresses are in the entry->ip[] array, excluding PERF_CONTEXT_{KERNEL,USER,etc} entries, so that we can really return the number of entries specified by the user via the relevant sysctl, kernel.perf_event_max_contexts, or via the per event perf_event_attr.sample_max_stack knob. This way we keep the perf_sample->ip_callchain->nr meaning, that is the number of entries, be it real addresses or PERF_CONTEXT_ entries, while honouring the max_stack knobs, i.e. the end result will be max_stack entries if we have at least that many entries in a given stack trace. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-s8teto51tdqvlfhefndtat9r@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-17perf core: Pass max stack as a perf_callchain_entry contextArnaldo Carvalho de Melo
This makes perf_callchain_{user,kernel}() receive the max stack as context for the perf_callchain_entry, instead of accessing the global sysctl_perf_event_max_stack. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Bigger kernel side changes: - Add backwards writing capability to the perf ring-buffer code, which is preparation for future advanced features like robust 'overwrite support' and snapshot mode. (Wang Nan) - Add pause and resume ioctls for the perf ringbuffer (Wang Nan) - x86 Intel cstate code cleanups and reorgnization (Thomas Gleixner) - x86 Intel uncore and CPU PMU driver updates (Kan Liang, Peter Zijlstra) - x86 AUX (Intel PT) related enhancements and updates (Alexander Shishkin) - x86 MSR PMU driver enhancements and updates (Huang Rui) - ... and lots of other changes spread out over 40+ commits. Biggest tooling side changes: - 'perf trace' features and enhancements. (Arnaldo Carvalho de Melo) - BPF tooling updates (Wang Nan) - 'perf sched' updates (Jiri Olsa) - 'perf probe' updates (Masami Hiramatsu) - ... plus 200+ other enhancements, fixes and cleanups to tools/ The merge commits, the shortlog and the changelogs contain a lot more details" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (249 commits) perf/core: Disable the event on a truncated AUX record perf/x86/intel/pt: Generate PMI in the STOP region as well perf buildid-cache: Use lsdir() for looking up buildid caches perf symbols: Use lsdir() for the search in kcore cache directory perf tools: Use SBUILD_ID_SIZE where applicable perf tools: Fix lsdir to set errno correctly perf trace: Move seccomp args beautifiers to tools/perf/trace/beauty/ perf trace: Move flock op beautifier to tools/perf/trace/beauty/ perf build: Add build-test for debug-frame on arm/arm64 perf build: Add build-test for libunwind cross-platforms support perf script: Fix export of callchains with recursion in db-export perf script: Fix callchain addresses in db-export perf script: Fix symbol insertion behavior in db-export perf symbols: Add dso__insert_symbol function perf scripting python: Use Py_FatalError instead of die() perf tools: Remove xrealloc and ALLOC_GROW perf help: Do not use ALLOC_GROW in add_cmd_list perf pmu: Make pmu_formats_string to check return value of strbuf perf header: Make topology checkers to check return value of strbuf perf tools: Make alias handler to check return value of strbuf ...
2016-05-16Merge branch 'efi-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI updates from Ingo Molnar: "The main changes in this cycle were: - Drop the unused EFI_SYSTEM_TABLES efi.flags bit and ensure the ARM/arm64 EFI System Table mapping is read-only (Ard Biesheuvel) - Add a comment to explain that one of the code paths in the x86/pat code is only executed for EFI boot (Matt Fleming) - Improve Secure Boot status checks on arm64 and handle unexpected errors (Linn Crosetto) - Remove the global EFI memory map variable 'memmap' as the same information is already available in efi::memmap (Matt Fleming) - Add EFI Memory Attribute table support for ARM/arm64 (Ard Biesheuvel) - Add EFI GOP framebuffer support for ARM/arm64 (Ard Biesheuvel) - Add EFI Bootloader Control driver for storing reboot(2) data in EFI variables for consumption by bootloaders (Jeremy Compostella) - Add Core EFI capsule support (Matt Fleming) - Add EFI capsule char driver (Kweh, Hock Leong) - Unify EFI memory map code for ARM and arm64 (Ard Biesheuvel) - Add generic EFI support for detecting when firmware corrupts CPU status register bits (like IRQ flags) when performing EFI runtime service calls (Mark Rutland) ... and other misc cleanups" * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits) efivarfs: Make efivarfs_file_ioctl() static efi: Merge boolean flag arguments efi/capsule: Move 'capsule' to the stack in efi_capsule_supported() efibc: Fix excessive stack footprint warning efi/capsule: Make efi_capsule_pending() lockless efi: Remove unnecessary (and buggy) .memmap initialization from the Xen EFI driver efi/runtime-wrappers: Remove ARCH_EFI_IRQ_FLAGS_MASK #ifdef x86/efi: Enable runtime call flag checking arm/efi: Enable runtime call flag checking arm64/efi: Enable runtime call flag checking efi/runtime-wrappers: Detect firmware IRQ flag corruption efi/runtime-wrappers: Remove redundant #ifdefs x86/efi: Move to generic {__,}efi_call_virt() arm/efi: Move to generic {__,}efi_call_virt() arm64/efi: Move to generic {__,}efi_call_virt() efi/runtime-wrappers: Add {__,}efi_call_virt() templates efi/arm-init: Reserve rather than unmap the memory map for ARM as well efi: Add misc char driver interface to update EFI firmware x86/efi: Force EFI reboot to process pending capsules efi: Add 'capsule' update support ...
2016-05-16Merge branch 'pm-cpuidle'Rafael J. Wysocki
* pm-cpuidle: cpuidle: Replace ktime_get() with local_clock() drivers: firmware: psci: use const and __initconst for psci_cpuidle_ops soc: qcom: spm: Use const and __initconst for qcom_cpuidle_ops ARM: cpuidle: constify return value of arm_cpuidle_get_ops() ARM: cpuidle: add const qualifier to cpuidle_ops member in structures intel_idle: add BXT support cpuidle: Indicate when a device has been unregistered
2016-05-11Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05ARM: 8572/1: nommu: change memory reserve for the vectorsJean-Philippe Brucker
Commit 19accfd3 (ARM: move vector stubs) moved the vector stubs in an additional page above the base vector one. This change wasn't taken into account by the nommu memreserve. This patch ensures that the kernel won't overwrite any vector stub on nommu. [changed the MPU side too] Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-05-05ARM: 8568/1: reboot: remove duplicated local_irq_disable()Jisheng Zhang
Once entering machine_halt() and machine_restart(), local_irq_disable() is called, and local irq is kept disabled, so the local_irq_disable() at the end of these two functions are not necessary, remove it. Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-05-05Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-03ARM: kexec: remove 512MB restriction on kexec crashdumpRussell King
The real limit is the top of the visible physical address space with the MMU turned off. Hence, we need to limit the crash kernel allocation running-view physical address of the top of the boot-view physical address space. Reviewed-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-05-03ARM: kexec: fix crashkernel= handlingRussell King
When the kernel crashkernel parameter is specified with just a size, we are supposed to allocate a region from RAM to store the crashkernel. However, ARM merely reserves physical address zero with no checking that there is even RAM there. Fix this by lifting similar code from x86, importing it to ARM with the ARM specific parameters added. In the absence of any platform specific information, we allocate the crashkernel region from the first 512MB of physical memory. Update the kdump documentation to reflect this change. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Reviewed-by: Pratyush Anand <panand@redhat.com>
2016-04-28efi/arm/libstub: Make screen_info accessible to the UEFI stubArd Biesheuvel
In order to hand over the framebuffer described by the GOP protocol and discovered by the UEFI stub, make struct screen_info accessible by the stub. This involves allocating a loader data buffer and passing it to the kernel proper via a UEFI Configuration Table, since the UEFI stub executes in the context of the decompressor, and cannot access the kernel's copy of struct screen_info directly. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Borislav Petkov <bp@alien8.de> Cc: David Herrmann <dh.herrmann@gmail.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Jones <pjones@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1461614832-17633-22-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-28ARM/efi: Apply strict permissions for UEFI Runtime Services regionsArd Biesheuvel
Recent UEFI versions expose permission attributes for runtime services memory regions, either in the UEFI memory map or in the separate memory attributes table. This allows the kernel to map these regions with stricter permissions, rather than the RWX permissions that are used by default. So wire this up in our mapping routine. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Leif Lindholm <leif.lindholm@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Jones <pjones@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1461614832-17633-11-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-27perf core: Allow setting up max frame stack depth via sysctlArnaldo Carvalho de Melo
The default remains 127, which is good for most cases, and not even hit most of the time, but then for some cases, as reported by Brendan, 1024+ deep frames are appearing on the radar for things like groovy, ruby. And in some workloads putting a _lower_ cap on this may make sense. One that is per event still needs to be put in place tho. The new file is: # cat /proc/sys/kernel/perf_event_max_stack 127 Chaging it: # echo 256 > /proc/sys/kernel/perf_event_max_stack # cat /proc/sys/kernel/perf_event_max_stack 256 But as soon as there is some event using callchains we get: # echo 512 > /proc/sys/kernel/perf_event_max_stack -bash: echo: write error: Device or resource busy # Because we only allocate the callchain percpu data structures when there is a user, which allows for changing the max easily, its just a matter of having no callchain users at that point. Reported-and-Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/r/20160426002928.GB16708@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-26Merge branch 'cpuidle/4.7' of ↵Rafael J. Wysocki
http://git.linaro.org/people/daniel.lezcano/linux into tmp Pull ARM cpuidle changes for v4.7 from Daniel Lezcano. * 'cpuidle/4.7' of http://git.linaro.org/people/daniel.lezcano/linux: drivers: firmware: psci: use const and __initconst for psci_cpuidle_ops soc: qcom: spm: Use const and __initconst for qcom_cpuidle_ops ARM: cpuidle: constify return value of arm_cpuidle_get_ops() ARM: cpuidle: add const qualifier to cpuidle_ops member in structures
2016-04-20ARM: cpuidle: constify return value of arm_cpuidle_get_ops()Jisheng Zhang
arm_cpuidle_read_ops() just copies '*ops' to cpuidle_ops[cpu], so the structure '*ops' is not modified at all. The comment is also updated accordingly. Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2016-04-19ARM: 8563/1: fix demoting HWCAP_SWPVladimir Murzin
Commit b8c9592 "ARM: 8318/1: treat CPU feature register fields as signed quantities" accidentally altered cpuid register used to demote HWCAP_SWP. ARM ARM says that SyncPrim_instrs bits in ID_ISAR3 should be used with SynchPrim_instrs_frac from ID_ISAR4. So, follow this rule. Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-04-13Merge tag 'v4.6-rc3' into perf/core, to refresh the treeIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-07ARM: 8550/1: protect idiv patching against undefined gcc behaviorNicolas Pitre
It was reported that a kernel with CONFIG_ARM_PATCH_IDIV=y stopped booting when compiled with the upcoming gcc 6. Turns out that turning a function address into a writable array is undefined and gcc 6 decided it was OK to omit the store to the first word of the function while still preserving the store to the second word. Even though gcc 6 is now fixed to behave more coherently, it is a mystery that gcc 4 and gcc 5 actually produce wanted code in the kernel. And in fact the reduced test case to illustrate the issue does indeed break with gcc < 6 as well. In any case, let's guard the kernel against undefined compiler behavior by hiding the nature of the array location as suggested by gcc developers. Reference: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70128 Signed-off-by: Nicolas Pitre <nico@linaro.org> Reported-by: Marcin Juszkiewicz <mjuszkiewicz@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: stable@vger.kernel.org # v4.5 Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-04-07ARM: wire up preadv2 and pwritev2 syscallsRussell King
Wire up the preadv2 and pwritev2 syscalls for ARM. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-04-04ARM: 8554/1: kernel: pci: remove pci=firmware command line parameter handlingLorenzo Pieralisi
According to kernel documentation, the pci=firmware command line parameter is only meant to be used on IXP2000 ARM platforms to prevent the kernel from assigning PCI resources configured by the bootloader. Since the IXP2000 ARM platforms support has been removed from the kernel in commit: commit c65f2abf54a6 ("ARM: remove ixp23xx and ixp2000 platforms") its platforms specific kernel parameters should be removed too from the kernel documentation along with the kernel code currently handling them in that they have just become obsolete. This patch removes the pci=firmware command line parameter handling from ARM code and the related kernel parameters documentation section. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Lennert Buytenhek <kernel@wantstofly.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Lennert Buytenhek <kernel@wantstofly.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Rob Herring <robh@kernel.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-03-31perf/core: Set event's default ::overflow_handler()Wang Nan
Set a default event->overflow_handler in perf_event_alloc() so don't need to check event->overflow_handler in __perf_event_overflow(). Following commits can give a different default overflow_handler. Initial idea comes from Peter: http://lkml.kernel.org/r/20130708121557.GA17211@twins.programming.kicks-ass.net Since the default value of event->overflow_handler is not NULL, existing 'if (!overflow_handler)' checks need to be changed. is_default_overflow_handler() is introduced for this. No extra performance overhead is introduced into the hot path because in the original code we still need to read this handler from memory. A conditional branch is avoided so actually we remove some instructions. Signed-off-by: Wang Nan <wangnan0@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <pi3orama@163.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/r/1459147292-239310-3-git-send-email-wangnan0@huawei.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-25arch, ftrace: for KASAN put hard/soft IRQ entries into separate sectionsAlexander Potapenko
KASAN needs to know whether the allocation happens in an IRQ handler. This lets us strip everything below the IRQ entry point to reduce the number of unique stack traces needed to be stored. Move the definition of __irq_entry to <linux/interrupt.h> so that the users don't need to pull in <linux/ftrace.h>. Also introduce the __softirq_entry macro which is similar to __irq_entry, but puts the corresponding functions to the .softirqentry.text section. Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrey Konovalov <adech.fo@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-21Merge tag 'arm64-perf' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm[64] perf updates from Will Deacon: "I have another mixed bag of ARM-related perf patches here. It's about 25% CPU and 75% interconnect, but with drivers/bus/ languishing without an obvious maintainer or tree, Olof and I agreed to keep all of these PMU patches together. I suspect a whole load of code from drivers/bus/arm-* can be moved under drivers/perf/, so that's on the radar for the future. Summary: - Initial support for ARMv8.1 CPU PMUs - Support for the CPU PMU in Cavium ThunderX - CPU PMU support for systems running 32-bit Linux in secure mode - Support for the system PMU in ARM CCI-550 (Cache Coherent Interconnect)" * tag 'arm64-perf' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (26 commits) drivers/perf: arm_pmu: avoid NULL dereference when not using devicetree arm64: perf: Extend ARMV8_EVTYPE_MASK to include PMCR.LC arm-cci: remove unused variable arm-cci: don't return value from void function arm-cci: make private functions static arm-cci: CoreLink CCI-550 PMU driver arm-cci500: Rearrange PMU driver for code sharing with CCI-550 PMU arm-cci: CCI-500: Work around PMU counter writes arm-cci: Provide hook for writing to PMU counters arm-cci: Add helper to enable PMU without synchornising counters arm-cci: Add routines to save/restore all counters arm-cci: Get the status of a counter arm-cci: write_counter: Remove redundant check arm-cci: Delay PMU counter writes to pmu::pmu_enable arm-cci: Refactor CCI PMU enable/disable methods arm-cci: Group writes to counter arm-cci: fix handling cpumask_any_but return value arm-cci: simplify sysfs attr handling drivers/perf: arm_pmu: implement CPU_PM notifier arm64: dts: Add Cavium ThunderX specific PMU ...
2016-03-19Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM updates from Russell King: "Another mixture of changes this time around: - Split XIP linker file from main linker file to make it more maintainable, and various XIP fixes, and clean up a resulting macro. - Decompressor cleanups from Masahiro Yamada - Avoid printing an error for a missing L2 cache - Remove some duplicated symbols in System.map, and move vectors/stubs back into kernel VMA - Various low priority fixes from Arnd - Updates to allow bus match functions to return negative errno values, touching some drivers and the driver core. Greg has acked these changes. - Virtualisation platform udpates form Jean-Philippe Brucker. - Security enhancements from Kees Cook - Rework some Kconfig dependencies and move PSCI idle management code out of arch/arm into drivers/firmware/psci.c - ARM DMA mapping updates, touching media, acked by Mauro. - Fix places in ARM code which should be using virt_to_idmap() so that Keystone2 can work. - Fix Marvell Tauros2 to work again with non-DT boots. - Provide a delay timer for ARM Orion platforms" * 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (45 commits) ARM: 8546/1: dma-mapping: refactor to fix coherent+cma+gfp=0 ARM: 8547/1: dma-mapping: store buffer information ARM: 8543/1: decompressor: rename suffix_y to compress-y ARM: 8542/1: decompressor: merge piggy.*.S and simplify Makefile ARM: 8541/1: decompressor: drop redundant FORCE in Makefile ARM: 8540/1: decompressor: use clean-files instead of extra-y to clean files ARM: 8539/1: decompressor: drop more unneeded assignments to "targets" ARM: 8538/1: decompressor: drop unneeded assignments to "targets" ARM: 8532/1: uncompress: mark putc as inline ARM: 8531/1: turn init_new_context into an inline function ARM: 8530/1: remove VIRT_TO_BUS ARM: 8537/1: drop unused DEBUG_RODATA from XIP_KERNEL ARM: 8536/1: mm: hide __start_rodata_section_aligned for non-debug builds ARM: 8535/1: mm: DEBUG_RODATA makes no sense with XIP_KERNEL ARM: 8534/1: virt: fix hyp-stub build for pre-ARMv7 CPUs ARM: make the physical-relative calculation more obvious ARM: 8512/1: proc-v7.S: Adjust stack address when XIP_KERNEL ARM: 8411/1: Add default SPARSEMEM settings ARM: 8503/1: clk_register_clkdev: remove format string interface ARM: 8529/1: remove 'i' and 'zi' targets ...
2016-03-16Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "One of the largest releases for KVM... Hardly any generic changes, but lots of architecture-specific updates. ARM: - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems - PMU support for guests - 32bit world switch rewritten in C - various optimizations to the vgic save/restore code. PPC: - enabled KVM-VFIO integration ("VFIO device") - optimizations to speed up IPIs between vcpus - in-kernel handling of IOMMU hypercalls - support for dynamic DMA windows (DDW). s390: - provide the floating point registers via sync regs; - separated instruction vs. data accesses - dirty log improvements for huge guests - bugfixes and documentation improvements. x86: - Hyper-V VMBus hypercall userspace exit - alternative implementation of lowest-priority interrupts using vector hashing (for better VT-d posted interrupt support) - fixed guest debugging with nested virtualizations - improved interrupt tracking in the in-kernel IOAPIC - generic infrastructure for tracking writes to guest memory - currently its only use is to speedup the legacy shadow paging (pre-EPT) case, but in the future it will be used for virtual GPUs as well - much cleanup (LAPIC, kvmclock, MMU, PIT), including ubsan fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (217 commits) KVM: x86: remove eager_fpu field of struct kvm_vcpu_arch KVM: x86: disable MPX if host did not enable MPX XSAVE features arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit arm64: KVM: vgic-v3: Reset LRs at boot time arm64: KVM: vgic-v3: Do not save an LR known to be empty arm64: KVM: vgic-v3: Save maintenance interrupt state only if required arm64: KVM: vgic-v3: Avoid accessing ICH registers KVM: arm/arm64: vgic-v2: Make GICD_SGIR quicker to hit KVM: arm/arm64: vgic-v2: Only wipe LRs on vcpu exit KVM: arm/arm64: vgic-v2: Reset LRs at boot time KVM: arm/arm64: vgic-v2: Do not save an LR known to be empty KVM: arm/arm64: vgic-v2: Move GICH_ELRSR saving to its own function KVM: arm/arm64: vgic-v2: Save maintenance interrupt state only if required KVM: arm/arm64: vgic-v2: Avoid accessing GICH registers KVM: s390: allocate only one DMA page per VM KVM: s390: enable STFLE interpretation only if enabled for the guest KVM: s390: wake up when the VCPU cpu timer expires KVM: s390: step the VCPU timer while in enabled wait KVM: s390: protect VCPU cpu timer with a seqcount KVM: s390: step VCPU cpu timer during kvm_run ioctl ...
2016-03-15Merge branch 'smp-hotplug-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull cpu hotplug updates from Thomas Gleixner: "This is the first part of the ongoing cpu hotplug rework: - Initial implementation of the state machine - Runs all online and prepare down callbacks on the plugged cpu and not on some random processor - Replaces busy loop waiting with completions - Adds tracepoints so the states can be followed" More detailed commentary on this work from an earlier email: "What's wrong with the current cpu hotplug infrastructure? - Asymmetry The hotplug notifier mechanism is asymmetric versus the bringup and teardown. This is mostly caused by the notifier mechanism. - Largely undocumented dependencies While some notifiers use explicitely defined notifier priorities, we have quite some notifiers which use numerical priorities to express dependencies without any documentation why. - Control processor driven Most of the bringup/teardown of a cpu is driven by a control processor. While it is understandable, that preperatory steps, like idle thread creation, memory allocation for and initialization of essential facilities needs to be done before a cpu can boot, there is no reason why everything else must run on a control processor. Before this patch series, bringup looks like this: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu bring the rest up - All or nothing approach There is no way to do partial bringups. That's something which is really desired because we waste e.g. at boot substantial amount of time just busy waiting that the cpu comes to life. That's stupid as we could very well do preparatory steps and the initial IPI for other cpus and then go back and do the necessary low level synchronization with the freshly booted cpu. - Minimal debuggability Due to the notifier based design, it's impossible to switch between two stages of the bringup/teardown back and forth in order to test the correctness. So in many hotplug notifiers the cancel mechanisms are either not existant or completely untested. - Notifier [un]registering is tedious To [un]register notifiers we need to protect against hotplug at every callsite. There is no mechanism that bringup/teardown callbacks are issued on the online cpus, so every caller needs to do it itself. That also includes error rollback. What's the new design? The base of the new design is a symmetric state machine, where both the control processor and the booting/dying cpu execute a well defined set of states. Each state is symmetric in the end, except for some well defined exceptions, and the bringup/teardown can be stopped and reversed at almost all states. So the bringup of a cpu will look like this in the future: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu bring itself up The synchronization step does not require the control cpu to wait. That mechanism can be done asynchronously via a worker or some other mechanism. The teardown can be made very similar, so that the dying cpu cleans up and brings itself down. Cleanups which need to be done after the cpu is gone, can be scheduled asynchronously as well. There is a long way to this, as we need to refactor the notion when a cpu is available. Today we set the cpu online right after it comes out of the low level bringup, which is not really correct. The proper mechanism is to set it to available, i.e. cpu local threads, like softirqd, hotplug thread etc. can be scheduled on that cpu, and once it finished all booting steps, it's set to online, so general workloads can be scheduled on it. The reverse happens on teardown. First thing to do is to forbid scheduling of general workloads, then teardown all the per cpu resources and finally shut it off completely. This patch series implements the basic infrastructure for this at the core level. This includes the following: - Basic state machine implementation with well defined states, so ordering and prioritization can be expressed. - Interfaces to [un]register state callbacks This invokes the bringup/teardown callback on all online cpus with the proper protection in place and [un]installs the callbacks in the state machine array. For callbacks which have no particular ordering requirement we have a dynamic state space, so that drivers don't have to register an explicit hotplug state. If a callback fails, the code automatically does a rollback to the previous state. - Sysfs interface to drive the state machine to a particular step. This is only partially functional today. Full functionality and therefor testability will be achieved once we converted all existing hotplug notifiers over to the new scheme. - Run all CPU_ONLINE/DOWN_PREPARE notifiers on the booting/dying processor: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu wait for boot bring itself up Signal completion to control cpu In a previous step of this work we've done a full tree mechanical conversion of all hotplug notifiers to the new scheme. The balance is a net removal of about 4000 lines of code. This is not included in this series, as we decided to take a different approach. Instead of mechanically converting everything over, we will do a proper overhaul of the usage sites one by one so they nicely fit into the symmetric callback scheme. I decided to do that after I looked at the ugliness of some of the converted sites and figured out that their hotplug mechanism is completely buggered anyway. So there is no point to do a mechanical conversion first as we need to go through the usage sites one by one again in order to achieve a full symmetric and testable behaviour" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) cpu/hotplug: Document states better cpu/hotplug: Fix smpboot thread ordering cpu/hotplug: Remove redundant state check cpu/hotplug: Plug death reporting race rcu: Make CPU_DYING_IDLE an explicit call cpu/hotplug: Make wait for dead cpu completion based cpu/hotplug: Let upcoming cpu bring itself fully up arch/hotplug: Call into idle with a proper state cpu/hotplug: Move online calls to hotplugged cpu cpu/hotplug: Create hotplug threads cpu/hotplug: Split out the state walk into functions cpu/hotplug: Unpark smpboot threads from the state machine cpu/hotplug: Move scheduler cpu_online notifier to hotplug core cpu/hotplug: Implement setup/removal interface cpu/hotplug: Make target state writeable cpu/hotplug: Add sysfs state interface cpu/hotplug: Hand in target state to _cpu_up/down cpu/hotplug: Convert the hotplugged cpu work to a state machine cpu/hotplug: Convert to a state machine for the control processor cpu/hotplug: Add tracepoints ...
2016-03-14Merge branch 'core-resources-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull ram resource handling changes from Ingo Molnar: "Core kernel resource handling changes to support NVDIMM error injection. This tree introduces a new I/O resource type, IORESOURCE_SYSTEM_RAM, for System RAM while keeping the current IORESOURCE_MEM type bit set for all memory-mapped ranges (including System RAM) for backward compatibility. With this resource flag it no longer takes a strcmp() loop through the resource tree to find "System RAM" resources. The new resource type is then used to extend ACPI/APEI error injection facility to also support NVDIMM" * 'core-resources-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ACPI/EINJ: Allow memory error injection to NVDIMM resource: Kill walk_iomem_res() x86/kexec: Remove walk_iomem_res() call with GART type x86, kexec, nvdimm: Use walk_iomem_res_desc() for iomem search resource: Add walk_iomem_res_desc() memremap: Change region_intersects() to take @flags and @desc arm/samsung: Change s3c_pm_run_res() to use System RAM type resource: Change walk_system_ram() to use System RAM type drivers: Initialize resource entry to zero xen, mm: Set IORESOURCE_SYSTEM_RAM to System RAM kexec: Set IORESOURCE_SYSTEM_RAM for System RAM arch: Set IORESOURCE_SYSTEM_RAM flag for System RAM ia64: Set System RAM type and descriptor x86/e820: Set System RAM type and descriptor resource: Add I/O resource descriptor resource: Handle resource flags properly resource: Add System RAM resource type
2016-03-04Merge branches 'amba', 'fixes', 'misc' and 'tauros2' into for-nextRussell King
2016-03-01arch/hotplug: Call into idle with a proper stateThomas Gleixner
Let the non boot cpus call into idle with the corresponding hotplug state, so the hotplug core can handle the further bringup. That's a first step to convert the boot side of the hotplugged cpus to do all the synchronization with the other side through the state machine. For now it'll only start the hotplug thread and kick the full bringup of the cpu. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: Rik van Riel <riel@redhat.com> Cc: Rafael Wysocki <rafael.j.wysocki@intel.com> Cc: "Srivatsa S. Bhat" <srivatsa@mit.edu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: http://lkml.kernel.org/r/20160226182341.614102639@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-29ARM: KVM: Remove unused hyp_pc fieldMarc Zyngier
This field was never populated, and the panic code already does something similar. Delete the related code. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29ARM: KVM: Cleanup asm-offsets.cMarc Zyngier
Since we don't have much assembler left, most of the KVM stuff in asm-offsets.c is now superfluous. Let's get rid of it. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29ARM: KVM: Move GP registers into the CPU context structureMarc Zyngier
Continuing our rework of the CPU context, we now move the GP registers into the CPU context structure. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29ARM: KVM: Move CP15 array into the CPU context structureMarc Zyngier
Continuing our rework of the CPU context, we now move the CP15 array into the CPU context structure. As this causes quite a bit of churn, we introduce the vcpu_cp15() macro that abstract the location of the actual array. This will probably help next time we have to revisit that code. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29ARM: KVM: Move VFP registers to a CPU context structureMarc Zyngier
In order to turn the WS code into something that looks a bit more like the arm64 version, move the VFP registers into a CPU context container for both the host and the guest. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29ARM: KVM: Move the HYP code to its own sectionMarc Zyngier
In order to be able to spread the HYP code into multiple compilation units, adopt a layout similar to that of arm64: - the HYP text is emited in its own section (.hyp.text) - two linker generated symbols are use to identify the boundaries of that section No functionnal change. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-22ARM: 8537/1: drop unused DEBUG_RODATA from XIP_KERNELKees Cook
With CONFIG_DEBUG_RODATA not being sensible under XIP_KERNEL, remove it from the XIP linker script. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-02-22ARM: 8536/1: mm: hide __start_rodata_section_aligned for non-debug buildsArnd Bergmann
The __start_rodata_section_aligned is only referenced by the DEBUG_RODATA code, which is only used when the MMU is enabled, but the definition fails on !MMU builds: arch/arm/kernel/vmlinux.lds:702: undefined symbol `SECTION_SHIFT' referenced in expression This hides the symbol whenever DEBUG_RODATA is disabled. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 64ac2e74f0b2 ("ARM: 8502/1: mm: mark section-aligned portion of rodata NX") Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-02-22ARM: 8534/1: virt: fix hyp-stub build for pre-ARMv7 CPUsJean-Philippe Brucker
ARMv6 CPUs do not have virtualisation extensions, but hyp-stub.S is still included into the image to keep it generic. In order to use ARMv7 instructions during HYP initialisation, add -march=armv7-a flag to hyp-stub's build. On an ARMv6 CPU, __hyp_stub_install returns as soon as it detects that the mode isn't HYP, so we will never reach those instructions. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-02-22ARM: 8534/1: virt: fix hyp-stub build for pre-ARMv7 CPUsJean-Philippe Brucker
ARMv6 CPUs do not have virtualisation extensions, but hyp-stub.S is still included into the image to keep it generic. In order to use ARMv7 instructions during HYP initialisation, add -march=armv7-a flag to hyp-stub's build. On an ARMv6 CPU, __hyp_stub_install returns as soon as it detects that the mode isn't HYP, so we will never reach those instructions. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>