summaryrefslogtreecommitdiff
path: root/arch/powerpc/include
AgeCommit message (Collapse)Author
2015-04-30Revert "powerpc/tm: Abort syscalls in active transactions"Michael Ellerman
This reverts commit feba40362b11341bee6d8ed58d54b896abbd9f84. Although the principle of this change is good, the implementation has a few issues. Firstly we can sometimes fail to abort a syscall because r12 may have been clobbered by C code if we went down the virtual CPU accounting path, or if syscall tracing was enabled. Secondly we have decided that it is safer to abort the syscall even earlier in the syscall entry path, so that we avoid the syscall tracing path when we are transactional. So that we have time to thoroughly test those changes we have decided to revert this for this merge window and will merge the fixed version in the next window. NB. Rather than reverting the selftest we just drop tm-syscall from TEST_PROGS so that it's not run by default. Fixes: feba40362b11 ("powerpc/tm: Abort syscalls in active transactions") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-26Merge tag 'powerpc-4.1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull powerpc fixes from Michael Ellerman: - fix for mm_dec_nr_pmds() from Scott. - fixes for oopses seen with KVM + THP from Aneesh. - build fixes from Aneesh & Shreyas. * tag 'powerpc-4.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: powerpc/mm: Fix build error with CONFIG_PPC_TRANSACTIONAL_MEM disabled powerpc/kvm: Fix ppc64_defconfig + PPC_POWERNV=n build error powerpc/mm/thp: Return pte address if we find trans_splitting. powerpc/mm/thp: Make page table walk safe against thp split/collapse KVM: PPC: Remove page table walk helpers KVM: PPC: Use READ_ONCE when dereferencing pte_t pointer powerpc/hugetlb: Call mm_dec_nr_pmds() in hugetlb_free_pmd_range()
2015-04-26Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull second batch of KVM changes from Paolo Bonzini: "This mostly includes the PPC changes for 4.1, which this time cover Book3S HV only (debugging aids, minor performance improvements and some cleanups). But there are also bug fixes and small cleanups for ARM, x86 and s390. The task_migration_notifier revert and real fix is still pending review, but I'll send it as soon as possible after -rc1" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (29 commits) KVM: arm/arm64: check IRQ number on userland injection KVM: arm: irqfd: fix value returned by kvm_irq_map_gsi KVM: VMX: Preserve host CR4.MCE value while in guest mode. KVM: PPC: Book3S HV: Use msgsnd for signalling threads on POWER8 KVM: PPC: Book3S HV: Translate kvmhv_commence_exit to C KVM: PPC: Book3S HV: Streamline guest entry and exit KVM: PPC: Book3S HV: Use bitmap of active threads rather than count KVM: PPC: Book3S HV: Use decrementer to wake napping threads KVM: PPC: Book3S HV: Don't wake thread with no vcpu on guest IPI KVM: PPC: Book3S HV: Get rid of vcore nap_count and n_woken KVM: PPC: Book3S HV: Move vcore preemption point up into kvmppc_run_vcpu KVM: PPC: Book3S HV: Minor cleanups KVM: PPC: Book3S HV: Simplify handling of VCPUs that need a VPA update KVM: PPC: Book3S HV: Accumulate timing information for real-mode code KVM: PPC: Book3S HV: Create debugfs file for each guest's HPT KVM: PPC: Book3S HV: Add ICP real mode counters KVM: PPC: Book3S HV: Move virtual mode ICP functions to real-mode KVM: PPC: Book3S HV: Convert ICS mutex lock to spin lock KVM: PPC: Book3S HV: Add guest->host real mode completion counters KVM: PPC: Book3S HV: Add helpers for lock/unlock hpte ...
2015-04-21KVM: PPC: Book3S HV: Translate kvmhv_commence_exit to CPaul Mackerras
This replaces the assembler code for kvmhv_commence_exit() with C code in book3s_hv_builtin.c. It also moves the IPI sending code that was in book3s_hv_rm_xics.c into a new kvmhv_rm_send_ipi() function so it can be used by kvmhv_commence_exit() as well as icp_rm_set_vcpu_irq(). Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Use bitmap of active threads rather than countPaul Mackerras
Currently, the entry_exit_count field in the kvmppc_vcore struct contains two 8-bit counts, one of the threads that have started entering the guest, and one of the threads that have started exiting the guest. This changes it to an entry_exit_map field which contains two bitmaps of 8 bits each. The advantage of doing this is that it gives us a bitmap of which threads need to be signalled when exiting the guest. That means that we no longer need to use the trick of setting the HDEC to 0 to pull the other threads out of the guest, which led in some cases to a spurious HDEC interrupt on the next guest entry. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Get rid of vcore nap_count and n_wokenPaul Mackerras
We can tell when a secondary thread has finished running a guest by the fact that it clears its kvm_hstate.kvm_vcpu pointer, so there is no real need for the nap_count field in the kvmppc_vcore struct. This changes kvmppc_wait_for_nap to poll the kvm_hstate.kvm_vcpu pointers of the secondary threads rather than polling vc->nap_count. Besides reducing the size of the kvmppc_vcore struct by 8 bytes, this also means that we can tell which secondary threads have got stuck and thus print a more informative error message. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Move vcore preemption point up into kvmppc_run_vcpuPaul Mackerras
Rather than calling cond_resched() in kvmppc_run_core() before doing the post-processing for the vcpus that we have just run (that is, calling kvmppc_handle_exit_hv(), kvmppc_set_timer(), etc.), we now do that post-processing before calling cond_resched(), and that post- processing is moved out into its own function, post_guest_process(). The reschedule point is now in kvmppc_run_vcpu() and we define a new vcore state, VCORE_PREEMPT, to indicate that that the vcore's runner task is runnable but not running. (Doing the reschedule with the vcore in VCORE_INACTIVE state would be bad because there are potentially other vcpus waiting for the runner in kvmppc_wait_for_exec() which then wouldn't get woken up.) Also, we make use of the handy cond_resched_lock() function, which unlocks and relocks vc->lock for us around the reschedule. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Minor cleanupsPaul Mackerras
* Remove unused kvmppc_vcore::n_busy field. * Remove setting of RMOR, since it was only used on PPC970 and the PPC970 KVM support has been removed. * Don't use r1 or r2 in setting the runlatch since they are conventionally reserved for other things; use r0 instead. * Streamline the code a little and remove the ext_interrupt_to_host label. * Add some comments about register usage. * hcall_try_real_mode doesn't need to be global, and can't be called from C code anyway. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Simplify handling of VCPUs that need a VPA updatePaul Mackerras
Previously, if kvmppc_run_core() was running a VCPU that needed a VPA update (i.e. one of its 3 virtual processor areas needed to be pinned in memory so the host real mode code can update it on guest entry and exit), we would drop the vcore lock and do the update there and then. Future changes will make it inconvenient to drop the lock, so instead we now remove it from the list of runnable VCPUs and wake up its VCPU task. This will have the effect that the VCPU task will exit kvmppc_run_vcpu(), go around the do loop in kvmppc_vcpu_run_hv(), and re-enter kvmppc_run_vcpu(), whereupon it will do the necessary call to kvmppc_update_vpas() and then rejoin the vcore. The one complication is that the runner VCPU (whose VCPU task is the current task) might be one of the ones that gets removed from the runnable list. In that case we just return from kvmppc_run_core() and let the code in kvmppc_run_vcpu() wake up another VCPU task to be the runner if necessary. This all means that the VCORE_STARTING state is no longer used, so we remove it. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Accumulate timing information for real-mode codePaul Mackerras
This reads the timebase at various points in the real-mode guest entry/exit code and uses that to accumulate total, minimum and maximum time spent in those parts of the code. Currently these times are accumulated per vcpu in 5 parts of the code: * rm_entry - time taken from the start of kvmppc_hv_entry() until just before entering the guest. * rm_intr - time from when we take a hypervisor interrupt in the guest until we either re-enter the guest or decide to exit to the host. This includes time spent handling hcalls in real mode. * rm_exit - time from when we decide to exit the guest until the return from kvmppc_hv_entry(). * guest - time spend in the guest * cede - time spent napping in real mode due to an H_CEDE hcall while other threads in the same vcore are active. These times are exposed in debugfs in a directory per vcpu that contains a file called "timings". This file contains one line for each of the 5 timings above, with the name followed by a colon and 4 numbers, which are the count (number of times the code has been executed), the total time, the minimum time, and the maximum time, all in nanoseconds. The overhead of the extra code amounts to about 30ns for an hcall that is handled in real mode (e.g. H_SET_DABR), which is about 25%. Since production environments may not wish to incur this overhead, the new code is conditional on a new config symbol, CONFIG_KVM_BOOK3S_HV_EXIT_TIMING. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Create debugfs file for each guest's HPTPaul Mackerras
This creates a debugfs directory for each HV guest (assuming debugfs is enabled in the kernel config), and within that directory, a file by which the contents of the guest's HPT (hashed page table) can be read. The directory is named vmnnnn, where nnnn is the PID of the process that created the guest. The file is named "htab". This is intended to help in debugging problems in the host's management of guest memory. The contents of the file consist of a series of lines like this: 3f48 4000d032bf003505 0000000bd7ff1196 00000003b5c71196 The first field is the index of the entry in the HPT, the second and third are the HPT entry, so the third entry contains the real page number that is mapped by the entry if the entry's valid bit is set. The fourth field is the guest's view of the second doubleword of the entry, so it contains the guest physical address. (The format of the second through fourth fields are described in the Power ISA and also in arch/powerpc/include/asm/mmu-hash64.h.) Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Add helpers for lock/unlock hpteAneesh Kumar K.V
This adds helper routines for locking and unlocking HPTEs, and uses them in the rest of the code. We don't change any locking rules in this patch. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Remove RMA-related variables from codeAneesh Kumar K.V
We don't support real-mode areas now that 970 support is removed. Remove the remaining details of rma from the code. Also rename rma_setup_done to hpte_setup_done to better reflect the changes. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Add fast real-mode H_RANDOM implementation.Michael Ellerman
Some PowerNV systems include a hardware random-number generator. This HWRNG is present on POWER7+ and POWER8 chips and is capable of generating one 64-bit random number every microsecond. The random numbers are produced by sampling a set of 64 unstable high-frequency oscillators and are almost completely entropic. PAPR defines an H_RANDOM hypercall which guests can use to obtain one 64-bit random sample from the HWRNG. This adds a real-mode implementation of the H_RANDOM hypercall. This hypercall was implemented in real mode because the latency of reading the HWRNG is generally small compared to the latency of a guest exit and entry for all the threads in the same virtual core. Userspace can detect the presence of the HWRNG and the H_RANDOM implementation by querying the KVM_CAP_PPC_HWRNG capability. The H_RANDOM hypercall implementation will only be invoked when the guest does an H_RANDOM hypercall if userspace first enables the in-kernel H_RANDOM implementation using the KVM_CAP_PPC_ENABLE_HCALL capability. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21kvmppc: Implement H_LOGICAL_CI_{LOAD,STORE} in KVMDavid Gibson
On POWER, storage caching is usually configured via the MMU - attributes such as cache-inhibited are stored in the TLB and the hashed page table. This makes correctly performing cache inhibited IO accesses awkward when the MMU is turned off (real mode). Some CPU models provide special registers to control the cache attributes of real mode load and stores but this is not at all consistent. This is a problem in particular for SLOF, the firmware used on KVM guests, which runs entirely in real mode, but which needs to do IO to load the kernel. To simplify this qemu implements two special hypercalls, H_LOGICAL_CI_LOAD and H_LOGICAL_CI_STORE which simulate a cache-inhibited load or store to a logical address (aka guest physical address). SLOF uses these for IO. However, because these are implemented within qemu, not the host kernel, these bypass any IO devices emulated within KVM itself. The simplest way to see this problem is to attempt to boot a KVM guest from a virtio-blk device with iothread / dataplane enabled. The iothread code relies on an in kernel implementation of the virtio queue notification, which is not triggered by the IO hcalls, and so the guest will stall in SLOF unable to load the guest OS. This patch addresses this by providing in-kernel implementations of the 2 hypercalls, which correctly scan the KVM IO bus. Any access to an address not handled by the KVM IO bus will cause a VM exit, hitting the qemu implementation as before. Note that a userspace change is also required, in order to enable these new hcall implementations with KVM_CAP_PPC_ENABLE_HCALL. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [agraf: fix compilation] Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-20Merge tag 'cpumask-next-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull final removal of deprecated cpus_* cpumask functions from Rusty Russell: "This is the final removal (after several years!) of the obsolete cpus_* functions, prompted by their mis-use in staging. With these function removed, all cpu functions should only iterate to nr_cpu_ids, so we finally only allocate that many bits when cpumasks are allocated offstack" * tag 'cpumask-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (25 commits) cpumask: remove __first_cpu / __next_cpu cpumask: resurrect CPU_MASK_CPU0 linux/cpumask.h: add typechecking to cpumask_test_cpu cpumask: only allocate nr_cpumask_bits. Fix weird uses of num_online_cpus(). cpumask: remove deprecated functions. mips: fix obsolete cpumask_of_cpu usage. x86: fix more deprecated cpu function usage. ia64: remove deprecated cpus_ usage. powerpc: fix deprecated CPU_MASK_CPU0 usage. CPU_MASK_ALL/CPU_MASK_NONE: remove from deprecated region. staging/lustre/o2iblnd: Don't use cpus_weight staging/lustre/libcfs: replace deprecated cpus_ calls with cpumask_ staging/lustre/ptlrpc: Do not use deprecated cpus_* functions blackfin: fix up obsolete cpu function usage. parisc: fix up obsolete cpu function usage. tile: fix up obsolete cpu function usage. arm64: fix up obsolete cpu function usage. mips: fix up obsolete cpu function usage. x86: fix up obsolete cpu function usage. ...
2015-04-17Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge third patchbomb from Andrew Morton: - various misc things - a couple of lib/ optimisations - provide DIV_ROUND_CLOSEST_ULL() - checkpatch updates - rtc tree - befs, nilfs2, hfs, hfsplus, fatfs, adfs, affs, bfs - ptrace fixes - fork() fixes - seccomp cleanups - more mmap_sem hold time reductions from Davidlohr * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (138 commits) proc: show locks in /proc/pid/fdinfo/X docs: add missing and new /proc/PID/status file entries, fix typos drivers/rtc/rtc-at91rm9200.c: make IO endian agnostic Documentation/spi/spidev_test.c: fix warning drivers/rtc/rtc-s5m.c: allow usage on device type different than main MFD type .gitignore: ignore *.tar MAINTAINERS: add Mediatek SoC mailing list tomoyo: reduce mmap_sem hold for mm->exe_file powerpc/oprofile: reduce mmap_sem hold for exe_file oprofile: reduce mmap_sem hold for mm->exe_file mips: ip32: add platform data hooks to use DS1685 driver lib/Kconfig: fix up HAVE_ARCH_BITREVERSE help text x86: switch to using asm-generic for seccomp.h sparc: switch to using asm-generic for seccomp.h powerpc: switch to using asm-generic for seccomp.h parisc: switch to using asm-generic for seccomp.h mips: switch to using asm-generic for seccomp.h microblaze: use asm-generic for seccomp.h arm: use asm-generic for seccomp.h seccomp: allow COMPAT sigreturn overrides ...
2015-04-17powerpc: switch to using asm-generic for seccomp.hKees Cook
Switch to using the newly created asm-generic/seccomp.h for the seccomp strict mode syscall definitions. The obsolete sigreturn in COMPAT mode is retained as an override. Remaining definitions are identical, though they incorrectly appeared in uapi, which has been corrected. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-17powerpc/mm/thp: Return pte address if we find trans_splitting.Aneesh Kumar K.V
For THP that is marked trans splitting, we return the pte. This require the callers to handle the pmd_trans_splitting scenario, if they care. All the current callers are either looking at pfn or write_ok, hence we don't need to update them. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-17powerpc/mm/thp: Make page table walk safe against thp split/collapseAneesh Kumar K.V
We can disable a THP split or a hugepage collapse by disabling irq. We do send IPI to all the cpus in the early part of split/collapse, and disabling local irq ensure we don't make progress with split/collapse. If the THP is getting split we return NULL from find_linux_pte_or_hugepte(). For all the current callers it should be ok. We need to be careful if we want to use returned pte_t pointer outside the irq disabled region. W.r.t to THP split, the pfn remains the same, but then a hugepage collapse will result in a pfn change. There are few steps we can take to avoid a hugepage collapse.One way is to take page reference inside the irq disable region. Other option is to take mmap_sem so that a parallel collapse will not happen. We can also disable collapse by taking pmd_lock. Another method used by kvm subsystem is to check whether we had a mmu_notifer update in between using mmu_notifier_retry(). Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-17KVM: PPC: Remove page table walk helpersAneesh Kumar K.V
This patch remove helpers which we had used only once in the code. Limiting page table walk variants help in ensuring that we won't end up with code walking page table with wrong assumptions. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-17KVM: PPC: Use READ_ONCE when dereferencing pte_t pointerAneesh Kumar K.V
pte can get updated from other CPUs as part of multiple activities like THP split, huge page collapse, unmap. We need to make sure we don't reload the pte value again and again for different checks. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-16Merge tag 'powerpc-4.1-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull powerpc updates from Michael Ellerman: - Numerous minor fixes, cleanups etc. - More EEH work from Gavin to remove its dependency on device_nodes. - Memory hotplug implemented entirely in the kernel from Nathan Fontenot. - Removal of redundant CONFIG_PPC_OF by Kevin Hao. - Rewrite of VPHN parsing logic & tests from Greg Kurz. - A fix from Nish Aravamudan to reduce memory usage by clamping nodes_possible_map. - Support for pstore on powernv from Hari Bathini. - Removal of old powerpc specific byte swap routines by David Gibson. - Fix from Vasant Hegde to prevent the flash driver telling you it was flashing your firmware when it wasn't. - Patch from Ben Herrenschmidt to add an OPAL heartbeat driver. - Fix for an oops causing get/put_cpu_var() imbalance in perf by Jan Stancek. - Some fixes for migration from Tyrel Datwyler. - A new syscall to switch the cpu endian by Michael Ellerman. - Large series from Wei Yang to implement SRIOV, reviewed and acked by Bjorn. - A fix for the OPAL sensor driver from Cédric Le Goater. - Fixes to get STRICT_MM_TYPECHECKS building again by Michael Ellerman. - Large series from Daniel Axtens to make our PCI hooks per PHB rather than per machine. - Small patch from Sam Bobroff to explicitly abort non-suspended transactions on syscalls, plus a test to exercise it. - Numerous reworks and fixes for the 24x7 PMU from Sukadev Bhattiprolu. - Small patch to enable the hard lockup detector from Anton Blanchard. - Fix from Dave Olson for missing L2 cache information on some CPUs. - Some fixes from Michael Ellerman to get Cell machines booting again. - Freescale updates from Scott: Highlights include BMan device tree nodes, an MSI erratum workaround, a couple minor performance improvements, config updates, and misc fixes/cleanup. * tag 'powerpc-4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (196 commits) powerpc/powermac: Fix build error seen with powermac smp builds powerpc/pseries: Fix compile of memory hotplug without CONFIG_MEMORY_HOTREMOVE powerpc: Remove PPC32 code from pseries specific find_and_init_phbs() powerpc/cell: Fix iommu breakage caused by controller_ops change powerpc/eeh: Fix crash in eeh_add_device_early() on Cell powerpc/perf: Cap 64bit userspace backtraces to PERF_MAX_STACK_DEPTH powerpc/perf/hv-24x7: Fail 24x7 initcall if create_events_from_catalog() fails powerpc/pseries: Correct memory hotplug locking powerpc: Fix missing L2 cache size in /sys/devices/system/cpu powerpc: Add ppc64 hard lockup detector support oprofile: Disable oprofile NMI timer on ppc64 powerpc/perf/hv-24x7: Add missing put_cpu_var() powerpc/perf/hv-24x7: Break up single_24x7_request powerpc/perf/hv-24x7: Define update_event_count() powerpc/perf/hv-24x7: Whitespace cleanup powerpc/perf/hv-24x7: Define add_event_to_24x7_request() powerpc/perf/hv-24x7: Rename hv_24x7_event_update powerpc/perf/hv-24x7: Move debug prints to separate function powerpc/perf/hv-24x7: Drop event_24x7_request() powerpc/perf/hv-24x7: Use pr_devel() to log message ... Conflicts: tools/testing/selftests/powerpc/Makefile tools/testing/selftests/powerpc/tm/Makefile
2015-04-15Merge branch 'exec_domain_rip_v2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc Pull exec domain removal from Richard Weinberger: "This series removes execution domain support from Linux. The idea behind exec domains was to support different ABIs. The feature was never complete nor stable. Let's rip it out and make the kernel signal handling code less complicated" * 'exec_domain_rip_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc: (27 commits) arm64: Removed unused variable sparc: Fix execution domain removal Remove rest of exec domains. arch: Remove exec_domain from remaining archs arc: Remove signal translation and exec_domain xtensa: Remove signal translation and exec_domain xtensa: Autogenerate offsets in struct thread_info x86: Remove signal translation and exec_domain unicore32: Remove signal translation and exec_domain um: Remove signal translation and exec_domain tile: Remove signal translation and exec_domain sparc: Remove signal translation and exec_domain sh: Remove signal translation and exec_domain s390: Remove signal translation and exec_domain mn10300: Remove signal translation and exec_domain microblaze: Remove signal translation and exec_domain m68k: Remove signal translation and exec_domain m32r: Remove signal translation and exec_domain m32r: Autogenerate offsets in struct thread_info frv: Remove signal translation and exec_domain ...
2015-04-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) Add BQL support to via-rhine, from Tino Reichardt. 2) Integrate SWITCHDEV layer support into the DSA layer, so DSA drivers can support hw switch offloading. From Floria Fainelli. 3) Allow 'ip address' commands to initiate multicast group join/leave, from Madhu Challa. 4) Many ipv4 FIB lookup optimizations from Alexander Duyck. 5) Support EBPF in cls_bpf classifier and act_bpf action, from Daniel Borkmann. 6) Remove the ugly compat support in ARP for ugly layers like ax25, rose, etc. And use this to clean up the neigh layer, then use it to implement MPLS support. All from Eric Biederman. 7) Support L3 forwarding offloading in switches, from Scott Feldman. 8) Collapse the LOCAL and MAIN ipv4 FIB tables when possible, to speed up route lookups even further. From Alexander Duyck. 9) Many improvements and bug fixes to the rhashtable implementation, from Herbert Xu and Thomas Graf. In particular, in the case where an rhashtable user bulk adds a large number of items into an empty table, we expand the table much more sanely. 10) Don't make the tcp_metrics hash table per-namespace, from Eric Biederman. 11) Extend EBPF to access SKB fields, from Alexei Starovoitov. 12) Split out new connection request sockets so that they can be established in the main hash table. Much less false sharing since hash lookups go direct to the request sockets instead of having to go first to the listener then to the request socks hashed underneath. From Eric Dumazet. 13) Add async I/O support for crytpo AF_ALG sockets, from Tadeusz Struk. 14) Support stable privacy address generation for RFC7217 in IPV6. From Hannes Frederic Sowa. 15) Hash network namespace into IP frag IDs, also from Hannes Frederic Sowa. 16) Convert PTP get/set methods to use 64-bit time, from Richard Cochran. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1816 commits) fm10k: Bump driver version to 0.15.2 fm10k: corrected VF multicast update fm10k: mbx_update_max_size does not drop all oversized messages fm10k: reset head instead of calling update_max_size fm10k: renamed mbx_tx_dropped to mbx_tx_oversized fm10k: update xcast mode before synchronizing multicast addresses fm10k: start service timer on probe fm10k: fix function header comment fm10k: comment next_vf_mbx flow fm10k: don't handle mailbox events in iov_event path and always process mailbox fm10k: use separate workqueue for fm10k driver fm10k: Set PF queues to unlimited bandwidth during virtualization fm10k: expose tx_timeout_count as an ethtool stat fm10k: only increment tx_timeout_count in Tx hang path fm10k: remove extraneous "Reset interface" message fm10k: separate PF only stats so that VF does not display them fm10k: use hw->mac.max_queues for stats fm10k: only show actual queues, not the maximum in hardware fm10k: allow creation of VLAN on default vid fm10k: fix unused warnings ...
2015-04-14Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge first patchbomb from Andrew Morton: - arch/sh updates - ocfs2 updates - kernel/watchdog feature - about half of mm/ * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (122 commits) Documentation: update arch list in the 'memtest' entry Kconfig: memtest: update number of test patterns up to 17 arm: add support for memtest arm64: add support for memtest memtest: use phys_addr_t for physical addresses mm: move memtest under mm mm, hugetlb: abort __get_user_pages if current has been oom killed mm, mempool: do not allow atomic resizing memcg: print cgroup information when system panics due to panic_on_oom mm: numa: remove migrate_ratelimited mm: fold arch_randomize_brk into ARCH_HAS_ELF_RANDOMIZE mm: split ET_DYN ASLR from mmap ASLR s390: redefine randomize_et_dyn for ELF_ET_DYN_BASE mm: expose arch_mmap_rnd when available s390: standardize mmap_rnd() usage powerpc: standardize mmap_rnd() usage mips: extract logic for mmap_rnd() arm64: standardize mmap_rnd() usage x86: standardize mmap_rnd() usage arm: factor out mmap ASLR into mmap_rnd ...
2015-04-14mm: fold arch_randomize_brk into ARCH_HAS_ELF_RANDOMIZEKees Cook
The arch_randomize_brk() function is used on several architectures, even those that don't support ET_DYN ASLR. To avoid bulky extern/#define tricks, consolidate the support under CONFIG_ARCH_HAS_ELF_RANDOMIZE for the architectures that support it, while still handling CONFIG_COMPAT_BRK. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Hector Marco-Gisbert <hecmargi@upv.es> Cc: Russell King <linux@arm.linux.org.uk> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: "David A. Long" <dave.long@linaro.org> Cc: Andrey Ryabinin <a.ryabinin@samsung.com> Cc: Arun Chandran <achandran@mvista.com> Cc: Yann Droneaud <ydroneaud@opteya.com> Cc: Min-Hua Chen <orca.chen@gmail.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Alex Smith <alex@alex-smith.me.uk> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Vineeth Vijayan <vvijayan@mvista.com> Cc: Jeff Bailey <jeffbailey@google.com> Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Behan Webster <behanw@converseincode.com> Cc: Ismael Ripoll <iripoll@upv.es> Cc: Jan-Simon Mller <dl9pf@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-14Merge branch 'timers-nohz-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull NOHZ changes from Ingo Molnar: "This tree adds full dynticks support to KVM guests (support the disabling of the timer tick on the guest). The main missing piece was the recognition of guest execution as RCU extended quiescent state and related changes" * 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest context_tracking: Export context_tracking_user_enter/exit context_tracking: Run vtime_user_enter/exit only when state == CONTEXT_USER context_tracking: Add stub context_tracking_is_enabled context_tracking: Generalize context tracking APIs to support user and guest context_tracking: Rename context symbols to prepare for transition state ppc: Remove unused cpp symbols in kvm headers
2015-04-14Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree from Jiri Kosina: "Usual trivial tree updates. Nothing outstanding -- mostly printk() and comment fixes and unused identifier removals" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: goldfish: goldfish_tty_probe() is not using 'i' any more powerpc: Fix comment in smu.h qla2xxx: Fix printks in ql_log message lib: correct link to the original source for div64_u64 si2168, tda10071, m88ds3103: Fix firmware wording usb: storage: Fix printk in isd200_log_config() qla2xxx: Fix printk in qla25xx_setup_mode init/main: fix reset_device comment ipwireless: missing assignment goldfish: remove unreachable line of code coredump: Fix do_coredump() comment stacktrace.h: remove duplicate declaration task_struct smpboot.h: Remove unused function prototype treewide: Fix typo in printk messages treewide: Fix typo in printk messages mod_devicetable: fix comment for match_flags
2015-04-13Merge branch 'next-sriov' of ↵Michael Ellerman
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc into next Merge Richard's work to support SR-IOV on PowerNV. All generic PCI patches acked by Bjorn. Some minor conflicts with Daniel's pci_controller_ops work. Conflicts: arch/powerpc/include/asm/machdep.h arch/powerpc/platforms/powernv/pci-ioda.c
2015-04-13Merge branch 'next-dlpar' of ↵Michael Ellerman
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc into next Merge series from Nathan Fontenot to do memory hotplug in the kernel.
2015-04-12arch: Remove exec_domain from remaining archsRichard Weinberger
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-04-11powerpc: Add ppc64 hard lockup detector supportAnton Blanchard
The hard lockup detector uses a PMU event as a periodic NMI to detect if we are stuck (where stuck means no timer interrupts have occurred). Ben's rework of the ppc64 soft disable code has made ppc64 PMU exceptions a partial NMI. They can get disabled if an external interrupt comes in, but otherwise PMU interrupts will fire in interrupt disabled regions. We disable the hard lockup detector by default for a few reasons: - It breaks userspace event based branches on POWER8. - It is likely to produce false positives on KVM guests. - Since PMCs can only count to 2^31, counting cycles means we might take multiple PMU exceptions per second per hardware thread even if our hard lockup timeout is 10 seconds. It can be enabled via a boot option, or via procfs. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11powerpc/powernv: Add interfaces for flash device accessCyril Bur
This change adds the OPAL interface definitions to allow Linux to read, write and erase from system flash devices. We register platform devices for the flash devices exported by firmware. We clash with the existing opal_flash_init function, which is really for the FSP flash update functionality, so we rename that initcall to opal_flash_update_init(). A future change will add an mtd driver that uses this interface. Changes from Joel Stanley and Jeremy Kerr. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Joel Stanley <joel@jms.id.au> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11powerpc/tm: Abort syscalls in active transactionsSam bobroff
This patch changes the syscall handler to doom (tabort) active transactions when a syscall is made and return immediately without performing the syscall. Currently, the system call instruction automatically suspends an active transaction which causes side effects to persist when an active transaction fails. This does change the kernel's behaviour, but in a way that was documented as unsupported. It doesn't reduce functionality because syscalls will still be performed after tsuspend. It also provides a consistent interface and makes the behaviour of user code substantially the same across powerpc and platforms that do not support suspended transactions (e.g. x86 and s390). Performance measurements using http://ozlabs.org/~anton/junkcode/null_syscall.c indicate the cost of a system call increases by about 0.5%. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Acked-By: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11powerpc: Remove shims for pci_controller_ops operationsDaniel Axtens
Remove shims, patch callsites to use pci_controller_ops versions instead. Also move back the probe mode defines, as explained in the patch for pci_probe_mode. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11powerpc: dart_iommu: optionally populate controller_ops on initDaniel Axtens
If a pci_controller_ops struct is provided to iommu_init_early_dart, populate that with the DMA setup ops, rather than ppc_md. If NULL is provided, populate ppc_md as before. This also patches the call sites for Maple and Power Mac to pass NULL, so existing behaviour is preserved. The benefit of making this optional is that it means we don't have to change dart, Maple and Power Mac over to the controller_ops system in one fell swoop. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11powerpc: Create pci_controller_ops.reset_secondary_bus and shimDaniel Axtens
Add pci_controller_ops.reset_secondary_bus, shadowing ppc_md.pcibios_reset_secondary_bus. Add a shim, and changes the callsites to use the shim. Use pcibios_reset_secondary_bus_shim, as both pcibios_reset_secondary_bus and pci_reset_secondary_bus are already taken. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11powerpc: Create pci_controller_ops.window_alignment and shimDaniel Axtens
Add pci_controller_ops.window_alignment, shadowing ppc_md.pcibios_window_alignment. Add a shim, and changes the callsites to use the shim. Here, we use pci_window_alignment, as pcibios_window_alignment is already taken. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11powerpc: Create pci_controller_ops.enable_device_hook and shimDaniel Axtens
Add pci_controller_ops.enable_device_hook, shadowing ppc_md.pcibios_enable_device_hook. Add a shim, and changes the callsites to use the shim. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11powerpc: Create pci_controller_ops.probe_mode and shimDaniel Axtens
Add pci_controller_ops.probe_mode, shadowing ppc_md.pci_probe_mode. Add a shim, and changes the callsites to use the shim. We also need to move the probe mode defines to pci-bridge.h from pci.h. They are required by the shim in order to return a sensible default. Previously, the were defined in pci.h, but pci.h includes pci-bridge.h before the relevant #defines. This means the definitions are absent if pci.h is included before pci-bridge.h. This occurs in some drivers. So, move the definitons now, and move them back when we remove the shim. Anything that wants the defines would have had to include pci.h, and since pci.h includes pci-bridge.h, nothing will lose access to the defines. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11powerpc: Create pci_controller_ops.dma_bus_setup and shimDaniel Axtens
Add pci_controller_ops.dma_bus_setup, shadowing ppc_md.pci_dma_bus_setup. Add a shim, and changes the callsites to use the shim. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11powerpc: Create pci_controller_ops.dma_dev_setup and shimDaniel Axtens
Introduces the pci_controller_ops structure. Add pci_controller_ops.dma_dev_setup, shadowing ppc_md.pci_dma_dev_setup. Add a shim, and change the callsites to use the shim. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11powerpc: pcibios_enable_device_hook: return bool rather than intDaniel Axtens
pcibios_enable_device_hook returned an int. Every implementation returned either -EINVAL or 0. The return value wasn't propagated by the caller: any non-zero return value caused pcibios_enable_device to return -EINVAL itself. Therefore, make the hook return a bool. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11powerpc: move find_and_init_phbs() to pSeries specific codeDaniel Axtens
Previously, find_and_init_phbs() was used in both PowerNV and pSeries setup. However, since RTAS support has been dropped from PowerNV, we can move it into a platform-specific file. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-10powerpc: Drop return value of smp_ops->probe()Michael Ellerman
smp_ops->probe() is currently supposed to return the number of cpus in the system. The last actual usage of the value was removed in May 2007 in e147ec8f1808 "[POWERPC] Simplify smp_space_timers". We still passed the value around until June 2010 when even that was finally removed in c1aa687d499a "powerpc: Clean up obsolete code relating to decrementer and timebase". So drop that requirement, probe() now returns void, and update all implementations. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-10powerpc: Replace mem_init_done with slab_is_available()Michael Ellerman
We have a powerpc specific global called mem_init_done which is "set on boot once kmalloc can be called". But that's not *quite* true. We set it at the bottom of mem_init(), and rely on the fact that mm_init() calls kmem_cache_init() immediately after that, and nothing is running in parallel. So replace it with the generic and 100% correct slab_is_available(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-10powerpc: Fix compile errors with STRICT_MM_TYPECHECKS enabledMichael Ellerman
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Fix the 32-bit code also] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-07powerpc: Remove the celleb supportMichael Ellerman
The celleb code has seen no actual development for ~7 years. We (maintainers) have no access to test hardware, and it is highly likely the code has bit-rotted. As far as we're aware the hardware was never widely available, and is certainly no longer available, and no one on the list has shown any interest in it over the years. So remove it. If anyone has one and cares please speak up. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Jeremy Kerr <jk@ozlabs.org>
2015-04-07Merge branch 'next-remove-ldst' of ↵Michael Ellerman
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc into next