summaryrefslogtreecommitdiff
path: root/arch/powerpc/include/asm
AgeCommit message (Collapse)Author
2013-07-24powerpc/eeh: Keep PE during hotplugGavin Shan
When we do normal hotplug, the PE (shadow EEH structure) shouldn't be kept around. However, we need to keep it if the hotplug an artifial one caused by EEH errors recovery. Since we remove EEH device through the PCI hook pcibios_release_device(), the flag "purge_pe" passed to various functions is meaningless. So the patch removes the meaningless flag and introduce new flag "EEH_PE_KEEP" to save the PE while doing hotplug during EEH error recovery. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/eeh: Export functions for hotplugGavin Shan
Make some functions public in order to support hotplug on either specific PCI bus or PCI device in future. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc: Access local paca after hard irq disabledTiejun Chen
In hard_irq_disable(), we accessed the PACA before we hard disabled the interrupts, potentially causing a warning as get_paca() will us debug_smp_processor_id(). Move that to after the disabling, and also use local_paca directly rather than get_paca() to avoid several redundant and useless checks. Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/modules: Module CRC relocation fix causes perf issuesAnton Blanchard
Module CRCs are implemented as absolute symbols that get resolved by a linker script. We build an intermediate .o that contains an unresolved symbol for each CRC. genksysms parses this .o, calculates the CRCs and writes a linker script that "resolves" the symbols to the calculated CRC. Unfortunately the ppc64 relocatable kernel sees these CRCs as symbols that need relocating and relocates them at boot. Commit d4703aef (module: handle ppc64 relocating kcrctabs when CONFIG_RELOCATABLE=y) added a hook to reverse the bogus relocations. Part of this patch created a symbol at 0x0: # head -2 /proc/kallsyms 0000000000000000 T reloc_start c000000000000000 T .__start This reloc_start symbol is causing lots of confusion to perf. It thinks reloc_start is a massive function that stretches from 0x0 to 0xc000000000000000 and we get various cryptic errors out of perf, including: problem incrementing symbol count, skipping event This patch removes the reloc_start linker script label and instead defines it as PHYSICAL_START. We also need to wrap it with CONFIG_PPC64 because the ppc32 kernel can set a non zero PHYSICAL_START at compile time and we wouldn't want to subtract it from the CRCs in that case. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc: Add second POWER8 PVR entryMichael Neuling
POWER8 comes with two different PVRs. This patch enables the additional PVR in the cputable. The existing entry (PVR=0x4b) is renamed to POWER8E and the new entry (PVR=0x4d) is given POWER8. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-04Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linuxLinus Torvalds
Pull device tree updates from Grant Likely: "This branch contains the following changes: - Removal of CONFIG_OF_DEVICE, it is always enabled by CONFIG_OF - Remove #ifdef from linux/of_platform.h to increase compiler syntax coverage - Bug fix for address decoding on Bimini and js2x powerpc platforms. - miscellaneous binding changes One note on the above. The binding changes going in from all kinds of different trees has gotten rather out of hand. I picked up some during this cycle, but even going though my tree isn't a great fit. Ian Campbell has prototyped splitting the bindings and .dtb files into a separate repository. The plan is to migrate to using that sometime in the next few kernel releases which should get rid of a lot of the churn on binding docs and .dts files" * tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux: of: Fix address decoding on Bimini and js2x machines of: remove CONFIG_OF_DEVICE usb: chipidea: depend on CONFIG_OF instead of CONFIG_OF_DEVICE of: remove of_platform_driver ibmebus: convert of_platform_driver to platform_driver driver core: move to_platform_driver to platform_device.h mfd: DT bindings for the palmas family MFD ARM: dts: omap3-devkit8000: fix NAND memory binding of/base: fix typos of: remove #ifdef from linux/of_platform.h
2013-07-04Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc updates from Ben Herrenschmidt: "This is the powerpc changes for the 3.11 merge window. In addition to the usual bug fixes and small updates, the main highlights are: - Support for transparent huge pages by Aneesh Kumar for 64-bit server processors. This allows the use of 16M pages as transparent huge pages on kernels compiled with a 64K base page size. - Base VFIO support for KVM on power by Alexey Kardashevskiy - Wiring up of our nvram to the pstore infrastructure, including putting compressed oopses in there by Aruna Balakrishnaiah - Move, rework and improve our "EEH" (basically PCI error handling and recovery) infrastructure. It is no longer specific to pseries but is now usable by the new "powernv" platform as well (no hypervisor) by Gavin Shan. - I fixed some bugs in our math-emu instruction decoding and made it usable to emulate some optional FP instructions on processors with hard FP that lack them (such as fsqrt on Freescale embedded processors). - Support for Power8 "Event Based Branch" facility by Michael Ellerman. This facility allows what is basically "userspace interrupts" for performance monitor events. - A bunch of Transactional Memory vs. Signals bug fixes and HW breakpoint/watchpoint fixes by Michael Neuling. And more ... I appologize in advance if I've failed to highlight something that somebody deemed worth it." * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (156 commits) pstore: Add hsize argument in write_buf call of pstore_ftrace_call powerpc/fsl: add MPIC timer wakeup support powerpc/mpic: create mpic subsystem object powerpc/mpic: add global timer support powerpc/mpic: add irq_set_wake support powerpc/85xx: enable coreint for all the 64bit boards powerpc/8xx: Erroneous double irq_eoi() on CPM IRQ in MPC8xx powerpc/fsl: Enable CONFIG_E1000E in mpc85xx_smp_defconfig powerpc/mpic: Add get_version API both for internal and external use powerpc: Handle both new style and old style reserve maps powerpc/hw_brk: Fix off by one error when validating DAWR region end powerpc/pseries: Support compression of oops text via pstore powerpc/pseries: Re-organise the oops compression code pstore: Pass header size in the pstore write callback powerpc/powernv: Fix iommu initialization again powerpc/pseries: Inform the hypervisor we are using EBB regs powerpc/perf: Add power8 EBB support powerpc/perf: Core EBB support for 64-bit book3s powerpc/perf: Drop MMCRA from thread_struct powerpc/perf: Don't enable if we have zero events ...
2013-07-03Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "On the x86 side, there are some optimizations and documentation updates. The big ARM/KVM change for 3.11, support for AArch64, will come through Catalin Marinas's tree. s390 and PPC have misc cleanups and bugfixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (87 commits) KVM: PPC: Ignore PIR writes KVM: PPC: Book3S PR: Invalidate SLB entries properly KVM: PPC: Book3S PR: Allow guest to use 1TB segments KVM: PPC: Book3S PR: Don't keep scanning HPTEG after we find a match KVM: PPC: Book3S PR: Fix invalidation of SLB entry 0 on guest entry KVM: PPC: Book3S PR: Fix proto-VSID calculations KVM: PPC: Guard doorbell exception with CONFIG_PPC_DOORBELL KVM: Fix RTC interrupt coalescing tracking kvm: Add a tracepoint write_tsc_offset KVM: MMU: Inform users of mmio generation wraparound KVM: MMU: document fast invalidate all mmio sptes KVM: MMU: document fast invalidate all pages KVM: MMU: document fast page fault KVM: MMU: document mmio page fault KVM: MMU: document write_flooding_count KVM: MMU: document clear_spte_count KVM: MMU: drop kvm_mmu_zap_mmio_sptes KVM: MMU: init kvm generation close to mmio wrap-around value KVM: MMU: add tracepoint for check_mmio_spte KVM: MMU: fast invalidate all mmio sptes ...
2013-07-02Merge branch 'sched-mm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull voluntary preemption fixes from Ingo Molnar: "This tree contains a speedup which is achieved through better might_sleep()/might_fault() preemption point annotations for uaccess functions, by Michael S Tsirkin: 1. The only reason uaccess routines might sleep is if they fault. Make this explicit for all architectures. 2. A voluntary preemption point in uaccess functions means compiler can't inline them efficiently, this breaks assumptions that they are very fast and small that e.g. net code seems to make. Remove this preemption point so behaviour matches with what callers assume. 3. Accesses (e.g through socket ops) to kernel memory with KERNEL_DS like net/sunrpc does will never sleep. Remove an unconditinal might_sleep() in the might_fault() inline in kernel.h (used when PROVE_LOCKING is not set). 4. Accesses with pagefault_disable() return EFAULT but won't cause caller to sleep. Check for that and thus avoid might_sleep() when PROVE_LOCKING is set. These changes offer a nice speedup for CONFIG_PREEMPT_VOLUNTARY=y kernels, here's a network bandwidth measurement between a virtual machine and the host: before: incoming: 7122.77 Mb/s outgoing: 8480.37 Mb/s after: incoming: 8619.24 Mb/s [ +21.0% ] outgoing: 9455.42 Mb/s [ +11.5% ] I kept these changes in a separate tree, separate from scheduler changes, because it's a mixed MM and scheduler topic" * 'sched-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: mm, sched: Allow uaccess in atomic with pagefault_disable() mm, sched: Drop voluntary schedule from might_fault() x86: uaccess s/might_sleep/might_fault/ tile: uaccess s/might_sleep/might_fault/ powerpc: uaccess s/might_sleep/might_fault/ mn10300: uaccess s/might_sleep/might_fault/ microblaze: uaccess s/might_sleep/might_fault/ m32r: uaccess s/might_sleep/might_fault/ frv: uaccess s/might_sleep/might_fault/ arm64: uaccess s/might_sleep/might_fault/ asm-generic: uaccess s/might_sleep/might_fault/
2013-07-02Merge branch 'sched-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes: - load-calculation cleanups and improvements, by Alex Shi - various nohz related tidying up of statisics, by Frederic Weisbecker - factor out /proc functions to kernel/sched/proc.c, by Paul Gortmaker - simplify the RT policy scheduler, by Kirill Tkhai - various fixes and cleanups" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits) sched/debug: Remove CONFIG_FAIR_GROUP_SCHED mask sched/debug: Fix formatting of /proc/<PID>/sched sched: Fix typo in struct sched_avg member description sched/fair: Fix typo describing flags in enqueue_entity sched/debug: Add load-tracking statistics to task sched: Change get_rq_runnable_load() to static and inline sched/tg: Remove tg.load_weight sched/cfs_rq: Change atomic64_t removed_load to atomic_long_t sched/tg: Use 'unsigned long' for load variable in task group sched: Change cfs_rq load avg to unsigned long sched: Consider runnable load average in move_tasks() sched: Compute runnable load avg in cpu_load and cpu_avg_load_per_task sched: Update cpu load after task_tick sched: Fix sleep time double accounting in enqueue entity sched: Set an initial value of runnable avg for new forked task sched: Move a few runnable tg variables into CONFIG_SMP Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking" sched: Don't mix use of typedef ctl_table and struct ctl_table sched: Remove WARN_ON(!sd) from init_sched_groups_power() sched: Fix memory leakage in build_sched_groups() ...
2013-07-02Merge branch 'core-mutexes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull WW mutex support from Ingo Molnar: "This tree adds support for wound/wait style locks, which the graphics guys would like to make use of in the TTM graphics subsystem. Wound/wait mutexes are used when other multiple lock acquisitions of a similar type can be done in an arbitrary order. The deadlock handling used here is called wait/wound in the RDBMS literature: The older tasks waits until it can acquire the contended lock. The younger tasks needs to back off and drop all the locks it is currently holding, ie the younger task is wounded. See this LWN.net description of W/W mutexes: https://lwn.net/Articles/548909/ The comments there outline specific usecases for this facility (which have already been implemented for the DRM tree). Also see Documentation/ww-mutex-design.txt for more details" * 'core-mutexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking-selftests: Handle unexpected failures more strictly mutex: Add more w/w tests to test EDEADLK path handling mutex: Add more tests to lib/locking-selftest.c mutex: Add w/w tests to lib/locking-selftest.c mutex: Add w/w mutex slowpath debugging mutex: Add support for wound/wait style locks arch: Make __mutex_fastpath_lock_retval return whether fastpath succeeded or not
2013-07-02Merge tag 'tty-3.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial updates from Greg KH: "Here is the big TTY / Serial driver merge for 3.11-rc1. It's not all that big, nothing major changed in the tty api, which is a nice change, just a number of serial driver fixes and updates and new drivers, along with some n_tty fixes to help resolve some reported issues. All of these have been in the linux-next releases for a while, with the exception of the last revert patch, which was reported this past weekend by two different people as being needed." * tag 'tty-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (51 commits) Revert "serial: 8250_pci: add support for another kind of NetMos Technology PCI 9835 Multi-I/O Controller" pch_uart: Add uart_clk selection for the MinnowBoard tty: atmel_serial: prepare clk before calling enable tty: Reset itty for other pty n_tty: Buffer work should not reschedule itself n_tty: Fix unsafe update of available buffer space n_tty: Untangle read completion variables n_tty: Encapsulate minimum_to_wake within N_TTY serial: omap: Fix device tree based PM runtime serial: imx: Fix serial clock unbalance serial/mpc52xx_uart: fix kernel panic when system reboot serial: mfd: Add sysrq support serial: imx: enable the clocks for console tty: serial: add Freescale lpuart driver support serial: imx: Improve Kconfig text serial: imx: Allow module build serial: imx: Fix warning when !CONFIG_SERIAL_IMX_CONSOLE tty/serial/sirf: fix error propagation in sirfsoc_uart_probe() serial: omap: fix potential NULL pointer dereference in serial_omap_runtime_suspend() tty: serial: Enable uartlite for ARM zynq ...
2013-07-02Merge remote-tracking branch 'agust/next' into nextBenjamin Herrenschmidt
From Anatolij: "There are small cleanups and fixes for mpc512x common code, mpc512x_defconfig updates and soft reboot support for mpc5125 based boards."
2013-07-02Merge remote-tracking branch 'scott/next' into nextBenjamin Herrenschmidt
Merge Freescale updates
2013-07-01powerpc/mpic: create mpic subsystem objectDongsheng.wang@freescale.com
Register a mpic subsystem at /sys/devices/system/ Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-07-01powerpc/mpic: add global timer supportDongsheng.wang@freescale.com
The MPIC global timer is a hardware timer inside the Freescale PIC complying with OpenPIC standard. When the specified interval times out, the hardware timer generates an interrupt. The driver currently is only tested on fsl chip, but it can potentially support other global timers complying to OpenPIC standard. The two independent groups of global timer on fsl chip, group A and group B, are identical in their functionality, except that they appear at different locations within the PIC register map. The hardware timer can be cascaded to create timers larger than the default 31-bit global timers. Timer cascade fields allow configuration of up to two 63-bit timers. But These two groups of timers cannot be cascaded together. It can be used as a wakeup source for low power modes. It also could be used as periodical timer for protocols, drivers and etc. Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com> Signed-off-by: Li Yang <leoli@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-07-01powerpc/mpic: Add get_version API both for internal and external useHongtao Jia
MPIC version is useful information for both mpic_alloc() and mpic_init(). The patch provide an API to get MPIC version for reusing the code. Also, some other IP block may need MPIC version for their own use. The API for external use is also provided. Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com> Signed-off-by: Li Yang <leoli@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-07-01Merge tag 'v3.10' into sched/coreIngo Molnar
Merge in a recent upstream commit: c2853c8df57f include/linux/math64.h: add div64_ul() because: 72a4cf20cb71 sched: Change cfs_rq load avg to unsigned long relies on it. [ We don't rebase sched/core for this, because the handful of followup commits after the broken commit are not behavioral changes so are unlikely to be needed during bisection. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-07-01Merge tag 'v3.10' into nextBenjamin Herrenschmidt
Merge 3.10 in order to get some of the last minute powerpc changes, resolve conflicts and add additional fixes on top of them.
2013-07-01powerpc/pseries: Inform the hypervisor we are using EBB regsMichael Ellerman
On LPAR systems we need to inform the hypervisor that we are using the EBB registers. We do this by setting a bit in the Virtual Processor Area (VPA) - formerly known as the lppaca. For now we do this always, ie. we do not dynamically enable/disable. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01powerpc/perf: Core EBB support for 64-bit book3sMichael Ellerman
Add support for EBB (Event Based Branches) on 64-bit book3s. See the included documentation for more details. EBBs are a feature which allows the hardware to branch directly to a specified user space address when a PMU event overflows. This can be used by programs for self-monitoring with no kernel involvement in the inner loop. Most of the logic is in the generic book3s code, primarily to avoid a proliferation of PMU callbacks. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01powerpc/perf: Drop MMCRA from thread_structMichael Ellerman
In commit 59affcd "Context switch more PMU related SPRs" I added more PMU SPRs to thread_struct, later modified in commit b11ae95. To add insult to injury it turns out we don't need to switch MMCRA as it's only user readable, and the value is recomputed by the PMU code. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01powerpc/perf: Freeze PMC5/6 if we're not using themMichael Ellerman
On Power8 we can freeze PMC5 and 6 if we're not using them. Normally they run all the time. As noticed by Anshuman, we should unfreeze them when we disable the PMU as there are legacy tools which expect them to run all the time. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> CC: <stable@vger.kernel.org> [v3.10] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01powerpc: Remove KVMTEST from RELON exception handlersMichael Ellerman
KVMTEST is a macro which checks whether we are taking an exception from guest context, if so we branch out of line and eventually call into the KVM code to handle the switch. When running real guests on bare metal (HV KVM) the hardware ensures that we never take a relocation on exception when transitioning from guest to host. For PR KVM we disable relocation on exceptions ourself in kvmppc_core_init_vm(), as of commit a413f47 "Disable relocation on exceptions whenever PR KVM is active". So convert all the RELON macros to use NOTEST, and drop the remaining KVM_HANDLER() definitions we have for 0xe40 and 0xe80. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> CC: <stable@vger.kernel.org> [v3.9+] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01powerpc: Delete __cpuinit usage from all usersPaul Gortmaker
The __cpuinit type of throwaway sections might have made sense some time ago when RAM was more constrained, but now the savings do not offset the cost and complications. For example, the fix in commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time") is a good example of the nasty type of bugs that can be created with improper use of the various __init prefixes. After a discussion on LKML[1] it was decided that cpuinit should go the way of devinit and be phased out. Once all the users are gone, we can then finally remove the macros themselves from linux/init.h. This removes all the powerpc uses of the __cpuinit macros. There are no __CPUINIT users in assembly files in powerpc. [1] https://lkml.org/lkml/2013/5/20/589 Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Josh Boyer <jwboyer@gmail.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01powerpc/iommu: Remove unused pci_iommu_init() and pci_direct_iommu_init()Bjorn Helgaas
pci_iommu_init() and pci_direct_iommu_init() are not referenced anywhere, so remove them. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01powerpc/eeh: Avoid build warningsGavin Shan
The patch is for avoiding following build warnings: The function .pnv_pci_ioda_fixup() references the function __init .eeh_init(). This is often because .pnv_pci_ioda_fixup lacks a __init The function .pnv_pci_ioda_fixup() references the function __init .eeh_addr_cache_build(). This is often because .pnv_pci_ioda_fixup lacks a __init Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-30KVM: PPC: Book3S PR: Allow guest to use 1TB segmentsPaul Mackerras
With this, the guest can use 1TB segments as well as 256MB segments. Since we now have the situation where a single emulated guest segment could correspond to multiple shadow segments (as the shadow segments are still 256MB segments), this adds a new kvmppc_mmu_flush_segment() to scan for all shadow segments that need to be removed. This restructures the guest HPT (hashed page table) lookup code to use the correct hashing and matching functions for HPTEs within a 1TB segment. We use the standard hpt_hash() function instead of open-coding the hash calculation, and we use HPTE_V_COMPARE() with an AVPN value that has the B (segment size) field included. The calculation of avpn is done a little earlier since it doesn't change in the loop starting at the do_second label. The computation in kvmppc_mmu_book3s_64_esid_to_vsid() changes so that it returns a 256MB VSID even if the guest SLB entry is a 1TB entry. This is because the users of this function are creating 256MB SLB entries. We set a new VSID_1T flag so that entries created from 1T segments don't collide with entries from 256MB segments. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-29consolidate io_remap_pfn_range definitionsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-26arch: Make __mutex_fastpath_lock_retval return whether fastpath succeeded or notMaarten Lankhorst
This will allow me to call functions that have multiple arguments if fastpath fails. This is required to support ticket mutexes, because they need to be able to pass an extra argument to the fail function. Originally I duplicated the functions, by adding __mutex_fastpath_lock_retval_arg. This ended up being just a duplication of the existing function, so a way to test if fastpath was called ended up being better. This also cleaned up the reservation mutex patch some by being able to call an atomic_set instead of atomic_xchg, and making it easier to detect if the wrong unlock function was previously used. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: robclark@gmail.com Cc: rostedt@goodmis.org Cc: daniel@ffwll.ch Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20130620113105.4001.83929.stgit@patser Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-25powerpc/eeh: Remove eeh_mutexGavin Shan
Originally, eeh_mutex was introduced to protect the PE hierarchy tree and the attached EEH devices because EEH core was possiblly running with multiple threads to access the PE hierarchy tree. However, we now have only one kthread in EEH core. So we needn't the eeh_mutex and just remove it. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-25powerpc/mm: Fix build warnings with CONFIG_TRANSPARENT_HUGEPAGE disabledNathan Fontenot
Building with CONFIG_TRANSPARENT_HUGEPAGE disabled causes the following build wearnings; powerpc/arch/powerpc/include/asm/mmu-hash64.h: In function ‘__hash_page_thp’: powerpc/arch/powerpc/include/asm/mmu-hash64.h:354: warning: no return statement in function returning non-void This patch adds a return -1 to the static inline for __hash_page_thp() to correct the warnings. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc: Optimize hugepage invalidateAneesh Kumar K.V
Hugepage invalidate involves invalidating multiple hpte entries. Optimize the operation using H_BULK_REMOVE on lpar platforms. On native, reduce the number of tlb flush. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc/THP: Enable THP on PPC64Aneesh Kumar K.V
We enable only if the we support 16MB page size. Reviewed-by: David Gibson <dwg@au1.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc/THP: Add code to handle HPTE faults for hugepagesAneesh Kumar K.V
The deposted PTE page in the second half of the PMD table is used to track the state on hash PTEs. After updating the HPTE, we mark the coresponding slot in the deposted PTE page valid. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc/kvm: Handle transparent hugepage in KVMAneesh Kumar K.V
We can find pte that are splitting while walking page tables. Return None pte in that case. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc: Replace find_linux_pte with find_linux_pte_or_hugepteAneesh Kumar K.V
Replace find_linux_pte with find_linux_pte_or_hugepte and explicitly document why we don't need to handle transparent hugepages at callsites. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc: move find_linux_pte_or_hugepte and gup_hugepte to common codeAneesh Kumar K.V
We will use this in the later patch for handling THP pages Reviewed-by: David Gibson <dwg@au1.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc/THP: Implement transparent hugepages for ppc64Aneesh Kumar K.V
We now have pmd entries covering 16MB range and the PMD table double its original size. We use the second half of the PMD table to deposit the pgtable (PTE page). The depoisted PTE page is further used to track the HPTE information. The information include [ secondary group | 3 bit hidx | valid ]. We use one byte per each HPTE entry. With 16MB hugepage and 64K HPTE we need 256 entries and with 4K HPTE we need 4096 entries. Both will fit in a 4K PTE page. On hugepage invalidate we need to walk the PTE page and invalidate all valid HPTEs. This patch implements necessary arch specific functions for THP support and also hugepage invalidate logic. These PMD related functions are intentionally kept similar to their PTE counter-part. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc/THP: Double the PMD table size for THPAneesh Kumar K.V
THP code does PTE page allocation along with large page request and deposit them for later use. This is to ensure that we won't have any failures when we split hugepages to regular pages. On powerpc we want to use the deposited PTE page for storing hash pte slot and secondary bit information for the HPTEs. We use the second half of the pmd table to save the deposted PTE page. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc/mm: handle hugepage size correctly when invalidating hpte entriesAneesh Kumar K.V
If a hash bucket gets full, we "evict" a more/less random entry from it. When we do that we don't invalidate the TLB (hpte_remove) because we assume the old translation is still technically "valid". This implies that when we are invalidating or updating pte, even if HPTE entry is not valid we should do a tlb invalidate. With hugepages, we need to pass the correct actual page size value for tlb invalidation. This change update the patch 0608d692463598c1d6e826d9dd7283381b4f246c "powerpc/mm: Always invalidate tlb on hpte invalidate and update" to handle transparent hugepages correctly. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powernv/opal: Notifier for OPAL eventsGavin Shan
This patch implements a notifier to receive a notification on OPAL event mask changes. The notifier is only called as a result of an OPAL interrupt, which will happen upon reception of FSP messages or PCI errors. Any event mask change detected as a result of opal_poll_events() will not result in a notifier call. [benh: changelog] Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Sync OPAL API with firmwareGavin Shan
The patch synchronizes OPAL APIs between kernel and firmware. Also, we starts to replace opal_pci_get_phb_diag_data() with the similar opal_pci_get_phb_diag_data2() and the former OPAL API would return OPAL_UNSUPPORTED from now on. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: EEH core to handle special eventGavin Shan
On PowerNV platform, the EEH event caused by interrupt won't have binding PE. The patch enables EEH core to handle the special event. To avoid the current logic we have, The eeh_handle_event() is renamed to eeh_handle_normal_event(), and the eeh_handle_special_event() is introduced. The function eeh_handle_event() dispatches to above two functions according to the input parameter. Besides, new backend "next_error" added to eeh_ops and it's expected to have following return values: 4 - Dead IOC 3 - Dead PHB 2 - Fenced PHB 1 - Frozen PE 0 - No error found Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Export confirm_error_lockGavin Shan
An EEH event is created and queued to the event queue for each ingress EEH error. When there're mutiple EEH errors, we need serialize the process to keep consistent PE state (flags). The spinlock "confirm_error_lock" was introduced for the purpose. We'll inject EEH event upon error reporting interrupts on PowerNV platform. So we export the spinlock for that to use for consistent PE state. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Allow to purge EEH eventsGavin Shan
On PowerNV platform, we might run into the situation where subsequent events are duplicated events of former one, which is being processed. For the case, we need the function implemented by the patch to purge EEH events accordingly. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Trace time on first error for PEGavin Shan
We're not expecting that one specific PE got frozen for over 5 times in last hour. Otherwise, the PE will be removed from the system upon newly coming EEH errors. The patch introduces time stamp to trace the first error on specific PE in last hour and function to update that accordingly. Besides, the time stamp is recovered during PE hotplug path as we did for frozen count. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Single kthread to handle eventsGavin Shan
We possiblly have multiple kthreads running for multiple EEH errors (events) and use one spinlock to make the process of handling those EEH events serialized. That's unnecessary and the patch creates only one kthread, which is started during EEH core initialization time in eeh_init(). A new semaphore introduced to count the number of existing EEH events in the queue and the kthread waiting on the semaphore. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: EEH post initialization operationGavin Shan
The patch adds new EEH operation post_init. It's used to notify the platform that EEH core has completed the EEH probe. By that, PowerNV platform starts to use the services supplied by EEH functionality. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Make eeh_init() publicGavin Shan
For EEH on PowerNV platform, we will do EEH probe based on the real PCI devices. The PCI devices are available after PCI probe. So we have to call eeh_init() explicitly on PowerNV platform after PCI probe. The patch also does EEH probe for PowerNV platform in eeh_init(). Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>