summaryrefslogtreecommitdiff
path: root/arch/powerpc/kernel
AgeCommit message (Collapse)Author
2013-05-31powerpc/cputable: Fix typo on P7+ cputable entryWill Schmidt
Fix a typo in setting COMMON_USER2_POWER7 bits to .cpu_user_features2 cpu specs table. Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-31powerpc/pci: Remove the unused variables in pci_process_bridge_OF_rangesKevin Hao
The codes which ever used these two variables have gone. Throw away them too. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-31powerpc/pci: Remove the stale comments of pci_process_bridge_OF_rangesKevin Hao
These comments already don't apply to the current code. So just remove them. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-31powerpc/32bit:Store temporary result in r0 instead of r8Priyanka Jain
Commit a9c4e541ea9b22944da356f2a9258b4eddcc953b "powerpc/kprobe: Complete kprobe and migrate exception frame" introduced a regression: While returning from exception handling in case of PREEMPT enabled, _TIF_NEED_RESCHED bit is checked in TI_FLAGS (thread_info flag) of current task. Only if this bit is set, it should continue with the process of calling preempt_schedule_irq() to schedule highest priority task if available. Current code assumes that r8 contains TI_FLAGS and check this for _TIF_NEED_RESCHED, but as r8 is modified in the code which executes before this check, r8 no longer contains the expected TI_FLAGS information. As a result check for comparison with _TIF_NEED_RESCHED was failing even if NEED_RESCHED bit is set in the current thread_info flag. Due to this, preempt_schedule_irq() and in turn scheduler was not getting called even if highest priority task is ready for execution. So, store temporary results in r0 instead of r8 to prevent r8 from getting modified as subsequent code is dependent on its value. Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com> CC: <stable@vger.kernel.org> [v3.7+] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-31powerpc/pseries: Kill all prefetch streams on context switchMichael Neuling
On context switch, we should have no prefetch streams leak from one userspace process to another. This frees up prefetch resources for the next process. Based on patch from Milton Miller. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-31powerpc/cputable: Fix oprofile_cpu_type on power8Nishanth Aravamudan
Maynard informed me that neither the oprofile kernel module nor oprofile userspace has been updated to support that "legacy" oprofile module interface for power8, which is indicated by "ppc64/power8." This results in no samples. The solution is to default to the "timer" type, instead. The raw entry also should be updated, as "ppc64/ibm-compat-v1" indicates to oprofile userspace to use "compatibility events" which are obsolete in ISA 2.07. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-31powerpc/tm: Fix userspace stack corruption on signal delivery for active ↵Michael Neuling
transactions When in an active transaction that takes a signal, we need to be careful with the stack. It's possible that the stack has moved back up after the tbegin. The obvious case here is when the tbegin is called inside a function that returns before a tend. In this case, the stack is part of the checkpointed transactional memory state. If we write over this non transactionally or in suspend, we are in trouble because if we get a tm abort, the program counter and stack pointer will be back at the tbegin but our in memory stack won't be valid anymore. To avoid this, when taking a signal in an active transaction, we need to use the stack pointer from the checkpointed state, rather than the speculated state. This ensures that the signal context (written tm suspended) will be written below the stack required for the rollback. The transaction is aborted becuase of the treclaim, so any memory written between the tbegin and the signal will be rolled back anyway. For signals taken in non-TM or suspended mode, we use the normal/non-checkpointed stack pointer. Tested with 64 and 32 bit signals Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: <stable@vger.kernel.org> # v3.9 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-31powerpc/tm: Abort on emulation and alignment faultsMichael Neuling
If we are emulating an instruction inside an active user transaction that touches memory, the kernel can't emulate it as it operates in transactional suspend context. We need to abort these transactions and send them back to userspace for the hardware to rollback. We can service these if the user transaction is in suspend mode, since the kernel will operate in the same suspend context. This adds a check to all alignment faults and to specific instruction emulations (only string instructions for now). If the user process is in an active (non-suspended) transaction, we abort the transaction go back to userspace allowing the HW to roll back the transaction and tell the user of the failure. This also adds new tm abort cause codes to report the reason of the persistent error to the user. Crappy test case here http://neuling.org/devel/junkcode/aligntm.c Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: <stable@vger.kernel.org> # v3.9 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-24powerpc: Make radeon 32-bit MSI quirk work on powernvBenjamin Herrenschmidt
This moves the quirk itself to pci_64.c as to get built on all ppc64 platforms (the only ones with a pci_dn), factors the two implementations of get_pdn() into a single pci_get_dn() and use the quirk to do 32-bit MSIs on IODA based powernv platforms. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-24powerpc: Context switch more PMU related SPRsMichael Ellerman
In commit 9353374 "Context switch the new EBB SPRs" we added support for context switching some new EBB SPRs. However despite four of us signing off on that patch we missed some. To be fair these are not actually new SPRs, but they are now potentially user accessible so need to be context switched. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-24powerpc/pci: Fix bogus message at boot about empty memory resourcesBenjamin Herrenschmidt
The message is only meant to be displayed if resource 0 is empty, but was displayed if any is. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-24powerpc: Fix TLB cleanup at boot on POWER8Benjamin Herrenschmidt
The TLB has 512 congruence classes (2048 entries 4 way set associative) while P7 had 128 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-14Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fixes from Benjamin Herrenschmidt: "This is mostly bug fixes (some of them regressions, some of them I deemed worth merging now) along with some patches from Li Zhong hooking up the new context tracking stuff (for the new full NO_HZ)" * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (25 commits) powerpc: Set show_unhandled_signals to 1 by default powerpc/perf: Fix setting of "to" addresses for BHRB powerpc/pmu: Fix order of interpreting BHRB target entries powerpc/perf: Move BHRB code into CONFIG_PPC64 region powerpc: select HAVE_CONTEXT_TRACKING for pSeries powerpc: Use the new schedule_user API on userspace preemption powerpc: Exit user context on notify resume powerpc: Exception hooks for context tracking subsystem powerpc: Syscall hooks for context tracking subsystem powerpc/booke64: Fix kernel hangs at kernel_dbg_exc powerpc: Fix irq_set_affinity() return values powerpc: Provide __bswapdi2 powerpc/powernv: Fix starting of secondary CPUs on OPALv2 and v3 powerpc/powernv: Detect OPAL v3 API version powerpc: Fix MAX_STACK_TRACE_ENTRIES too low warning again powerpc: Make CONFIG_RTAS_PROC depend on CONFIG_PROC_FS powerpc: Bring all threads online prior to migration/hibernation powerpc/rtas_flash: Fix validate_flash buffer overflow issue powerpc/kexec: Fix kexec when using VMX optimised memcpy powerpc: Fix build errors STRICT_MM_TYPECHECKS ...
2013-05-14powerpc: Set show_unhandled_signals to 1 by defaultBenjamin Herrenschmidt
Just like other architectures Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-14powerpc: Use the new schedule_user API on userspace preemptionLi Zhong
This patch corresponds to [PATCH] x86: Use the new schedule_user API on userspace preemption commit 0430499ce9d78691f3985962021b16bf8f8a8048 Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-14powerpc: Exit user context on notify resumeLi Zhong
This patch allows RCU usage in do_notify_resume, e.g. signal handling. It corresponds to [PATCH] x86: Exit RCU extended QS on notify resume commit edf55fda35c7dc7f2d9241c3abaddaf759b457c6 Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-14powerpc: Exception hooks for context tracking subsystemLi Zhong
This is the exception hooks for context tracking subsystem, including data access, program check, single step, instruction breakpoint, machine check, alignment, fp unavailable, altivec assist, unknown exception, whose handlers might use RCU. This patch corresponds to [PATCH] x86: Exception hooks for userspace RCU extended QS commit 6ba3c97a38803883c2eee489505796cb0a727122 But after the exception handling moved to generic code, and some changes in following two commits: 56dd9470d7c8734f055da2a6bac553caf4a468eb context_tracking: Move exception handling to generic code 6c1e0256fad84a843d915414e4b5973b7443d48d context_tracking: Restore correct previous context state on exception exit it is able for exception hooks to use the generic code above instead of a redundant arch implementation. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-14powerpc: Syscall hooks for context tracking subsystemLi Zhong
This is the syscall slow path hooks for context tracking subsystem, corresponding to [PATCH] x86: Syscall hooks for userspace RCU extended QS commit bf5a3c13b939813d28ce26c01425054c740d6731 TIF_MEMDIE is moved to the second 16-bits (with value 17), as it seems there is no asm code using it. TIF_NOHZ is added to _TIF_SYCALL_T_OR_A, so it is better for it to be in the same 16 bits with others in the group, so in the asm code, andi. with this group could work. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-14powerpc/booke64: Fix kernel hangs at kernel_dbg_excScott Wood
MSR_DE is not cleared on entry to the kernel, and we don't clear it explicitly outside of debug code. If we have MSR_DE set in prime_debug_regs(), and the new thread has events enabled in DBCR0 (e.g. ICMP is set in thread->dbsr0, even though it was cleared in the real DBCR0 when the thread got scheduled out), we'll end up taking a debug exception in the kernel when DBCR0 is loaded. DSRR0 will not point to an exception vector, and the kernel ends up hanging at kernel_dbg_exc. Fix this by always clearing MSR_DE when we load new debug state. Another observed source of kernel_dbg_exc hangs is with the branch taken event. If this event is active, but we take a non-debug trap (e.g. a TLB miss or an asynchronous interrupt) before the next branch. We end up taking a branch-taken debug exception on the initial branch instruction of the exception vector, but because the debug exception is DBSR_BT rather than DBSR_IC we branch to kernel_dbg_exc before even checking the DSRR0 address. Fix this by checking for DBSR_BT as well as DBSR_IC, which is what 32-bit does and what the comments suggest was intended in the 64-bit code as well. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-14powerpc: Provide __bswapdi2David Woodhouse
Some versions of GCC apparently expect this to be provided by libgcc. Updates from Mikey to fix 32 bit version and adding "r" to registers. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-14powerpc: Fix MAX_STACK_TRACE_ENTRIES too low warning againLi Zhong
Saw this warning again, and this time from the ret_from_fork path. It seems we could clear the back chain earlier in copy_thread(), which could cover both path, and also fix potential lockdep usage in schedule_tail(), or exception occurred before we clear the back chain. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-14powerpc: Bring all threads online prior to migration/hibernationRobert Jennings
This patch brings online all threads which are present but not online prior to migration/hibernation. After migration/hibernation those threads are taken back offline. During migration/hibernation all online CPUs must call H_JOIN, this is required by the hypervisor. Without this patch, threads that are offline (H_CEDE'd) will not be woken to make the H_JOIN call and the OS will be deadlocked (all threads either JOIN'd or CEDE'd). Cc: <stable@kernel.org> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-14powerpc/rtas_flash: Fix validate_flash buffer overflow issueVasant Hegde
ibm,validate-flash-image RTAS call output buffer contains 150 - 200 bytes of data on latest system. Presently we have output buffer size as 64 bytes and we use sprintf to copy data from RTAS buffer to local buffer. This causes kernel oops (see below call trace). This patch increases local buffer size to 256 and also uses snprintf instead of sprintf to copy data from RTAS buffer. Kernel call trace : ------------------- Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=1024 NUMA pSeries Modules linked in: nfs fscache lockd auth_rpcgss nfs_acl sunrpc fuse loop dm_mod ipv6 ipv6_lib usb_storage ehea(X) sr_mod qlge ses cdrom enclosure st be2net sg ext3 jbd mbcache usbhid hid ohci_hcd ehci_hcd usbcore qla2xxx usb_common sd_mod crc_t10dif scsi_dh_hp_sw scsi_dh_rdac scsi_dh_alua scsi_dh_emc scsi_dh lpfc scsi_transport_fc scsi_tgt ipr(X) libata scsi_mod Supported: Yes NIP: 4520323031333130 LR: 4520323031333130 CTR: 0000000000000000 REGS: c0000001b91779b0 TRAP: 0400 Tainted: G X (3.0.13-0.27-ppc64) MSR: 8000000040009032 <EE,ME,IR,DR> CR: 44022488 XER: 20000018 TASK = c0000001bca1aba0[4736] 'cat' THREAD: c0000001b9174000 CPU: 36 GPR00: 4520323031333130 c0000001b9177c30 c000000000f87c98 000000000000009b GPR04: c0000001b9177c4a 000000000000000b 3520323031333130 2032303133313031 GPR08: 3133313031350a4d 000000000000009b 0000000000000000 c0000000003664a4 GPR12: 0000000022022448 c000000003ee6c00 0000000000000002 00000000100e8a90 GPR16: 00000000100cb9d8 0000000010093370 000000001001d310 0000000000000000 GPR20: 0000000000008000 00000000100fae60 000000000000005e 0000000000000000 GPR24: 0000000010129350 46573738302e3030 2046573738302e30 300a4d4720323031 GPR28: 333130313520554e 4b4e4f574e0a4d47 2032303133313031 3520323031333130 NIP [4520323031333130] 0x4520323031333130 LR [4520323031333130] 0x4520323031333130 Call Trace: [c0000001b9177c30] [4520323031333130] 0x4520323031333130 (unreliable) Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-14powerpc/kexec: Fix kexec when using VMX optimised memcpyAnton Blanchard
commit b3f271e86e5a (powerpc: POWER7 optimised memcpy using VMX and enhanced prefetch) uses VMX when it is safe to do so (ie not in interrupt). It also looks at the task struct to decide if we have to save the current tasks' VMX state. kexec calls memcpy() at a point where the task struct may have been overwritten by the new kexec segments. If it has been overwritten then when memcpy -> enable_altivec looks up current->thread.regs->msr we get a cryptic oops or lockup. I also notice we aren't initialising thread_info->cpu, which means smp_processor_id is broken. Fix that too. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@vger.kernel.org> # 3.6+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-14powerpc: Fix build errors STRICT_MM_TYPECHECKSAneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-11Merge git://git.infradead.org/users/eparis/auditLinus Torvalds
Pull audit changes from Eric Paris: "Al used to send pull requests every couple of years but he told me to just start pushing them to you directly. Our touching outside of core audit code is pretty straight forward. A couple of interface changes which hit net/. A simple argument bug calling audit functions in namei.c and the removal of some assembly branch prediction code on ppc" * git://git.infradead.org/users/eparis/audit: (31 commits) audit: fix message spacing printing auid Revert "audit: move kaudit thread start from auditd registration to kaudit init" audit: vfs: fix audit_inode call in O_CREAT case of do_last audit: Make testing for a valid loginuid explicit. audit: fix event coverage of AUDIT_ANOM_LINK audit: use spin_lock in audit_receive_msg to process tty logging audit: do not needlessly take a lock in tty_audit_exit audit: do not needlessly take a spinlock in copy_signal audit: add an option to control logging of passwords with pam_tty_audit audit: use spin_lock_irqsave/restore in audit tty code helper for some session id stuff audit: use a consistent audit helper to log lsm information audit: push loginuid and sessionid processing down audit: stop pushing loginid, uid, sessionid as arguments audit: remove the old depricated kernel interface audit: make validity checking generic audit: allow checking the type of audit message in the user filter audit: fix build break when AUDIT_DEBUG == 2 audit: remove duplicate export of audit_enabled Audit: do not print error when LSMs disabled ...
2013-05-10Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull stray syscall bits from Al Viro: "Several syscall-related commits that were missing from the original" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: switch compat_sys_sysctl to COMPAT_SYSCALL_DEFINE unicore32: just use mmap_pgoff()... unify compat fanotify_mark(2), switch to COMPAT_SYSCALL_DEFINE x86, vm86: fix VM86 syscalls: use SYSCALL_DEFINEx(...)
2013-05-09unify compat fanotify_mark(2), switch to COMPAT_SYSCALL_DEFINEAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-05-07powerpc: Add an in memory udbg consoleAlistair Popple
This patch adds a new udbg early debug console which utilises statically defined input and output buffers stored within the kernel BSS. It is primarily designed to assist with bring up of new hardware which may not have a working console but which has a method of reading/writing kernel memory. This version incorporates comments made by Ben H (thanks!). Changes from v1: - Add memory barriers. - Ensure updating of read/write positions is atomic. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-06powerpc/topology: Fix spurr attribute permissionBenjamin Herrenschmidt
We are registering the attribute with permission 0600 but it doesn't have a store callback, which causes WARN_ON's during boot. Fix the permission. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-06powerpc/pci: Support per-aperture memory offsetBenjamin Herrenschmidt
The PCI core supports an offset per aperture nowadays but our arch code still has a single offset per host bridge representing the difference betwen CPU memory addresses and PCI MMIO addresses. This is a problem as new machines and hypervisor versions are coming out where the 64-bit windows will have a different offset (basically mapped 1:1) from the 32-bit windows. This fixes it by using separate offsets. In the long run, we probably want to get rid of that intermediary struct pci_controller and have those directly stored into the pci_host_bridge as they are parsed but this will be a more invasive change. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-05powerpc/pci: Don't add bogus empty resources to PHBsBenjamin Herrenschmidt
When converting to use the new pci_add_resource_offset() we didn't properly account for empty resources (0 flags) and add those bogons to the PHBs. The result is some annoying messages in the log. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-05powerpc/cputable: Advertise support for ISEL/HTM/DSCR/TAR on POWER8Nishanth Aravamudan
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-05powerpc/cputable: Advertise ISEL support on appropriate embedded processorsNishanth Aravamudan
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-05powerpc/cputable: Advertise DSCR support on P7/P7+Nishanth Aravamudan
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-05powerpc/pseries: Perform proper max_bus_speed detectionKleber Sacilotto de Souza
On pseries machines the detection for max_bus_speed should be done through an OpenFirmware property. This patch adds a function to perform this detection and a hook to perform dynamic adding of the function only for pseries. This is done by overwriting the weak pcibios_root_bridge_prepare function which is called by pci_create_root_bus(). From: Lucas Kannebley Tavares <lucaskt@linux.vnet.ibm.com> Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-05powerpc: Emulate non privileged DSCR read and writeAnton Blanchard
POWER8 allows read and write of the DSCR in userspace. We added kernel emulation so applications could always use the instructions regardless of the CPU type. Unfortunately there are two SPRs for the DSCR and we only added emulation for the privileged one. Add code to match the non privileged one. A simple test was created to verify the fix: http://ozlabs.org/~anton/junkcode/user_dscr_test.c Without the patch we get a SIGILL and it passes with the patch. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-05Merge tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Gleb Natapov: "Highlights of the updates are: general: - new emulated device API - legacy device assignment is now optional - irqfd interface is more generic and can be shared between arches x86: - VMCS shadow support and other nested VMX improvements - APIC virtualization and Posted Interrupt hardware support - Optimize mmio spte zapping ppc: - BookE: in-kernel MPIC emulation with irqfd support - Book3S: in-kernel XICS emulation (incomplete) - Book3S: HV: migration fixes - BookE: more debug support preparation - BookE: e6500 support ARM: - reworking of Hyp idmaps s390: - ioeventfd for virtio-ccw And many other bug fixes, cleanups and improvements" * tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits) kvm: Add compat_ioctl for device control API KVM: x86: Account for failing enable_irq_window for NMI window request KVM: PPC: Book3S: Add API for in-kernel XICS emulation kvm/ppc/mpic: fix missing unlock in set_base_addr() kvm/ppc: Hold srcu lock when calling kvm_io_bus_read/write kvm/ppc/mpic: remove users kvm/ppc/mpic: fix mmio region lists when multiple guests used kvm/ppc/mpic: remove default routes from documentation kvm: KVM_CAP_IOMMU only available with device assignment ARM: KVM: iterate over all CPUs for CPU compatibility check KVM: ARM: Fix spelling in error message ARM: KVM: define KVM_ARM_MAX_VCPUS unconditionally KVM: ARM: Fix API documentation for ONE_REG encoding ARM: KVM: promote vfp_host pointer to generic host cpu context ARM: KVM: add architecture specific hook for capabilities ARM: KVM: perform HYP initilization for hotplugged CPUs ARM: KVM: switch to a dual-step HYP init code ARM: KVM: rework HYP page table freeing ARM: KVM: enforce maximum size for identity mapped code ARM: KVM: move to a KVM provided HYP idmap ...
2013-05-02Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc update from Benjamin Herrenschmidt: "The main highlights this time around are: - A pile of addition POWER8 bits and nits, such as updated performance counter support (Michael Ellerman), new branch history buffer support (Anshuman Khandual), base support for the new PCI host bridge when not using the hypervisor (Gavin Shan) and other random related bits and fixes from various contributors. - Some rework of our page table format by Aneesh Kumar which fixes a thing or two and paves the way for THP support. THP itself will not make it this time around however. - More Freescale updates, including Altivec support on the new e6500 cores, new PCI controller support, and a pile of new boards support and updates. - The usual batch of trivial cleanups & fixes" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (156 commits) powerpc: Fix build error for book3e powerpc: Context switch the new EBB SPRs powerpc: Turn on the EBB H/FSCR bits powerpc: Replace CPU_FTR_BCTAR with CPU_FTR_ARCH_207S powerpc: Setup BHRB instructions facility in HFSCR for POWER8 powerpc: Fix interrupt range check on debug exception powerpc: Update tlbie/tlbiel as per ISA doc powerpc: Print page size info during boot powerpc: print both base and actual page size on hash failure powerpc: Fix hpte_decode to use the correct decoding for page sizes powerpc: Decode the pte-lp-encoding bits correctly. powerpc: Use encode avpn where we need only avpn values powerpc: Reduce PTE table memory wastage powerpc: Move the pte free routines from common header powerpc: Reduce the PTE_INDEX_SIZE powerpc: Switch 16GB and 16MB explicit hugepages to a different page table format powerpc: New hugepage directory format powerpc: Don't truncate pgd_index wrongly powerpc: Don't hard code the size of pte page powerpc: Save DAR and DSISR in pt_regs on MCE ...
2013-05-02Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS updates from Al Viro, Misc cleanups all over the place, mainly wrt /proc interfaces (switch create_proc_entry to proc_create(), get rid of the deprecated create_proc_read_entry() in favor of using proc_create_data() and seq_file etc). 7kloc removed. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits) don't bother with deferred freeing of fdtables proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h proc: Make the PROC_I() and PDE() macros internal to procfs proc: Supply a function to remove a proc entry by PDE take cgroup_open() and cpuset_open() to fs/proc/base.c ppc: Clean up scanlog ppc: Clean up rtas_flash driver somewhat hostap: proc: Use remove_proc_subtree() drm: proc: Use remove_proc_subtree() drm: proc: Use minor->index to label things, not PDE->name drm: Constify drm_proc_list[] zoran: Don't print proc_dir_entry data in debug reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show() proc: Supply an accessor for getting the data from a PDE's parent airo: Use remove_proc_subtree() rtl8192u: Don't need to save device proc dir PDE rtl8187se: Use a dir under /proc/net/r8180/ proc: Add proc_mkdir_data() proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h} proc: Move PDE_NET() to fs/proc/proc_net.c ...
2013-05-02powerpc: Context switch the new EBB SPRsMichael Ellerman
This context switches the new Event Based Branching (EBB) SPRs. The three new SPRs are: - Event Based Branch Handler Register (EBBHR) - Event Based Branch Return Register (EBBRR) - Branch Event Status and Control Register (BESCR) Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-02powerpc: Turn on the EBB H/FSCR bitsMichael Neuling
This turns Event Based Branching (EBB) on in the Hypervisor Facility Status and Control Register (HFSCR) and Facility Status and Control Register (FSCR). Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-02powerpc: Replace CPU_FTR_BCTAR with CPU_FTR_ARCH_207SMichael Ellerman
We are getting low on cpu feature bits. So rather than add a separate bit for every new Power8 feature, add a bit for arch 2.07 server catagory and use that instead. Hijack the value we had for BCTAR, but swap the value with CFAR so that all the ARCH defines are together. Note we don't touch CPU_FTR_TM, because it is conditionally enabled if the kernel is built with TM support. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-02powerpc: Setup BHRB instructions facility in HFSCR for POWER8Anshuman Khandual
Make BHRB instructions available in problem and privileged states. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-02powerpc: Fix interrupt range check on debug exceptionBharat Bhushan
We do not want to take single step and branch-taken debug exception in kernel exception code. But the address range check was not covering all kernel exception handlers address range. With this patch we defined the interrupt_end label which defines the end on kernel exception code. So now we check interrupt_base to interrupt_end range for not handling debug exception in kernel exception entry. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-01ppc: Clean up rtas_flash driver somewhatDavid Howells
Clean up some of the problems with the rtas_flash driver: (1) It shouldn't fiddle with the internals of the procfs filesystem (altering pde->count). (2) If pid namespaces are in effect, then you can get multiple inodes connected to a single pde, thereby rendering the pde->count > 2 test useless. (3) The pde->count fudging doesn't work for forked, dup'd or cloned file descriptors, so add static mutexes and use them to wrap access to the driver through read, write and release methods. (4) The driver can only handle one device, so allocate most of the data previously attached to the pde->data as static variables instead (though allocate the validation data buffer with kmalloc). (5) We don't need to save the pde pointers as long as we have the filenames available for removal. (6) Don't try to multiplex what the update file read method does based on the filename. Instead provide separate file ops and split the function. Whilst we're at it, tabulate the procfile information and loop through it when creating or destroying them rather than manually coding each one. [Folded fixes from Vasant Hegde <hegdevasant@linux.vnet.ibm.com>] Signed-off-by: David Howells <dhowells@redhat.com> cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> cc: Paul Mackerras <paulus@samba.org> cc: Anton Blanchard <anton@samba.org> cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-05-01proc: Supply PDE attribute setting accessor functionsDavid Howells
Supply accessor functions to set attributes in proc_dir_entry structs. The following are supplied: proc_set_size() and proc_set_user(). Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> cc: linuxppc-dev@lists.ozlabs.org cc: linux-media@vger.kernel.org cc: netdev@vger.kernel.org cc: linux-wireless@vger.kernel.org cc: linux-pci@vger.kernel.org cc: netfilter-devel@vger.kernel.org cc: alsa-devel@alsa-project.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-05-01Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull compat cleanup from Al Viro: "Mostly about syscall wrappers this time; there will be another pile with patches in the same general area from various people, but I'd rather push those after both that and vfs.git pile are in." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: syscalls.h: slightly reduce the jungles of macros get rid of union semop in sys_semctl(2) arguments make do_mremap() static sparc: no need to sign-extend in sync_file_range() wrapper ppc compat wrappers for add_key(2) and request_key(2) are pointless x86: trim sys_ia32.h x86: sys32_kill and sys32_mprotect are pointless get rid of compat_sys_semctl() and friends in case of ARCH_WANT_OLD_COMPAT_IPC merge compat sys_ipc instances consolidate compat lookup_dcookie() convert vmsplice to COMPAT_SYSCALL_DEFINE switch getrusage() to COMPAT_SYSCALL_DEFINE switch epoll_pwait to COMPAT_SYSCALL_DEFINE convert sendfile{,64} to COMPAT_SYSCALL_DEFINE switch signalfd{,4}() to COMPAT_SYSCALL_DEFINE make SYSCALL_DEFINE<n>-generated wrappers do asmlinkage_protect make HAVE_SYSCALL_WRAPPERS unconditional consolidate cond_syscall and SYSCALL_ALIAS declarations teach SYSCALL_DEFINE<n> how to deal with long long/unsigned long long get rid of duplicate logics in __SC_....[1-6] definitions
2013-05-01dump_stack: unify debug information printed by show_regs()Tejun Heo
show_regs() is inherently arch-dependent but it does make sense to print generic debug information and some archs already do albeit in slightly different forms. This patch introduces a generic function to print debug information from show_regs() so that different archs print out the same information and it's much easier to modify what's printed. show_regs_print_info() prints out the same debug info as dump_stack() does plus task and thread_info pointers. * Archs which didn't print debug info now do. alpha, arc, blackfin, c6x, cris, frv, h8300, hexagon, ia64, m32r, metag, microblaze, mn10300, openrisc, parisc, score, sh64, sparc, um, xtensa * Already prints debug info. Replaced with show_regs_print_info(). The printed information is superset of what used to be there. arm, arm64, avr32, mips, powerpc, sh32, tile, unicore32, x86 * s390 is special in that it used to print arch-specific information along with generic debug info. Heiko and Martin think that the arch-specific extra isn't worth keeping s390 specfic implementation. Converted to use the generic version. Note that now all archs print the debug info before actual register dumps. An example BUG() dump follows. kernel BUG at /work/os/work/kernel/workqueue.c:4841! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #7 Hardware name: empty empty/S3992, BIOS 080011 10/26/2007 task: ffff88007c85e040 ti: ffff88007c860000 task.ti: ffff88007c860000 RIP: 0010:[<ffffffff8234a07e>] [<ffffffff8234a07e>] init_workqueues+0x4/0x6 RSP: 0000:ffff88007c861ec8 EFLAGS: 00010246 RAX: ffff88007c861fd8 RBX: ffffffff824466a8 RCX: 0000000000000001 RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffffffff8234a07a RBP: ffff88007c861ec8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff8234a07a R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffff88015f7ff000 CR3: 00000000021f1000 CR4: 00000000000007f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff88007c861ef8 ffffffff81000312 ffffffff824466a8 ffff88007c85e650 0000000000000003 0000000000000000 ffff88007c861f38 ffffffff82335e5d ffff88007c862080 ffffffff8223d8c0 ffff88007c862080 ffffffff81c47760 Call Trace: [<ffffffff81000312>] do_one_initcall+0x122/0x170 [<ffffffff82335e5d>] kernel_init_freeable+0x9b/0x1c8 [<ffffffff81c47760>] ? rest_init+0x140/0x140 [<ffffffff81c4776e>] kernel_init+0xe/0xf0 [<ffffffff81c6be9c>] ret_from_fork+0x7c/0xb0 [<ffffffff81c47760>] ? rest_init+0x140/0x140 ... v2: Typo fix in x86-32. v3: CPU number dropped from show_regs_print_info() as dump_stack_print_info() has been updated to print it. s390 specific implementation dropped as requested by s390 maintainers. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Sam Ravnborg <sam@ravnborg.org> Acked-by: Chris Metcalf <cmetcalf@tilera.com> [tile bits] Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon bits] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01dump_stack: consolidate dump_stack() implementations and unify their behaviorsTejun Heo
Both dump_stack() and show_stack() are currently implemented by each architecture. show_stack(NULL, NULL) dumps the backtrace for the current task as does dump_stack(). On some archs, dump_stack() prints extra information - pid, utsname and so on - in addition to the backtrace while the two are identical on other archs. The usages in arch-independent code of the two functions indicate show_stack(NULL, NULL) should print out bare backtrace while dump_stack() is used for debugging purposes when something went wrong, so it does make sense to print additional information on the task which triggered dump_stack(). There's no reason to require archs to implement two separate but mostly identical functions. It leads to unnecessary subtle information. This patch expands the dummy fallback dump_stack() implementation in lib/dump_stack.c such that it prints out debug information (taken from x86) and invokes show_stack(NULL, NULL) and drops arch-specific dump_stack() implementations in all archs except blackfin. Blackfin's dump_stack() does something wonky that I don't understand. Debug information can be printed separately by calling dump_stack_print_info() so that arch-specific dump_stack() implementation can still emit the same debug information. This is used in blackfin. This patch brings the following behavior changes. * On some archs, an extra level in backtrace for show_stack() could be printed. This is because the top frame was determined in dump_stack() on those archs while generic dump_stack() can't do that reliably. It can be compensated by inlining dump_stack() but not sure whether that'd be necessary. * Most archs didn't use to print debug info on dump_stack(). They do now. An example WARN dump follows. WARNING: at kernel/workqueue.c:4841 init_workqueues+0x35/0x505() Hardware name: empty Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #9 0000000000000009 ffff88007c861e08 ffffffff81c614dc ffff88007c861e48 ffffffff8108f50f ffffffff82228240 0000000000000040 ffffffff8234a03c 0000000000000000 0000000000000000 0000000000000000 ffff88007c861e58 Call Trace: [<ffffffff81c614dc>] dump_stack+0x19/0x1b [<ffffffff8108f50f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff8108f56a>] warn_slowpath_null+0x1a/0x20 [<ffffffff8234a071>] init_workqueues+0x35/0x505 ... v2: CPU number added to the generic debug info as requested by s390 folks and dropped the s390 specific dump_stack(). This loses %ksp from the debug message which the maintainers think isn't important enough to keep the s390-specific dump_stack() implementation. dump_stack_print_info() is moved to kernel/printk.c from lib/dump_stack.c. Because linkage is per objecct file, dump_stack_print_info() living in the same lib file as generic dump_stack() means that archs which implement custom dump_stack() - at this point, only blackfin - can't use dump_stack_print_info() as that will bring in the generic version of dump_stack() too. v1 The v1 patch broke build on blackfin due to this issue. The build breakage was reported by Fengguang Wu. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [s390 bits] Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Sam Ravnborg <sam@ravnborg.org> Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon bits] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>