summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2013-04-26powerpc/pseries: Move architecture vector definitions to prom.hNathan Fontenot
As part of handling of PRRN events we need to check vector 5 of the architecture vector bits reported in the device tree to ensure PRRN event handling is enabled. To do this firmware_has_feature() is updated (in a subsequent patch) to make this check vector 5 bits. To avoid having to re-define bits in the architecture vector the bit definitions are moved to prom.h. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-26powerpc/pseries: Add PRRN RTAS event handlerJesse Larrew
A PRRN event is signaled via the RTAS event-scan mechanism, which returns a Hot Plug Event message "fixed part" indicating "Platform Resource Reassignment". In response to the Hot Plug Event message, we must call ibm,update-nodes to determine which resources were reassigned and then ibm,update-properties to obtain the new affinity information about those resources. The PRRN event-scan RTAS message contains only the "fixed part" with the "Type" field set to the value 160 and no Extended Event Log. The four-byte Extended Event Log Length field is re-purposed (since no Extended Event Log message is included) to pass the "scope" parameter that causes the ibm,update-nodes to return the nodes affected by the specific resource reassignment. This patch adds a handler for RTAS events. The function pseries_devicetree_update() (from mobility.c) is used to make the ibm,update-nodes/ibm,update-properties RTAS calls. Updating the NUMA maps (handled by a subsequent patch) will require significant processing, so pseries_devicetree_update() is called from an asynchronous workqueue to allow event processing to continue. PRRN RTAS events on pseries systems are rare events that have to be initiated from the HMC console for the system by an IBM tech. This allows us to assume that these events are widely spaced. Additionally, all work on the queue is flushed before handling any new work to ensure we only have one event in flight being handled at a time. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-26powerpc/pseries: Correct buffer parsing in update_dt_node()Nathan Fontenot
Correct parsing of the buffer returned from ibm,update-properties. The first element is a length and the path to the property which is slightly different from the list of properties in the buffer so we need to specifically handle this. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-26powerpc/pseries: Expose pseries devicetree_update()Nathan Fontenot
Newer firmware on Power systems can transparently reassign platform resources (CPU and Memory) in use. For instance, if a processor or memory unit is predicted to fail, the platform may transparently move the processing to an equivalent unused processor or the memory state to an equivalent unused memory unit. However, reassigning resources across NUMA boundaries may alter the performance of the partition. When such reassignment is necessary, the Platform Resource Reassignment Notification (PRRN) option provides a mechanism to inform the Linux kernel of changes to the NUMA affinity of its platform resources. When rtasd receives a PRRN event, it needs to make a series of RTAS calls (ibm,update-nodes and ibm,update-properties) to retrieve the updated device tree information. These calls are already handled in the pseries_devicetree_update() routine used in partition migration. This patch exposes pseries_devicetree_update() to make it accessible to other pseries routines, this patch also updates pseries_devicetree_update() to take a 32-bit scope parameter. The scope value, which was previously hard coded to 1 for partition migration, is used for the RTAS calls ibm,update-nodes/properties to update the device tree. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-26powerpc: Fix hardware IRQs with MMU on exceptions when HV=0Michael Neuling
POWER8 allows us to take interrupts with the MMU on. This gives us a second set of vectors offset at 0x4000. Unfortunately when coping these vectors we missed checking for MSR HV for hardware interrupts (0x500). This results in us trying to use HSRR0/1 when HV=0, rather than SRR0/1 on HW IRQs The below fixes this to check CPU_FTR_HVMODE when patching the code at 0x4500. Also we remove the check for CPU_FTR_ARCH_206 since relocation on IRQs are only available in arch 2.07 and beyond. Thanks to benh for helping find this. Signed-off-by: Michael Neuling <mikey@neuling.org> CC: <stable@vger.kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-26powerpc/power8: Fix secondary CPUs hanging on boot for HV=0Michael Neuling
In __restore_cpu_power8 we determine if we are HV and if not, we return before setting HV only resources. Unfortunately we forgot to restore the link register from r11 before returning. This will happen on boot and with secondary CPUs not coming online. This adds the missing link register restore. Signed-off-by: Michael Neuling <mikey@neuling.org> CC: <stable@vger.kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-26powerpc: Add isync to copy_and_flushMichael Neuling
In __after_prom_start we copy the kernel down to zero in two calls to copy_and_flush. After the first call (copy from 0 to copy_to_here:) we jump to the newly copied code soon after. Unfortunately there's no isync between the copy of this code and the jump to it. Hence it's possible that stale instructions could still be in the icache or pipeline before we branch to it. We've seen this on real machines and it's results in no console output after: calling quiesce... returning from prom_init The below adds an isync to ensure that the copy and flushing has completed before any branching to the new instructions occurs. Signed-off-by: Michael Neuling <mikey@neuling.org> CC: <stable@vger.kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-26powerpc: Add HWCAP2 aux entryMichael Neuling
We are currently out of free bits in AT_HWCAP. With POWER8, we have several hardware features that we need to advertise. Tested on POWER and x86. Signed-off-by: Michael Neuling <michael@neuling.org> Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-24powerpc/powernv: Fix missing Kconfig dependency for MSIsBenjamin Herrenschmidt
We need PPC_MSI_BITMAP support Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-24Merge remote-tracking branch 'origin/master' into nextBenjamin Herrenschmidt
Merge upstream to get the audit fixes
2013-04-24powerpc/spufs: Initialise inode->i_ino in spufs_new_inode()Michael Ellerman
In commit 85fe402 (fs: do not assign default i_ino in new_inode), the initialisation of i_ino was removed from new_inode() and pushed down into the callers. However spufs_new_inode() was not updated. This exhibits as no files appearing in /spu, because all our dirents have a zero inode, which readdir() seems to dislike. Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-24powerpc/rtas_flash: New return code to indicate FW entitlement expiryVasant Hegde
Add new return code to rtas_flash to indicate firmware entitlement expiry. Strictly we don't need this update. But to keep it in sync with PAPR, this was added. Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-24powerpc/rtas_flash: Update return token commentsVasant Hegde
Add proper comment to ibm,validate-flash-image RTAS call update result tokens. Note: Only comment section is modified, no code change. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-24powerpc/cell: Only iterate over online nodes in cbe_init_pm_irq()Michael Ellerman
None of the cell platforms support CPU hotplug, so we should iterate only over online nodes when setting PMU interrupts. This also fixes a warning during boot when NODES_SHIFT is large enough: WARNING: at /scratch/michael/src/kmk/linus/kernel/irq/irqdomain.c:766 ... NIP [c0000000000db278] .irq_linear_revmap+0x30/0x58 LR [c0000000000dc2a0] .irq_create_mapping+0x38/0x1a8 Call Trace: [c0000003fc9c3af0] [c0000000000dc2a0] .irq_create_mapping+0x38/0x1a8 (unreliable) [c0000003fc9c3b80] [c000000000655c1c] .__machine_initcall_cell_cbe_init_pm_irq+0x84/0x158 [c0000003fc9c3c20] [c00000000000afb4] .do_one_initcall+0x5c/0x1e0 [c0000003fc9c3cd0] [c000000000644580] .kernel_init_freeable+0x238/0x328 [c0000003fc9c3db0] [c00000000000b784] .kernel_init+0x1c/0x120 [c0000003fc9c3e30] [c000000000009fb8] .ret_from_kernel_thread+0x64/0xac This is caused by us overflowing our linear revmap because we're requesting too many interrupts. Reported-by: Dennis Schridde <devurandom@gmx.net> Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-24Merge remote-tracking branch 'mpe/master' into nextBenjamin Herrenschmidt
Merge the tentative powerpc-next branch maintained by Michael while I was on vacation
2013-04-23powerpc/pseries/lparcfg: Fix possible overflow are more than 1026Chen Gang
need set '\0' for 'local_buffer'. SPLPAR_MAXLENGTH is 1026, RTAS_DATA_BUF_SIZE is 4096. so the contents of rtas_data_buf may truncated in memcpy. if contents are really truncated. the splpar_strlen is more than 1026. the next while loop checking will not find the end of buffer. that will cause memory access violation. Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-23ptrace/powerpc: Don't flush_ptrace_hw_breakpoint() on fork()Oleg Nesterov
arch_dup_task_struct() does flush_ptrace_hw_breakpoint(src), this destroys the parent's breakpoints for no reason. We should clear child->thread.ptrace_bps[] copied by dup_task_struct() instead. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-23powerpc: Add VDSO version of timeAdhemerval Zanella
On 04/18/2013 07:38 PM, Anton Blanchard wrote: > Since you are only reading one long you shouldn't need to check the > update count and loop, you will always see a consistent value. The > system call version of time() just does an unprotected load for example. Fixed. > With the above change and with Michael's comments covered (decent > changelog entry and Signed-off-by): > > Acked-by: Anton Blanchard <anton@samba.org> Thanks for the review, below the updated patch: From: Adhemerval Zanella <azanella@linux.vnet.ibm.com> This patch implement the time syscall as vDSO. The performance speedups are: Baseline PPC32: 380 nsec Baseline PPC64: 350 nsec vdso PPC32: 20 nsec vsdo PPC64: 20 nsec Tested on 64 bit build with both 32 bit and 64 bit userland. Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Adhemerval Zanella <azanella@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-21Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86: Fix offcore_rsp valid mask for SNB/IVB perf: Treat attr.config as u64 in perf_swevent_init()
2013-04-21Merge branch 'x86-kdump-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull kdump fixes from Peter Anvin: "The kexec/kdump people have found several problems with the support for loading over 4 GiB that was introduced in this merge cycle. This is partly due to a number of design problems inherent in the way the various pieces of kdump fit together (it is pretty horrifically manual in many places.) After a *lot* of iterations this is the patchset that was agreed upon, but of course it is now very late in the cycle. However, because it changes both the syntax and semantics of the crashkernel option, it would be desirable to avoid a stable release with the broken interfaces." I'm not happy with the timing, since originally the plan was to release the final 3.9 tomorrow. But apparently I'm doing an -rc8 instead... * 'x86-kdump-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kexec: use Crash kernel for Crash kernel low x86, kdump: Change crashkernel_high/low= to crashkernel=,high/low x86, kdump: Retore crashkernel= to allocate under 896M x86, kdump: Set crashkernel_low automatically
2013-04-21Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "Three groups of fixes: 1. Make sure we don't execute the early microcode patching if family < 6, since it would touch MSRs which don't exist on those families, causing crashes. 2. The Xen partial emulation of HyperV can be dealt with more gracefully than just disabling the driver. 3. More EFI variable space magic. In particular, variables hidden from runtime code need to be taken into account too." * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, microcode: Verify the family before dispatching microcode patching x86, hyperv: Handle Xen emulation of Hyper-V more gracefully x86,efi: Implement efi_no_storage_paranoia parameter efi: Export efi_query_variable_store() for efivars.ko x86/Kconfig: Make EFI select UCS2_STRING efi: Distinguish between "remaining space" and actually used space efi: Pass boot services variable info to runtime code Move utf16 functions to kernel core and rename x86,efi: Check max_size only if it is non-zero. x86, efivars: firmware bug workarounds should be in platform code
2013-04-21Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "A set of fixes from various people - Will Deacon gets a prize for removing code this time around. The biggest fix in this lot is sorting out the ARM740T mess. The rest are relatively small fixes." * 'fixes' of git://git.linaro.org/people/rmk/linux-arm: ARM: 7699/1: sched_clock: Add more notrace to prevent recursion ARM: 7698/1: perf: fix group validation when using enable_on_exec ARM: 7697/1: hw_breakpoint: do not use __cpuinitdata for dbg_cpu_pm_nb ARM: 7696/1: Fix kexec by setting outer_cache.inv_all for Feroceon ARM: 7694/1: ARM, TCM: initialize TCM in paging_init(), instead of setup_arch() ARM: 7692/1: iop3xx: move IOP3XX_PERIPHERAL_VIRT_BASE ARM: modules: don't export cpu_set_pte_ext when !MMU ARM: mm: remove broken condition check for v4 flushing ARM: mm: fix numerous hideous errors in proc-arm740.S ARM: cache: remove ARMv3 support code ARM: tlbflush: remove ARMv3 support
2013-04-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds
Pull sparc fixes from David Miller: 1) Fix race in sparc64 TLB shootdowns, we have to synchronize with the sibling cpus completing if we are passing them a reference via pointer to a data structure. 2) Fix cleaning of bitmaps in sparc32, from Akinobu Mita. 3) Fix various sparc header mistakes, some of which resulted in userland build breakage. From Sam Ravnborg. 4) Kill ghost declarations and defines missed when several bits of code got deleted recently. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: Fix race in TLB batch processing. sparc: use asm-generic version of types.h bbc_i2c: fix section mismatch warning sparc: use generic headers sparc:cleanup unused code in smp_32.h sparc/iommu: fix typo s/265KB/256KB/ sparc/srmmu: clear trailing edge of bitmap properly sparc:remove unused declaration smp_boot_cpus()
2013-04-20Merge remote-tracking branch 'efi/urgent' into x86/urgentH. Peter Anvin
Matt Fleming (1): x86, efivars: firmware bug workarounds should be in platform code Matthew Garrett (3): Move utf16 functions to kernel core and rename efi: Pass boot services variable info to runtime code efi: Distinguish between "remaining space" and actually used space Richard Weinberger (2): x86,efi: Check max_size only if it is non-zero. x86,efi: Implement efi_no_storage_paranoia parameter Sergey Vlasov (2): x86/Kconfig: Make EFI select UCS2_STRING efi: Export efi_query_variable_store() for efivars.ko Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-19x86, microcode: Verify the family before dispatching microcode patchingH. Peter Anvin
For each CPU vendor that implements CPU microcode patching, there will be a minimum family for which this is implemented. Verify this minimum level of support. This can be done in the dispatch function or early in the application functions. Doing the latter turned out to be somewhat awkward because of the ineviable split between the BSP and the AP paths, and rather than pushing deep into the application functions, do this in the dispatch function. Reported-by: "Bryan O'Donoghue" <bryan.odonoghue.lkml@nexus-software.ie> Suggested-by: Borislav Petkov <bp@alien8.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1366392183-4149-1-git-send-email-bryan.odonoghue.lkml@nexus-software.ie
2013-04-19sparc64: Fix race in TLB batch processing.David S. Miller
As reported by Dave Kleikamp, when we emit cross calls to do batched TLB flush processing we have a race because we do not synchronize on the sibling cpus completing the cross call. So meanwhile the TLB batch can be reset (tb->tlb_nr set to zero, etc.) and either flushes are missed or flushes will flush the wrong addresses. Fix this by using generic infrastructure to synchonize on the completion of the cross call. This first required getting the flush_tlb_pending() call out from switch_to() which operates with locks held and interrupts disabled. The problem is that smp_call_function_many() cannot be invoked with IRQs disabled and this is explicitly checked for with WARN_ON_ONCE(). We get the batch processing outside of locked IRQ disabled sections by using some ideas from the powerpc port. Namely, we only batch inside of arch_{enter,leave}_lazy_mmu_mode() calls. If we're not in such a region, we flush TLBs synchronously. 1) Get rid of xcall_flush_tlb_pending and per-cpu type implementations. 2) Do TLB batch cross calls instead via: smp_call_function_many() tlb_pending_func() __flush_tlb_pending() 3) Batch only in lazy mmu sequences: a) Add 'active' member to struct tlb_batch b) Define __HAVE_ARCH_ENTER_LAZY_MMU_MODE c) Set 'active' in arch_enter_lazy_mmu_mode() d) Run batch and clear 'active' in arch_leave_lazy_mmu_mode() e) Check 'active' in tlb_batch_add_one() and do a synchronous flush if it's clear. 4) Add infrastructure for synchronous TLB page flushes. a) Implement __flush_tlb_page and per-cpu variants, patch as needed. b) Likewise for xcall_flush_tlb_page. c) Implement smp_flush_tlb_page() to invoke the cross-call. d) Wire up global_flush_tlb_page() to the right routine based upon CONFIG_SMP 5) It turns out that singleton batches are very common, 2 out of every 3 batch flushes have only a single entry in them. The batch flush waiting is very expensive, both because of the poll on sibling cpu completeion, as well as because passing the tlb batch pointer to the sibling cpus invokes a shared memory dereference. Therefore, in flush_tlb_pending(), if there is only one entry in the batch perform a completely asynchronous global_flush_tlb_page() instead. Reported-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Dave Kleikamp <dave.kleikamp@oracle.com>
2013-04-19ARM: 7699/1: sched_clock: Add more notrace to prevent recursionStephen Boyd
cyc_to_sched_clock() is called by sched_clock() and cyc_to_ns() is called by cyc_to_sched_clock(). I suspect that some compilers inline both of these functions into sched_clock() and so we've been getting away without having a notrace marking. It seems that my compiler isn't inlining cyc_to_sched_clock() though, so I'm hitting a recursion bug when I enable the function graph tracer, causing my system to crash. Marking these functions notrace fixes it. Technically cyc_to_ns() doesn't need the notrace because it's already marked inline, but let's just add it so that if we ever remove inline from that function it doesn't blow up. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-04-19Merge tag 'fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "Only one remaining fix for arm-soc platforms at this time, a small bugfix for cpu hotplug on highbank platforms that has become much easier to hit as of late. Details in the patch description, but it's small and well-contained and definitely impacts users of the platform, so 3.9 seems appropriate." * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: highbank: fix cache flush ordering for cpu hotplug
2013-04-18ARM: highbank: fix cache flush ordering for cpu hotplugRob Herring
The L1 data cache flush needs to be after highbank_set_cpu_jump call which pollutes the cache with the l2x0_lock. This causes other cores to deadlock waiting for the l2x0_lock. Moving the flush of the entire data cache after highbank_set_cpu_jump fixes the problem. Use flush_cache_louis instead of flush_cache_all are that is sufficient to flush only the L1 data cache. flush_cache_louis did not exist when highbank_cpu_die was originally written. With PL310 errata 769419 enabled, a wmb is inserted into idle which takes the l2x0_lock. This makes the problem much more easily hit and causes reset to hang. Reported-by: Paolo Pisati <p.pisati@gmail.com> Signed-off-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Olof Johansson <olof@lixom.net>
2013-04-18x86, hyperv: Handle Xen emulation of Hyper-V more gracefullyK. Y. Srinivasan
Install the Hyper-V specific interrupt handler only when needed. This would permit us to get rid of the Xen check. Note that when the vmbus drivers invokes the call to register its handler, we are sure to be running on Hyper-V. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Link: http://lkml.kernel.org/r/1366299886-6399-1-git-send-email-kys@microsoft.com Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-18powerpc: Try to insert the hptes repeatedly in kernel_map_linear_page()Li Zhong
This patch fixes the following oops, which could be trigged by build the kernel with many concurrent threads, under CONFIG_DEBUG_PAGEALLOC. hpte_insert() might return -1, indicating that the bucket (primary here) is full. We are not necessarily reporting a BUG in this case. Instead, we could try repeatedly (try secondary, remove and try again) until we find a slot. [ 543.075675] ------------[ cut here ]------------ [ 543.075701] kernel BUG at arch/powerpc/mm/hash_utils_64.c:1239! [ 543.075714] Oops: Exception in kernel mode, sig: 5 [#1] [ 543.075722] PREEMPT SMP NR_CPUS=16 DEBUG_PAGEALLOC NUMA pSeries [ 543.075741] Modules linked in: binfmt_misc ehea [ 543.075759] NIP: c000000000036eb0 LR: c000000000036ea4 CTR: c00000000005a594 [ 543.075771] REGS: c0000000a90832c0 TRAP: 0700 Not tainted (3.8.0-next-20130222) [ 543.075781] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI> CR: 22224482 XER: 00000000 [ 543.075816] SOFTE: 0 [ 543.075823] CFAR: c00000000004c200 [ 543.075830] TASK = c0000000e506b750[23934] 'cc1' THREAD: c0000000a9080000 CPU: 1 GPR00: 0000000000000001 c0000000a9083540 c000000000c600a8 ffffffffffffffff GPR04: 0000000000000050 fffffffffffffffa c0000000a90834e0 00000000004ff594 GPR08: 0000000000000001 0000000000000000 000000009592d4d8 c000000000c86854 GPR12: 0000000000000002 c000000006ead300 0000000000a51000 0000000000000001 GPR16: f000000003354380 ffffffffffffffff ffffffffffffff80 0000000000000000 GPR20: 0000000000000001 c000000000c600a8 0000000000000001 0000000000000001 GPR24: 0000000003354380 c000000000000000 0000000000000000 c000000000b65950 GPR28: 0000002000000000 00000000000cd50e 0000000000bf50d9 c000000000c7c230 [ 543.076005] NIP [c000000000036eb0] .kernel_map_pages+0x1e0/0x3f8 [ 543.076016] LR [c000000000036ea4] .kernel_map_pages+0x1d4/0x3f8 [ 543.076025] Call Trace: [ 543.076033] [c0000000a9083540] [c000000000036ea4] .kernel_map_pages+0x1d4/0x3f8 (unreliable) [ 543.076053] [c0000000a9083640] [c000000000167638] .get_page_from_freelist+0x6cc/0x8dc [ 543.076067] [c0000000a9083800] [c000000000167a48] .__alloc_pages_nodemask+0x200/0x96c [ 543.076082] [c0000000a90839c0] [c0000000001ade44] .alloc_pages_vma+0x160/0x1e4 [ 543.076098] [c0000000a9083a80] [c00000000018ce04] .handle_pte_fault+0x1b0/0x7e8 [ 543.076113] [c0000000a9083b50] [c00000000018d5a8] .handle_mm_fault+0x16c/0x1a0 [ 543.076129] [c0000000a9083c00] [c0000000007bf1dc] .do_page_fault+0x4d0/0x7a4 [ 543.076144] [c0000000a9083e30] [c0000000000090e8] handle_page_fault+0x10/0x30 [ 543.076155] Instruction dump: [ 543.076163] 7c630038 78631d88 e80a0000 f8410028 7c0903a6 e91f01de e96a0010 e84a0008 [ 543.076192] 4e800421 e8410028 7c7107b4 7a200fe0 <0b000000> 7f63db78 48785781 60000000 [ 543.076224] ---[ end trace bd5807e8d6ae186b ]--- Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18powerpc: Split the code trying to insert hpte repeatedly as an helper functionLi Zhong
Move the logic trying to insert hpte in __hash_page_huge() to an helper function, so it could also be used by others. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18powerpc: Move the setting of rflags out of loop in __hash_page_hugeLi Zhong
It seems that new_pte and rflags don't get changed in the repeating loop, so move their assignment out of the loop. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18powerpc: Set default VGA deviceBrian King
Add a PCI quirk for VGA devices on Power to set the default VGA device. Ensures a default VGA is always set if a graphics adapter is present, even if firmware did not initialize it. If more than one graphics adapter is present, ensure the one initialized by firmware is set as the default VGA device. This ensures that X autoconfiguration will work. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18powerpc/pci: fix PCI-e devices rescan issue on powerpc platformYuanquan Chen
Powerpc initializes the DMA and IRQ information in pci_scan_child_bus()-> pcibios_fixup_bus()->pcibios_setup_bus_devices(). But for the devices which are hotpluged, bus->is added has been set for the first scan of the PCI-e bus, so the initialization code won't be called. Then the hotpluged devices' driver will fail to load. For example : The PCI-e device 0001:03:00.0 is the Intel PCI-e e1000e network card, remove it from the system: # echo 1 > /sys/bus/pci/devices/0001\:03\:00.0/remove # e1000e 0001:03:00.0 eth0: removed PHC Rescan it from it's bus: # echo 1 > /sys/bus/pci/devices/0001\:02\:00.0/rescan ... e1000e 0001:03:00.0: Disabling ASPM L0s L1 e1000e 0001:03:00.0: No usable DMA configuration, aborting e1000e: probe of 0001:03:00.0 failed with error -5 So we move the DMA & IRQ initialization code from pcibios_setup_devices() and construct a new function pcibios_enable_device. We call this function in pcibios_enable_device, which will be called by PCI-e rescan code. At the meanwhile, we avoid the the impact on cardbus. I also validate this patch with silicon's PCIe-sata which encounters the IRQ issue. Signed-off-by: Yuanquan Chen <Yuanquan.Chen@freescale.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Hiroo Matsumoto <matsumoto.hiroo@jp.fujitsu.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18powerpc: fix annotation of fake_numa_create_new_node()Stephen Rothwell
This function has always been marked as __cpuinit, but is only called from functions marked as __init and references an __initdata variable. So change its annotation to __init. Fixes this build warning: WARNING: arch/powerpc/mm/built-in.o(.cpuinit.text+0x86): Section mismatch in reference from the function .fake_numa_create_new_node() to the variable .init.data:cmdline The function __cpuinit .fake_numa_create_new_node() references a variable __initdata cmdline. If cmdline is only used by .fake_numa_create_new_node then annotate cmdline with a matching annotation. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18powerpc: fix numa distance for form0 device treeVaidyanathan Srinivasan
The following commit breaks numa distance setup for old powerpc systems that use form0 encoding in device tree. commit 41eab6f88f24124df89e38067b3766b7bef06ddb powerpc/numa: Use form 1 affinity to setup node distance Device tree node /rtas/ibm,associativity-reference-points would index into /cpus/PowerPCxxxx/ibm,associativity based on form0 or form1 encoding detected by ibm,architecture-vec-5 property. All modern systems use form1 and current kernel code is correct. However, on older systems with form0 encoding, the numa distance will get hard coded as LOCAL_DISTANCE for all nodes. This causes task scheduling anomaly since scheduler will skip building numa level domain (topmost domain with all cpus) if all numa distances are same. (value of 'level' in sched_init_numa() will remain 0) Prior to the above commit: ((from) == (to) ? LOCAL_DISTANCE : REMOTE_DISTANCE) Restoring compatible behavior with this patch for old powerpc systems with device tree where numa distance are encoded as form0. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18powerpc/ptrace: Add DAWR debug feature info for userspaceMichael Neuling
This adds new debug feature information so that the DAWR can be identified by userspace tools like GDB. Unfortunately the DAWR doesn't sit nicely into the current description that ptrace provides to userspace via struct ppc_debug_info. It doesn't allow for specifying that only some ranges are possible or even the end alignment constraints (DAWR only allows 512 byte wide ranges which can't cross a 512 byte boundary). After talking to Edjunior Machado (GDB ppc developer), it was decided this was the best approach. Just mark it as debug feature DAWR and tools like GDB can internally decide the constraints. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18powerpc: Add accounting for Doorbell interruptsIan Munsie
This patch adds a new line to /proc/interrupts to account for the doorbell interrupts that each hardware thread has received. The total interrupt count in /proc/stat will now also include doorbells. # cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 16: 551 1267 281 175 XICS Level IPI LOC: 2037 1503 1688 1625 Local timer interrupts SPU: 0 0 0 0 Spurious interrupts CNT: 0 0 0 0 Performance monitoring interrupts MCE: 0 0 0 0 Machine check exceptions DBL: 42 550 20 91 Doorbell interrupts Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18powerpc/pseries: close DDW race between functions of adapterNishanth Aravamudan
Given a PCI device with multiple functions in a DDW capable slot, the following situation can be encountered: When the first function sets a 64-bit DMA mask, enable_ddw() will be called and we can fail to properly configure DDW (the most common reason being the new DMA window's size is not large enough to map all of an LPAR's memory). With the recent changes to DDW, we remove the base window in order to determine if the new window is of sufficient size to cover an LPAR's memory. We correctly replace the base window if we find that not to be the case. However, once we go through and re-configured 32-bit DMA via the IOMMU, the next function of the adapter will go through the same process. And since DDW is a characteristic of the slot itself, we are most likely going to fail again. But to determine we are going to fail the second slot, we again remove the base window -- but that is now in-use by the first function/driver, which might be issuing I/O already. To close this window, keep a list of all the failed struct device_nodes that have failed to configure DDW. If the current device_node is in that list, just fail out immediately and fall back to 32-bit DMA without doing any DDW manipulation. Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18powerpc/powernv: Use MSI bitmap to manage IRQsGavin Shan
As Michael Ellerman mentioned, arch/powerpc/sysdev/msi_bitmap.c already implemented bitmap to manage (alloc/free) MSI interrupts. The patch intends to use that mechanism to manage MSI interrupts for PowerNV platform. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18powerpc: Setup in HFSCR for POWER8Michael Neuling
Setup the HFSCR (Hypervisor Facility Status and Control Register) for POWER8 when running HV=1. The HFSCR is the same as the FSCR except it's for hypervisors. It controls the available of various facilities in OS and userspace levels. It also indicates the cause of a hypervisor facility unavailable interrupt (although we are not using this here). This patch sets the facilities Linux knows about incase the firmware doesn't. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18powerpc: Add HFSCR SPR definitionsMichael Neuling
Add SPR number and bit definitions for the HFSCR (Hypervisor Facility Status and Control Register). Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18powerpc: fixing ptrace_get_reg to return an errorAlexey Kardashevskiy
Currently ptrace_get_reg returns error as a value what make impossible to tell whether it is a correct value or error code. The patch adds a parameter which points to the real return data and returns an error code. As get_user_msr() never fails and it is used in multiple places so it has not been changed by this patch. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18powerpc: Fix build errors with UP configs in HV-style KVMPaul Mackerras
This fixes these errors when building UP with CONFIG_KVM_BOOK3S_64_HV=y: arch/powerpc/kvm/book3s_hv.c:1855:2: error: implicit declaration of function 'inhibit_secondary_onlining' [-Werror=implicit-function-declaration] arch/powerpc/kvm/book3s_hv.c:1862:2: error: implicit declaration of function 'uninhibit_secondary_onlining' [-Werror=implicit-function-declaration] cc1: all warnings being treated as errors and this error (with CONFIG_KVM_BOOK3S_64=m, or a vmlinux link error with CONFIG_KVM_BOOK3S_64=y): ERROR: "smp_send_reschedule" [arch/powerpc/kvm/kvm.ko] undefined! make[2]: *** [__modpost] Error 1 The fix for the link error is suboptimal; ideally we want a self_ipi() function from irq.c, connected at least to the MPIC code, to initiate an IPI to this cpu. The fix here at least lets the code build, and it will work, just with interrupts being delayed sometimes. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18powerpc: remove cast for kmalloc/kzalloc return valueZhang Yanfei
remove cast for kmalloc/kzalloc return value. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18powerpc/kgdb: Removed kmalloc returned value castAlex Grad
Signed-off-by: Alex Grad <alex.grad@gmail.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18powerpc: drop even more unused Kconfig symbolsPaul Bolle
When I submitted commit 6805ab6daa2b589fe3242d05ddc47a9dbb0c4eb1 ("powerpc: drop unused Kconfig symbols") I apparently failed to notice that my patch also made PREP_RESIDUAL and PPC_A2_DD2 unused. Drop these now. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18powerpc: remove CONFIG_MPC10X_OPENPICPaul Bolle
The last users of Kconfig symbol MPC10X_OPENPIC were removed in v2.6.27. Its Kconfig entry can be removed now. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18powerpc: remove unused CONFIG_405EPPaul Bolle
All users of Kconfig symbol 405EP were removed in release v2.6.27. Remove this symbol (and a useless select of it) too. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Acked-by: Josh Boyer <jwboyer@gmail.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>