Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
"I've been sitting on some of these fixes for a while.
- Corner case of returning to delay slot from interrupt
- Changing default interrupt prioiry level
- Kconfig'ize support for super pages
- Other minor fixes"
* tag 'arc-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: mm: Introduce explicit super page size support
ARCv2: intc: Allow interruption by lowest priority interrupt
ARCv2: Check for LL-SC livelock only if LLSC is enabled
ARC: shrink cpuinfo by not saving full timer BCR
ARCv2: clocksource: Rename GRTC -> GFRC ...
ARCv2: STAR 9000950267: Handle return from intr to Delay Slot #2
|
|
MMUv4 supports 2 concurrent page sizes: Normal and Super [4K to 16M]
So far Linux supported a single super page size for a given Normal page,
depending on the software page walking address split.
e.g. we had 11:8:13 address split for 8K page, which meant super page
was 2 ^(8+13) = 2M (given that THP size has to be PMD_SHIFT)
Now we turn this around, by allowing multiple Super Pages in Kconfig
(currently 2M and 16M only) and forcing page walker address split to
PGDIR_SHIFT and PAGE_SHIFT
For configs without Super page, things are same as before and
PGDIR_SHIFT can be hacked to get non default address split
The motivation for this change is a customer who needs 16M super page
and a 8K Normal page combo.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
ARC HS Cores support configurable multiple interrupt priorities of upto
16 levels.
There is processor "interrupt preemption threshhold" in STATUS32.E[4:1]
And several places need to set this up:
1. seed value as kernel is booting
2. seed value for user space programs
3. Arg to SLEEP instruction in idle task (what interrupt prio can wake)
4. Per-IRQ line prioirty (i.e. what is the priority of interrupt
raised by a peripheral or timer or perf counter...
Currently above sites use the highest priority 0. This can be potential
problem when multiple priorities are supported. e.g. user space could
only be interrupted by P0 interrupt, not others...
So turn this over and instead make default interruption level to be
the lowest priority possible 15. This should be fine even if there are
fewer priority levels configured (say two: P0 HIGH, P1 LOW)
This feature also effectively disables FIRQ feature if present in
hardware config. With old code, a P0 interrupt would be FIRQ, needing
special handling (ISR or Register Banks) which is NOT supported yet.
Now it not be P0 (P15 or whatever is lowest prio) so FIRQ is not
triggered.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
... it is now called Global Free Running Counter
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Returning to delay slot, riding an interrupti, had one loose end.
AUX_USER_SP used for restoring user mode SP upon RTIE was not being
setup from orig task's saved value, causing task to use wrong SP,
leading to ProtV errors.
The reason being:
- INTERRUPT_EPILOGUE returns to a kernel trampoline, thus not expected to restore it
- EXCEPTION_EPILOGUE is not used at all
Fix that by restoring AUX_USER_SP explicitly in the trampoline.
This was broken in the original workaround, but the error scenarios got
reduced considerably since v3.14 due to following:
1. The Linuxthreads.old based userspace at the time caused many more
exceptions in delay slot than the current NPTL based one.
Infact with current userspace the error doesn't happen at all.
2. Return from interrupt (delay slot or otherwise) doesn't get exercised much
after commit 4de0e52867d8 ("Really Re-enable interrupts to avoid deadlocks")
since IRQ_ACTIVE.active being clear means most returns are as if from pure
kernel (even for active interrupts)
Infact the issue only happened in an experimental branch where I was tinkering with
reverted 4de0e52867d8
Cc: stable@kernel.org # v4.2+
Fixes: 4255b07f2c9c ("ARCv2: STAR 9000793984: Handle return from intr to Delay Slot")
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Move the generic implementation to <linux/dma-mapping.h> now that all
architectures support it and remove the HAVE_DMA_ATTR Kconfig symbol now
that everyone supports them.
[valentinrothberg@gmail.com: remove leftovers in Kconfig]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Helge Deller <deller@gmx.de>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
[vgupta@synopsys.com: ARC: dma mapping fixes #2]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Carlos Palminha <CARLOS.PALMINHA@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
As illustrated by commit a3afe70b83fd ("[S390] latencytop s390
support."), HAVE_LATENCYTOP_SUPPORT is defined by an architecture to
advertise an implementation of save_stack_trace_tsk.
However, as of 9212ddb5eada ("stacktrace: provide save_stack_trace_tsk()
weak alias") a dummy implementation is provided if STACKTRACE=y. Given
that LATENCYTOP already depends on STACKTRACE_SUPPORT and selects
STACKTRACE, we can remove HAVE_LATENCYTOP_SUPPORT altogether.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Helge Deller <deller@gmx.de>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Let's define page_mapped() to be true for compound pages if any
sub-pages of the compound page is mapped (with PMD or PTE).
On other hand page_mapcount() return mapcount for this particular small
page.
This will make cases like page_get_anon_vma() behave correctly once we
allow huge pages to be mapped with PTE.
Most users outside core-mm should use page_mapcount() instead of
page_mapped().
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching
Pull livepatching updates from Jiri Kosina:
- RO/NX attribute fixes for patch module relocations from Josh
Poimboeuf. As part of this effort, module.c has been cleaned up as
well and livepatching is piggy-backing on this cleanup. Rusty is OK
with this whole lot going through livepatching tree.
- symbol disambiguation support from Chris J Arges. That series is
also
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
but this came in only after I've alredy pushed out. Didn't want to
rebase because of that, hence I am mentioning it here.
- symbol lookup fix from Miroslav Benes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
livepatch: Cleanup module page permission changes
module: keep percpu symbols in module's symtab
module: clean up RO/NX handling.
module: use a structure to encapsulate layout.
gcov: use within_module() helper.
module: Use the same logic for setting and unsetting RO/NX
livepatch: function,sympos scheme in livepatch sysfs directory
livepatch: add sympos as disambiguator field to klp_reloc
livepatch: add old_sympos as disambiguator field to klp_func
|
|
Instead of seeing empty stack traces, let kernel fail early so dwarf
issues can be fixed sooner
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
The rudimentary CIE.version == 3 handling is already present in code
(for return address register specification)
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Blingly ignoring CIE.version != 1 was a bad idea.
It still leaves "desirability" when running perf with callgraphing where libgcc
symbols might show in hotspot.
More importantly, basic CIE.version == 3 support already exists in code:
|
| retAddrReg = state.version <= 1 ? *ptr++ : get_uleb128(&ptr, end);
|
Next commit with simply add continue-not-bail for CIE.version != 1
This reverts commit 323f41f9e7d0cb5b1d1586aded6682855f1e646d.
|
|
At -Os, ARC gcc generates millicode thunk for function prologue/epilogue,
which are served by libgcc.
Modules historically are NOT linked with libgcc to avoid code bloat, reducing
runtime relocation fixups etc. I even once tried doing that but got lost
in makefile intricacies.
This means modules at -Os don't get the millicode thunks, causing build
failures below:
| MODPOST 5 modules
| ERROR: "__ld_r13_to_r18" [crypto/sha256_generic.ko] undefined!
| ERROR: "__ld_r13_to_r18_ret" [crypto/sha256_generic.ko] undefined!
| ERROR: "__st_r13_to_r18" [crypto/sha256_generic.ko] undefined!
| ERROR: "__ld_r13_to_r17_ret" [crypto/sha256_generic.ko] undefined!
| ERROR: "__st_r13_to_r17" [crypto/sha256_generic.ko] undefined!
| ERROR: "__ld_r13_to_r16_ret" [crypto/sha256_generic.ko] undefined!
| ERROR: "__st_r13_to_r16" [crypto/sha256_generic.ko] undefined!
|....
|....
Workaround that by inhibiting millicode thunks for loadable modules
Fixes STAR 9000641864:
("Linux built with optimizations for size emits errors for modules")
Reported-by: Anton Kolesov <akolesov@synosys.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
ARC700 cores with MMU v2 don't have IC_PTAG AUX register and so we only
define ARC_REG_IC_PTAG for MMU versions >= 3.
But current implementation of cache_line_loop_vX() routines assumes
availability of all of them (v2, v3 and v4) simultaneously.
And given undefined ARC_REG_IC_PTAG if CONFIG_MMU_VER=2 we're seeing
compilation problem:
---------------------------------->8-------------------------------
CC arch/arc/mm/cache.o
arch/arc/mm/cache.c: In function '__cache_line_loop_v3':
arch/arc/mm/cache.c:270:13: error: 'ARC_REG_IC_PTAG' undeclared (first use in this function)
aux_tag = ARC_REG_IC_PTAG;
^
arch/arc/mm/cache.c:270:13: note: each undeclared identifier is reported only once for each function it appears in
scripts/Makefile.build:258: recipe for target 'arch/arc/mm/cache.o' failed
---------------------------------->8-------------------------------
The simples fix is to have ARC_REG_IC_PTAG defined regardless MMU
version being used.
We don't use it in cache_line_loop_v2() anyways so who cares.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
| WARNING: vmlinux.o(.text+0xd6c2): Section mismatch in reference from the function alloc_kmap_pgtable() to the function
| .init.text:__alloc_bootmem_low()
The function alloc_kmap_pgtable() references the function __init __alloc_bootmem_low().
This is often because alloc_kmap_pgtable lacks a __init annotation or the annotation of __alloc_bootmem_low is wrong.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Makes it similar to smp_ops which also has callback with same name
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
This will better reflect its description i.e. "any needed setup..."
and not just do an "IPI request".
Signed-off-by: Noam Camus <noamc@ezchip.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
ARC dwarf unwinder only supports CIE version == 1
The boot time dwarf sanitizer (part of binary lookup table constructor)
would simply bail if it saw CIE version == 3, rendering unwinder with a
NULL lookup table.
It seems libgcc linked with kernel does have such entries.
With fallback linear search removed, and a NULL binary lookup table,
unwinder fails to generate any stack trace.
So allow graceful ignoring of unsupported CIE entries.
This problem was initially seen in Alexey's setup (and not mine) as he
was using buildroot built toolchain (libgcc) which doesn't get built with
CFLAGS_FOR_TARGET="-gdwarf-2 which is my default
Fixes STAR 9000985048: "kernel unwinder broken with stock tools"
Fixes: 2e22502c080f ARC: dw2 unwind: Remove falllback linear search thru FDE entries
Reported-by Alexey Brodkin <abrodkin@synopsys.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
The fix which removed linear searching of dwarf (because binary lookup
data always exists) missed out on the fact that modules don't get the
binary lookup tables info. This caused unwinding out of modules to stop
working.
So add binary lookup header setup (equivalent of eh_frame_hdr setup) to
modules as well.
While at it, confine the header setup to within unwinder code,
reducing one API exposed out of unwinder code.
Fixes: 2e22502c080f ARC: dw2 unwind: Remove falllback linear search thru FDE entries
Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
HIGHMEM support bumped the default memory size for nsim platform to 1G.
Thus total memory ended at the very edge of start of peripherals address
space. With linux link base shifted, memory started bleeding into
peripheral space which caused early boot bad_page spew !
Fixes: 29e332261d2 ("ARC: mm: HIGHMEM: populate high memory from DT")
Reported-by: Anton Kolesov <akolesov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
This was the second perf intr issue
perf sampling on multicore requires intr to be enabled on all cores.
ARC perf probe code used helper arc_request_percpu_irq() which calls
- request_percpu_irq() on core0
- enable_percpu_irq() on all all cores (including core0)
genirq requires that request be made ahead of enable call.
However if perf probe happened on non core0 (observed on a 3.18 kernel),
enable would get called ahead of request, failing obviously and
rendering perf intr disabled on all such cores
[ 11.120000] 1 ARC perf : 8 counters (48 bits), 113 conditions, [overflow IRQ support]
[ 11.130000] 1 -----> enable_percpu_irq() IRQ 20 failed
[ 11.140000] 3 -----> enable_percpu_irq() IRQ 20 failed
[ 11.140000] 2 -----> enable_percpu_irq() IRQ 20 failed
[ 11.140000] 0 =====> request_percpu_irq() IRQ 20
[ 11.140000] 0 -----> enable_percpu_irq() IRQ 20
Fix this fragility, by calling request_percpu_irq() on whatever core
calls probe (there is no requirement on which core calls this anyways)
and then calling enable on each cores.
Interestingly this started as invesigation of STAR 9000838902:
"sporadically IRQs enabled on perf prob"
which was about occassional boot spew as request_percpu_irq got called
non-locally (from an IPI), and re-enabled interrupts in following path
proc_mkdir -> spin_unlock_irq()
which the irq work code didn't like.
| ARC perf : 8 counters (48 bits), 113 conditions, [overflow IRQ support]
|
| BUG: failure at ../kernel/irq_work.c:135/irq_work_run_list()!
| CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.10-01127-g285efb8e66d1 #2
|
| Stack Trace:
| arc_unwind_core.constprop.1+0x94/0x104
| dump_stack+0x62/0x98
| irq_work_run_list+0xb0/0xb4
| irq_work_run+0x22/0x3c
| do_IPI+0x74/0x9c
| handle_irq_event_percpu+0x34/0x164
| handle_percpu_irq+0x58/0x78
| generic_handle_irq+0x1e/0x2c
| arch_do_IRQ+0x3c/0x60
| ret_from_exception+0x0/0x8
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: <stable@vger.kernel.org> #4.2+
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
arc_request_percpu_irq() is called by all cores to request/enable percpu
irq. It has some "prep" calls needed by genirq:
- setup percpu devid
- disable IRQ_NOAUTOEN
However given that enable_percpu_irq() is called enayways, latter can be
avoided.
We are now left with irq_set_percpu_devid() quirk and that too for
ARCompact builds only, since previous patch updated ARCv2 intc to do this
in the "right" place, i.e. irq map function.
By next release, this will ultimately be fixed for ARCompact as well.
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
As part of fixing another perf issue, observed that after a perf run,
the interrupt got disabled on one/more cores.
Turns out that despite requesting perf irq as percpu, the flow handler
registered was not handle_percpu_irq()
Given that on ARCv2 cores, IRQs < 24 are always private to cpu, we
register the right handler at the very onset.
Before Fix
| [ARCLinux]# cat /proc/interrupts | grep perf
| 20: 0 0 0 0 ARCv2 core Intc 20 ARC perf counters
|
| [ARCLinux]# perf record -c 20000 /sbin/hackbench
| Running with 10*40 (== 400) tasks.
|
| [ARCLinux]# cat /proc/interrupts | grep perf
| 20: 0 522 8 51916 ARCv2 core Intc 20 ARC perf counters
|
| [ARCLinux]# perf record -c 20000 /sbin/hackbench
| Running with 10*40 (== 400) tasks.
|
| [ARCLinux]# cat /proc/interrupts | grep perf
| 20: 0 522 8 104368 ARCv2 core Intc 20 ARC perf counters
After Fix
| [ARCLinux]# cat /proc/interrupts | grep perf
| 20: 0 0 0 0 ARCv2 core Intc 20 ARC perf counters
|
| [ARCLinux]# perf record -c 20000 /sbin/hackbench
| Running with 10*40 (== 400) tasks.
|
| [ARCLinux]# cat /proc/interrupts | grep perf
| 20: 64198 62012 62697 67803 ARCv2 core Intc 20 ARC perf counters
|
| [ARCLinux]# perf record -c 20000 /sbin/hackbench
| Running with 10*40 (== 400) tasks.
|
| [ARCLinux]# cat /proc/interrupts | grep perf
| 20: 126014 122792 123301 133654 ARCv2 core Intc 20 ARC perf counters
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: stable@vger.kernel.org #4.2+
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Current ARC SDP boards cannot reliably handle 1Gbit
Ethernet connections due to limitations in hardware.
To make sure networking is stable on the board we're
limiting phy to 100 Mbit.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Makes it easier to handle init vs core cleanly, though the change is
fairly invasive across random architectures.
It simplifies the rbtree code immediately, however, while keeping the
core data together in the same cachline (now iff the rbtree code is
enabled).
Acked-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
Fixes STAR 9000953410: "perf callgraph profiling causing RCU stalls"
| perf record -g -c 15000 -e cycles /sbin/hackbench
|
| INFO: rcu_preempt self-detected stall on CPU
| 1: (1 GPs behind) idle=609/140000000000002/0 softirq=2914/2915 fqs=603
| Task dump for CPU 1:
in-kernel dwarf unwinder has a fast binary lookup and a fallback linear
search (which iterates thru each of ~11K entries) thus takes 2 orders of
magnitude longer (~3 million cycles vs. 2000). Routines written in hand
assembler lack dwarf info (as we don't support assembler CFI pseudo-ops
yet) fail the unwinder binary lookup, hit linear search, failing
nevertheless in the end.
However the linear search is pointless as binary lookup tables are created
from it in first place. It is impossible to have binary lookup fail while
succeed the linear search. It is pure waste of cycles thus removed by
this patch.
This manifested as RCU stalls / NMI watchdog splat when running
hackbench under perf with callgraph profiling. The triggering condition
was perf counter overflowing in routine lacking dwarf info (like memset)
leading to patheic 3 million cycle unwinder slow path and by the time it
returned new interrupts were already pending (Timer, IPI) and taken
rightaway. The original memset didn't make forward progress, system kept
accruing more interrupts and more unwinder delayes in a vicious feedback
loop, ultimately triggering the NMI diagnostic.
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
SYNC in __switch_to() is a historic relic and not needed at all.
- In UP context it is obviously useless, why would we want to stall
the core for all updates to stack memory of t0 to complete before
loading kernel mode callee registers from t1 stack's memory.
- In SMP, there could be potential race in which outgoing task could
be concurrently picked for running on a different core, thus writes
to stack here need to be visible before the reads from stack on
other core. Peter confirmed that generic schedular already has needed
barriers (by way of rq lock) so there is no need for additional arch
barrier.
This came up when Noam was trying to replace this SYNC with EZChip
specific hardware thread scheduling instruction for their platform
support.
Link: http://lkml.kernel.org/r/20151102092654.GM17308@twins.programming.kicks-ass.net
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Cc: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Although kernel doesn't support the multiple IRQ priority levels provided
by HS38x core intc yet, ensure that the default prio value is used
anyways by relevant code.
SLEEP insn needs to be provided the IRQ priority level which can
interrupt it. This needs to be the default level which maynot
necessarily be 0 as assumed by current code.
This change allows a kernel with ARCV2_IRQ_DEF_PRIO = 1 to boot fine.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
No semantical changes, prepares for ARCv2 specific change in next commit
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
When building kernel with buildroot built toolchain, CROSS_COMPILE
currently needs adjustment even if minor. This is because the defconfigs
prefer "arc-linux-uclibc-" prefix from hand built (non buildroot) toolchain
while buildroot provides "arc-buildroot-linux-uclibc-"
To avoid this use the common "arc-linux-" prefix which is provided by
buildroot and has also been in hand built tools for quite some time.
Signed-off-by: sujayraaj <sujayraaj@gmail.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: updated changelog]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
"Found a couple of brown paper bag bugs with the prev pull request
(including a SMP build breakage report from Guenter). Since these are
urgent I also decided to send over a bunch of other pending fixes
which could have otherwise waited an rc or two.
Summary:
- A bunch of brown paper bag bugs (MAINTAINERS list email, SMP build
failure)
- cpu_relax() now compiler barrier for UP as well
- handling of userspace Bus Errors for ARCompact builds"
* tag 'arc-4.4-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: Fix silly typo in MAINTAINERS file
ARC: cpu_relax() to be compiler barrier even for UP
ARC: use ASL assembler mnemonic
ARC: [arcompact] Handle bus error from userspace as Interrupt not exception
ARC: remove extraneous header include
ARCv2: lib: memcpy: use local symbols
|
|
cpu_relax() on ARC has been barrier only for SMP (and no-op for UP). Per
recent discussions, it is safer to make it a compiler barrier
unconditionally.
Link: http://lkml.kernel.org/r/53A7D3AA.9020100@synopsys.com
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
ARCompact and ARCv2 only have ASL, while binutils used to support LSL as
a alias mnemonic.
Newer binutils (upstream) don't want to do that so replace it.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Bus errors from userspace on ARCompact based cores are handled by core
as a high priority L2 interrupt but current code treated it as interrupt
Handling an interrupt like exception is certainly not going to go unnoticed.
(and it worked so far as we never saw a Bus error from userspace until
IPPK guys tested a DDR controller with ECC error detection etc hence
needed to explicitly trigger/handle such errors)
- So move mem_service exception handler from common code into ARCv2 code.
- In ARCompact code, define mem_service as L2 interrupt handler which
just drops down to pure kernel mode and goes of to enqueue SIGBUS
Reported-by: Nelson Pereira <npereira@synopsys.com>
Tested-by: Ana Martins <amartins@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull DeviceTree updates from Rob Herring:
"A fairly large (by DT standards) pull request this time with the
majority being some overdue moving DT binding docs around to
consolidate similar bindings.
- DT binding doc consolidation moving similar bindings to common
locations. The majority of these are display related which were
scattered in video/, fb/, drm/, gpu/, and panel/ directories.
- Add new config option, CONFIG_OF_ALL_DTBS, to enable building all
dtbs in the tree for most arches with dts files (except powerpc for
now).
- OF_IRQ=n fixes for user enabled CONFIG_OF.
- of_node_put ref counting fixes from Julia Lawall.
- Common DT binding for wakeup-source and deprecation of all similar
bindings.
- DT binding for PXA LCD controller.
- Allow ignoring failed PCI resource translations in order to ignore
64-bit addresses on non-LPAE 32-bit kernels.
- Support setting the NUMA node from DT instead of only from parent
device.
- Couple of earlycon DT parsing fixes for address and options"
* tag 'devicetree-for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (45 commits)
MAINTAINERS: update DT binding doc locations
devicetree: add Sigma Designs vendor prefix
of: simplify arch_find_n_match_cpu_physical_id() function
Documentation: arm: Fixed typo in socfpga fpga mgr example
Documentation: devicetree: fix reference to legacy wakeup properties
Documentation: devicetree: standardize/consolidate on "wakeup-source" property
drivers: of: removing assignment of 0 to static variable
xtensa: enable building of all dtbs
mips: enable building of all dtbs
metag: enable building of all dtbs
metag: use common make variables for dtb builds
h8300: enable building of all dtbs
arm64: enable building of all dtbs
arm: enable building of all dtbs
arc: enable building of all dtbs
arc: use common make variables for dtb builds
of: add config option to enable building of all dtbs
of/fdt: fix error checking for earlycon address
of/overlay: add missing of_node_put
of/platform: add missing of_node_put
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking changes from Ingo Molnar:
"The main changes in this cycle were:
- More gradual enhancements to atomic ops: new atomic*_read_ctrl()
ops, synchronize atomic_{read,set}() ordering requirements between
architectures, add atomic_long_t bitops. (Peter Zijlstra)
- Add _{relaxed|acquire|release}() variants for inc/dec atomics and
use them in various locking primitives: mutex, rtmutex, mcs, rwsem.
This enables weakly ordered architectures (such as arm64) to make
use of more locking related optimizations. (Davidlohr Bueso)
- Implement atomic[64]_{inc,dec}_relaxed() on ARM. (Will Deacon)
- Futex kernel data cache footprint micro-optimization. (Rasmus
Villemoes)
- pvqspinlock runtime overhead micro-optimization. (Waiman Long)
- misc smaller fixlets"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ARM, locking/atomics: Implement _relaxed variants of atomic[64]_{inc,dec}
locking/rwsem: Use acquire/release semantics
locking/mcs: Use acquire/release semantics
locking/rtmutex: Use acquire/release semantics
locking/mutex: Use acquire/release semantics
locking/asm-generic: Add _{relaxed|acquire|release}() variants for inc/dec atomics
atomic: Implement atomic_read_ctrl()
atomic, arch: Audit atomic_{read,set}()
atomic: Add atomic_long_t bitops
futex: Force hot variables into a single cache line
locking/pvqspinlock: Kick the PV CPU unconditionally when _Q_SLOW_VAL
locking/osq: Relax atomic semantics
locking/qrwlock: Rename ->lock to ->wait_lock
locking/Documentation/lockstat: Fix typo - lokcing -> locking
locking/atomics, cmpxchg: Privatize the inclusion of asm/cmpxchg.h
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC updates from Vineet Gupta:
- Support for new MM features in ARCv2 cores (THP, PAE40) Some generic
THP bits are touched - all ACKed by Kirill
- Platform framework updates to prepare for EZChip arrival (still in works)
- ARC Public Mailing list setup finally (linux-snps-arc@lists.infraded.org)
* tag 'arc-4.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (42 commits)
ARC: mm: PAE40 support
ARC: mm: PAE40: tlbex.S: Explicitify the size of pte_t
ARC: mm: PAE40: switch to using phys_addr_t for physical addresses
ARC: mm: HIGHMEM: populate high memory from DT
ARC: mm: HIGHMEM: kmap API implementation
ARC: mm: preps ahead of HIGHMEM support #2
ARC: mm: preps ahead of HIGHMEM support
ARC: mm: use generic macros _BITUL()/_AC()
ARC: mm: Improve Duplicate PD Fault handler
MAINTAINERS: Add public mailing list for ARC
ARC: Ensure DT mem base is same as what kernel is built with
ARC: boot: Non Master cpus only need to call EARLY_CPU_SETUP once
ARCv2: smp: [plat-*]: No need to explicitly call mcip_init_smp()
ARC: smp: Introduce smp hook @init_irq_cpu called for all cores
ARC: smp: Rename platform hook @init_smp -> @init_cpu_smp
ARCv2: smp: [plat-*]: No need to explicitly call mcip_init_early_smp()
ARC: smp: Introduce smp hook @init_early_smp for Master core
ARC: remove @init_time, @init_irq platform callbacks
ARC: smp: irqchip: handle IPI as percpu irq like timer
ARC: boot: Support Halt-on-reset and Run-on-reset SMP booting modes
...
|
|
Otherwise perf profiles don't charge tme to memcpy
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
This is the first working implementation of 40-bit physical address
extension on ARCv2.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
That way a single flip of phys_addr_t to 64 bit ensures all places
dealing with physical addresses get correct data
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Implement kmap* API for ARC.
This enables
- permanent kernel maps (pkmaps): :kmap() API
- fixmap : kmap_atomic()
We use a very simple/uniform approach for both (unlike some of the other
arches). So fixmap doesn't use the customary compile time address stuff.
The important semantic is sleep'ability (pkmap) vs. not (fixmap) which
the API guarantees.
Note that this patch only enables highmem for subsequent PAE40 support
as there is no real highmem for ARC in pure 32-bit paradigm as explained
below.
ARC has 2:2 address split of the 32-bit address space with lower half
being translated (virtual) while upper half unstranslated
(0x8000_0000 to 0xFFFF_FFFF). kernel itself is linked at base of
unstranslated space (i.e. 0x8000_0000 onwards), which is mapped to say
DDR 0x0 by external Bus Glue logic (outside the core). So kernel can
potentially access 1.75G worth of memory directly w/o need for highmem.
(the top 256M is taken by uncached peripheral space from 0xF000_0000 to
0xFFFF_FFFF)
In PAE40, hardware can address memory beyond 4G (0x1_0000_0000) while
the logical/virtual addresses remain 32-bits. Thus highmem is required
for kernel proper to be able to access these pages for it's own purposes
(user space is agnostic to this anyways).
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Explicit'ify that all memory added so far is low memory
Nothing semantical
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|