summaryrefslogtreecommitdiff
path: root/arch/arm/mm
AgeCommit message (Collapse)Author
2013-08-12ARM: tlb: don't perform inner-shareable invalidation for local TLB opsWill Deacon
Inner-shareable TLB invalidation is typically more expensive than local (non-shareable) invalidation, so performing the broadcasting for local_flush_tlb_* operations is a waste of cycles and needlessly clobbers entries in the TLBs of other CPUs. This patch introduces __flush_tlb_* versions for many of the TLB invalidation functions, which only respect inner-shareable variants of the invalidation instructions when presented with the TLB_V7_UIS_FULL flag. The local version is also inlined to prevent SMP_ON_UP kernels from missing flushes, where the __flush variant would be called with the UP flags. This gains us around 0.5% in hackbench scores for a dual-core A15, but I would expect this to improve as more cores (and clusters) are added to the equation. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Albin Tonnerre <Albin.Tonnerre@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-08-12ARM: mm: remove redundant dsb() prior to range TLB invalidationWill Deacon
The kernel TLB range invalidation functions already contain dsb instructions before and after the maintenance, so there is no need to introduce additional barriers. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-08-01Merge branch 'security-fixes' into fixesRussell King
2013-08-01ARM: make vectors page inaccessible from userspaceRussell King
If kuser helpers are not provided by the kernel, disable user access to the vectors page. With the kuser helpers gone, there is no reason for this page to be visible to userspace. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-31ARM: allow kuser helpers to be removed from the vector pageRussell King
Provide a kernel configuration option to allow the kernel user helpers to be removed from the vector page, thereby preventing their use with ROP (return orientated programming) attacks. This option is only visible for CPU architectures which natively support all the operations which kernel user helpers would normally provide, and must be enabled with caution. Cc: <stable@vger.kernel.org> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-31ARM: move vector stubsRussell King
Move the machine vector stubs into the page above the vector page, which we can prevent from being visible to userspace. Also move the reset stub, and place the swi vector at a location that the 'ldr' can get to it. This hides pointers into the kernel which could give valuable information to attackers, and reduces the number of exploitable instructions at a fixed address. Cc: <stable@vger.kernel.org> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-26ARM: 7789/1: Do not run dummy_flush_tlb_a15_erratum() on non-Cortex-A15Fabio Estevam
Commit 93dc688 (ARM: 7684/1: errata: Workaround for Cortex-A15 erratum 798181 (TLBI/DSB operations)) causes the following undefined instruction error on a mx53 (Cortex-A8): Internal error: Oops - undefined instruction: 0 [#1] SMP ARM CPU: 0 PID: 275 Comm: modprobe Not tainted 3.11.0-rc2-next-20130722-00009-g9b0f371 #881 task: df46cc00 ti: df48e000 task.ti: df48e000 PC is at check_and_switch_context+0x17c/0x4d0 LR is at check_and_switch_context+0xdc/0x4d0 This problem happens because check_and_switch_context() calls dummy_flush_tlb_a15_erratum() without checking if we are really running on a Cortex-A15 or not. To avoid this issue, only call dummy_flush_tlb_a15_erratum() inside check_and_switch_context() if erratum_a15_798181() returns true, which means that we are really running on a Cortex-A15. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-22ARM: 7785/1: mm: restrict early_alloc to section-aligned memoryRussell King
When map_lowmem() runs, and processes a memory bank whose start or end is not section-aligned, memory must be allocated to store the 2nd-level page tables. Those allocations are made by calling memblock_alloc(). At this point, the only memory that is free *and* mapped is memory which has already been mapped by map_lowmem() itself. For this reason, we must calculate the first point at which map_lowmem() will need to allocate memory, and set the memblock allocation limit to a lower address, so that memblock_alloc() is guaranteed to return memory that is already mapped. This patch enhances sanity_check_meminfo() to calculate that memory address, and pass it to memblock_set_current_limit(), rather than just assuming the limit is arm_lowmem_limit. The algorithm applied is: * Default memblock_limit to arm_lowmem_limit in the absence of any other limit; arm_lowmem_limit is the highest memory that is mapped by map_lowmem(). * While walking the list of memblocks, if the start of a block is not aligned, 2nd-level page tables will need to be allocated to map the first few pages of the block. Hence, the memblock_limit must be before the start of the block. * Similarly, if the end of any block is not aligned, 2nd-level page tables will need to be allocated to map the last few pages of the block. Hence, the memblock_limit must point at the end of the block, rounded down to section-alignment. * The memory blocks are assumed to be sorted in address order, so the first unaligned block start or end is used to set the limit. With this algorithm, the start or end of almost any bank can be non- section-aligned. The only exception is that the start of bank 0 must be section-aligned, since otherwise memory would need to be allocated when mapping the start of bank 0, which occurs before any free memory is mapped. [swarren, wrote commit description, rewrote calculation of memblock_limit] Signed-off-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-22ARM: 7784/1: mm: ensure SMP alternates assemble to exactly 4 bytes with Thumb-2Will Deacon
Commit ae8a8b9553bd ("ARM: 7691/1: mm: kill unused TLB_CAN_READ_FROM_L1_CACHE and use ALT_SMP instead") added early function returns for page table cache flushing operations on ARMv7 SMP CPUs. Unfortunately, when targetting Thumb-2, these `mov pc, lr' sequences assemble to 2 bytes which can lead to corruption of the instruction stream after code patching. This patch fixes the alternates to use wide (32-bit) instructions for Thumb-2, therefore ensuring that the patching code works correctly. Cc: <stable@vger.kernel.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-14arm: delete __cpuinit/__CPUINIT usage from all ARM usersPaul Gortmaker
The __cpuinit type of throwaway sections might have made sense some time ago when RAM was more constrained, but now the savings do not offset the cost and complications. For example, the fix in commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time") is a good example of the nasty type of bugs that can be created with improper use of the various __init prefixes. After a discussion on LKML[1] it was decided that cpuinit should go the way of devinit and be phased out. Once all the users are gone, we can then finally remove the macros themselves from linux/init.h. Note that some harmless section mismatch warnings may result, since notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c) and are flagged as __cpuinit -- so if we remove the __cpuinit from the arch specific callers, we will also get section mismatch warnings. As an intermediate step, we intend to turn the linux/init.h cpuinit related content into no-ops as early as possible, since that will get rid of these warnings. In any case, they are temporary and harmless. This removes all the ARM uses of the __cpuinit macros from C code, and all __CPUINIT from assembly code. It also had two ".previous" section statements that were paired off against __CPUINIT (aka .section ".cpuinit.text") that also get removed here. [1] https://lkml.org/lkml/2013/5/20/589 Cc: Russell King <linux@arm.linux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2013-07-13Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "A few fixes for ARM, mostly just one liners with the exception of the missing section specification. We decided not to rely on .previous to fix this but to explicitly state the section we want the code to be in." * 'fixes' of git://git.linaro.org/people/rmk/linux-arm: ARM: 7778/1: smp_twd: twd_update_frequency need be run on all online CPUs ARM: 7782/1: Kconfig: Let ARM_ERRATA_364296 not depend on CONFIG_SMP ARM: mm: fix boot on SA1110 Assabet ARM: 7781/1: mmu: Add debug_ll_io_init() mappings to early mappings ARM: 7780/1: add missing linker section markup to head-common.S
2013-07-11mm: remove free_area_cacheMichel Lespinasse
Since all architectures have been converted to use vm_unmapped_area(), there is no remaining use for the free_area_cache. Signed-off-by: Michel Lespinasse <walken@google.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Helge Deller <deller@gmx.de> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-09ARM: mm: fix boot on SA1110 AssabetRussell King
Commit 83db0384 (mm/ARM: use common help functions to free reserved pages) broke booting on the Assabet by trying to convert a PFN to a virtual address using the __va() macro. This macro takes the physical address, not a PFN. Fix this. Cc: <stable@vger.kernel.org> # 3.10 Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-09ARM: 7781/1: mmu: Add debug_ll_io_init() mappings to early mappingsStephen Boyd
Failure to add the mapping created in debug_ll_io_init() can lead to the BUG_ON() triggering in lib/ioremap.c:27 if the static virtual address decided for the debug_ll mapping overlaps with another mapping that is created later. This happens because the generic ioremap code has no idea there is a mapping there and it tries to place a mapping in the same location and blows up when it sees that there is a pte already present. kernel BUG at lib/ioremap.c:27! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-rc2-00042-g2af0c67-dirty #316 task: ef088000 ti: ef082000 task.ti: ef082000 PC is at ioremap_page_range+0x16c/0x198 LR is at ioremap_page_range+0xf0/0x198 pc : [<c04cb874>] lr : [<c04cb7f8>] psr: 20000113 sp : ef083e78 ip : af140000 fp : ef083ebc r10: ef7fc100 r9 : ef7fc104 r8 : 000af174 r7 : 00000647 r6 : beffffff r5 : f004c000 r4 : f0040000 r3 : af173417 r2 : 16440653 r1 : af173e07 r0 : ef7fc8fc Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 10c5787d Table: 8020406a DAC: 00000015 Process swapper/0 (pid: 1, stack limit = 0xef082238) Stack: (0xef083e78 to 0xef084000) 3e60: 00040000 ef083eec 3e80: bf134000 f004bfff c0207c00 f004c000 c02fc120 f000c000 c15e7800 00040000 3ea0: ef083eec 00000647 c098ba9c c0953544 ef083edc ef083ec0 c021b82c c04cb714 3ec0: c09cdc50 00000040 ef0f1e00 ef1003c0 ef083f14 ef083ee0 c09535bc c021b7bc 3ee0: c0953544 c04d0c6c c094e2cc c1600be4 c07440c4 c09a6888 00000002 c0a15f00 3f00: ef082000 00000000 ef083f54 ef083f18 c0208728 c0953550 00000002 c1600bfc 3f20: c08e3fac c0839918 ef083f54 c1600b80 c09a6888 c0a15f00 0000008b c094e2cc 3f40: c098ba9c c098bab8 ef083f94 ef083f58 c094ea0c c020865c 00000002 00000002 3f60: c094e2cc 00000000 c025b674 00000000 c06ff860 00000000 00000000 00000000 3f80: 00000000 00000000 ef083fac ef083f98 c06ff878 c094e910 00000000 00000000 3fa0: 00000000 ef083fb0 c020efe8 c06ff86c 00000000 00000000 00000000 00000000 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 c0595108 [<c04cb874>] (ioremap_page_range+0x16c/0x198) from [<c021b82c>] (__alloc_remap_buffer.isra.18+0x7c/0xc4) [<c021b82c>] (__alloc_remap_buffer.isra.18+0x7c/0xc4) from [<c09535bc>] (atomic_pool_init+0x78/0x128) [<c09535bc>] (atomic_pool_init+0x78/0x128) from [<c0208728>] (do_one_initcall+0xd8/0x198) [<c0208728>] (do_one_initcall+0xd8/0x198) from [<c094ea0c>] (kernel_init_freeable+0x108/0x1d0) [<c094ea0c>] (kernel_init_freeable+0x108/0x1d0) from [<c06ff878>] (kernel_init+0x18/0xf4) [<c06ff878>] (kernel_init+0x18/0xf4) from [<c020efe8>] (ret_from_fork+0x14/0x20) Code: e50b0040 ebf54b2f e51b0040 eaffffee (e7f001f2) Fix it by telling generic layers about the static mapping via iotable_init(). This also has the nice side effect of letting you see the mapping in procfs' vmallocinfo file. Cc: Rob Herring <rob.herring@calxeda.com> Cc: Stephen Warren <swarren@nvidia.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-06Merge branch 'for-v3.11' of ↵Linus Torvalds
git://git.linaro.org/people/mszyprowski/linux-dma-mapping Pull ARM DMA mapping updates from Marek Szyprowski: "This contains important bugfixes and an update for IOMMU integration support for ARM architecture" * 'for-v3.11' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping: ARM: dma: Drop __GFP_COMP for iommu dma memory allocations ARM: DMA-mapping: mark all !DMA_TO_DEVICE pages in unmapping as clean ARM: dma-mapping: NULLify dev->archdata.mapping pointer on detach ARM: dma-mapping: convert DMA direction into IOMMU protection attributes ARM: dma-mapping: Get pages if the cpu_addr is out of atomic_pool
2013-07-03mm/ARM: prepare for removing num_physpages and simplify mem_init()Jiang Liu
Prepare for removing num_physpages and simplify mem_init(). Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03mm: concentrate modification of totalram_pages into the mm coreJiang Liu
Concentrate code to modify totalram_pages into the mm core, so the arch memory initialized code doesn't need to take care of it. With these changes applied, only following functions from mm core modify global variable totalram_pages: free_bootmem_late(), free_all_bootmem(), free_all_bootmem_node(), adjust_managed_page_count(). With this patch applied, it will be much more easier for us to keep totalram_pages and zone->managed_pages in consistence. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Acked-by: David Howells <dhowells@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: <sworddragon2@aol.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Michel Lespinasse <walken@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03mm: enhance free_reserved_area() to support poisoning memory with zeroJiang Liu
Address more review comments from last round of code review. 1) Enhance free_reserved_area() to support poisoning freed memory with pattern '0'. This could be used to get rid of poison_init_mem() on ARM64. 2) A previous patch has disabled memory poison for initmem on s390 by mistake, so restore to the original behavior. 3) Remove redundant PAGE_ALIGN() when calling free_reserved_area(). Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: <sworddragon2@aol.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: David Howells <dhowells@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Michel Lespinasse <walken@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03mm: change signature of free_reserved_area() to fix building warningsJiang Liu
Change signature of free_reserved_area() according to Russell King's suggestion to fix following build warnings: arch/arm/mm/init.c: In function 'mem_init': arch/arm/mm/init.c:603:2: warning: passing argument 1 of 'free_reserved_area' makes integer from pointer without a cast [enabled by default] free_reserved_area(__va(PHYS_PFN_OFFSET), swapper_pg_dir, 0, NULL); ^ In file included from include/linux/mman.h:4:0, from arch/arm/mm/init.c:15: include/linux/mm.h:1301:22: note: expected 'long unsigned int' but argument is of type 'void *' extern unsigned long free_reserved_area(unsigned long start, unsigned long end, mm/page_alloc.c: In function 'free_reserved_area': >> mm/page_alloc.c:5134:3: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast [enabled by default] In file included from arch/mips/include/asm/page.h:49:0, from include/linux/mmzone.h:20, from include/linux/gfp.h:4, from include/linux/mm.h:8, from mm/page_alloc.c:18: arch/mips/include/asm/io.h:119:29: note: expected 'const volatile void *' but argument is of type 'long unsigned int' mm/page_alloc.c: In function 'free_area_init_nodes': mm/page_alloc.c:5030:34: warning: array subscript is below array bounds [-Warray-bounds] Also address some minor code review comments. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Reported-by: Arnd Bergmann <arnd@arndb.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: <sworddragon2@aol.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Michel Lespinasse <walken@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-armLinus Torvalds
Pull ARM updates from Russell King: "This contains the usual updates from other people (listed below) and the usual random muddle of miscellaneous ARM updates which cover some low priority bug fixes and performance improvements. I've started to put the pull request wording into the merge commits, which are: - NoMMU stuff: This includes the following series sent earlier to the list: - nommu-fixes - R7 Support - MPU support I've left out the ARCH_MULTIPLATFORM/!MMU stuff that Arnd and I were discussing today until we've reached a conclusion/that's had some more review. This is rebased (and re-tested) on your devel-stable branch because otherwise there were going to be conflicts with Uwe's V7M work now that you've merged that. I've included the fix for limiting MPU to CPU_V7. - Huge page support These changes bring both HugeTLB support and Transparent HugePage (THP) support to ARM. Only long descriptors (LPAE) are supported in this series. The code has been tested on an Arndale board (Exynos 5250). - LPAE updates Please pull these miscellaneous LPAE fixes I've been collecting for a while now for 3.11. They've been tested and reviewed by quite a few people, and most of the patches are pretty trivial. -- Will Deacon. - arch_timer cleanups Please pull these arch_timer cleanups I've been holding onto for a while. They're the same as my last posting, but have been rebased to v3.10-rc3. - mpidr linearisation (multiprocessor id register - identifies which CPU number we are in the system) This patch series that implements MPIDR linearization through a simple hashing algorithm and updates current cpu_{suspend}/{resume} code to use the newly created hash structures to retrieve context pointers. It represents a stepping stone for the implementation of power management code on forthcoming multi-cluster ARM systems. It has been tested on TC2 (dual cluster A15xA7 system), iMX6q, OMAP4 and Tegra, with processors hitting low-power states requiring warm-boot resume through the cpu_resume code path" * 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (77 commits) ARM: 7775/1: mm: Remove do_sect_fault from LPAE code ARM: 7777/1: Avoid extra calls to the C compiler ARM: 7774/1: Fix dtb dependency to use order-only prerequisites ARM: 7770/1: remove residual ARMv2 support from decompressor ARM: 7769/1: Cortex-A15: fix erratum 798181 implementation ARM: 7768/1: prevent risks of out-of-bound access in ASID allocator ARM: 7767/1: let the ASID allocator handle suspended animation ARM: 7766/1: versatile: don't mark pen as __INIT ARM: 7765/1: perf: Record the user-mode PC in the call chain. ARM: 7735/2: Preserve the user r/w register TPIDRURW on context switch and fork ARM: kernel: implement stack pointer save array through MPIDR hashing ARM: kernel: build MPIDR hash function data structure ARM: mpu: Ensure that MPU depends on CPU_V7 ARM: mpu: protect the vectors page with an MPU region ARM: mpu: Allow enabling of the MPU via kconfig ARM: 7758/1: introduce config HAS_BANDGAP ARM: 7757/1: mm: don't flush icache in switch_mm with hardware broadcasting ARM: 7751/1: zImage: don't overwrite ourself with a page table ARM: 7749/1: spinlock: retry trylock operation if strex fails on free lock ARM: 7748/1: oabi: handle faults when loading swi instruction from userspace ...
2013-07-02Merge tag 'soc-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC specific changes from Arnd Bergmann: "These changes are all to SoC-specific code, a total of 33 branches on 17 platforms were pulled into this. Like last time, Renesas sh-mobile is now the platform with the most changes, followed by OMAP and EXYNOS. Two new platforms, TI Keystone and Rockchips RK3xxx are added in this branch, both containing almost no platform specific code at all, since they are using generic subsystem interfaces for clocks, pinctrl, interrupts etc. The device drivers are getting merged through the respective subsystem maintainer trees. One more SoC (u300) is now multiplatform capable and several others (shmobile, exynos, msm, integrator, kirkwood, clps711x) are moving towards that goal with this series but need more work. Also noteworthy is the work on PCI here, which is traditionally part of the SoC specific code. With the changes done by Thomas Petazzoni, we can now more easily have PCI host controller drivers as loadable modules and keep them separate from the platform code in drivers/pci/host. This has already led to the discovery that three platforms (exynos, spear and imx) are actually using an identical PCIe host controller and will be able to share a driver once support for spear and imx is added." * tag 'soc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (480 commits) ARM: integrator: let pciv3 use mem/premem from device tree ARM: integrator: set local side PCI addresses right ARM: dts: Add pcie controller node for exynos5440-ssdk5440 ARM: dts: Add pcie controller node for Samsung EXYNOS5440 SoC ARM: EXYNOS: Enable PCIe support for Exynos5440 pci: Add PCIe driver for Samsung Exynos ARM: OMAP5: voltagedomain data: remove temporary OMAP4 voltage data ARM: keystone: Move CPU bringup code to dedicated asm file ARM: multiplatform: always pick one CPU type ARM: imx: select syscon for IMX6SL ARM: keystone: select ARM_ERRATA_798181 only for SMP ARM: imx: Synertronixx scb9328 needs to select SOC_IMX1 ARM: OMAP2+: AM43x: resolve SMP related build error dmaengine: edma: enable build for AM33XX ARM: edma: Add EDMA crossbar event mux support ARM: edma: Add DT and runtime PM support to the private EDMA API dmaengine: edma: Add TI EDMA device tree binding arm: add basic support for Rockchip RK3066a boards arm: add debug uarts for rockchip rk29xx and rk3xxx series arm: Add basic clocks for Rockchip rk3066a SoCs ...
2013-07-02Merge tag 'cleanup-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC cleanups from Arnd Bergmann: "This contains cleanups as preparation for other branches adding new features, we pulled 16 branches for 9 platforms into this one. Most notable here is the removal of support for ATAGS based OMAP4 systems. Since all OMAP4 machines are fully functional with DT based booting in 3.10, we can remove a lot of code here. Also noteworthy is Maxime Ripard's cleanup of the machine descriptors, which means we need no machine descriptors in a lot more cases and can boot additional machines by just having the respective device drivers enabled." * tag 'cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (76 commits) ARM: picoxcell: remove .nr_irqs reference ARM: s5p64x0: avoid build warning for uncompress.h ARM: SAMSUNG: Remove unused plat/regs-watchdog.h header ARM: SAMSUNG: Remove legacy watchdog reset code ARM: SAMSUNG: Let platforms use the new watchdog reset driver ARM: SAMSUNG: Add watchdog reset driver ARM: SAMSUNG: Use local definitions of watchdog registers watchdog: s3c2410_wdt: Use local register definitions ARM: S5P64X0: Use common uncompress.h part for plat-samsung ARM: SAMSUNG: Consolidate uncompress subroutine ARM: at91: drop rm9200dk board support ARM: dts: msm: Fix merge resolution ARM: OMAP1: Remove dma.h ARM: OMAP1: Remove legacy irda.h and irda setup from board files ARM: OMAP1: Remove duplicated DMA channel definitions ARM: OMAP1: Remove McBSP DMA channel definitions ARM: OMAP2+: Remove dma.h ARM: OMAP2+: hwmod: Remove remaining DMA channel definitions ARM: OMAP2+: Remove duplicated DMA channel definitions ARM: OMAP2+: Remove AES crypto device DMA channel definitions ...
2013-06-29Merge branch 'devel-stable' into for-nextRussell King
Conflicts: arch/arm/Makefile arch/arm/include/asm/glue-proc.h
2013-06-29Merge branches 'fixes', 'mcpm', 'misc' and 'mmci' into for-nextRussell King
2013-06-29ARM: 7775/1: mm: Remove do_sect_fault from LPAE codeSteven Capper
For LPAE, do_sect_fault used to be invoked as the second level access flag handler. When transparent huge pages were introduced for LPAE, do_page_fault was used instead. Unfortunately, do_sect_fault remains defined but not used for LPAE code resulting in a compile warning. This patch surrounds do_sect_fault with #ifndef CONFIG_ARM_LPAE to fix this warning. Signed-off-by: Steve Capper <steve.capper@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-06-28ARM: dma: Drop __GFP_COMP for iommu dma memory allocationsRichard Zhao
__iommu_alloc_buffer wants to split pages after allocation in order to reduce the memory footprint. This does not work well with __GFP_COMP pages, so drop this flag before allocation One failure example is snd_malloc_dev_pages call dma_alloc_coherent with __GFP_COMP. Signed-off-by: Richard Zhao <rizhao@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-06-28ARM: DMA-mapping: mark all !DMA_TO_DEVICE pages in unmapping as cleanMing Lei
It is common for one sg to include many pages, so mark all these pages as clean to avoid unnecessary flushing on them in set_pte_at() or update_mmu_cache(). The patch might improve loading performance of applciation code a bit. On the below test code to read file(~1GByte size) from usb mass storage disk to buffer created with mmap(PROT_READ | PROT_EXEC) on Pandaboard, average ~1% improvement can be observed with the patch on 10 times test. unsigned int sum = 0; static unsigned long tv_diff(struct timeval *tv1, struct timeval *tv2) { return (tv2->tv_sec - tv1->tv_sec) * 1000000 + (tv2->tv_usec - tv1->tv_usec); } int main(int argc, char *argv[]) { char *mbuffer; int fd; int i; unsigned long page_size, size; struct stat stat; struct timeval t1, t2; page_size = getpagesize(); fd = open(argv[1], O_RDONLY); assert(fd >= 0); fstat(fd, &stat); size = stat.st_size; printf("%s: file %s, file size %lu, page size %lu\n", argv[0], read_filename, size, page_size); gettimeofday(&t1, NULL); mbuffer = mmap(NULL, size, PROT_READ | PROT_EXEC, MAP_SHARED, fd, 0); for (i = 0 ; i < size ; i += page_size) sum += mbuffer[i]; munmap(mbuffer, page_size); gettimeofday(&t2, NULL); printf("\tread mmaped time: %luus\n", tv_diff(&t1, &t2)); close(fd); } Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-06-28ARM: dma-mapping: NULLify dev->archdata.mapping pointer on detachWill Deacon
The current code only clobbers a local variable, so the device is left with a stale mapping pointer. Cc: Hiroshi Doyu <hdoyu@nvidia.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Hiroshi Doyu <hdoyu@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-06-28ARM: dma-mapping: convert DMA direction into IOMMU protection attributesWill Deacon
IOMMU mappings take a prot parameter, identifying the protection bits to enforce on the newly created mapping (READ or WRITE). The ARM dma-mapping framework currently just passes 0 as the prot argument, resulting in faulting mappings. This patch infers the protection attributes based on the direction of the DMA transfer. Cc: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-06-28ARM: dma-mapping: Get pages if the cpu_addr is out of atomic_poolYoungJun Cho
In __iommu_get_pages(), the cpu_addr is checked wheather in atomic_pool range or not. So if the cpu_addr is in atomic_pool range, it does not need to check twice. Signed-off-by: YoungJun Cho <yj44.cho@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-06-24ARM: 7769/1: Cortex-A15: fix erratum 798181 implementationMarc Zyngier
Looking into the active_asids array is not enough, as we also need to look into the reserved_asids array (they both represent processes that are currently running). Also, not holding the ASID allocator lock is racy, as another CPU could schedule that process and trigger a rollover, making the erratum workaround miss an IPI. Exposing this outside of context.c is a little ugly on the side, so let's define a new entry point that the erratum workaround can call to obtain the cpumask. Cc: <stable@vger.kernel.org> # 3.9 Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-06-24ARM: 7768/1: prevent risks of out-of-bound access in ASID allocatorMarc Zyngier
On a CPU that never ran anything, both the active and reserved ASID fields are set to zero. In this case the ASID_TO_IDX() macro will return -1, which is not a very useful value to index a bitmap. Instead of trying to offset the ASID so that ASID #1 is actually bit 0 in the asid_map bitmap, just always ignore bit 0 and start the search from bit 1. This makes the code a bit more readable, and without risk of OoB access. Cc: <stable@vger.kernel.org> # 3.9 Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-06-24ARM: 7767/1: let the ASID allocator handle suspended animationMarc Zyngier
When a CPU is running a process, the ASID for that process is held in a per-CPU variable (the "active ASIDs" array). When the ASID allocator handles a rollover, it copies the active ASIDs into a "reserved ASIDs" array to ensure that a process currently running on another CPU will continue to run unaffected. The active array is zero-ed to indicate that a rollover occurred. Because of this mechanism, a reserved ASID is only remembered for a single rollover. A subsequent rollover will completely refill the reserved ASIDs array. In a severely oversubscribed environment where a CPU can be prevented from running for extended periods of time (think virtual machines), the above has a horrible side effect: [P{a} denotes process P running with ASID a] CPU-0 CPU-1 A{x} [active = <x 0>] [suspended] runs B{y} [active = <x y>] [rollover: active = <0 0> reserved = <x y>] runs B{y} [active = <0 y> reserved = <x y>] [rollover: active = <0 0> reserved = <0 y>] runs C{x} [active = <0 x>] [resumes] runs A{x} At that stage, both A and C have the same ASID, with deadly consequences. The fix is to preserve reserved ASIDs across rollovers if the CPU doesn't have an active ASID when the rollover occurs. Cc: <stable@vger.kernel.org> # 3.9 Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Catalin Carinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-06-24ARM: 7773/1: PJ4B: Add support for errata 4742Gregory CLEMENT
This commit fixes the regression on Armada 370 (the kernal hang during boot) introduced by the commit: "ARM: 7691/1: mm: kill unused TLB_CAN_READ_FROM_L1_CACHE and use ALT_SMP instead". When coming out of either a Wait for Interrupt (WFI) or a Wait for Event (WFE) IDLE states, a specific timing sensitivity exists between the retiring WFI/WFE instructions and the newly issued subsequent instructions. This sensitivity can result in a CPU hang scenario. The workaround is to insert either a Data Synchronization Barrier (DSB) or Data Memory Barrier (DMB) command immediately after the WFI/WFE instruction. This commit was based on the work of Lior Amsalem, but heavily modified to apply the errata fix dynamically according to the processor type thanks to the suggestions of Russell King and Nicolas Pitre. Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Acked-by: Nicolas Pitre <nico@linaro.org> Tested-by: Willy Tarreau <w@1wt.eu> Cc: <stable@vger.kernel.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-06-24ARM: 7772/1: Fix missing flush_kernel_dcache_page() for noMMUSimon Baatz
Commit 1bc3974 (ARM: 7755/1: handle user space mapped pages in flush_kernel_dcache_page) moved the implementation of flush_kernel_dcache_page() into mm/flush.c but did not implement it on noMMU ARM. Signed-off-by: Simon Baatz <gmbnomis@gmail.com> Acked-by: Kevin Hilman <khilman@linaro.org> Cc: <stable@vger.kernel.org> # 3.2+: 1bc3974: ARM: 7755/1 Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-06-24ARM: 7760/1: cpu_fa526_do_idle: remove WFIJonas Jensen
As it was already suggested by Russell King and Arnd Bergmann: https://lkml.org/lkml/2013/5/16/133 moxart and gemini seem to be the only platforms using CPU_FA526, and instead of pointing arm_pm_idle to an empty function from platform code, it makes sense to remove WFI code from the processor specific idle function. Applies to arm-soc/for-next (and 3.10-rc1). Changes since v1: 1. remove WFI but make sure cpu_fa526_do_idle do not fall through to cpu_fa526_dcache_clean_area Note: moxart boots and prints to UART without this patch, but input is broken. Signed-off-by: Jonas Jensen <jonas.jensen@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-06-18Merge branch 'for-rmk/lpae' of ↵Russell King
git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into devel-stable Conflicts: arch/arm/kernel/smp.c Please pull these miscellaneous LPAE fixes I've been collecting for a while now for 3.11. They've been tested and reviewed by quite a few people, and most of the patches are pretty trivial. -- Will Deacon.
2013-06-18Merge branch 'for-rmk/hugepages' of ↵Russell King
git://git.linaro.org/people/stevecapper/linux into devel-stable These changes bring both HugeTLB support and Transparent HugePage (THP) support to ARM. Only long descriptors (LPAE) are supported in this series. The code has been tested on an Arndale board (Exynos 5250).
2013-06-17ARM: 7755/1: handle user space mapped pages in flush_kernel_dcache_pageSimon Baatz
Commit f8b63c1 made flush_kernel_dcache_page a no-op assuming that the pages it needs to handle are kernel mapped only. However, for example when doing direct I/O, pages with user space mappings may occur. Thus, continue to do lazy flushing if there are no user space mappings. Otherwise, flush the kernel cache lines directly. Signed-off-by: Simon Baatz <gmbnomis@gmail.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: <stable@vger.kernel.org> # 3.2+ Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-06-17ARM: 7754/1: Fix the CPU ID and the mask associated to the PJ4BGregory CLEMENT
This commit fixes the ID and mask for the PJ4B which was too restrictive and didn't match the CPU of the Armada 370 SoC. Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-06-17ARM: 7753/1: map_init_section flushes incorrect pmdPo-Yu Chuang
This bug was introduced in commit e651eab0. Some v4/v5 platforms failed to boot due to this. Signed-off-by: Po-Yu Chuang <ratbert.chuang@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-06-17ARM: 7752/1: errata: LoUIS bit field in CLIDR register is incorrectJon Medhurst
On Cortex-A9 before version r1p0, the LoUIS bit field of the CLIDR register returns zero when it should return one. This leads to cache maintenance operations which rely on this value to not function as intended, causing data corruption. The workaround for this errata is to detect affected CPUs and correct the LoUIS value read. Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Nicolas Pitre <nico@linaro.org> Cc: stable@vger.kernel.org Signed-off-by: Jon Medhurst <tixy@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-06-07ARM: mpu: Complete initialisation of the MPU after reaching the C-worldJonathan Austin
Much like with the MMU, MPU initialisation is performed in two stages; the first in the pre-C world and the 'real' initialisation during arch setup. This patch wires in previously added MPU initialisation functions so that the whole of memory is mapped with the appropriate region properties for 'normal' RAM (the appropriate properties depend on whether the system is SMP). Stub initialisation functions are added for the case that there MPU support is not configured in to the kernel. Signed-off-by: Jonathan Austin <jonathan.austin@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> CC: Hyok S. Choi <hyok.choi@samsung.com>
2013-06-07ARM: mpu: add MPU probe and initialisation functions in CJonathan Austin
This patch adds new functions for probing and initialising the ARMv7 PMSA-compliant MPU. These use the pre-defined and reserved MPU_PROBE_REGION for establishing properties of the MPU, which is necessary because certain probe operations require modifying region properties and reading back the results. This patch also introduces a minimal sanity_check_meminfo_mpu function, that ensures that the memory set-up passed to the kernel can be used in conjunction with the MPU. The base address of a region must be aligned to the region size, otherwise behavior is unpredictable and region sizes can only be specified as a power-of-two. To simplify the satisfaction of these requirements this implementation currently enforces that all memory is contiguous from PHYS_OFFSET, merging banks that are contiguous but passed in separately. The functions are added in this patch but wired in to the boot process later in the series. Signed-off-by: Jonathan Austin <jonathan.austin@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> CC: Hyok S. Choi <hyok.choi@samsung.com>
2013-06-07ARM: add Cortex-R7 Processor InfoJonathan Austin
This patch adds processor info for ARM Ltd. Cortex-R7. The R7 has many similarities to the A9 and though the ACTLR layout is not identical, the bits associated with cache operations broadcasting and SMP modes are the same for A9, A5 and R7 (Though in the A-class processors the same bits toggle TLB-ops broadcasting as well as cache-ops) Signed-off-by: Jonathan Austin <jonathan.austin@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> CC: Catalin Marinas <catalin.marinas@arm.com> CC: Stephen Boyd <sboyd@codeaurora.org>
2013-06-07ARM: select CPU_CPU15_MMU/MPU appropriatelyJonathan Austin
Currently CPU_V7 selects CPU_CP15_MMU, however in the case of a V7 CPU implementing the PMSA, such as the Cortex-R7, the CP15_MMU operations are not available. Selecting CPU_CP15_MPU is appropriate in this case. This patch makes CPU_CP15_MMU dependent on the use of the MMU, selecting CPU_CP15_MPU for v7 processors when !MMU is chosen. Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
2013-06-07ARM: suspend: fix CPU suspend code for !CONFIG_MMU configurationsWill Deacon
The ARM CPU suspend code can be selected even for a !CONFIG_MMU configuration. The resulting kernel will not compile and, even if it did, would access undefined co-processor registers when executing. This patch fixes the v6 and v7 CPU suspend code for the nommu case. Signed-off-by: Will Deacon <will.deacon@arm.com> Tested-by: Jonathan Austin <jonathan.austin@arm.com> CC: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> (commit_signer:1/3=33%) CC: Santosh Shilimkar <santosh.shilimkar@ti.com> (commit_signer:1/3=33%) CC: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
2013-06-05ARM: 7746/1: mm: lazy cache flushing on non-mapped pagesMing Lei
Currently flush_dcache_page() thinks pages as non-mapped if mapping_mapped(mapping) return false. This approach is very coase: - mmap on part of file may cause all pages backed on the file being thought as mmaped - file-backed pages aren't mapped into user space actually if the memory mmaped on the file isn't accessed This patch uses page_mapped() to decide if the page has been mapped. From the attached test code, I find there is much performance improvement(>25%) when accessing page caches via read under this situations, so memcpy benefits a lot from not flushing cache under this situation. No. read time without the patch No. read time with the patch ================================================================ No. 0, time 22615636 us No. 0, time 22014717 us No. 1, time 4387851 us No. 1, time 3113184 us No. 2, time 4276535 us No. 2, time 3005244 us No. 3, time 4259821 us No. 3, time 3001565 us No. 4, time 4263811 us No. 4, time 3002748 us No. 5, time 4258486 us No. 5, time 3004104 us No. 6, time 4253009 us No. 6, time 3002188 us No. 7, time 4262809 us No. 7, time 2998196 us No. 8, time 4264525 us No. 8, time 3007255 us No. 9, time 4267795 us No. 9, time 3005094 us 1), No.0. is to read the file from storage device, and others are to read the file from page caches basically. 2), file size is 512M, and is on ext4 over usb mass storage. 3), the test is done on Pandaboard. unsigned int sum = 0; unsigned long sum_val = 0; static unsigned long tv_diff(struct timeval *tv1, struct timeval *tv2) { return (tv2->tv_sec - tv1->tv_sec) * 1000000 + (tv2->tv_usec - tv1->tv_usec); } int main(int argc, char *argv[]) { char *mbuf, fbuf; int fd; int i; unsigned long page_size, size; struct stat stat; struct timeval t1, t2; unsigned char *rbuf = malloc(32 * page_size); if (!rbuf) { printf(" %sn", "malloc failed"); exit(-1); } page_size = getpagesize(); fd = open(argv[1], O_RDWR); assert(fd >= 0); fstat(fd, &stat); size = stat.st_size; printf("%s: file %s, size %lu, page size %lun", argv[0], argv[1], size, page_size); gettimeofday(&t1, NULL); mbuf = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); if (!mbuf) { printf(" %sn", "mmap failed"); exit(-1); } for (i = 0 ; i < size ; i += (page_size * 32)) { int rcnt; lseek(fd, i, SEEK_SET); rcnt = read(fd, rbuf, page_size * 32); if (rcnt != page_size * 32) { printf("%s: read faildn", __func__); exit(-1); } } free(rbuf); munmap(mbuf, size); gettimeofday(&t2, NULL); printf("tread mmaped time: %luusn", tv_diff(&t1, &t2)); close(fd); } Cc: Michel Lespinasse <walken@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Reviewed-by: Will Deacon <will.deacon@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-06-04ARM: mm: Transparent huge page support for LPAE systems.Catalin Marinas
The patch adds support for THP (transparent huge pages) to LPAE systems. When this feature is enabled, the kernel tries to map anonymous pages as 2MB sections where possible. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> [steve.capper@linaro.org: symbolic constants used, value of PMD_SECT_SPLITTING adjusted, tlbflush.h included in pgtable.h, added PROT_NONE support.] Signed-off-by: Steve Capper <steve.capper@linaro.org> Reviewed-by: Will Deacon <will.deacon@arm.com>
2013-06-04ARM: mm: HugeTLB support for LPAE systems.Catalin Marinas
This patch adds support for hugetlbfs based on the x86 implementation. It allows mapping of 2MB sections (see Documentation/vm/hugetlbpage.txt for usage). The 64K pages configuration is not supported (section size is 512MB in this case). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> [steve.capper@linaro.org: symbolic constants replace numbers in places. Split up into multiple files, to simplify future non-LPAE support, removed huge_pmd_share code, as this is very rarely executed, Added PROT_NONE support]. Signed-off-by: Steve Capper <steve.capper@linaro.org> Reviewed-by: Will Deacon <will.deacon@arm.com>