summaryrefslogtreecommitdiff
path: root/arch/arc/mm
AgeCommit message (Collapse)Author
2016-08-10ARC: Elide redundant setup of DMA callbacksVineet Gupta
For resources shared by all cores such as SLC and IOC, only the master core needs to do any setups / enabling / disabling etc. Cc: <stable@vger.kernel.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-08-04dma-mapping: use unsigned long for dma_attrsKrzysztof Kozlowski
The dma-mapping core and the implementations do not change the DMA attributes passed by pointer. Thus the pointer can point to const data. However the attributes do not have to be a bitfield. Instead unsigned long will do fine: 1. This is just simpler. Both in terms of reading the code and setting attributes. Instead of initializing local attributes on the stack and passing pointer to it to dma_set_attr(), just set the bits. 2. It brings safeness and checking for const correctness because the attributes are passed by value. Semantic patches for this change (at least most of them): virtual patch virtual context @r@ identifier f, attrs; @@ f(..., - struct dma_attrs *attrs + unsigned long attrs , ...) { ... } @@ identifier r.f; @@ f(..., - NULL + 0 ) and // Options: --all-includes virtual patch virtual context @r@ identifier f, attrs; type t; @@ t f(..., struct dma_attrs *attrs); @@ identifier r.f; @@ f(..., - NULL + 0 ) Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> Acked-by: Mark Salter <msalter@redhat.com> [c6x] Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris] Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm] Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp] Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core] Acked-by: David Vrabel <david.vrabel@citrix.com> [xen] Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb] Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon] Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390] Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32] Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc] Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02treewide: replace obsolete _refok by __refFabian Frederick
There was only one use of __initdata_refok and __exit_refok __init_refok was used 46 times against 82 for __ref. Those definitions are obsolete since commit 312b1485fb50 ("Introduce new section reference annotations tags: __ref, __refdata, __refconst") This patch removes the following compatibility definitions and replaces them treewide. /* compatibility defines */ #define __init_refok __ref #define __initdata_refok __refdata #define __exit_refok __ref I can also provide separate patches if necessary. (One patch per tree and check in 1 month or 2 to remove old definitions) [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/1466796271-3043-1-git-send-email-fabf@skynet.be Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Ingo Molnar <mingo@redhat.com> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-29Merge tag 'arc-4.8-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC updates from Vineet Gupta: "Things have been calm here - nothing much except for a few fixes" * tag 'arc-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: mm: don't loose PTE_SPECIAL in pte_modify() ARC: dma: fix address translation in arc_dma_free ARC: typo fix in mm/ioremap.c ARC: fix linux-next build breakage
2016-07-26mm: do not pass mm_struct into handle_mm_faultKirill A. Shutemov
We always have vma->vm_mm around. Link: http://lkml.kernel.org/r/1466021202-61880-8-git-send-email-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-20ARC: dma: fix address translation in arc_dma_freeVladimir Kondratiev
page should be calculated using physical address. If platform uses non-trivial dma-to-phys memory translation, dma_handle should be converted to physicval address before calculation of page. Failing to do so results in struct page * pointing to wrong or non-existent memory. Fixes: f2e3d55397ff ("ARC: dma: reintroduce platform specific dma<->phys") Cc: stable@vger.kernel.org #4.6+ Signed-off-by: Vladimir Kondratiev <vladimir.kondratiev@intel.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-07-19ARC: typo fix in mm/ioremap.cAlexey Brodkin
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-05-30Fix typosAndrea Gelmini
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-05-09ARC: [plat-eznps] Use dedicated user stack topNoam Camus
NPS use special mapping right below TASK_SIZE. Hence we need to lower STACK_TOP so that user stack won't overlap NPS special mapping. Signed-off-by: Noam Camus <noamc@ezchip.com> Acked-by: Vineet Gupta <vgupta@synopsys.com>
2016-05-09ARC: Make vmalloc size configurableNoam Camus
On ARC, lower 2G of address space is translated and used for - user vaddr space (region 0 to 5) - unused kernel-user gutter (region 6) - kernel vaddr space (region 7) where each region simply represents 256MB of address space. The kernel vaddr space of 256MB is used to implement vmalloc, modules So far this was enough, but not on EZChip system with 4K CPUs (given that per cpu mechanism uses vmalloc for allocating chunks) So allow VMALLOC_SIZE to be configurable by expanding down into the unused kernel-user gutter region which at default 256M was excessive anyways. Also use _BITUL() to fix a build error since PGDIR_SIZE cannot use "1UL" as called from assembly code in mm/tlbex.S Signed-off-by: Noam Camus <noamc@ezchip.com> [vgupta: rewrote changelog, debugged bootup crash due to int vs. hex] Acked-by: Vineet Gupta <vgupta@synopsys.com>
2016-05-05ARC: support HIGHMEM even without PAE40Vineet Gupta
Initial HIGHMEM support on ARC was introduced for PAE40 where the low memory (0x8000_0000 based) and high memory (0x1_0000_0000) were physically contiguous. So CONFIG_FLATMEM sufficed (despite a peipheral hole in the middle, which wasted a bit of struct page memory, but things worked). However w/o PAE, highmem was not possible and we could only reach ~1.75GB of DDR. Now there is a use case to access ~4GB of DDR w/o PAE40 The idea is to have low memory at canonical 0x8000_0000 and highmem at 0 so enire 4GB address space is available for physical addressing This needs additional platform/interconnect mapping to convert the non contiguous physical addresses into linear bus adresses. From Linux point of view, non contiguous divide means FLATMEM no longer works and DISCONTIGMEM is needed to track the pfns in the 2 regions. This scheme would also work for PAE40, only better in that we don't waste struct page memory for the peripheral hole. The DT description will be something like memory { ... reg = <0x80000000 0x200000000 /* 512MB: lowmem */ 0x00000000 0x10000000>; /* 256MB: highmem */ } Signed-off-by: Noam Camus <noamc@ezchip.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-04-27ARC: add support for reserved memory defined by device treeAlexey Brodkin
Enable reserved memory initialization from device tree. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-04-04mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macrosKirill A. Shutemov
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-19ARCv2: ioremap: Support dynamic peripheral address spaceVineet Gupta
The peripheral address space is architectural address window which is uncached and typically used to wire up peripherals. For ARC700 cores (ARCompact ISA based) this was fixed to 1GB region 0xC000_0000 - 0xFFFF_FFFF. For ARCv2 based HS38 cores the start address is flexible and can be 0xC, 0xD, 0xE, 0xF 000_000 by programming AUX_NON_VOLATILE_LIMIT reg (typically done in bootloader) Further in cas of PAE, the physical address can extend beyond 4GB so need to confine this check, otherwise all pages beyond 4GB will be treated as uncached Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-19ARC: dma: reintroduce platform specific dma<->physVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-19ARC: dma: ioremap: use phys_addr_t consistenctly in code pathsVineet Gupta
To support dma in physical memory beyond 4GB with PAE40 Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-19ARC: dma: pass_phys() not sg_virt() to cache opsVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-19ARC: dma: non-coherent pages need V-P mapping if in HIGHMEMVineet Gupta
Previously a non-coherent page (hardware IOC or simply driver needs) could be handled by cpu with paddr alone (kvaddr used to be needed for coherent mappings to enforce uncached semantics via a MMU mapping). Now however such a page might still require a V-P mapping if it was in physical address space > 32bits due to PAE40, which the CPU can't access directly with a paddr So decouple decision of kvaddr allocation from type of alloc request (coh/non-coh) Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-19ARC: dma: Use struct page based page allocator helpersVineet Gupta
vs. the ones which reutne void *, so that we can handle pages > 4GB in subsequent patches Also plug a potential page leak in case ioremap fails Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-11ARC: Fix misspellings in comments.Adam Buchbinder
Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-01-21arc: convert to dma_map_opsChristoph Hellwig
[vgupta@synopsys.com: ARC: dma mapping fixes #2] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Cc: Carlos Palminha <CARLOS.PALMINHA@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16mm: differentiate page_mapped() from page_mapcount() for compound pagesKirill A. Shutemov
Let's define page_mapped() to be true for compound pages if any sub-pages of the compound page is mapped (with PMD or PTE). On other hand page_mapcount() return mapcount for this particular small page. This will make cases like page_get_anon_vma() behave correctly once we allow huge pages to be mapped with PTE. Most users outside core-mm should use page_mapcount() instead of page_mapped(). Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Sasha Levin <sasha.levin@oracle.com> Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Steve Capper <steve.capper@linaro.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-21ARC: mm: HIGHMEM: Fix section mismatch splatVineet Gupta
| WARNING: vmlinux.o(.text+0xd6c2): Section mismatch in reference from the function alloc_kmap_pgtable() to the function | .init.text:__alloc_bootmem_low() The function alloc_kmap_pgtable() references the function __init __alloc_bootmem_low(). This is often because alloc_kmap_pgtable lacks a __init annotation or the annotation of __alloc_bootmem_low is wrong. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17ARC: [plat-sim] unbork non default CONFIG_LINUX_LINK_BASEVineet Gupta
HIGHMEM support bumped the default memory size for nsim platform to 1G. Thus total memory ended at the very edge of start of peripherals address space. With linux link base shifted, memory started bleeding into peripheral space which caused early boot bad_page spew ! Fixes: 29e332261d2 ("ARC: mm: HIGHMEM: populate high memory from DT") Reported-by: Anton Kolesov <akolesov@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-16ARC: comments updateVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-14ARC: use ASL assembler mnemonicVineet Gupta
ARCompact and ARCv2 only have ASL, while binutils used to support LSL as a alias mnemonic. Newer binutils (upstream) don't want to do that so replace it. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-29ARC: mm: PAE40 supportVineet Gupta
This is the first working implementation of 40-bit physical address extension on ARCv2. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: mm: PAE40: tlbex.S: Explicitify the size of pte_tVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: mm: PAE40: switch to using phys_addr_t for physical addressesVineet Gupta
That way a single flip of phys_addr_t to 64 bit ensures all places dealing with physical addresses get correct data Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: mm: HIGHMEM: populate high memory from DTVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: mm: HIGHMEM: kmap API implementationVineet Gupta
Implement kmap* API for ARC. This enables - permanent kernel maps (pkmaps): :kmap() API - fixmap : kmap_atomic() We use a very simple/uniform approach for both (unlike some of the other arches). So fixmap doesn't use the customary compile time address stuff. The important semantic is sleep'ability (pkmap) vs. not (fixmap) which the API guarantees. Note that this patch only enables highmem for subsequent PAE40 support as there is no real highmem for ARC in pure 32-bit paradigm as explained below. ARC has 2:2 address split of the 32-bit address space with lower half being translated (virtual) while upper half unstranslated (0x8000_0000 to 0xFFFF_FFFF). kernel itself is linked at base of unstranslated space (i.e. 0x8000_0000 onwards), which is mapped to say DDR 0x0 by external Bus Glue logic (outside the core). So kernel can potentially access 1.75G worth of memory directly w/o need for highmem. (the top 256M is taken by uncached peripheral space from 0xF000_0000 to 0xFFFF_FFFF) In PAE40, hardware can address memory beyond 4G (0x1_0000_0000) while the logical/virtual addresses remain 32-bits. Thus highmem is required for kernel proper to be able to access these pages for it's own purposes (user space is agnostic to this anyways). Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: mm: preps ahead of HIGHMEM support #2Vineet Gupta
Explicit'ify that all memory added so far is low memory Nothing semantical Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: mm: preps ahead of HIGHMEM supportVineet Gupta
Before we plug in highmem support, some of code needs to be ready for it - copy_user_highpage() needs to be using the kmap_atomic API - mk_pte() can't assume page_address() - do_page_fault() can't assume VMALLOC_END is end of kernel vaddr space Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: mm: Improve Duplicate PD Fault handlerVineet Gupta
- Move the verbosity knob from .data to .bss by using inverted logic - No need to readout PD1 descriptor - clip the non pfn bits of PD0 to avoid clipping inside the loop Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: Ensure DT mem base is same as what kernel is built withVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17ARC: boot log: decode more mmu config itemsVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17ARC: boot log: move helper macros to header for reuseVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17ARC: mm: compute TLB size as needed from ways * setsVineet Gupta
This frees up some bits to hold more high level info such as PAE being present, w/o increasing the size of already bloated cpuinfo struct Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17ARCv2: mm: THP: flush_pmd_tlb_range make SMP safeVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17ARCv2: mm: THP: Implement flush_pmd_tlb_range() optimizationVineet Gupta
Implement the TLB flush routine to evict a sepcific Super TLB entry, vs. moving to a new ASID on every such flush. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17ARCv2: mm: THP: boot validation/reportingVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17ARCv2: mm: THP supportVineet Gupta
MMUv4 in HS38x cores supports Super Pages which are basis for Linux THP support. Normal and Super pages can co-exist (ofcourse not overlap) in TLB with a new bit "SZ" in TLB page desciptor to distinguish between them. Super Page size is configurable in hardware (4K to 16M), but fixed once RTL builds. The exact THP size a Linx configuration will support is a function of: - MMU page size (typical 8K, RTL fixed) - software page walker address split between PGD:PTE:PFN (typical 11:8:13, but can be changed with 1 line) So for above default, THP size supported is 8K * 256 = 2M Default Page Walker is 2 levels, PGD:PTE:PFN, which in THP regime reduces to 1 level (as PTE is folded into PGD and canonically referred to as PMD). Thus thp PMD accessors are implemented in terms of PTE (just like sparc) Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-09ARC: mm: pte flags comsetic cleanups, commentsVineet Gupta
No semantical changes Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-21ARC: Eliminate some ARCv2 specific code for ARCompact buildVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20ARCv2: IOC: Allow boot time disableAlexey Brodkin
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20ARCv2: SLC: Allow boot time disableVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20ARCv2: Support IO Coherency and permutations involving L1 and L2 cachesAlexey Brodkin
In case of ARCv2 CPU there're could be following configurations that affect cache handling for data exchanged with peripherals via DMA: [1] Only L1 cache exists [2] Both L1 and L2 exist, but no IO coherency unit [3] L1, L2 caches and IO coherency unit exist Current implementation takes care of [1] and [2]. Moreover support of [2] is implemented with run-time check for SLC existence which is not super optimal. This patch introduces support of [3] and rework of DMA ops usage. Instead of doing run-time check every time a particular DMA op is executed we'll have 3 different implementations of DMA ops and select appropriate one during init. As for IOC support for it we need: [a] Implement empty DMA ops because IOC takes care of cache coherency with DMAed data [b] Route dma_alloc_coherent() via dma_alloc_noncoherent() This is required to make IOC work in first place and also serves as optimization as LD/ST to coherent buffers can be srviced from caches w/o going all the way to memory Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> [vgupta: -Added some comments about IOC gains -Marked dma ops as static, -Massaged changelog a bit] Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-07-06ARC: Don't memzero twice in dma_alloc_coherent for __GFP_ZEROVineet Gupta
alloc_pages_exact() get gfp flags and handle zero'ing already And while it, fix the case where ioremap fails: return rightaway. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-07-06ARCv2: guard SLC DMA ops with spinlockAlexey Brodkin
SLC maintenance ops need to be serialized by software as there is no inherent buffering / quequing of aux commands. It can silently ignore a new aux operation if previous one is still ongoing (SLC_CTRL_BUSY) So gaurd the SLC op using a spin lock The spin lock doesn't seem to be contended even in heavy workloads such as iperf. On FPGA @ 75 MHz. [1] Before this change: ============================================================ # iperf -c 10.42.0.1 ------------------------------------------------------------ Client connecting to 10.42.0.1, TCP port 5001 TCP window size: 43.8 KByte (default) ------------------------------------------------------------ [ 3] local 10.42.0.110 port 38935 connected with 10.42.0.1 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 48.4 MBytes 40.6 Mbits/sec ============================================================ [2] After this change: ============================================================ # iperf -c 10.42.0.1 ------------------------------------------------------------ Client connecting to 10.42.0.1, TCP port 5001 TCP window size: 43.8 KByte (default) ------------------------------------------------------------ [ 3] local 10.42.0.243 port 60248 connected with 10.42.0.1 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 47.5 MBytes 39.8 Mbits/sec # iperf -c 10.42.0.1 ------------------------------------------------------------ Client connecting to 10.42.0.1, TCP port 5001 TCP window size: 43.8 KByte (default) ------------------------------------------------------------ [ 3] local 10.42.0.243 port 60249 connected with 10.42.0.1 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 54.9 MBytes 46.0 Mbits/sec ============================================================ Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: arc-linux-dev@synopsys.com Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-07-01Merge tag 'arc-4.2-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC architecture updates from Vineet Gupta: - support for HS38 cores based on ARCv2 ISA ARCv2 is the next generation ISA from Synopsys and basis for the HS3{4,6,8} families of processors which retain the traditional ARC mantra of low power and configurability and are now more performant and feature rich. HS38x is a 10 stage pipeline core which supports MMU (with huge pages) and SMP (upto 4 cores) among other features. + www.synopsys.com/dw/ipdir.php?ds=arc-hs38-processor + http://news.synopsys.com/2014-10-14-New-DesignWare-ARC-HS38-Processor-Doubles-Performance-for-Embedded-Linux-Applications + http://www.embedded.com/electronics-news/4435975/Synopsys-ARC-HS38-core-gives-2X-boost-to-Linux-based-apps - support for ARC SDP (Software Development platform): Main Board + CPU Cards = AXS101: CPU Card with ARC700 in silicon @ 700 MHz = AXS103: CPU Card with HS38x in FPGA - refactoring of ARCompact port to accomodate new ARCv2 ISA - misc updates/cleanups * tag 'arc-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (72 commits) ARC: Fix build failures for ARCompact in linux-next after ARCv2 support ARCv2: Allow older gcc to cope with new regime of ARCv2/ARCompact support ARCv2: [vdk] dts files and defconfig for HS38 VDK ARCv2: [axs103] Support ARC SDP FPGA platform for HS38x cores ARC: [axs101] Prepare for AXS103 ARCv2: [nsim*hs*] Support simulation platforms for HS38x cores ARCv2: All bits in place, allow ARCv2 builds ARCv2: SLC: Handle explcit flush for DMA ops (w/o IO-coherency) ARCv2: STAR 9000837815 workaround hardware exclusive transactions livelock ARC: Reduce bitops lines of code using macros ARCv2: barriers arch: conditionally define smp_{mb,rmb,wmb} ARC: add smp barriers around atomics per Documentation/atomic_ops.txt ARC: add compiler barrier to LLSC based cmpxchg ARCv2: SMP: intc: IDU 2nd level intc for dynamic IRQ distribution ARCv2: SMP: clocksource: Enable Global Real Time counter ARCv2: SMP: ARConnect debug/robustness ARCv2: SMP: Support ARConnect (MCIP) for Inter-Core-Interrupts et al ARC: make plat_smp_ops weak to allow over-rides ARCv2: clocksource: Introduce 64bit local RTC counter ...