From 9962eea9e55f797f05f20ba6448929cab2a9f018 Mon Sep 17 00:00:00 2001 From: Minfei Huang Date: Sun, 12 Jul 2015 20:18:42 +0800 Subject: x86/mm: Initialize pmd_idx in page_table_range_init_count() The variable pmd_idx is not initialized for the first iteration of the for loop. Assign the proper value which indexes the start address. Fixes: 719272c45b82 'x86, mm: only call early_ioremap_page_table_range_init() once' Signed-off-by: Minfei Huang Cc: tony.luck@intel.com Cc: wangnan0@huawei.com Cc: david.vrabel@citrix.com Reviewed-by: yinghai@kernel.org Link: http://lkml.kernel.org/r/1436703522-29552-1-git-send-email-mhuang@redhat.com Signed-off-by: Thomas Gleixner diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c index c8140e1..c23ab1e 100644 --- a/arch/x86/mm/init_32.c +++ b/arch/x86/mm/init_32.c @@ -137,6 +137,7 @@ page_table_range_init_count(unsigned long start, unsigned long end) vaddr = start; pgd_idx = pgd_index(vaddr); + pmd_idx = pmd_index(vaddr); for ( ; (pgd_idx < PTRS_PER_PGD) && (vaddr != end); pgd_idx++) { for (; (pmd_idx < PTRS_PER_PMD) && (vaddr != end); -- cgit v0.10.2 From 8c7ea50c010b2f1e006ad37c43f98202a31de2cb Mon Sep 17 00:00:00 2001 From: "Luis R. Rodriguez" Date: Thu, 9 Jul 2015 17:28:16 -0700 Subject: x86/mm, asm-generic: Add IOMMU ioremap_uc() variant default We currently have no safe way of currently defining architecture agnostic IOMMU ioremap_*() variants. The trend is for folks to *assume* that ioremap_nocache() should be the default everywhere and then add this mapping on each architectures -- this is not correct today for a variety of reasons. We have two options: 1) Sit and wait for every architecture in Linux to get a an ioremap_*() variant defined before including it upstream. 2) Gather consensus on a safe architecture agnostic ioremap_*() default. Approach 1) introduces development latencies, and since 2) will take time and work on clarifying semantics the only remaining sensible thing to do to avoid issues is returning NULL on ioremap_*() variants. In order for this to work we must have all architectures declare their own ioremap_*() variants as defined. This will take some work, do this for ioremp_uc() to set the example as its only currently implemented on x86. Document all this. We only provide implementation support for ioremap_uc() as the other ioremap_*() variants are well defined all over the kernel for other architectures already. Signed-off-by: Luis R. Rodriguez Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: arnd@arndb.de Cc: benh@kernel.crashing.org Cc: bp@suse.de Cc: dan.j.williams@intel.com Cc: geert@linux-m68k.org Cc: hch@lst.de Cc: hmh@hmh.eng.br Cc: jgross@suse.com Cc: linux-mm@kvack.org Cc: luto@amacapital.net Cc: mpe@ellerman.id.au Cc: mst@redhat.com Cc: ralf@linux-mips.org Cc: ross.zwisler@linux.intel.com Cc: stefan.bader@canonical.com Cc: tj@kernel.org Cc: tomi.valkeinen@ti.com Cc: toshi.kani@hp.com Link: http://lkml.kernel.org/r/1436488096-3165-1-git-send-email-mcgrof@do-not-panic.com Signed-off-by: Ingo Molnar diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h index 83ec9b1..de25aad 100644 --- a/arch/x86/include/asm/io.h +++ b/arch/x86/include/asm/io.h @@ -180,6 +180,8 @@ static inline unsigned int isa_virt_to_bus(volatile void *address) */ extern void __iomem *ioremap_nocache(resource_size_t offset, unsigned long size); extern void __iomem *ioremap_uc(resource_size_t offset, unsigned long size); +#define ioremap_uc ioremap_uc + extern void __iomem *ioremap_cache(resource_size_t offset, unsigned long size); extern void __iomem *ioremap_prot(resource_size_t offset, unsigned long size, unsigned long prot_val); diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h index f56094c..eed3bbe 100644 --- a/include/asm-generic/io.h +++ b/include/asm-generic/io.h @@ -736,6 +736,35 @@ static inline void *phys_to_virt(unsigned long address) } #endif +/** + * DOC: ioremap() and ioremap_*() variants + * + * If you have an IOMMU your architecture is expected to have both ioremap() + * and iounmap() implemented otherwise the asm-generic helpers will provide a + * direct mapping. + * + * There are ioremap_*() call variants, if you have no IOMMU we naturally will + * default to direct mapping for all of them, you can override these defaults. + * If you have an IOMMU you are highly encouraged to provide your own + * ioremap variant implementation as there currently is no safe architecture + * agnostic default. To avoid possible improper behaviour default asm-generic + * ioremap_*() variants all return NULL when an IOMMU is available. If you've + * defined your own ioremap_*() variant you must then declare your own + * ioremap_*() variant as defined to itself to avoid the default NULL return. + */ + +#ifdef CONFIG_MMU + +#ifndef ioremap_uc +#define ioremap_uc ioremap_uc +static inline void __iomem *ioremap_uc(phys_addr_t offset, size_t size) +{ + return NULL; +} +#endif + +#else /* !CONFIG_MMU */ + /* * Change "struct page" to physical address. * @@ -743,7 +772,6 @@ static inline void *phys_to_virt(unsigned long address) * you'll need to provide your own definitions. */ -#ifndef CONFIG_MMU #ifndef ioremap #define ioremap ioremap static inline void __iomem *ioremap(phys_addr_t offset, size_t size) -- cgit v0.10.2 From eacd2d542610e55cad0be445966ac8ae79124c6e Mon Sep 17 00:00:00 2001 From: "Luis R. Rodriguez" Date: Thu, 9 Jul 2015 18:24:56 -0700 Subject: drivers/video/fbdev/atyfb: Carve out framebuffer length fudging into a helper The size of the framebuffer to be used needs to be fudged to account for the different type of devices that are out there. This captures what is required to do well, we'll reuse this later. This has no functional changes. Signed-off-by: Luis R. Rodriguez Signed-off-by: Borislav Petkov Cc: H. Peter Anvin Cc: Jean-Christophe Plagniol-Villard Cc: Linus Torvalds Cc: Mathias Krause Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Tomi Valkeinen Cc: airlied@redhat.com Cc: arnd@arndb.de Cc: benh@kernel.crashing.org Cc: dan.j.williams@intel.com Cc: geert@linux-m68k.org Cc: hch@lst.de Cc: hmh@hmh.eng.br Cc: jgross@suse.com Cc: linux-fbdev@vger.kernel.org Cc: linux-mm@kvack.org Cc: luto@amacapital.net Cc: mpe@ellerman.id.au Cc: mst@redhat.com Cc: ralf@linux-mips.org Cc: ross.zwisler@linux.intel.com Cc: stefan.bader@canonical.com Cc: syrjala@sci.fi Cc: tj@kernel.org Cc: toshi.kani@hp.com Cc: ville.syrjala@linux.intel.com Link: http://lkml.kernel.org/r/1435251019-32421-1-git-send-email-mcgrof@do-not-panic.com Link: http://lkml.kernel.org/r/1436491499-3289-2-git-send-email-mcgrof@do-not-panic.com Signed-off-by: Ingo Molnar diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index 8789e48..16936bb 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -427,6 +427,20 @@ static struct { #endif /* CONFIG_FB_ATY_CT */ }; +/* + * Last page of 8 MB (4 MB on ISA) aperture is MMIO, + * unless the auxiliary register aperture is used. + */ +static void aty_fudge_framebuffer_len(struct fb_info *info) +{ + struct atyfb_par *par = (struct atyfb_par *) info->par; + + if (!par->aux_start && + (info->fix.smem_len == 0x800000 || + (par->bus_type == ISA && info->fix.smem_len == 0x400000))) + info->fix.smem_len -= GUI_RESERVE; +} + static int correct_chipset(struct atyfb_par *par) { u8 rev; @@ -2603,14 +2617,7 @@ static int aty_init(struct fb_info *info) if (par->pll_ops->resume_pll) par->pll_ops->resume_pll(info, &par->pll); - /* - * Last page of 8 MB (4 MB on ISA) aperture is MMIO, - * unless the auxiliary register aperture is used. - */ - if (!par->aux_start && - (info->fix.smem_len == 0x800000 || - (par->bus_type == ISA && info->fix.smem_len == 0x400000))) - info->fix.smem_len -= GUI_RESERVE; + aty_fudge_framebuffer_len(info); /* * Disable register access through the linear aperture -- cgit v0.10.2 From f55de6ec375da89f89f1a76e1b998e5f14878c06 Mon Sep 17 00:00:00 2001 From: "Luis R. Rodriguez" Date: Thu, 9 Jul 2015 18:24:57 -0700 Subject: drivers/video/fbdev/atyfb: Clarify ioremap() base and length used MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adjust the ioremap() call for the framebuffer to use the same values we later use for the framebuffer. This will make it easier to review the next change. The size of the framebuffer varies but since this is for PCI we *know* this defaults to 0x800000. atyfb_setup_generic() is *only* used on PCI probe. No functional change. Signed-off-by: Luis R. Rodriguez Signed-off-by: Borislav Petkov Cc: Andrew Morton Cc: Andrzej Hajda Cc: Andy Lutomirski Cc: Antonino Daplas Cc: Daniel Vetter Cc: Dave Airlie Cc: Davidlohr Bueso Cc: H. Peter Anvin Cc: Jean-Christophe Plagniol-Villard Cc: Juergen Gross Cc: Linus Torvalds Cc: Mathias Krause Cc: Mel Gorman Cc: Peter Zijlstra Cc: Rob Clark Cc: Suresh Siddha Cc: Thomas Gleixner Cc: Tomi Valkeinen Cc: Toshi Kani Cc: Ville Syrjälä Cc: Vlastimil Babka Cc: arnd@arndb.de Cc: benh@kernel.crashing.org Cc: dan.j.williams@intel.com Cc: geert@linux-m68k.org Cc: hch@lst.de Cc: hmh@hmh.eng.br Cc: linux-fbdev@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-pci@vger.kernel.org Cc: mpe@ellerman.id.au Cc: mst@redhat.com Cc: ralf@linux-mips.org Cc: ross.zwisler@linux.intel.com Cc: stefan.bader@canonical.com Cc: tj@kernel.org Cc: ville.syrjala@linux.intel.com Link: http://lkml.kernel.org/r/1436491499-3289-3-git-send-email-mcgrof@do-not-panic.com Signed-off-by: Ingo Molnar diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index 16936bb..de8f7e0 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -3489,7 +3489,21 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info, /* Map in frame buffer */ info->fix.smem_start = addr; - info->screen_base = ioremap(addr, 0x800000); + + /* + * The framebuffer is not always 8 MiB, that's just the size of the + * PCI BAR. We temporarily abuse smem_len here to store the size + * of the BAR. aty_init() will later correct it to match the actual + * framebuffer size. + * + * On devices that don't have the auxiliary register aperture, the + * registers are housed at the top end of the framebuffer PCI BAR. + * aty_fudge_framebuffer_len() is used to reduce smem_len to not + * overlap with the registers. + */ + info->fix.smem_len = 0x800000; + + info->screen_base = ioremap(info->fix.smem_start, info->fix.smem_len); if (info->screen_base == NULL) { ret = -ENOMEM; goto atyfb_setup_generic_fail; -- cgit v0.10.2 From 3cc2dac5be3f23414a4efdee0b26d79bed297cac Mon Sep 17 00:00:00 2001 From: "Luis R. Rodriguez" Date: Thu, 9 Jul 2015 18:24:58 -0700 Subject: drivers/video/fbdev/atyfb: Replace MTRR UC hole with strong UC MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replace a WC MTRR call followed by a UC MTRR "hole" call with a single WC MTRR call and use strong UC to protect the MMIO region and account for the device's architecture and MTRR size requirements. The atyfb driver relies on two overlapping MTRRs. It does this to account for the fact that, on some devices, it has the MMIO region bundled together with the framebuffer on the same PCI BAR and the hardware requirement on MTRRs on both base and size to be powers of two. In the worst case, the PCI BAR is of 16 MiB while the MMIO region is on the last 4 KiB of the same PCI BAR. If we use just one MTRR for WC, we can only end up with an 8 MiB or 16 MiB framebuffer. Using a 16 MiB WC framebuffer area is unacceptable since we need the MMIO region to not be write-combined. An 8 MiB WC framebuffer option does not let use quite a bit of framebuffer space, it would reduce the resolution capability of the device considerably. An alternative is to use many MTRRs but on some systems that could mean not having enough MTRRs to cover the framebuffer. The current solution is to issue a 16 MiB WC MTRR followed by a 4 KiB UC MTRR on the last 4 KiB. Its worth mentioning and documenting that the current ioremap*() strategy as well: the first ioremap() is used only for the MMIO region, a second ioremap() call is used for the framebuffer *and* the MMIO region, the MMIO region then ends up mmapped twice. Two ioremap() calls are used since in some situations the framebuffer actually ends up on a separate auxiliary PCI BAR, but this is not always true. In the worst case, the PCI BAR is shared for both MMIO and the framebuffer. By allowing overlapping ioremap() calls, the driver enables two types of devices with one simple ioremap() strategy. See also: 2f9e897353fc ("x86/mm/mtrr, pat: Document Write Combining MTRR type effects on PAT / non-PAT pages") By default, Linux today defaults both pci_mmap_page_range() and ioremap_nocache() to use _PAGE_CACHE_MODE_UC_MINUS. On x86, ioremap() aliases ioremap_nocache(). The preferred value for Linux may soon change, however, the goal is to use _PAGE_CACHE_MODE_UC by default in the future. We can use ioremap_uc() to set PCD=1, PWT=1 on non-PAT systems and use a PAT value of UC for PAT systems. This will ensure the same settings are in place regardless of what Linux decides to use by default later and to not regress our MTRR strategy since the effective memory type will differ depending on the value used. Using a WC MTRR on such an area will be nullified. This technique can be used to protect the MMIO region in this driver's case and address the restrictions of the device's architecture as well as restrictions set upon us by powers of 2 when using MTRRs. This allows us to replace the two MTRR calls with a single 16 MiB WC MTRR and use page-attribute settings for non-PAT and PAT entry values for PAT systems to ensure the appropriate effective memory type won't have a write-combining effect on the MMIO region on both non-PAT and PAT systems. The framebuffer area will be sure to get the write-combined effective memory type by white-listing it with ioremap_wc(). We ensure the desired effective memory types are set by: 0) Using one ioremap_uc() for the MMIO region alone. This will set the page attribute settings for the MMIO region to PCD=1, PWT=1 for non-PAT systems while using a strong UC value on PAT systems. 1) Fixing the framebuffer ioremapped area to exclude the MMIO region and using ioremap_wc() instead to whitelist the area we want for write-combining. In both cases, an implementation defined (as per 2f9e897353fc) effective memory type of WC is used for the framebuffer for non-PAT systems. Signed-off-by: Luis R. Rodriguez Signed-off-by: Borislav Petkov Cc: Andrew Morton Cc: Andrzej Hajda Cc: Andy Lutomirski Cc: Antonino Daplas Cc: Daniel Vetter Cc: Dave Airlie Cc: Davidlohr Bueso Cc: H. Peter Anvin Cc: Jean-Christophe Plagniol-Villard Cc: Juergen Gross Cc: Linus Torvalds Cc: Mathias Krause Cc: Mel Gorman Cc: Peter Zijlstra Cc: Rob Clark Cc: Suresh Siddha Cc: Thomas Gleixner Cc: Tomi Valkeinen Cc: Toshi Kani Cc: Ville Syrjälä Cc: Vlastimil Babka Cc: arnd@arndb.de Cc: benh@kernel.crashing.org Cc: dan.j.williams@intel.com Cc: geert@linux-m68k.org Cc: hch@lst.de Cc: hmh@hmh.eng.br Cc: linux-fbdev@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-pci@vger.kernel.org Cc: mpe@ellerman.id.au Cc: mst@redhat.com Cc: ralf@linux-mips.org Cc: ross.zwisler@linux.intel.com Cc: stefan.bader@canonical.com Cc: tj@kernel.org Cc: ville.syrjala@linux.intel.com Link: http://lkml.kernel.org/r/1435196060-27350-3-git-send-email-mcgrof@do-not-panic.com Link: http://lkml.kernel.org/r/1436491499-3289-4-git-send-email-mcgrof@do-not-panic.com Signed-off-by: Ingo Molnar diff --git a/drivers/video/fbdev/aty/atyfb.h b/drivers/video/fbdev/aty/atyfb.h index 1f39a62..89ec439 100644 --- a/drivers/video/fbdev/aty/atyfb.h +++ b/drivers/video/fbdev/aty/atyfb.h @@ -184,7 +184,6 @@ struct atyfb_par { spinlock_t int_lock; #ifdef CONFIG_MTRR int mtrr_aper; - int mtrr_reg; #endif u32 mem_cntl; struct crtc saved_crtc; diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index de8f7e0..7770a84 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -2630,21 +2630,13 @@ static int aty_init(struct fb_info *info) #ifdef CONFIG_MTRR par->mtrr_aper = -1; - par->mtrr_reg = -1; if (!nomtrr) { - /* Cover the whole resource. */ + /* + * Only the ioremap_wc()'d area will get WC here + * since ioremap_uc() was used on the entire PCI BAR. + */ par->mtrr_aper = mtrr_add(par->res_start, par->res_size, MTRR_TYPE_WRCOMB, 1); - if (par->mtrr_aper >= 0 && !par->aux_start) { - /* Make a hole for mmio. */ - par->mtrr_reg = mtrr_add(par->res_start + 0x800000 - - GUI_RESERVE, GUI_RESERVE, - MTRR_TYPE_UNCACHABLE, 1); - if (par->mtrr_reg < 0) { - mtrr_del(par->mtrr_aper, 0, 0); - par->mtrr_aper = -1; - } - } } #endif @@ -2776,10 +2768,6 @@ aty_init_exit: par->pll_ops->set_pll(info, &par->saved_pll); #ifdef CONFIG_MTRR - if (par->mtrr_reg >= 0) { - mtrr_del(par->mtrr_reg, 0, 0); - par->mtrr_reg = -1; - } if (par->mtrr_aper >= 0) { mtrr_del(par->mtrr_aper, 0, 0); par->mtrr_aper = -1; @@ -3466,7 +3454,11 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info, } info->fix.mmio_start = raddr; - par->ati_regbase = ioremap(info->fix.mmio_start, 0x1000); + /* + * By using strong UC we force the MTRR to never have an + * effect on the MMIO region on both non-PAT and PAT systems. + */ + par->ati_regbase = ioremap_uc(info->fix.mmio_start, 0x1000); if (par->ati_regbase == NULL) return -ENOMEM; @@ -3503,7 +3495,10 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info, */ info->fix.smem_len = 0x800000; - info->screen_base = ioremap(info->fix.smem_start, info->fix.smem_len); + aty_fudge_framebuffer_len(info); + + info->screen_base = ioremap_wc(info->fix.smem_start, + info->fix.smem_len); if (info->screen_base == NULL) { ret = -ENOMEM; goto atyfb_setup_generic_fail; @@ -3575,6 +3570,7 @@ static int atyfb_pci_probe(struct pci_dev *pdev, return -ENOMEM; } par = info->par; + par->bus_type = PCI; info->fix = atyfb_fix; info->device = &pdev->dev; par->pci_id = pdev->device; @@ -3744,10 +3740,6 @@ static void atyfb_remove(struct fb_info *info) #endif #ifdef CONFIG_MTRR - if (par->mtrr_reg >= 0) { - mtrr_del(par->mtrr_reg, 0, 0); - par->mtrr_reg = -1; - } if (par->mtrr_aper >= 0) { mtrr_del(par->mtrr_aper, 0, 0); par->mtrr_aper = -1; -- cgit v0.10.2 From 7d89a3cb159aecb1b363ea50cb14c967ff83b5a6 Mon Sep 17 00:00:00 2001 From: "Luis R. Rodriguez" Date: Thu, 9 Jul 2015 18:24:59 -0700 Subject: drivers/video/fbdev/atyfb: Use arch_phys_wc_add() and ioremap_wc() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This driver uses strong UC for the MMIO region, and ioremap_wc() for the framebuffer to whitelist for the WC MTRR that can be changed to WC. On PAT systems we don't need the MTRR call so just use arch_phys_wc_add() there, this lets us remove all those ifdefs. Let's also be consistent and use ioremap_wc() for ATARI as well. There are a few motivations for this: a) Take advantage of PAT when available. b) Help bury MTRR code away, MTRR is architecture specific and on x86 it is being replaced by PAT. c) Help with the goal of eventually using _PAGE_CACHE_UC over _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit de33c442e titled "x86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()"). The conversion done is expressed by the following Coccinelle SmPL patch, it additionally required manual intervention to address all the ifdeffery and removal of redundant things which arch_phys_wc_add() already addresses such as verbose message about when MTRR fails and doing nothing when we didn't get an MTRR: @ mtrr_found @ expression index, base, size; @@ -index = mtrr_add(base, size, MTRR_TYPE_WRCOMB, 1); +index = arch_phys_wc_add(base, size); @ mtrr_rm depends on mtrr_found @ expression mtrr_found.index, mtrr_found.base, mtrr_found.size; @@ -mtrr_del(index, base, size); +arch_phys_wc_del(index); @ mtrr_rm_zero_arg depends on mtrr_found @ expression mtrr_found.index; @@ -mtrr_del(index, 0, 0); +arch_phys_wc_del(index); @ mtrr_rm_fb_info depends on mtrr_found @ struct fb_info *info; expression mtrr_found.index; @@ -mtrr_del(index, info->fix.smem_start, info->fix.smem_len); +arch_phys_wc_del(index); @ ioremap_replace_nocache depends on mtrr_found @ struct fb_info *info; expression base, size; @@ -info->screen_base = ioremap_nocache(base, size); +info->screen_base = ioremap_wc(base, size); @ ioremap_replace_default depends on mtrr_found @ struct fb_info *info; expression base, size; @@ -info->screen_base = ioremap(base, size); +info->screen_base = ioremap_wc(base, size); Signed-off-by: Luis R. Rodriguez Signed-off-by: Borislav Petkov Cc: Andrew Morton Cc: Andrzej Hajda Cc: Andy Lutomirski Cc: Antonino Daplas Cc: Daniel Vetter Cc: Dave Airlie Cc: Davidlohr Bueso Cc: H. Peter Anvin Cc: Jean-Christophe Plagniol-Villard Cc: Juergen Gross Cc: Linus Torvalds Cc: Mathias Krause Cc: Mel Gorman Cc: Peter Zijlstra Cc: Rob Clark Cc: Suresh Siddha Cc: Thomas Gleixner Cc: Tomi Valkeinen Cc: Toshi Kani Cc: Ville Syrjälä Cc: Vlastimil Babka Cc: arnd@arndb.de Cc: benh@kernel.crashing.org Cc: dan.j.williams@intel.com Cc: geert@linux-m68k.org Cc: hch@lst.de Cc: hmh@hmh.eng.br Cc: linux-fbdev@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-pci@vger.kernel.org Cc: mpe@ellerman.id.au Cc: mst@redhat.com Cc: ralf@linux-mips.org Cc: ross.zwisler@linux.intel.com Cc: stefan.bader@canonical.com Cc: tj@kernel.org Cc: ville.syrjala@linux.intel.com Link: http://lkml.kernel.org/r/1436491499-3289-5-git-send-email-mcgrof@do-not-panic.com Signed-off-by: Ingo Molnar diff --git a/drivers/video/fbdev/aty/atyfb.h b/drivers/video/fbdev/aty/atyfb.h index 89ec439..63c4842 100644 --- a/drivers/video/fbdev/aty/atyfb.h +++ b/drivers/video/fbdev/aty/atyfb.h @@ -182,9 +182,7 @@ struct atyfb_par { unsigned long irq_flags; unsigned int irq; spinlock_t int_lock; -#ifdef CONFIG_MTRR - int mtrr_aper; -#endif + int wc_cookie; u32 mem_cntl; struct crtc saved_crtc; union aty_pll saved_pll; diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index 7770a84..f34ed47f 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -98,9 +98,6 @@ #ifdef CONFIG_PMAC_BACKLIGHT #include #endif -#ifdef CONFIG_MTRR -#include -#endif /* * Debug flags. @@ -303,9 +300,7 @@ static struct fb_ops atyfb_ops = { }; static bool noaccel; -#ifdef CONFIG_MTRR static bool nomtrr; -#endif static int vram; static int pll; static int mclk; @@ -2628,17 +2623,13 @@ static int aty_init(struct fb_info *info) aty_st_le32(BUS_CNTL, aty_ld_le32(BUS_CNTL, par) | BUS_APER_REG_DIS, par); -#ifdef CONFIG_MTRR - par->mtrr_aper = -1; - if (!nomtrr) { + if (!nomtrr) /* * Only the ioremap_wc()'d area will get WC here * since ioremap_uc() was used on the entire PCI BAR. */ - par->mtrr_aper = mtrr_add(par->res_start, par->res_size, - MTRR_TYPE_WRCOMB, 1); - } -#endif + par->wc_cookie = arch_phys_wc_add(par->res_start, + par->res_size); info->fbops = &atyfb_ops; info->pseudo_palette = par->pseudo_palette; @@ -2766,13 +2757,8 @@ aty_init_exit: /* restore video mode */ aty_set_crtc(par, &par->saved_crtc); par->pll_ops->set_pll(info, &par->saved_pll); + arch_phys_wc_del(par->wc_cookie); -#ifdef CONFIG_MTRR - if (par->mtrr_aper >= 0) { - mtrr_del(par->mtrr_aper, 0, 0); - par->mtrr_aper = -1; - } -#endif return ret; } @@ -3672,7 +3658,8 @@ static int __init atyfb_atari_probe(void) * Map the video memory (physical address given) * to somewhere in the kernel address space. */ - info->screen_base = ioremap(phys_vmembase[m64_num], phys_size[m64_num]); + info->screen_base = ioremap_wc(phys_vmembase[m64_num], + phys_size[m64_num]); info->fix.smem_start = (unsigned long)info->screen_base; /* Fake! */ par->ati_regbase = ioremap(phys_guiregbase[m64_num], 0x10000) + 0xFC00ul; @@ -3738,13 +3725,8 @@ static void atyfb_remove(struct fb_info *info) if (M64_HAS(MOBIL_BUS)) aty_bl_exit(info->bl_dev); #endif + arch_phys_wc_del(par->wc_cookie); -#ifdef CONFIG_MTRR - if (par->mtrr_aper >= 0) { - mtrr_del(par->mtrr_aper, 0, 0); - par->mtrr_aper = -1; - } -#endif #ifndef __sparc__ if (par->ati_regbase) iounmap(par->ati_regbase); @@ -3860,10 +3842,8 @@ static int __init atyfb_setup(char *options) while ((this_opt = strsep(&options, ",")) != NULL) { if (!strncmp(this_opt, "noaccel", 7)) { noaccel = 1; -#ifdef CONFIG_MTRR } else if (!strncmp(this_opt, "nomtrr", 6)) { nomtrr = 1; -#endif } else if (!strncmp(this_opt, "vram:", 5)) vram = simple_strtoul(this_opt + 5, NULL, 0); else if (!strncmp(this_opt, "pll:", 4)) @@ -4033,7 +4013,5 @@ module_param(comp_sync, int, 0); MODULE_PARM_DESC(comp_sync, "Set composite sync signal to low (0) or high (1)"); module_param(mode, charp, 0); MODULE_PARM_DESC(mode, "Specify resolution as \"x[-][@]\" "); -#ifdef CONFIG_MTRR module_param(nomtrr, bool, 0); MODULE_PARM_DESC(nomtrr, "bool: disable use of MTRR registers"); -#endif -- cgit v0.10.2 From 4c73e8926623ca0f64c2c6111289ab8096fa647a Mon Sep 17 00:00:00 2001 From: "Luis R. Rodriguez" Date: Tue, 28 Jul 2015 20:17:13 +0200 Subject: arch/*/io.h: Add ioremap_uc() to all architectures This adds ioremap_uc() only for architectures that do not include asm-generic.h/io.h as that already provides a default definition for them for both cases where you have CONFIG_MMU and you do not, and because of this, the number of architectures this patch address is less than the architectures that the ioremap_wt() patch addressed, "arch/*/io.h: Add ioremap_wt() to all architectures"). In order to reduce the number of architectures we have to modify by adding new architecture IO APIs we'll have to review the architectures in this patch, see why they can't add asm-generic.h/io.h or issues that would be created by doing so and then spread a consistent inclusion of this header towards the end of their own header. For instance arch/metag includes the asm-generic/io.h *before* the ioremap*() definitions, this should be the other way around but only once we have guard wrappers for the non-MMU case also for asm-generic/io.h. Reported-by: Stephen Rothwell Signed-off-by: Luis R. Rodriguez Cc: Abhilash Kesavan Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Chris Metcalf Cc: David Howells Cc: Fengguang Wu Cc: Geert Uytterhoeven Cc: Greg Kroah-Hartman Cc: Greg Ungerer Cc: Guenter Roeck Cc: Haavard Skinnemoen Cc: Hans-Christian Egtvedt Cc: Koichi Yasutake Cc: Kyle McMartin Cc: Linus Torvalds Cc: Michael Ellerman Cc: Paul Mackerras Cc: Peter Hurley Cc: Peter Zijlstra Cc: Rob Herring Cc: Thomas Gleixner Cc: Toshi Kani Cc: Will Deacon Cc: linux-am33-list@redhat.com Cc: linux-arch@vger.kernel.org Cc: linux-m68k@lists.linux-m68k.org Cc: linux-sh@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20150728181713.GB30479@wotan.suse.de Signed-off-by: Ingo Molnar diff --git a/arch/avr32/include/asm/io.h b/arch/avr32/include/asm/io.h index e998ff5..f855646 100644 --- a/arch/avr32/include/asm/io.h +++ b/arch/avr32/include/asm/io.h @@ -297,6 +297,7 @@ extern void __iounmap(void __iomem *addr); #define ioremap_wc ioremap_nocache #define ioremap_wt ioremap_nocache +#define ioremap_uc ioremap_nocache #define cached(addr) P1SEGADDR(addr) #define uncached(addr) P2SEGADDR(addr) diff --git a/arch/frv/include/asm/io.h b/arch/frv/include/asm/io.h index a31b63e..70dfbea 100644 --- a/arch/frv/include/asm/io.h +++ b/arch/frv/include/asm/io.h @@ -278,6 +278,7 @@ static inline void __iomem *ioremap_fullcache(unsigned long physaddr, unsigned l } #define ioremap_wc ioremap_nocache +#define ioremap_uc ioremap_nocache extern void iounmap(void volatile __iomem *addr); diff --git a/arch/m32r/include/asm/io.h b/arch/m32r/include/asm/io.h index 0c3f25e..4c4982a 100644 --- a/arch/m32r/include/asm/io.h +++ b/arch/m32r/include/asm/io.h @@ -69,6 +69,7 @@ extern void iounmap(volatile void __iomem *addr); #define ioremap_nocache(off,size) ioremap(off,size) #define ioremap_wc ioremap_nocache #define ioremap_wt ioremap_nocache +#define ioremap_uc ioremap_nocache /* * IO bus memory addresses are also 1:1 with the physical address diff --git a/arch/m68k/include/asm/io_mm.h b/arch/m68k/include/asm/io_mm.h index 618c85d3..a0fce34 100644 --- a/arch/m68k/include/asm/io_mm.h +++ b/arch/m68k/include/asm/io_mm.h @@ -467,6 +467,7 @@ static inline void __iomem *ioremap_nocache(unsigned long physaddr, unsigned lon { return __ioremap(physaddr, size, IOMAP_NOCACHE_SER); } +#define ioremap_uc ioremap_nocache static inline void __iomem *ioremap_wt(unsigned long physaddr, unsigned long size) { diff --git a/arch/mn10300/include/asm/io.h b/arch/mn10300/include/asm/io.h index 07c5b4a..6218935 100644 --- a/arch/mn10300/include/asm/io.h +++ b/arch/mn10300/include/asm/io.h @@ -283,6 +283,7 @@ static inline void __iomem *ioremap_nocache(unsigned long offset, unsigned long #define ioremap_wc ioremap_nocache #define ioremap_wt ioremap_nocache +#define ioremap_uc ioremap_nocache static inline void iounmap(void __iomem *addr) { diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h index a8d2ef3..5879fde 100644 --- a/arch/powerpc/include/asm/io.h +++ b/arch/powerpc/include/asm/io.h @@ -721,6 +721,7 @@ extern void __iomem *ioremap_prot(phys_addr_t address, unsigned long size, unsigned long flags); extern void __iomem *ioremap_wc(phys_addr_t address, unsigned long size); #define ioremap_nocache(addr, size) ioremap((addr), (size)) +#define ioremap_uc(addr, size) ioremap((addr), (size)) extern void iounmap(volatile void __iomem *addr); diff --git a/arch/sh/include/asm/io.h b/arch/sh/include/asm/io.h index 728c4c5..93ec906 100644 --- a/arch/sh/include/asm/io.h +++ b/arch/sh/include/asm/io.h @@ -368,6 +368,7 @@ static inline int iounmap_fixed(void __iomem *addr) { return -EINVAL; } #endif #define ioremap_nocache ioremap +#define ioremap_uc ioremap #define iounmap __iounmap /* diff --git a/arch/tile/include/asm/io.h b/arch/tile/include/asm/io.h index dc61de1..322b5fe 100644 --- a/arch/tile/include/asm/io.h +++ b/arch/tile/include/asm/io.h @@ -55,6 +55,7 @@ extern void iounmap(volatile void __iomem *addr); #define ioremap_nocache(physaddr, size) ioremap(physaddr, size) #define ioremap_wc(physaddr, size) ioremap(physaddr, size) #define ioremap_wt(physaddr, size) ioremap(physaddr, size) +#define ioremap_uc(physaddr, size) ioremap(physaddr, size) #define ioremap_fullcache(physaddr, size) ioremap(physaddr, size) #define mmiowb() -- cgit v0.10.2 From d5dc861bd601b546ae6b36af54485142cca36a5e Mon Sep 17 00:00:00 2001 From: Toshi Kani Date: Wed, 22 Jul 2015 12:06:11 -0600 Subject: x86/mm/pat: Add comments to cachemode translation tables Add comments to the cachemode translation tables to clarify that the default values are set as minimal supported mode, which are necessary to handle WC and WT fallback to UC- when they are not enabled. Signed-off-by: Toshi Kani Cc: Jan Beulich Cc: Borislav Petkov Cc: Ingo Molnar Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/1437588371-28223-1-git-send-email-toshi.kani@hp.com Signed-off-by: Thomas Gleixner diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index 8533b46..1d8a83d 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -30,8 +30,11 @@ /* * Tables translating between page_cache_type_t and pte encoding. * - * Minimal supported modes are defined statically, they are modified - * during bootup if more supported cache modes are available. + * The default values are defined statically as minimal supported mode; + * WC and WT fall back to UC-. pat_init() updates these values to support + * more cache modes, WC and WT, when it is safe to do so. See pat_init() + * for the details. Note, __early_ioremap() used during early boot-time + * takes pgprot_t (pte encoding) and does not use these tables. * * Index into __cachemode2pte_tbl[] is the cachemode. * -- cgit v0.10.2 From 8f45fe441a9836011672436ef007654969b28ccc Mon Sep 17 00:00:00 2001 From: Paul Gortmaker Date: Mon, 24 Aug 2015 19:34:54 -0400 Subject: x86/mm/pat: Make mm/pageattr[-test].c explicitly non-modular The file pageattr.c is obj-y and it includes pageattr-test.c based on CPA_DEBUG (a bool), meaning that no code here is currently being built as a module by anyone. Lets remove the couple traces of modularity so that when reading the code there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. Signed-off-by: Paul Gortmaker Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1440459295-21814-3-git-send-email-paul.gortmaker@windriver.com Signed-off-by: Ingo Molnar diff --git a/arch/x86/mm/pageattr-test.c b/arch/x86/mm/pageattr-test.c index 8ff686a..5f169d5 100644 --- a/arch/x86/mm/pageattr-test.c +++ b/arch/x86/mm/pageattr-test.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include @@ -256,5 +257,4 @@ static int start_pageattr_test(void) return 0; } - -module_init(start_pageattr_test); +device_initcall(start_pageattr_test); diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c index 727158c..2c44c07 100644 --- a/arch/x86/mm/pageattr.c +++ b/arch/x86/mm/pageattr.c @@ -4,7 +4,6 @@ */ #include #include -#include #include #include #include -- cgit v0.10.2 From 13fe86f465b72fc9328d4f5ebc33223c011852ae Mon Sep 17 00:00:00 2001 From: Paul Gortmaker Date: Mon, 24 Aug 2015 19:34:55 -0400 Subject: x86/mm: Make kernel/check.c explicitly non-modular The Kconfig currently controlling compilation of this code is: arch/x86/Kconfig:config X86_CHECK_BIOS_CORRUPTION arch/x86/Kconfig: bool "Check for low memory corruption" ...meaning that it currently is not being built as a module by anyone. Lets remove the couple traces of modularity so that when reading the code there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. Signed-off-by: Paul Gortmaker Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1440459295-21814-4-git-send-email-paul.gortmaker@windriver.com Signed-off-by: Ingo Molnar diff --git a/arch/x86/kernel/check.c b/arch/x86/kernel/check.c index 83a7995..f40cfd1 100644 --- a/arch/x86/kernel/check.c +++ b/arch/x86/kernel/check.c @@ -1,4 +1,4 @@ -#include +#include #include #include #include @@ -162,6 +162,5 @@ static int start_periodic_check_for_corruption(void) schedule_delayed_work(&bios_check_work, 0); return 0; } - -module_init(start_periodic_check_for_corruption); +device_initcall(start_periodic_check_for_corruption); -- cgit v0.10.2 From c43996f4001de629af4a4d6713782e883677e5b9 Mon Sep 17 00:00:00 2001 From: "Luis R. Rodriguez" Date: Mon, 24 Aug 2015 12:13:23 -0700 Subject: PCI: Add pci_ioremap_wc_bar() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This lets drivers take advantage of PAT when available. It should help with the transition of converting video drivers over to ioremap_wc() to help with the goal of eventually using _PAGE_CACHE_UC over _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache(), see: de33c442ed2a ("x86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()") Signed-off-by: Luis R. Rodriguez Signed-off-by: Borislav Petkov Acked-by: Arnd Bergmann Cc: Cc: Andrew Morton Cc: Andy Lutomirski Cc: Antonino Daplas Cc: Bjorn Helgaas Cc: Daniel Vetter Cc: Dave Airlie Cc: Davidlohr Bueso Cc: H. Peter Anvin Cc: Jean-Christophe Plagniol-Villard Cc: Juergen Gross Cc: Linus Torvalds Cc: Mel Gorman Cc: Peter Zijlstra Cc: Suresh Siddha Cc: Thomas Gleixner Cc: Tomi Valkeinen Cc: Toshi Kani Cc: Ville Syrjälä Cc: Vlastimil Babka Cc: airlied@linux.ie Cc: benh@kernel.crashing.org Cc: dan.j.williams@intel.com Cc: konrad.wilk@oracle.com Cc: linux-fbdev@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: mst@redhat.com Cc: vinod.koul@intel.com Cc: xen-devel@lists.xensource.com Link: http://lkml.kernel.org/r/1440443613-13696-2-git-send-email-mcgrof@do-not-panic.com Signed-off-by: Ingo Molnar diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 0008c95..fdae37b 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -138,6 +138,20 @@ void __iomem *pci_ioremap_bar(struct pci_dev *pdev, int bar) return ioremap_nocache(res->start, resource_size(res)); } EXPORT_SYMBOL_GPL(pci_ioremap_bar); + +void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar) +{ + /* + * Make sure the BAR is actually a memory resource, not an IO resource + */ + if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) { + WARN_ON(1); + return NULL; + } + return ioremap_wc(pci_resource_start(pdev, bar), + pci_resource_len(pdev, bar)); +} +EXPORT_SYMBOL_GPL(pci_ioremap_wc_bar); #endif #define PCI_FIND_CAP_TTL 48 diff --git a/include/linux/pci.h b/include/linux/pci.h index 8a0321a..985dd39 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1661,6 +1661,7 @@ static inline void pci_mmcfg_late_init(void) { } int pci_ext_cfg_avail(void); void __iomem *pci_ioremap_bar(struct pci_dev *pdev, int bar); +void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar); #ifdef CONFIG_PCI_IOV int pci_iov_virtfn_bus(struct pci_dev *dev, int id); -- cgit v0.10.2 From c112709809b2ca1e8ff2841a1958503e548b81e4 Mon Sep 17 00:00:00 2001 From: "Luis R. Rodriguez" Date: Mon, 24 Aug 2015 12:13:24 -0700 Subject: drivers/video/fbdev/i740fb: Use arch_phys_wc_add() and pci_ioremap_wc_bar() Convert the driver from using the x86-specific MTRR code to the architecture-agnostic arch_phys_wc_add(). It will avoid MTRR if write-combining is available, in order to take advantage of that also ensure the ioremapped area is requested as write-combining. There are a few motivations for this: a) Take advantage of PAT when available b) Help bury MTRR code away, MTRR is architecture-specific and on x86 it is being replaced by PAT. c) Help with the goal of eventually using _PAGE_CACHE_UC over _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit de33c442e titled "x86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()") The conversion done is expressed by the following Coccinelle SmPL patch, it additionally required manual intervention to address all the ifdeffery and removal of redundant things which arch_phys_wc_add() already addresses such as verbose message about when MTRR fails and doing nothing when we didn't get an MTRR. @ mtrr_found @ expression index, base, size; @@ -index = mtrr_add(base, size, MTRR_TYPE_WRCOMB, 1); +index = arch_phys_wc_add(base, size); @ mtrr_rm depends on mtrr_found @ expression mtrr_found.index, mtrr_found.base, mtrr_found.size; @@ -mtrr_del(index, base, size); +arch_phys_wc_del(index); @ mtrr_rm_zero_arg depends on mtrr_found @ expression mtrr_found.index; @@ -mtrr_del(index, 0, 0); +arch_phys_wc_del(index); @ mtrr_rm_fb_info depends on mtrr_found @ struct fb_info *info; expression mtrr_found.index; @@ -mtrr_del(index, info->fix.smem_start, info->fix.smem_len); +arch_phys_wc_del(index); @ ioremap_replace_nocache depends on mtrr_found @ struct fb_info *info; expression base, size; @@ -info->screen_base = ioremap_nocache(base, size); +info->screen_base = ioremap_wc(base, size); @ ioremap_replace_default depends on mtrr_found @ struct fb_info *info; expression base, size; @@ -info->screen_base = ioremap(base, size); +info->screen_base = ioremap_wc(base, size); Signed-off-by: Luis R. Rodriguez Signed-off-by: Borislav Petkov Acked-by: Tomi Valkeinen Cc: Andrew Morton Cc: Andy Lutomirski Cc: Antonino Daplas Cc: Arnd Bergmann Cc: Benoit Taine Cc: Bjorn Helgaas Cc: Daniel Vetter Cc: Dave Airlie Cc: Geert Uytterhoeven Cc: H. Peter Anvin Cc: Jean-Christophe Plagniol-Villard Cc: Jingoo Han Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rob Clark Cc: Suresh Siddha Cc: Thomas Gleixner Cc: airlied@linux.ie Cc: benh@kernel.crashing.org Cc: dan.j.williams@intel.com Cc: konrad.wilk@oracle.com Cc: linux-fbdev@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: mst@redhat.com Cc: toshi.kani@hp.com Cc: vinod.koul@intel.com Cc: xen-devel@lists.xensource.com Link: http://lkml.kernel.org/r/1440443613-13696-3-git-send-email-mcgrof@do-not-panic.com Signed-off-by: Ingo Molnar diff --git a/drivers/video/fbdev/i740fb.c b/drivers/video/fbdev/i740fb.c index a2b4204..452e116 100644 --- a/drivers/video/fbdev/i740fb.c +++ b/drivers/video/fbdev/i740fb.c @@ -27,24 +27,15 @@ #include #include