summaryrefslogtreecommitdiff
path: root/arch/s390
AgeCommit message (Collapse)Author
2016-01-30Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "An optimization for irq-restore, the SSM instruction is quite a bit slower than an if-statement and a STOSM. The copy_file_range system all is added. Cleanup for PCI and CIO. And a couple of bug fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/cio: update measurement characteristics s390/cio: ensure consistent measurement state s390/cio: fix measurement characteristics memleak s390/zcrypt: Fix cryptographic device id in kernel messages s390/pci: remove iomap sanity checks s390/pci: set error state for unusable functions s390/pci: fix bar check s390/pci: resize iomap s390/pci: improve ZPCI_* macros s390/pci: provide ZPCI_ADDR macro s390/pci: adjust IOMAP_MAX_ENTRIES s390/numa: move numa_init_late() from device to arch_initcall s390: remove all usages of PSW_ADDR_INSN s390: remove all usages of PSW_ADDR_AMODE s390: wire up copy_file_range syscall s390: remove superfluous memblock_alloc() return value checks s390/numa: allocate memory with correct alignment s390/irqflags: optimize irq restore s390/mm: use TASK_MAX_SIZE where applicable
2016-01-27Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "s390 and POWER bug fixes, plus enabling the KVM-VFIO interface on s390" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM doc: Fix KVM_SMI chapter number KVM: s390: fix memory overwrites when vx is disabled KVM: s390: Enable the KVM-VFIO device KVM: s390: fix guest fprs memory leak KVM: PPC: Fix ONE_REG AltiVec support KVM: PPC: Increase memslots to 512 KVM: PPC: Book3S PR: Remove unused variable 'vcpu_book3s' KVM: PPC: Fix emulation of H_SET_DABR/X on POWER8 KVM: PPC: Book3S HV: Handle unexpected traps in guest entry/exit code better
2016-01-26KVM: s390: fix memory overwrites when vx is disabledDavid Hildenbrand
The kernel now always uses vector registers when available, however KVM has special logic if support is really enabled for a guest. If support is disabled, guest_fpregs.fregs will only contain memory for the fpu. The kernel, however, will store vector registers into that area, resulting in crazy memory overwrites. Simply extending that area is not enough, because the format of the registers also changes. We would have to do additional conversions, making the code even more complex. Therefore let's directly use one place for the vector/fpu registers + fpc (in kvm_run). We just have to convert the data properly when accessing it. This makes current code much easier. Please note that vector/fpu registers are now always stored to vcpu->run->s.regs.vrs. Although this data is visible to QEMU and used for migration, we only guarantee valid values to user space when KVM_SYNC_VRS is set. As that is only the case when we have vector register support, we are on the safe side. Fixes: b5510d9b68c3 ("s390/fpu: always enable the vector facility if it is available") Cc: stable@vger.kernel.org # v4.4 d9a3a09af54d s390/kvm: remove dependency on struct save_area definition Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> [adopt to d9a3a09af54d]
2016-01-26KVM: s390: Enable the KVM-VFIO deviceDong Jia Shi
The KVM-VFIO device is used by the QEMU VFIO device. It is used to record the list of in-use VFIO groups so that KVM can manipulate them. While we don't need this on s390 currently, let's try to be like everyone else. Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-01-26KVM: s390: fix guest fprs memory leakDavid Hildenbrand
fprs is never freed, therefore resulting in a memory leak if kvm_vcpu_init() fails or the vcpu is destroyed. Fixes: 9977e886cbbc ("s390/kernel: lazy restore fpu registers") Cc: stable@vger.kernel.org # v4.3+ Reported-by: Eric Farman <farman@linux.vnet.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-01-26s390/pci: remove iomap sanity checksSebastian Ott
Since each iomap_entry handles only one bar of one pci function (even when disjunct ranges of a bar are mapped) the sanity check in pci_iomap_range is not needed and can be removed. Also convert the remaining BUG_ONs to WARN_ONs. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-26s390/pci: set error state for unusable functionsSebastian Ott
We receive special notifications from firmware when an error was detected and a pci function became unusable. Set the error_state accordingly to give device drivers a hint that they don't need to try error recovery. Suggested-by: Alexander Schmidt <alexschm@de.ibm.com> Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-26s390/pci: fix bar checkSebastian Ott
Fix the check which bar space we should map to allow available bars only. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-26s390/pci: resize iomapSebastian Ott
On s390 we need to maintain a mapping between iomem addresses and arch specific function identifiers. Currently the mapping table is created as such that we could span the whole iomem address space. Since we can only map each bar space from each possible function we have an upper bound for the number of mapping entries. This reduces the size of the iomap from 256K to less than 4K (using the defconfig). Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-26s390/pci: improve ZPCI_* macrosSebastian Ott
Most of the constants defined in pci_io.h depend on each other and thus can be calculated. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-26s390/pci: provide ZPCI_ADDR macroSebastian Ott
Provide and use a ZPCI_ADDR macro as the complement of ZPCI_IDX to get rid of some constants in the code. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-26s390/pci: adjust IOMAP_MAX_ENTRIESSebastian Ott
ZPCI_IOMAP_MAX_ENTRIES is off by one. Let's adjust this for the sake of correctness. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-26s390/numa: move numa_init_late() from device to arch_initcallMichael Holzheu
Commit 3e89e1c5ea ("hugetlb: make mm and fs code explicitly non-modular") moves hugetlb_init() from module_init to subsys_initcall. The hugetlb_init()->hugetlb_register_node() code accesses "node->dev.kobj" which is initialized in numa_init_late(). Since numa_init_late() is a device_initcall which is called *after* subsys_initcall the above mentioned patch breaks NUMA on s390. So fix this and move numa_init_late() to arch_initcall. Fixes: 3e89e1c5ea ("hugetlb: make mm and fs code explicitly non-modular") Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-22wrappers for ->i_mutex accessAl Viro
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested}, inode_foo(inode) being mutex_foo(&inode->i_mutex). Please, use those for access to ->i_mutex; over the coming cycle ->i_mutex will become rwsem, with ->lookup() done with it held only shared. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-21dma-mapping: always provide the dma_map_ops based implementationChristoph Hellwig
Move the generic implementation to <linux/dma-mapping.h> now that all architectures support it and remove the HAVE_DMA_ATTR Kconfig symbol now that everyone supports them. [valentinrothberg@gmail.com: remove leftovers in Kconfig] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Helge Deller <deller@gmx.de> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Steven Miao <realmz6@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-19s390: remove all usages of PSW_ADDR_INSNHeiko Carstens
Yet another leftover from the 31 bit era. The usual operation "y = x & PSW_ADDR_INSN" with the PSW_ADDR_INSN mask is a nop for CONFIG_64BIT. Therefore remove all usages and hope the code is a bit less confusing. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
2016-01-19s390: remove all usages of PSW_ADDR_AMODEHeiko Carstens
This is a leftover from the 31 bit area. For CONFIG_64BIT the usual operation "y = x | PSW_ADDR_AMODE" is a nop. Therefore remove all usages of PSW_ADDR_AMODE and make the code a bit less confusing. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
2016-01-19s390: wire up copy_file_range syscallHeiko Carstens
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-19s390: remove superfluous memblock_alloc() return value checksHeiko Carstens
memblock_alloc() and memblock_alloc_base() will panic on their own if they can't find free memory. Therefore remove some pointless checks. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-19s390/numa: allocate memory with correct alignmentHeiko Carstens
Allocating memory with a requested minimum alignment of 1 is wrong since pg_data_t contains a spinlock which requires an alignment of 4 bytes. Therefore fix this and ask for an alignment of 8 bytes like it is guarenteed for all kmalloc requests. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-19s390/irqflags: optimize irq restoreChristian Borntraeger
The ssm instruction takes longer that stnsm/stosm as it is often used to modify DAT and PER. We know that irqsave/irqrestore only deals with external and I/O interrupts and we know that irqrestore can transition only from disabled->disabled or disabled->enabled, so we can use the faster stosm. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-01-19s390/mm: use TASK_MAX_SIZE where applicableDominik Dingel
To improve readability we can use TASK_MAX_SIZE when we just check for the upper limit. All places explicitly dealing with 3 vs 4 level pgtables were left unchanged. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Reviewed-By: Sascha Silbe <silbe@linux.vnet.ibm.com>
2016-01-19Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio barrier rework+fixes from Michael Tsirkin: "This adds a new kind of barrier, and reworks virtio and xen to use it. Plus some fixes here and there" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (44 commits) checkpatch: add virt barriers checkpatch: check for __smp outside barrier.h checkpatch.pl: add missing memory barriers virtio: make find_vqs() checkpatch.pl-friendly virtio_balloon: fix race between migration and ballooning virtio_balloon: fix race by fill and leak s390: more efficient smp barriers s390: use generic memory barriers xen/events: use virt_xxx barriers xen/io: use virt_xxx barriers xenbus: use virt_xxx barriers virtio_ring: use virt_store_mb sh: move xchg_cmpxchg to a header by itself sh: support 1 and 2 byte xchg virtio_ring: update weak barriers to use virt_xxx Revert "virtio_ring: Update weak barriers to use dma_wmb/rmb" asm-generic: implement virt_xxx memory barriers x86: define __smp_xxx xtensa: define __smp_xxx tile: define __smp_xxx ...
2016-01-16Kconfig: remove HAVE_LATENCYTOP_SUPPORTWill Deacon
As illustrated by commit a3afe70b83fd ("[S390] latencytop s390 support."), HAVE_LATENCYTOP_SUPPORT is defined by an architecture to advertise an implementation of save_stack_trace_tsk. However, as of 9212ddb5eada ("stacktrace: provide save_stack_trace_tsk() weak alias") a dummy implementation is provided if STACKTRACE=y. Given that LATENCYTOP already depends on STACKTRACE_SUPPORT and selects STACKTRACE, we can remove HAVE_LATENCYTOP_SUPPORT altogether. Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: James Hogan <james.hogan@imgtec.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Helge Deller <deller@gmx.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16s390/mm: enable fixup_user_fault retryingDominik Dingel
By passing a non-null flag we allow fixup_user_fault to retry, which enables userfaultfd. As during these retries we might drop the mmap_sem we need to check if that happened and redo the complete chain of actions. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: "Jason J. Herne" <jjherne@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: Eric B Munson <emunson@akamai.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16mm: bring in additional flag for fixup_user_fault to signal unlockDominik Dingel
During Jason's work with postcopy migration support for s390 a problem regarding gmap faults was discovered. The gmap code will call fixup_user_fault which will end up always in handle_mm_fault. Till now we never cared about retries, but as the userfaultfd code kind of relies on it. this needs some fix. This patchset does not take care of the futex code. I will now look closer at this. This patch (of 2): With the introduction of userfaultfd, kvm on s390 needs fixup_user_fault to pass in FAULT_FLAG_ALLOW_RETRY and give feedback if during the faulting we ever unlocked mmap_sem. This patch brings in the logic to handle retries as well as it cleans up the current documentation. fixup_user_fault was not having the same semantics as filemap_fault. It never indicated if a retry happened and so a caller wasn't able to handle that case. So we now changed the behaviour to always retry a locked mmap_sem. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: "Jason J. Herne" <jjherne@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: Eric B Munson <emunson@akamai.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16s390, thp: remove infrastructure for handling splitting PMDsKirill A. Shutemov
With new refcounting we don't need to mark PMDs splitting. Let's drop code to handle this. pmdp_splitting_flush() is not needed too: on splitting PMD we will do pmdp_clear_flush() + set_pte_at(). pmdp_clear_flush() will do IPI as needed for fast_gup. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Steve Capper <steve.capper@linaro.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16mm: drop tail page refcountingKirill A. Shutemov
Tail page refcounting is utterly complicated and painful to support. It uses ->_mapcount on tail pages to store how many times this page is pinned. get_page() bumps ->_mapcount on tail page in addition to ->_count on head. This information is required by split_huge_page() to be able to distribute pins from head of compound page to tails during the split. We will need ->_mapcount to account PTE mappings of subpages of the compound page. We eliminate need in current meaning of ->_mapcount in tail pages by forbidding split entirely if the page is pinned. The only user of tail page refcounting is THP which is marked BROKEN for now. Let's drop all this mess. It makes get_page() and put_page() much simpler. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Sasha Levin <sasha.levin@oracle.com> Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Steve Capper <steve.capper@linaro.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge first patch-bomb from Andrew Morton: - A few hotfixes which missed 4.4 becasue I was asleep. cc'ed to -stable - A few misc fixes - OCFS2 updates - Part of MM. Including pretty large changes to page-flags handling and to thp management which have been buffered up for 2-3 cycles now. I have a lot of MM material this time. [ It turns out the THP part wasn't quite ready, so that got dropped from this series - Linus ] * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (117 commits) zsmalloc: reorganize struct size_class to pack 4 bytes hole mm/zbud.c: use list_last_entry() instead of list_tail_entry() zram/zcomp: do not zero out zcomp private pages zram: pass gfp from zcomp frontend to backend zram: try vmalloc() after kmalloc() zram/zcomp: use GFP_NOIO to allocate streams mm: add tracepoint for scanning pages drivers/base/memory.c: fix kernel warning during memory hotplug on ppc64 mm/page_isolation: use macro to judge the alignment mm: fix noisy sparse warning in LIBCFS_ALLOC_PRE() mm: rework virtual memory accounting include/linux/memblock.h: fix ordering of 'flags' argument in comments mm: move lru_to_page to mm_inline.h Documentation/filesystems: describe the shared memory usage/accounting memory-hotplug: don't BUG() in register_memory_resource() hugetlb: make mm and fs code explicitly non-modular mm/swapfile.c: use list_for_each_entry_safe in free_swap_count_continuations mm: /proc/pid/clear_refs: no need to clear VM_SOFTDIRTY in clear_soft_dirty_pmd() mm: make sure isolate_lru_page() is never called for tail page vmstat: make vmstat_updater deferrable again and shut down on idle ...
2016-01-15Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching Pull livepatching updates from Jiri Kosina: - RO/NX attribute fixes for patch module relocations from Josh Poimboeuf. As part of this effort, module.c has been cleaned up as well and livepatching is piggy-backing on this cleanup. Rusty is OK with this whole lot going through livepatching tree. - symbol disambiguation support from Chris J Arges. That series is also Reviewed-by: Miroslav Benes <mbenes@suse.cz> but this came in only after I've alredy pushed out. Didn't want to rebase because of that, hence I am mentioning it here. - symbol lookup fix from Miroslav Benes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching: livepatch: Cleanup module page permission changes module: keep percpu symbols in module's symtab module: clean up RO/NX handling. module: use a structure to encapsulate layout. gcov: use within_module() helper. module: Use the same logic for setting and unsetting RO/NX livepatch: function,sympos scheme in livepatch sysfs directory livepatch: add sympos as disambiguator field to klp_reloc livepatch: add old_sympos as disambiguator field to klp_func
2016-01-15mm, shmem: add internal shmem resident memory accountingJerome Marchand
Currently looking at /proc/<pid>/status or statm, there is no way to distinguish shmem pages from pages mapped to a regular file (shmem pages are mapped to /dev/zero), even though their implication in actual memory use is quite different. The internal accounting currently counts shmem pages together with regular files. As a preparation to extend the userspace interfaces, this patch adds MM_SHMEMPAGES counter to mm_rss_stat to account for shmem pages separately from MM_FILEPAGES. The next patch will expose it to userspace - this patch doesn't change the exported values yet, by adding up MM_SHMEMPAGES to MM_FILEPAGES at places where MM_FILEPAGES was used before. The only user-visible change after this patch is the OOM killer message that separates the reported "shmem-rss" from "file-rss". [vbabka@suse.cz: forward-porting, tweak changelog] Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14Merge tag 'libnvdimm-for-4.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm updates from Dan Williams: "The bulk of this has appeared in -next and independently received a build success notification from the kbuild robot. The 'for-4.5/block- dax' topic branch was rebased over the weekend to drop the "block device end-of-life" rework that Al would like to see re-implemented with a notifier, and to address bug reports against the badblocks integration. There is pending feedback against "libnvdimm: Add a poison list and export badblocks" received last week. Linda identified some localized fixups that we will handle incrementally. Summary: - Media error handling: The 'badblocks' implementation that originated in md-raid is up-levelled to a generic capability of a block device. This initial implementation is limited to being consulted in the pmem block-i/o path. Later, 'badblocks' will be consulted when creating dax mappings. - Raw block device dax: For virtualization and other cases that want large contiguous mappings of persistent memory, add the capability to dax-mmap a block device directly. - Increased /dev/mem restrictions: Add an option to treat all io-memory as IORESOURCE_EXCLUSIVE, i.e. disable /dev/mem access while a driver is actively using an address range. This behavior is controlled via the new CONFIG_IO_STRICT_DEVMEM option and can be overridden by the existing "iomem=relaxed" kernel command line option. - Miscellaneous fixes include a 'pfn'-device huge page alignment fix, block device shutdown crash fix, and other small libnvdimm fixes" * tag 'libnvdimm-for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (32 commits) block: kill disk_{check|set|clear|alloc}_badblocks libnvdimm, pmem: nvdimm_read_bytes() badblocks support pmem, dax: disable dax in the presence of bad blocks pmem: fail io-requests to known bad blocks libnvdimm: convert to statically allocated badblocks libnvdimm: don't fail init for full badblocks list block, badblocks: introduce devm_init_badblocks block: clarify badblocks lifetime badblocks: rename badblocks_free to badblocks_exit libnvdimm, pmem: move definition of nvdimm_namespace_add_poison to nd.h libnvdimm: Add a poison list and export badblocks nfit_test: Enable DSMs for all test NFITs md: convert to use the generic badblocks code block: Add badblock management for gendisks badblocks: Add core badblock management code block: fix del_gendisk() vs blkdev_ioctl crash block: enable dax for raw block devices block: introduce bdev_file_inode() restrict /dev/mem to idle io memory ranges arch: consolidate CONFIG_STRICT_DEVM in lib/Kconfig.debug ...
2016-01-13Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "Among the traditional bug fixes and cleanups are some improvements: - A tool to generated the facility lists, generating the bit fields by hand has been a source of bugs in the past - The spinlock loop is reordered to avoid bursts of hypervisor calls - Add support for the open-for-business interface to the service element - The get_cpu call is added to the vdso - A set of tracepoints is defined for the common I/O layer - The deprecated sclp_cpi module is removed - Update default configuration" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (56 commits) s390/sclp: fix possible control register corruption s390: fix normalization bug in exception table sorting s390/configs: update default configurations s390/vdso: optimize getcpu system call s390: drop smp_mb in vdso_init s390: rename struct _lowcore to struct lowcore s390/mem_detect: use unsigned longs s390/ptrace: get rid of long longs in psw_bits s390/sysinfo: add missing SYSIB 1.2.2 multithreading fields s390: get rid of CONFIG_SCHED_MC and CONFIG_SCHED_BOOK s390/Kconfig: remove pointless 64 bit dependencies s390/dasd: fix failfast for disconnected devices s390/con3270: testing return kzalloc retval s390/hmcdrv: constify hmcdrv_ftp_ops structs s390/cio: add NULL test s390/cio: Change I/O instructions from inline to normal functions s390/cio: Introduce common I/O layer tracepoints s390/cio: Consolidate inline assemblies and related data definitions s390/cio: Fix incorrect xsch opcode specification s390/cio: Remove unused inline assemblies ...
2016-01-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from Davic Miller: 1) Support busy polling generically, for all NAPI drivers. From Eric Dumazet. 2) Add byte/packet counter support to nft_ct, from Floriani Westphal. 3) Add RSS/XPS support to mvneta driver, from Gregory Clement. 4) Implement IPV6_HDRINCL socket option for raw sockets, from Hannes Frederic Sowa. 5) Add support for T6 adapter to cxgb4 driver, from Hariprasad Shenai. 6) Add support for VLAN device bridging to mlxsw switch driver, from Ido Schimmel. 7) Add driver for Netronome NFP4000/NFP6000, from Jakub Kicinski. 8) Provide hwmon interface to mlxsw switch driver, from Jiri Pirko. 9) Reorganize wireless drivers into per-vendor directories just like we do for ethernet drivers. From Kalle Valo. 10) Provide a way for administrators "destroy" connected sockets via the SOCK_DESTROY socket netlink diag operation. From Lorenzo Colitti. 11) Add support to add/remove multicast routes via netlink, from Nikolay Aleksandrov. 12) Make TCP keepalive settings per-namespace, from Nikolay Borisov. 13) Add forwarding and packet duplication facilities to nf_tables, from Pablo Neira Ayuso. 14) Dead route support in MPLS, from Roopa Prabhu. 15) TSO support for thunderx chips, from Sunil Goutham. 16) Add driver for IBM's System i/p VNIC protocol, from Thomas Falcon. 17) Rationalize, consolidate, and more completely document the checksum offloading facilities in the networking stack. From Tom Herbert. 18) Support aborting an ongoing scan in mac80211/cfg80211, from Vidyullatha Kanchanapally. 19) Use per-bucket spinlock for bpf hash facility, from Tom Leiming. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1375 commits) net: bnxt: always return values from _bnxt_get_max_rings net: bpf: reject invalid shifts phonet: properly unshare skbs in phonet_rcv() dwc_eth_qos: Fix dma address for multi-fragment skbs phy: remove an unneeded condition mdio: remove an unneed condition mdio_bus: NULL dereference on allocation error net: Fix typo in netdev_intersect_features net: freescale: mac-fec: Fix build error from phy_device API change net: freescale: ucc_geth: Fix build error from phy_device API change bonding: Prevent IPv6 link local address on enslaved devices IB/mlx5: Add flow steering support net/mlx5_core: Export flow steering API net/mlx5_core: Make ipv4/ipv6 location more clear net/mlx5_core: Enable flow steering support for the IB driver net/mlx5_core: Initialize namespaces only when supported by device net/mlx5_core: Set priority attributes net/mlx5_core: Connect flow tables net/mlx5_core: Introduce modify flow table command net/mlx5_core: Managing root flow table ...
2016-01-13Merge branch 'work.misc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "All kinds of stuff. That probably should've been 5 or 6 separate branches, but by the time I'd realized how large and mixed that bag had become it had been too close to -final to play with rebasing. Some fs/namei.c cleanups there, memdup_user_nul() introduction and switching open-coded instances, burying long-dead code, whack-a-mole of various kinds, several new helpers for ->llseek(), assorted cleanups and fixes from various people, etc. One piece probably deserves special mention - Neil's lookup_one_len_unlocked(). Similar to lookup_one_len(), but gets called without ->i_mutex and tries to avoid ever taking it. That, of course, means that it's not useful for any directory modifications, but things like getting inode attributes in nfds readdirplus are fine with that. I really should've asked for moratorium on lookup-related changes this cycle, but since I hadn't done that early enough... I *am* asking for that for the coming cycle, though - I'm going to try and get conversion of i_mutex to rwsem with ->lookup() done under lock taken shared. There will be a patch closer to the end of the window, along the lines of the one Linus had posted last May - mechanical conversion of ->i_mutex accesses to inode_lock()/inode_unlock()/inode_trylock()/ inode_is_locked()/inode_lock_nested(). To quote Linus back then: ----- | This is an automated patch using | | sed 's/mutex_lock(&\(.*\)->i_mutex)/inode_lock(\1)/' | sed 's/mutex_unlock(&\(.*\)->i_mutex)/inode_unlock(\1)/' | sed 's/mutex_lock_nested(&\(.*\)->i_mutex,[ ]*I_MUTEX_\([A-Z0-9_]*\))/inode_lock_nested(\1, I_MUTEX_\2)/' | sed 's/mutex_is_locked(&\(.*\)->i_mutex)/inode_is_locked(\1)/' | sed 's/mutex_trylock(&\(.*\)->i_mutex)/inode_trylock(\1)/' | | with a very few manual fixups ----- I'm going to send that once the ->i_mutex-affecting stuff in -next gets mostly merged (or when Linus says he's about to stop taking merges)" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) nfsd: don't hold i_mutex over userspace upcalls fs:affs:Replace time_t with time64_t fs/9p: use fscache mutex rather than spinlock proc: add a reschedule point in proc_readfd_common() logfs: constify logfs_block_ops structures fcntl: allow to set O_DIRECT flag on pipe fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE fs: xattr: Use kvfree() [s390] page_to_phys() always returns a multiple of PAGE_SIZE nbd: use ->compat_ioctl() fs: use block_device name vsprintf helper lib/vsprintf: add %*pg format specifier fs: use gendisk->disk_name where possible poll: plug an unused argument to do_poll amdkfd: don't open-code memdup_user() cdrom: don't open-code memdup_user() rsxx: don't open-code memdup_user() mtip32xx: don't open-code memdup_user() [um] mconsole: don't open-code memdup_user_nul() [um] hostaudio: don't open-code memdup_user() ...
2016-01-12Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "PPC changes will come next week. - s390: Support for runtime instrumentation within guests, support of 248 VCPUs. - ARM: rewrite of the arm64 world switch in C, support for 16-bit VM identifiers. Performance counter virtualization missed the boat. - x86: Support for more Hyper-V features (synthetic interrupt controller), MMU cleanups" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (115 commits) kvm: x86: Fix vmwrite to SECONDARY_VM_EXEC_CONTROL kvm/x86: Hyper-V SynIC timers tracepoints kvm/x86: Hyper-V SynIC tracepoints kvm/x86: Update SynIC timers on guest entry only kvm/x86: Skip SynIC vector check for QEMU side kvm/x86: Hyper-V fix SynIC timer disabling condition kvm/x86: Reorg stimer_expiration() to better control timer restart kvm/x86: Hyper-V unify stimer_start() and stimer_restart() kvm/x86: Drop stimer_stop() function kvm/x86: Hyper-V timers fix incorrect logical operation KVM: move architecture-dependent requests to arch/ KVM: renumber vcpu->request bits KVM: document which architecture uses each request bit KVM: Remove unused KVM_REQ_KICK to save a bit in vcpu->requests kvm: x86: Check kvm_write_guest return value in kvm_write_wall_clock KVM: s390: implement the RI support of guest kvm/s390: drop unpaired smp_mb kvm: x86: fix comment about {mmu,nested_mmu}.gva_to_gpa KVM: x86: MMU: Use clear_page() instead of init_shadow_page_table() arm/arm64: KVM: Detect vGIC presence at runtime ...
2016-01-12s390: more efficient smp barriersMichael S. Tsirkin
As per: lkml.kernel.org/r/20150921112252.3c2937e1@mschwide atomics imply a barrier on s390, so s390 should change smp_mb__before_atomic and smp_mb__after_atomic to barrier() instead of smp_mb() and hence should not use the generic versions. Suggested-by: Peter Zijlstra <peterz@infradead.org> Suggested-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2016-01-12s390: use generic memory barriersMichael S. Tsirkin
The s390 kernel is SMP to 99.99%, we just didn't bother with a non-smp variant for the memory-barriers. If the generic header is used we'd get the non-smp version for free. It will save a small amount of text space for CONFIG_SMP=n. Suggested-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2016-01-12s390: define __smp_xxxMichael S. Tsirkin
This defines __smp_xxx barriers for s390, for use by virtualization. Some smp_xxx barriers are removed as they are defined correctly by asm-generic/barriers.h Note: smp_mb, smp_rmb and smp_wmb are defined as full barriers unconditionally on this architecture. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2016-01-12s390: reuse asm-generic/barrier.hMichael S. Tsirkin
On s390 read_barrier_depends, smp_read_barrier_depends smp_store_mb(), smp_mb__before_atomic and smp_mb__after_atomic match the asm-generic variants exactly. Drop the local definitions and pull in asm-generic/barrier.h instead. This is in preparation to refactoring this code area. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2016-01-12lcoking/barriers, arch: Use smp barriers in smp_store_release()Davidlohr Bueso
With commit b92b8b35a2e ("locking/arch: Rename set_mb() to smp_store_mb()") it was made clear that the context of this call (and thus set_mb) is strictly for CPU ordering, as opposed to IO. As such all archs should use the smp variant of mb(), respecting the semantics and saving a mandatory barrier on UP. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <linux-arch@vger.kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: dave@stgolabs.net Link: http://lkml.kernel.org/r/1445975631-17047-3-git-send-email-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-01-11Merge branch 'locking-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "So we have a laundry list of locking subsystem changes: - continuing barrier API and code improvements - futex enhancements - atomics API improvements - pvqspinlock enhancements: in particular lock stealing and adaptive spinning - qspinlock micro-enhancements" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op futex: Cleanup the goto confusion in requeue_pi() futex: Remove pointless put_pi_state calls in requeue() futex: Document pi_state refcounting in requeue code futex: Rename free_pi_state() to put_pi_state() futex: Drop refcount if requeue_pi() acquired the rtmutex locking/barriers, arch: Remove ambiguous statement in the smp_store_mb() documentation lcoking/barriers, arch: Use smp barriers in smp_store_release() locking/cmpxchg, arch: Remove tas() definitions locking/pvqspinlock: Queue node adaptive spinning locking/pvqspinlock: Allow limited lock stealing locking/pvqspinlock: Collect slowpath lock statistics sched/core, locking: Document Program-Order guarantees locking, sched: Introduce smp_cond_acquire() and use it locking/pvqspinlock, x86: Optimize the PV unlock code path locking/qspinlock: Avoid redundant read of next pointer locking/qspinlock: Prefetch the next node cacheline locking/qspinlock: Use _acquire/_release() versions of cmpxchg() & xchg() atomics: Add test for atomic operations with _relaxed variants
2016-01-11s390: fix normalization bug in exception table sortingArd Biesheuvel
The normalization pass in the sorting routine of the relative exception table serves two purposes: - it ensures that the address fields of the exception table entries are fully ordered, so that no ambiguities arise between entries with identical instruction offsets (i.e., when two instructions that are exactly 8 bytes apart each have an exception table entry associated with them) - it ensures that the offsets of both the instruction and the fixup fields of each entry are relative to their final location after sorting. Commit eb608fb366de ("s390/exceptions: switch to relative exception table entries") ported the relative exception table format from x86, but modified the sorting routine to only normalize the instruction offset field and not the fixup offset field. The result is that the fixup offset of each entry will be relative to the original location of the entry before sorting, likely leading to crashes when those entries are dereferenced. Fixes: eb608fb366de ("s390/exceptions: switch to relative exception table entries") Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: <stable@vger.kernel.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-11s390/configs: update default configurationsHeiko Carstens
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-11s390/vdso: optimize getcpu system callMartin Schwidefsky
Add the CPU number to the per-cpu vdso data page and add the __kernel_getcpu function to the vdso object to retrieve the CPU number in user space. Suggested-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-11s390: drop smp_mb in vdso_initMichael S. Tsirkin
The initial s390 vdso code is heavily influenced by the powerpc version which does have a smp_wmb in vdso_init right before the vdso_ready=1 assignment. s390 has no need for that. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <1452010645-25380-1-git-send-email-mst@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-11s390: rename struct _lowcore to struct lowcoreHeiko Carstens
Finally get rid of the leading underscore. I tried this already two or three years ago, however Michael Holzheu objected since this would break the crash utility (again). However Michael integrated support for the new name into the crash utility back then, so it doesn't break if the name will be changed now. So finally get rid of the ever confusing leading underscore. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-11s390/mem_detect: use unsigned longsHeiko Carstens
The memory detection code historically had to use unsigned long long since the machine reported the true memory size (>4GB) even if the virtual machine was running in ESA/390 mode. Since the old code is gone use unsigned long everywhere and also get rid of an unused ADDR2G define. (this patch converts all long longs within sclp_info to longs) There are many more possible conversions, however that can be done if somebody touches the corresponding code. Since people started to convert unrelated long types to long longs because of the types within struct sclp_info convert this now. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-11s390/ptrace: get rid of long longs in psw_bitsHeiko Carstens
The long longs were introduced by me in order to have a working definition of the struct psw_bits also in 31 bit mode. Since that is gone also get rid of the long longs. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-11s390/sysinfo: add missing SYSIB 1.2.2 multithreading fieldsHeiko Carstens
Add missing multithreading fields of SYSIB 1.2.2 (Basic-Machine CPUs) to the output of /proc/sysinfo. Also use bitfields for SYSIB 2.2.2 to simplify the C code a bit. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>