summaryrefslogtreecommitdiff
path: root/arch/s390
AgeCommit message (Collapse)Author
2017-11-15s390/topology: make "topology=off" parameter workHeiko Carstens
[ Upstream commit 68cc795d1933285705ced6d841ef66c00ce98cbe ] The "topology=off" kernel parameter is supposed to prevent the kernel to use hardware topology information to generate scheduling domains etc. For an unknown reason I implemented this in a very odd way back then: instead of simply clearing the MACHINE_HAS_TOPOLOGY flag within the lowcore I added a second variable which indicated that topology information should not be used. This is more than suboptimal since it partially doesn't work. For the fake NUMA case topology information is still considered and scheduling domains will be created based on this. To fix this and to simplify the code get rid of the extra variable and implement the "topology=off" case like it is done for other features. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-08s390/crypto: Extend key length check for AES-XTS in fips mode.Harald Freudenberger
[ Upstream commit a4f2779ecf2f42b0997fedef6fd20a931c40a3e3 ] In fips mode only xts keys with 128 bit or 125 bit are allowed. This fix extends the xts_aes_set_key function to check for these valid key lengths in fips mode. Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-08s390/prng: Adjust generation of entropy to produce real 256 bits.Harald Freudenberger
[ Upstream commit d34b1acb78af41b8b8d5c60972b6555ea19f7564 ] The generate_entropy function used a sha256 for compacting together 256 bits of entropy into 32 bytes hash. However, it is questionable if a sha256 can really be used here, as potential collisions may reduce the max entropy fitting into a 32 byte hash value. So this batch introduces the use of sha512 instead and the required buffer adjustments for the calling functions. Further more the working buffer for the generate_entropy function has been widened from one page to two pages. So now 1024 stckf invocations are used to gather 256 bits of entropy. This has been done to be on the save side if the jitters of stckf values isn't as good as supposed. Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-08s390/mm: make pmdp_invalidate() do invalidation onlyGerald Schaefer
commit 91c575b335766effa6103eba42a82aea560c365f upstream. Commit 227be799c39a ("s390/mm: uninline pmdp_xxx functions from pgtable.h") inadvertently changed the behavior of pmdp_invalidate(), so that it now clears the pmd instead of just marking it as invalid. Fix this by restoring the original behavior. A possible impact of the misbehaving pmdp_invalidate() would be the MADV_DONTNEED races (see commits ced10803 and 58ceeb6b), although we should not have any negative impact on the related dirty/young flags, since those flags are not set by the hardware on s390. Fixes: 227be799c39a ("s390/mm: uninline pmdp_xxx functions from pgtable.h") Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-05s390/mm: fix write access check in gup_huge_pmd()Gerald Schaefer
commit ba385c0594e723d41790ecfb12c610e6f90c7785 upstream. The check for the _SEGMENT_ENTRY_PROTECT bit in gup_huge_pmd() is the wrong way around. It must not be set for write==1, and not be checked for write==0. Fix this similar to how it was fixed for ptes long time ago in commit 25591b070336 ("[S390] fix get_user_pages_fast"). One impact of this bug would be unnecessarily using the gup slow path for write==0 on r/w mappings. A potentially more severe impact would be that gup_huge_pmd() will succeed for write==1 on r/o mappings. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27s390/mm: fix race on mm->context.flush_mmMartin Schwidefsky
commit 60f07c8ec5fae06c23e9fd7bab67dabce92b3414 upstream. The order in __tlb_flush_mm_lazy is to flush TLB first and then clear the mm->context.flush_mm bit. This can lead to missed flushes as the bit can be set anytime, the order needs to be the other way aronud. But this leads to a different race, __tlb_flush_mm_lazy may be called on two CPUs concurrently. If mm->context.flush_mm is cleared first then another CPU can bypass __tlb_flush_mm_lazy although the first CPU has not done the flush yet. In a virtualized environment the time until the flush is finally completed can be arbitrarily long. Add a spinlock to serialize __tlb_flush_mm_lazy and use the function in finish_arch_post_lock_switch as well. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27s390/mm: fix local TLB flushing vs. detach of an mm address spaceMartin Schwidefsky
commit b3e5dc45fd1ec2aa1de6b80008f9295eb17e0659 upstream. The local TLB flushing code keeps an additional mask in the mm.context, the cpu_attach_mask. At the time a global flush of an address space is done the cpu_attach_mask is copied to the mm_cpumask in order to avoid future global flushes in case the mm is used by a single CPU only after the flush. Trouble is that the reset of the mm_cpumask is racy against the detach of an mm address space by switch_mm. The current order is first the global TLB flush and then the copy of the cpu_attach_mask to the mm_cpumask. The order needs to be the other way around. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-09s390/mm: avoid empty zero pages for KVM guests to avoid postcopy hangsChristian Borntraeger
commit fa41ba0d08de7c975c3e94d0067553f9b934221f upstream. Right now there is a potential hang situation for postcopy migrations, if the guest is enabling storage keys on the target system during the postcopy process. For storage key virtualization, we have to forbid the empty zero page as the storage key is a property of the physical page frame. As we enable storage key handling lazily we then drop all mappings for empty zero pages for lazy refaulting later on. This does not work with the postcopy migration, which relies on the empty zero page never triggering a fault again in the future. The reason is that postcopy migration will simply read a page on the target system if that page is a known zero page to fault in an empty zero page. At the same time postcopy remembers that this page was already transferred - so any future userfault on that page will NOT be retransmitted again to avoid races. If now the guest enters the storage key mode while in postcopy, we will break this assumption of postcopy. The solution is to disable the empty zero page for KVM guests early on and not during storage key enablement. With this change, the postcopy migration process is guaranteed to start after no zero pages are left. As guest pages are very likely not empty zero pages anyway the memory overhead is also pretty small. While at it this also adds proper page table locking to the zero page removal. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-30KVM: s390: sthyi: fix specification exception detectionHeiko Carstens
commit 857b8de96795646c5891cf44ae6fb19b9ff74bf9 upstream. sthyi should only generate a specification exception if the function code is zero and the response buffer is not on a 4k boundary. The current code would also test for unknown function codes if the response buffer, that is currently only defined for function code 0, is not on a 4k boundary and incorrectly inject a specification exception instead of returning with condition code 3 and return code 4 (unsupported function code). Fix this by moving the boundary check. Fixes: 95ca2cb57985 ("KVM: s390: Add sthyi emulation") Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-30KVM: s390: sthyi: fix sthyi inline assemblyHeiko Carstens
commit 4a4eefcd0e49f9f339933324c1bde431186a0a7d upstream. The sthyi inline assembly misses register r3 within the clobber list. The sthyi instruction will always write a return code to register "R2+1", which in this case would be r3. Due to that we may have register corruption and see host crashes or data corruption depending on how gcc decided to allocate and use registers during compile time. Fixes: 95ca2cb57985 ("KVM: s390: Add sthyi emulation") Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-13bpf, s390: fix jit branch offset related to ldimm64Daniel Borkmann
[ Upstream commit b0a0c2566f28e71e5e32121992ac8060cec75510 ] While testing some other work that required JIT modifications, I run into test_bpf causing a hang when JIT enabled on s390. The problematic test case was the one from ddc665a4bb4b (bpf, arm64: fix jit branch offset related to ldimm64), and turns out that we do have a similar issue on s390 as well. In bpf_jit_prog() we update next instruction address after returning from bpf_jit_insn() with an insn_count. bpf_jit_insn() returns either -1 in case of error (e.g. unsupported insn), 1 or 2. The latter is only the case for ldimm64 due to spanning 2 insns, however, next address is only set to i + 1 not taking actual insn_count into account, thus fix is to use insn_count instead of 1. bpf_jit_enable in mode 2 provides also disasm on s390: Before fix: 000003ff800349b6: a7f40003 brc 15,3ff800349bc ; target 000003ff800349ba: 0000 unknown 000003ff800349bc: e3b0f0700024 stg %r11,112(%r15) 000003ff800349c2: e3e0f0880024 stg %r14,136(%r15) 000003ff800349c8: 0db0 basr %r11,%r0 000003ff800349ca: c0ef00000000 llilf %r14,0 000003ff800349d0: e320b0360004 lg %r2,54(%r11) 000003ff800349d6: e330b03e0004 lg %r3,62(%r11) 000003ff800349dc: ec23ffeda065 clgrj %r2,%r3,10,3ff800349b6 ; jmp 000003ff800349e2: e3e0b0460004 lg %r14,70(%r11) 000003ff800349e8: e3e0b04e0004 lg %r14,78(%r11) 000003ff800349ee: b904002e lgr %r2,%r14 000003ff800349f2: e3b0f0700004 lg %r11,112(%r15) 000003ff800349f8: e3e0f0880004 lg %r14,136(%r15) 000003ff800349fe: 07fe bcr 15,%r14 After fix: 000003ff80ef3db4: a7f40003 brc 15,3ff80ef3dba 000003ff80ef3db8: 0000 unknown 000003ff80ef3dba: e3b0f0700024 stg %r11,112(%r15) 000003ff80ef3dc0: e3e0f0880024 stg %r14,136(%r15) 000003ff80ef3dc6: 0db0 basr %r11,%r0 000003ff80ef3dc8: c0ef00000000 llilf %r14,0 000003ff80ef3dce: e320b0360004 lg %r2,54(%r11) 000003ff80ef3dd4: e330b03e0004 lg %r3,62(%r11) 000003ff80ef3dda: ec230006a065 clgrj %r2,%r3,10,3ff80ef3de6 ; jmp 000003ff80ef3de0: e3e0b0460004 lg %r14,70(%r11) 000003ff80ef3de6: e3e0b04e0004 lg %r14,78(%r11) ; target 000003ff80ef3dec: b904002e lgr %r2,%r14 000003ff80ef3df0: e3b0f0700004 lg %r11,112(%r15) 000003ff80ef3df6: e3e0f0880004 lg %r14,136(%r15) 000003ff80ef3dfc: 07fe bcr 15,%r14 test_bpf.ko suite runs fine after the fix. Fixes: 054623105728 ("s390/bpf: Add s390x eBPF JIT compiler backend") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-27s390/syscalls: Fix out of bounds arguments accessJiri Olsa
commit c46fc0424ced3fb71208e72bd597d91b9169a781 upstream. Zorro reported following crash while having enabled syscall tracing (CONFIG_FTRACE_SYSCALLS): Unable to handle kernel pointer dereference at virtual ... Oops: 0011 [#1] SMP DEBUG_PAGEALLOC SNIP Call Trace: ([<000000000024d79c>] ftrace_syscall_enter+0xec/0x1d8) [<00000000001099c6>] do_syscall_trace_enter+0x236/0x2f8 [<0000000000730f1c>] sysc_tracesys+0x1a/0x32 [<000003fffcf946a2>] 0x3fffcf946a2 INFO: lockdep is turned off. Last Breaking-Event-Address: [<000000000022dd44>] rb_event_data+0x34/0x40 ---[ end trace 8c795f86b1b3f7b9 ]--- The crash happens in syscall_get_arguments function for syscalls with zero arguments, that will try to access first argument (args[0]) in event entry, but it's not allocated. Bail out of there are no arguments. Reported-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21s390: reduce ELF_ET_DYN_BASEKees Cook
commit a73dc5370e153ac63718d850bddf0c9aa9d871e6 upstream. Now that explicitly executed loaders are loaded in the mmap region, we have more freedom to decide where we position PIE binaries in the address space to avoid possible collisions with mmap or stack regions. For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit address space for 32-bit pointers. On 32-bit use 4MB, which is the traditional x86 minimum load location, likely to avoid historically requiring a 4MB page table entry when only a portion of the first 4MB would be used (since the NULL address is avoided). For s390 the position could be 0x10000, but that is needlessly close to the NULL address. Link: http://lkml.kernel.org/r/1498154792-49952-5-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05s390/ctl_reg: make __ctl_load a full memory barrierHeiko Carstens
[ Upstream commit e991c24d68b8c0ba297eeb7af80b1e398e98c33f ] We have quite a lot of code that depends on the order of the __ctl_load inline assemby and subsequent memory accesses, like e.g. disabling lowcore protection and the writing to lowcore. Since the __ctl_load macro does not have memory barrier semantics, nor any other dependencies the compiler is, theoretically, free to shuffle code around. Or in other words: storing to lowcore could happen before lowcore protection is disabled. In order to avoid this class of potential bugs simply add a full memory barrier to the __ctl_load macro. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-29KVM: s390: gaccess: fix real-space designation asce handling for gmap shadowsHeiko Carstens
commit addb63c18a0d52a9ce2611d039f981f7b6148d2b upstream. For real-space designation asces the asce origin part is only a token. The asce token origin must not be used to generate an effective address for storage references. This however is erroneously done within kvm_s390_shadow_tables(). Furthermore within the same function the wrong parts of virtual addresses are used to generate a corresponding real address (e.g. the region second index is used as region first index). Both of the above can result in incorrect address translations. Only for real space designations with a token origin of zero and addresses below one megabyte the translation was correct. Furthermore replace a "!asce.r" statement with a "!*fake" statement to make it more obvious that a specific condition has nothing to do with the architecture, but with the fake handling of real space designations. Fixes: 3218f7094b6b ("s390/mm: support real-space for gmap shadows") Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-24mm: larger stack guard gap, between vmasHugh Dickins
commit 1be7107fbe18eed3e319a6c3e83c78254b693acb upstream. Stack guard page is a useful feature to reduce a risk of stack smashing into a different mapping. We have been using a single page gap which is sufficient to prevent having stack adjacent to a different mapping. But this seems to be insufficient in the light of the stack usage in userspace. E.g. glibc uses as large as 64kB alloca() in many commonly used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX] which is 256kB or stack strings with MAX_ARG_STRLEN. This will become especially dangerous for suid binaries and the default no limit for the stack size limit because those applications can be tricked to consume a large portion of the stack and a single glibc call could jump over the guard page. These attacks are not theoretical, unfortunatelly. Make those attacks less probable by increasing the stack guard gap to 1MB (on systems with 4k pages; but make it depend on the page size because systems with larger base pages might cap stack allocations in the PAGE_SIZE units) which should cover larger alloca() and VLA stack allocations. It is obviously not a full fix because the problem is somehow inherent, but it should reduce attack space a lot. One could argue that the gap size should be configurable from userspace, but that can be done later when somebody finds that the new 1MB is wrong for some special case applications. For now, add a kernel command line option (stack_guard_gap) to specify the stack gap size (in page units). Implementation wise, first delete all the old code for stack guard page: because although we could get away with accounting one extra page in a stack vma, accounting a larger gap can break userspace - case in point, a program run with "ulimit -S -v 20000" failed when the 1MB gap was counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK and strict non-overcommit mode. Instead of keeping gap inside the stack vma, maintain the stack guard gap as a gap between vmas: using vm_start_gap() in place of vm_start (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few places which need to respect the gap - mainly arch_get_unmapped_area(), and and the vma tree's subtree_gap support for that. Original-patch-by: Oleg Nesterov <oleg@redhat.com> Original-patch-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Tested-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [wt: backport to 4.11: adjust context] [wt: backport to 4.9: adjust context ; kernel doc was not in admin-guide] Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-17s390/kvm: do not rely on the ILC on kvm host protection faulsChristian Borntraeger
commit c0e7bb38c07cbd8269549ee0a0566021a3c729de upstream. For most cases a protection exception in the host (e.g. copy on write or dirty tracking) on the sie instruction will indicate an instruction length of 4. Turns out that there are some corner cases (e.g. runtime instrumentation) where this is not necessarily true and the ILC is unpredictable. Let's replace our 4 byte rewind_pad with 3 byte nops to prepare for all possible ILCs. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25s390/cputime: fix incorrect system timeMartin Schwidefsky
commit 07a63cbe8bcb6ba72fb989dcab1ec55ec6c36c7e upstream. git commit c5328901aa1db134 "[S390] entry[64].S improvements" removed the update of the exit_timer lowcore field from the critical section cleanup of the .Lsysc_restore/.Lsysc_done and .Lio_restore/.Lio_done blocks. If the PSW is updated by the critical section cleanup to point to user space again, the interrupt entry code will do a vtime calculation after the cleanup completed with an exit_timer value which has *not* been updated. Due to this incorrect system time deltas are calculated. If an interrupt occured with an old PSW between .Lsysc_restore/.Lsysc_done or .Lio_restore/.Lio_done update __LC_EXIT_TIMER with the system entry time of the interrupt. Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25s390/kdump: Add final noteMichael Holzheu
commit dcc00b79fc3d076832f7240de8870f492629b171 upstream. Since linux v3.14 with commit 38dfac843cb6d7be1 ("vmcore: prevent PT_NOTE p_memsz overflow during header update") on s390 we get the following message in the kdump kernel: Warning: Exceeded p_memsz, dropping PT_NOTE entry n_namesz=0x6b6b6b6b, n_descsz=0x6b6b6b6b The reason for this is that we don't create a final zero note in the ELF header which the proc/vmcore code uses to find out the end of the notes section (see also kernel/kexec_core.c:final_note()). It still worked on s390 by chance because we (most of the time?) have the byte pattern 0x6b6b6b6b after the notes section which also makes the notes parsing code stop in update_note_header_size_elf64() because 0x6b6b6b6b is interpreded as note size: if ((real_sz + sz) > max_sz) { pr_warn("Warning: Exceeded p_memsz, dropping P ...); break; } So fix this and add the missing final note to the ELF header. We don't have to adjust the memory size for ELF header ("alloc_size") because the new ELF note still fits into the 0x1000 base memory. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-27s390/mm: fix CMMA vs KSM vs othersChristian Borntraeger
commit a8f60d1fadf7b8b54449fcc9d6b15248917478ba upstream. On heavy paging with KSM I see guest data corruption. Turns out that KSM will add pages to its tree, where the mapping return true for pte_unused (or might become as such later). KSM will unmap such pages and reinstantiate with different attributes (e.g. write protected or special, e.g. in replace_page or write_protect_page)). This uncovered a bug in our pagetable handling: We must remove the unused flag as soon as an entry becomes present again. Signed-of-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12s390/uaccess: get_user() should zero on failure (again)Heiko Carstens
commit d09c5373e8e4eaaa09233552cbf75dc4c4f21203 upstream. Commit fd2d2b191fe7 ("s390: get_user() should zero on failure") intended to fix s390's get_user() implementation which did not zero the target operand if the read from user space faulted. Unfortunately the patch has no effect: the corresponding inline assembly specifies that the operand is only written to ("=") and the previous value is discarded. Therefore the compiler is free to and actually does omit the zero initialization. To fix this simply change the contraint modifier to "+", so the compiler cannot omit the initialization anymore. Fixes: c9ca78415ac1 ("s390/uaccess: provide inline variants of get_user/put_user") Fixes: fd2d2b191fe7 ("s390: get_user() should zero on failure") Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12s390/decompressor: fix initrd corruption caused by bss clearMarcelo Henrique Cerri
commit d82c0d12c92705ef468683c9b7a8298dd61ed191 upstream. Reorder the operations in decompress_kernel() to ensure initrd is moved to a safe location before the bss section is zeroed. During decompression bss can overlap with the initrd and this can corrupt the initrd contents depending on the size of the compressed kernel (which affects where the initrd is placed by the bootloader) and the size of the bss section of the decompressor. Also use the correct initrd size when checking for overlaps with parmblock. Fixes: 06c0dd72aea3 ([S390] fix boot failures with compressed kernels) Reviewed-by: Joy Latten <joy.latten@canonical.com> Reviewed-by: Vineetha HariPai <vineetha.hari.pai@canonical.com> Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-18KVM: s390: Fix guest migration for huge guests resulting in panicJanosch Frank
commit 2e4d88009f57057df7672fa69a32b5224af54d37 upstream. While we can technically not run huge page guests right now, we can setup a guest with huge pages. Trying to migrate it will trigger a VM_BUG_ON and, if the kernel is not configured to panic on a BUG, it will happily try to work on non-existing page table entries. With this patch, we always return "dirty" if we encounter a large page when migrating. This at least fixes the immediate problem until we have proper handling for both kind of pages. Fixes: 15f36eb ("KVM: s390: Add proper dirty bitmap support to S390 kvm.") Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-15s390: use correct input data address for setup_randomnessHeiko Carstens
commit 4920e3cf77347d7d7373552d4839e8d832321313 upstream. The current implementation of setup_randomness uses the stack address and therefore the pointer to the SYSIB 3.2.2 block as input data address. Furthermore the length of the input data is the number of virtual-machine description blocks which is typically one. This means that typically a single zero byte is fed to add_device_randomness. Fix both of these and use the address of the first virtual machine description block as input data address and also use the correct length. Fixes: bcfcbb6bae64 ("s390: add system information as device randomness") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-15s390: make setup_randomness workHeiko Carstens
commit da8fd820f389a0e29080b14c61bf5cf1d8ef5ca1 upstream. Commit bcfcbb6bae64 ("s390: add system information as device randomness") intended to add some virtual machine specific information to the randomness pool. Unfortunately it uses the page allocator before it is ready to use. In result the page allocator always returns NULL and the setup_randomness function never adds anything to the randomness pool. To fix this use memblock_alloc and memblock_free instead. Fixes: bcfcbb6bae64 ("s390: add system information as device randomness") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-15s390: TASK_SIZE for kernel threadsMartin Schwidefsky
commit fb94a687d96c570d46332a4a890f1dcb7310e643 upstream. Return a sensible value if TASK_SIZE if called from a kernel thread. This gets us around an issue with copy_mount_options that does a magic size calculation "TASK_SIZE - (unsigned long)data" while in a kernel thread and data pointing to kernel space. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-15s390/kdump: Use "LINUX" ELF note name instead of "CORE"Michael Holzheu
commit a4a81d8eebdc1d209d034f62a082a5131e4242b5 upstream. In binutils/libbfd (bfd/elf.c) it is enforced that all s390 specific ELF notes like e.g. NT_S390_PREFIX or NT_S390_CTRS have "LINUX" specified as note name. Otherwise the notes are ignored. For /proc/vmcore we currently use "CORE" for these notes. Up to now this has not been a real problem because the dump analysis tool "crash" does not check the note name. But it will break all programs that use libbfd for processing ELF notes. So fix this and use "LINUX" for all s390 specific notes to comply with libbfd. Reported-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Reviewed-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-15KVM: s390: Disable dirty log retrieval for UCONTROL guestsJanosch Frank
commit e1e8a9624f7ba8ead4f056ff558ed070e86fa747 upstream. User controlled KVM guests do not support the dirty log, as they have no single gmap that we can check for changes. As they have no single gmap, kvm->arch.gmap is NULL and all further referencing to it for dirty checking will result in a NULL dereference. Let's return -EINVAL if a caller tries to sync dirty logs for a UCONTROL guest. Fixes: 15f36eb ("KVM: s390: Add proper dirty bitmap support to S390 kvm.") Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com> Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-01s390/ptrace: Preserve previous registers for short regset writeMartin Schwidefsky
commit 9dce990d2cf57b5ed4e71a9cdbd7eae4335111ff upstream. Ensure that if userspace supplies insufficient data to PTRACE_SETREGSET to fill all the registers, the thread's old registers are preserved. convert_vx_to_fp() is adapted to handle only a specified number of registers rather than unconditionally handling all of them: other callers of this function are adapted appropriately. Based on an initial patch by Dave Martin. Reported-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-01s390/mm: Fix cmma unused transfer from pgste into pteChristian Borntraeger
commit 0d6da872d3e4a60f43c295386d7ff9a4cdcd57e9 upstream. The last pgtable rework silently disabled the CMMA unused state by setting a local pte variable (a parameter) instead of propagating it back into the caller. Fix it. Fixes: ebde765c0e85 ("s390/mm: uninline ptep_xxx functions from pgtable.h") Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-26KVM: s390: do not expose random data via facility bitmapChristian Borntraeger
commit 04478197416e3a302e9ebc917ba1aa884ef9bfab upstream. kvm_s390_get_machine() populates the facility bitmap by copying bytes from the host results that are stored in a 256 byte array in the prefix page. The KVM code does use the size of the target buffer (2k), thus copying and exposing unrelated kernel memory (mostly machine check related logout data). Let's use the size of the source buffer instead. This is ok, as the target buffer will always be greater or equal than the source buffer as the KVM internal buffers (and thus S390_ARCH_FAC_LIST_SIZE_BYTE) cover the maximum possible size that is allowed by STFLE, which is 256 doublewords. All structures are zero allocated so we can leave bytes 256-2047 unchanged. Add a similar fix for kvm_arch_init_vm(). Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> [found with smatch] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12s390/pci: fix dma address calculation in map_sgSebastian Ott
commit 6b7df3ce92ac82ec3f4a2953b6fed77da7b38aaa upstream. __s390_dma_map_sg maps a dma-contiguous area. Although we only map whole pages we have to take into account that the area doesn't start or stop at a page boundary because we use the dma address to loop over the individual sg entries. Failing to do that might lead to an access of the wrong sg entry. Fixes: ee877b81c6b9 ("s390/pci_dma: improve map_sg") Reported-and-tested-by: Christoph Raisch <raisch@de.ibm.com> Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12s390/topology: always use s390 specific sched_domain_topology_levelHeiko Carstens
commit ebb299a51059017ec253bd30781a83d1f6e11b24 upstream. The s390 specific sched_domain_topology_level should always be used, not only if the machine provides topology information. Luckily this odd behaviour, that was by accident introduced with git commit d05d15da18f5 ("s390/topology: delay initialization of topology cpu masks") has currently no side effect. Fixes: d05d15da18f5 ("s390/topology: delay initialization of topology cpumasks") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12s390/crypto: unlock on error in prng_tdes_read()Dan Carpenter
commit 9e6e7c74315095fd40f41003850690c711e44420 upstream. We added some new locking but forgot to unlock on error. Fixes: 57127645d79d ("s390/zcrypt: Introduce new SHA-512 based Pseudo Random Generator.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-09s390/kexec: use node 0 when re-adding crash kernel memoryHeiko Carstens
commit 9f88eb4df728aebcd2ddd154d99f1d75b428b897 upstream. When re-adding crash kernel memory within setup_resources() the function memblock_add() is used. That function will add memory by default to node "MAX_NUMNODES" instead of node 0, like the memory detection code does. In case of !NUMA this will trigger this warning when the kernel generates the vmemmap: Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead WARNING: CPU: 0 PID: 0 at mm/memblock.c:1261 memblock_virt_alloc_internal+0x76/0x220 CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc6 #16 Call Trace: [<0000000000d0b2e8>] memblock_virt_alloc_try_nid+0x88/0xc8 [<000000000083c8ea>] __earlyonly_bootmem_alloc.constprop.1+0x42/0x50 [<000000000083e7f4>] vmemmap_populate+0x1ac/0x1e0 [<0000000000840136>] sparse_mem_map_populate+0x46/0x68 [<0000000000d0c59c>] sparse_init+0x184/0x238 [<0000000000cf45f6>] paging_init+0xbe/0xf8 [<0000000000cf1d4a>] setup_arch+0xa02/0xae0 [<0000000000ced75a>] start_kernel+0x72/0x450 [<0000000000100020>] _stext+0x20/0x80 If NUMA is selected numa_setup_memory() will fix the node assignments before the vmemmap will be populated; so this warning will only appear if NUMA is not selected. To fix this simply use memblock_add_node() and re-add crash kernel memory explicitly to node 0. Reported-and-tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Fixes: 4e042af463f8 ("s390/kexec: fix crash on resize of reserved memory") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-11Merge branch 'maybe-uninitialized' (patches from Arnd)Linus Torvalds
Merge fixes for -Wmaybe-uninitialized from Arnd Bergmann: "It took a while for some patches to make it into mainline through maintainer trees, but the 28-patch series is now reduced to 10, with one tiny patch added at the end. Aside from patches that are no longer required, I did these changes compared to version 1: - Dropped "iio: maxim_thermocouple: detect invalid storage size in read()", which is currently in linux-next as commit 32cb7d27e65d. This is the only remaining warning I see for a couple of corner cases (kbuild bot reports it on blackfin, kernelci bot and arm-soc bot both report it on arm64) - Dropped "brcmfmac: avoid maybe-uninitialized warning in brcmf_cfg80211_start_ap", which is currently in net/master merge pending. - Dropped two x86 patches, "x86: math-emu: possible uninitialized variable use" and "x86: mark target address as output in 'insb' asm" as they do not seem to trigger for a default build, and I got no feedback on them. Both of these are ancient issues and seem harmless, I will send them again to the x86 maintainers once the rest is merged. - Dropped "rbd: false-postive gcc-4.9 -Wmaybe-uninitialized" based on feedback from Ilya Dryomov, who already has a different fix queued up for v4.10. The kbuild bot reports this as a warning for xtensa. - Replaced "crypto: aesni: avoid -Wmaybe-uninitialized warning" with a simpler patch, this one always triggers but my first solution would not be safe for linux-4.9 any more at this point. I'll follow up with the larger patch as a cleanup for 4.10. - Replaced "dib0700: fix nec repeat handling" with a better one, contributed by Sean Young" * -Wmaybe-uninitialized fixes: Kbuild: enable -Wmaybe-uninitialized warnings by default pcmcia: fix return value of soc_pcmcia_regulator_set infiniband: shut up a maybe-uninitialized warning crypto: aesni: shut up -Wmaybe-uninitialized warning rc: print correct variable for z8f0811 dib0700: fix nec repeat handling s390: pci: don't print uninitialized data for debugging nios2: fix timer initcall return value x86: apm: avoid uninitialized data NFSv4.1: work around -Wmaybe-uninitialized warning Kbuild: enable -Wmaybe-uninitialized warning for "make W=1"
2016-11-11s390: pci: don't print uninitialized data for debuggingArnd Bergmann
gcc correctly warns about an incorrect use of the 'pa' variable in case we pass an empty scatterlist to __s390_dma_map_sg: arch/s390/pci/pci_dma.c: In function '__s390_dma_map_sg': arch/s390/pci/pci_dma.c:309:13: warning: 'pa' may be used uninitialized in this function [-Wmaybe-uninitialized] This adds a bogus initialization to the function to sanitize the debug output. I would have preferred a solution without the initialization, but I only got the report from the kbuild bot after turning on the warning again, and didn't manage to reproduce it myself. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11mm: kmemleak: scan .data.ro_after_initJakub Kicinski
Limit the number of kmemleak false positives by including .data.ro_after_init in memory scanning. To achieve this we need to add symbols for start and end of the section to the linker scripts. The problem was been uncovered by commit 56989f6d8568 ("genetlink: mark families as __ro_after_init"). Link: http://lkml.kernel.org/r/1478274173-15218-1-git-send-email-jakub.kicinski@netronome.com Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-09Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "Two bug fixes - a memory alignment fix in the s390 only hypfs code - a fix for the generic percpu code that caused ftrace to break on s390. This is not relevant for x86 but for all architectures that use the generic percpu code" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: percpu: use notrace variant of preempt_disable/preempt_enable s390/hypfs: Use get_free_page() instead of kmalloc to ensure page alignment
2016-11-04Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "One NULL pointer dereference, and two fixes for regressions introduced during the merge window. The rest are fixes for MIPS, s390 and nested VMX" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm: x86: Check memopp before dereference (CVE-2016-8630) kvm: nVMX: VMCLEAR an active shadow VMCS after last use KVM: x86: drop TSC offsetting kvm_x86_ops to fix KVM_GET/SET_CLOCK KVM: x86: fix wbinvd_dirty_mask use-after-free kvm/x86: Show WRMSR data is in hex kvm: nVMX: Fix kernel panics induced by illegal INVEPT/INVVPID types KVM: document lock orders KVM: fix OOPS on flush_work KVM: s390: Fix STHYI buffer alignment for diag224 KVM: MIPS: Precalculate MMIO load resume PC KVM: MIPS: Make ERET handle ERL before EXL KVM: MIPS: Fix lazy user ASID regenerate for SMP
2016-10-28s390/hypfs: Use get_free_page() instead of kmalloc to ensure page alignmentMichael Holzheu
Since commit d86bd1bece6f ("mm/slub: support left redzone") it is no longer guaranteed that kmalloc(PAGE_SIZE) returns page aligned memory. After the above commit we get an error for diag224 because aligned memory is required. This leads to the following user visible error: # mount none -t s390_hypfs /sys/hypervisor/ mount: unknown filesystem type 's390_hypfs' # dmesg | grep hypfs hypfs.cccfb8: The hardware system does not provide all functions required by hypfs hypfs.7a79f0: Initialization of hypfs failed with rc=-61 Fix this problem and use get_free_page() instead of kmalloc() to get correctly aligned memory. Cc: stable@vger.kernel.org # v3.6+ Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-27Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "A few more s390 patches for 4.9: - a fix for an overflow in the dasd driver reported by UBSAN - fix a regression and add hotplug memory to the zone movable again - add ignore defines for the pkey system calls - fix the ouput of the merged stack tracer - replace printk with pr_cont in arch/s390 where appropriate - remove the arch specific return_address function again - ignore reserved channel paths at boot time - add a missing hugetlb_bad_size call to the arch backend" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/mm: fix zone calculation in arch_add_memory() s390/dumpstack: use pr_cont within show_stack and die s390/dumpstack: get rid of return_address again s390/disassambler: use pr_cont where appropriate s390/dumpstack: use pr_cont where appropriate s390/dumpstack: restore reliable indicator for call traces s390/mm: use hugetlb_bad_size() s390/cio: don't register chpids in reserved state s390: ignore pkey system calls s390/dasd: avoid undefined behaviour
2016-10-27Merge tag 'kvm-s390-master-4.9-2' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fix wrong memory allocation With commit d86bd1bece6f ("mm/slub: support left redzone") or with slab debugging the allocation of our diag224 buffer is not aligned properly. Let's fix this.
2016-10-26KVM: s390: Fix STHYI buffer alignment for diag224Janosch Frank
Diag224 requires a page-aligned 4k buffer to store the name table into. kmalloc does not guarantee page alignment, hence we replace it with __get_free_page for the buffer allocation. Cc: stable@vger.kernel.org # v4.8+ Reported-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-10-24s390/mm: fix zone calculation in arch_add_memory()Gerald Schaefer
Standby (hotplug) memory should be added to ZONE_MOVABLE on s390. After commit 199071f1 "s390/mm: make arch_add_memory() NUMA aware", arch_add_memory() used memblock_end_of_DRAM() to find out the end of ZONE_NORMAL and the beginning of ZONE_MOVABLE. However, commit 7f36e3e5 "memory-hotplug: add hot-added memory ranges to memblock before allocate node_data for a node." moved the call of memblock_add_node() before the call of arch_add_memory() in add_memory_resource(), and thus changed the return value of memblock_end_of_DRAM() when called in arch_add_memory(). As a result, arch_add_memory() will think that all memory blocks should be added to ZONE_NORMAL. Fix this by changing the logic in arch_add_memory() so that it will manually iterate over all zones of a given node to find out which zone a memory block should be added to. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-24s390/dumpstack: use pr_cont within show_stack and dieHeiko Carstens
Use pr_cont instead of printk calls also within show_stack and die in order to avoid extra line breaks. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-22Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Radim Krčmář: "ARM: - avoid livelock when walking guest page tables - fix HYP mode static keys without CC_HAVE_ASM_GOTO MIPS: - fix a build error without TRACEPOINTS_ENABLED s390: - reject a malformed userspace configuration x86: - suppress a warning without CONFIG_CPU_FREQ - initialize whole irq_eoi array" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: arm/arm64: KVM: Map the BSS at HYP arm64: KVM: Take S1 walks into account when determining S2 write faults KVM: s390: reject invalid modes for runtime instrumentation kvm: x86: memset whole irq_eoi kvm/x86: Fix unused variable warning in kvm_timer_init() KVM: MIPS: Add missing uaccess.h include
2016-10-20KVM: s390: reject invalid modes for runtime instrumentationChristian Borntraeger
Usually a validity intercept is a programming error of the host because of invalid entries in the state description. We can get a validity intercept if the mode of the runtime instrumentation control block is wrong. As the host does not know which modes are valid, this can be used by userspace to trigger a WARN. Instead of printing a WARN let's return an error to userspace as this can only happen if userspace provides a malformed initial value (e.g. on migration). The kernel should never warn on bogus input. Instead let's log it into the s390 debug feature. While at it, let's return -EINVAL for all validity intercepts as this will trigger an error in QEMU like error: kvm run failed Invalid argument PSW=mask 0404c00180000000 addr 000000000063c226 cc 00 R00=000000000000004f R01=0000000000000004 R02=0000000000760005 R03=000000007fe0a000 R04=000000000064ba2a R05=000000049db73dd0 R06=000000000082c4b0 R07=0000000000000041 R08=0000000000000002 R09=000003e0804042a8 R10=0000000496152c42 R11=000000007fe0afb0 [...] This will avoid an endless loop of validity intercepts. Cc: stable@vger.kernel.org # v4.5+ Fixes: c6e5f166373a ("KVM: s390: implement the RI support of guest") Acked-by: Fan Zhang <zhangfan@linux.vnet.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-10-19Merge branch 'gup_flag-cleanups'Linus Torvalds
Merge the gup_flags cleanups from Lorenzo Stoakes: "This patch series adjusts functions in the get_user_pages* family such that desired FOLL_* flags are passed as an argument rather than implied by flags. The purpose of this change is to make the use of FOLL_FORCE explicit so it is easier to grep for and clearer to callers that this flag is being used. The use of FOLL_FORCE is an issue as it overrides missing VM_READ/VM_WRITE flags for the VMA whose pages we are reading from/writing to, which can result in surprising behaviour. The patch series came out of the discussion around commit 38e088546522 ("mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing"), which addressed a BUG_ON() being triggered when a page was faulted in with PROT_NONE set but having been overridden by FOLL_FORCE. do_numa_page() was run on the assumption the page _must_ be one marked for NUMA node migration as an actual PROT_NONE page would have been dealt with prior to this code path, however FOLL_FORCE introduced a situation where this assumption did not hold. See https://marc.info/?l=linux-mm&m=147585445805166 for the patch proposal" Additionally, there's a fix for an ancient bug related to FOLL_FORCE and FOLL_WRITE by me. [ This branch was rebased recently to add a few more acked-by's and reviewed-by's ] * gup_flag-cleanups: mm: replace access_process_vm() write parameter with gup_flags mm: replace access_remote_vm() write parameter with gup_flags mm: replace __access_remote_vm() write parameter with gup_flags mm: replace get_user_pages_remote() write/force parameters with gup_flags mm: replace get_user_pages() write/force parameters with gup_flags mm: replace get_vaddr_frames() write/force parameters with gup_flags mm: replace get_user_pages_locked() write/force parameters with gup_flags mm: replace get_user_pages_unlocked() write/force parameters with gup_flags mm: remove write/force parameters from __get_user_pages_unlocked() mm: remove write/force parameters from __get_user_pages_locked() mm: remove gup_flags FOLL_WRITE games from __get_user_pages()
2016-10-18mm: replace get_user_pages_unlocked() write/force parameters with gup_flagsLorenzo Stoakes
This removes the 'write' and 'force' use from get_user_pages_unlocked() and replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>