Age | Commit message (Collapse) | Author |
|
When out-of-memory the tprot code incorrectly injected a program check
for the guest which reported an addressing exception even if the guest
address was valid.
Let's use the new gmap_translate() which translates a guest address to
a user space address whithout the chance of running into an out-of-memory
situation.
Also make it more explicit that for -EFAULT we won't find a vma.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Guest access functions like copy_to/from_guest() call __guestaddr_to_user()
which in turn call gmap_fault() in order to translate a guest address to a
user space address.
In error case __guest_addr_to_user() returns either -EFAULT or -ENOMEM.
The copy_to/from_guest functions just pass these return values down to the
callers.
The -ENOMEM case however is problematic since there are several places
which access guest memory like:
rc = copy_to_guest(...);
if (rc == -EFAULT)
error_handling();
So in case of -ENOMEM the code assumes that the guest memory access
succeeded even though it failed.
This can cause guest data or state corruption.
If __guestaddr_to_user() returns -ENOMEM the meaning is that a valid user
space mapping exists, but there was not enough memory available when trying
to build the guest mapping. In other words an out-of-memory situation
occured.
For normal user space accesses an out-of-memory situation causes the page
fault handler to map -ENOMEM to -EFAULT (see fixup code in do_no_context()).
We need to do exactly the same for the kvm gaccess functions.
So __guestaddr_to_user() should just map all error codes to -EFAULT.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Enable ioeventfd support on s390 and hook up diagnose 500 virtio-ccw
notifications.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
This patch makes the parameter old a const pointer to the old memory
slot and adds a new parameter named change to know the change being
requested: the former is for removing extra copying and the latter is
for cleaning up the code.
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
This patch drops the parameter old, a copy of the old memory slot, and
adds a new parameter named change to know the change being requested.
This not only cleans up the code but also removes extra copying of the
memory slot structure.
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
X86 does not use this any more. The remaining user, s390's !user_alloc
check, can be simply removed since KVM_SET_MEMORY_REGION ioctl is no
longer supported.
Note: fixed powerpc's indentations with spaces to suppress checkpatch
errors.
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Pull KVM updates from Marcelo Tosatti:
"KVM updates for the 3.9 merge window, including x86 real mode
emulation fixes, stronger memory slot interface restrictions, mmu_lock
spinlock hold time reduction, improved handling of large page faults
on shadow, initial APICv HW acceleration support, s390 channel IO
based virtio, amongst others"
* tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (143 commits)
Revert "KVM: MMU: lazily drop large spte"
x86: pvclock kvm: align allocation size to page size
KVM: nVMX: Remove redundant get_vmcs12 from nested_vmx_exit_handled_msr
x86 emulator: fix parity calculation for AAD instruction
KVM: PPC: BookE: Handle alignment interrupts
booke: Added DBCR4 SPR number
KVM: PPC: booke: Allow multiple exception types
KVM: PPC: booke: use vcpu reference from thread_struct
KVM: Remove user_alloc from struct kvm_memory_slot
KVM: VMX: disable apicv by default
KVM: s390: Fix handling of iscs.
KVM: MMU: cleanup __direct_map
KVM: MMU: remove pt_access in mmu_set_spte
KVM: MMU: cleanup mapping-level
KVM: MMU: lazily drop large spte
KVM: VMX: cleanup vmx_set_cr0().
KVM: VMX: add missing exit names to VMX_EXIT_REASONS array
KVM: VMX: disable SMEP feature when guest is in non-paging mode
KVM: Remove duplicate text in api.txt
Revert "KVM: MMU: split kvm_mmu_free_page"
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 update from Martin Schwidefsky:
"The most prominent change in this patch set is the software dirty bit
patch for s390. It removes __HAVE_ARCH_PAGE_TEST_AND_CLEAR_DIRTY and
the page_test_and_clear_dirty primitive which makes the common memory
management code a bit less obscure.
Heiko fixed most of the PCI related fallout, more often than not
missing GENERIC_HARDIRQS dependencies. Notable is one of the 3270
patches which adds an export to tty_io to be able to resize a tty.
The rest is the usual bunch of cleanups and bug fixes."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (42 commits)
s390/module: Add missing R_390_NONE relocation type
drivers/gpio: add missing GENERIC_HARDIRQ dependency
drivers/input: add couple of missing GENERIC_HARDIRQS dependencies
s390/cleanup: rename SPP to LPP
s390/mm: implement software dirty bits
s390/mm: Fix crst upgrade of mmap with MAP_FIXED
s390/linker skript: discard exit.data at runtime
drivers/media: add missing GENERIC_HARDIRQS dependency
s390/bpf,jit: add vlan tag support
drivers/net,AT91RM9200: add missing GENERIC_HARDIRQS dependency
iucv: fix kernel panic at reboot
s390/Kconfig: sort list of arch selected config options
phylib: remove !S390 dependeny from Kconfig
uio: remove !S390 dependency from Kconfig
dasd: fix sysfs cleanup in dasd_generic_remove
s390/pci: fix hotplug module init
s390/pci: cleanup clp page allocation
s390/pci: cleanup clp inline assembly
s390/perf: cpum_cf: fallback to software sampling events
s390/mm: provide PAGE_SHARED define
...
|
|
The s390 architecture is unique in respect to dirty page detection,
it uses the change bit in the per-page storage key to track page
modifications. All other architectures track dirty bits by means
of page table entries. This property of s390 has caused numerous
problems in the past, e.g. see git commit ef5d437f71afdf4a
"mm: fix XFS oops due to dirty pages without buffers on s390".
To avoid future issues in regard to per-page dirty bits convert
s390 to a fault based software dirty bit detection mechanism. All
user page table entries which are marked as clean will be hardware
read-only, even if the pte is supposed to be writable. A write by
the user process will trigger a protection fault which will cause
the user pte to be marked as dirty and the hardware read-only bit
is removed.
With this change the dirty bit in the storage key is irrelevant
for Linux as a host, but the storage key is still required for
KVM guests. The effect is that page_test_and_clear_dirty and the
related code can be removed. The referenced bit in the storage
key is still used by the page_test_and_clear_young primitive to
provide page age information.
For page cache pages of mappings with mapping_cap_account_dirty
there will not be any change in behavior as the dirty bit tracking
already uses read-only ptes to control the amount of dirty pages.
Only for swap cache pages and pages of mappings without
mapping_cap_account_dirty there can be additional protection faults.
To avoid an excessive number of additional faults the mk_pte
primitive checks for PageDirty if the pgprot value allows for writes
and pre-dirties the pte. That avoids all additional faults for
tmpfs and shmem pages until these pages are added to the swap cache.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Fix name clash with some common code device drivers and add "tod"
to all tod clock access function names.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
There are two ways to express an interruption subclass:
- As a bitmask, as used in cr6.
- As a number, as used in the I/O interruption word.
Unfortunately, we have treated the I/O interruption word as if it
contained the bitmask as well, which went unnoticed so far as
- (not-yet-released) qemu made the same mistake, and
- Linux guest kernels don't check the isc value in the I/O interruption
word for subchannel interrupts.
Make sure that we treat the I/O interruption word correctly.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
|
|
Instructions with long displacement have a signed displacement.
Currently the sign bit is interpreted as 2^20: Lets fix it by doing the
sign extension from 20bit to 32bit and then use it as a signed variable
in the addition (see kvm_s390_get_base_disp_rsy).
Furthermore, there are lots of "int" in that code. This is problematic,
because shifting on a signed integer is undefined/implementation defined
if the bit value happens to be negative.
Fortunately the promotion rules will make the right hand side unsigned
anyway, so there is no real problem right now.
Let's convert them anyway to unsigned where appropriate to avoid
problems if the code is changed or copy/pasted later on.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
|
|
On store status we need to copy the current state of registers
into a save area. Currently we might save stale versions:
The sie state descriptor doesnt have fields for guest ACRS,FPRS,
those registers are simply stored in the host registers. The host
program must copy these away if needed. We do that in vcpu_put/load.
If we now do a store status in KVM code between vcpu_put/load, the
saved values are not up-to-date. Lets collect the ACRS/FPRS before
saving them.
This also fixes some strange problems with hotplug and virtio-ccw,
since the low level machine check handler (on hotplug a machine check
will happen) will revalidate all registers with the content of the
save area.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
CC: stable@vger.kernel.org
Signed-off-by: Gleb Natapov <gleb@redhat.com>
|
|
This is to fix up a build problem with a wireless driver due to the
dynamic-debug patches in this branch.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull more s390 patches from Martin Schwidefsky:
"A couple of bug fixes: one of the transparent huge page primitives is
broken, the sched_clock function overflows after 417 days, the XFS
module has grown too large for -fpic and the new pci code has broken
normal channel subsystem notifications."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/chsc: fix SEI usage
s390/time: fix sched_clock() overflow
s390: use -fPIC for module compile
s390/mm: fix pmd_pfn() for thp
|
|
the variable inti should be freed in the branch CPUSTAT_STOPPED.
Signed-off-by: Cong Ding <dinggnu@gmail.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
|
|
Converting a 64 Bit TOD format value to nanoseconds means that the value
must be divided by 4.096. In order to achieve that we multiply with 125
and divide by 512.
When used within sched_clock() this triggers an overflow after appr.
417 days. Resulting in a sched_clock() return value that is much smaller
than previously and therefore may cause all sort of weird things in
subsystems that rely on a monotonic sched_clock() behaviour.
To fix this implement a tod_to_ns() helper function which converts TOD
values without overflow and call this function from both places that
open coded the conversion: sched_clock() and kvm_s390_handle_wait().
Cc: stable@kernel.org
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
The CONFIG_EXPERIMENTAL config item has not carried much meaning for a
while now and is almost always enabled by default. As agreed during the
Linux kernel summit, remove it from any "depends on" lines in Kconfigs.
CC: Avi Kivity <avi@redhat.com>
CC: Marcelo Tosatti <mtosatti@redhat.com>
CC: Christian Borntraeger <borntraeger@de.ibm.com>
CC: Cornelia Huck <cornelia.huck@de.ibm.com>
CC: Martin Schwidefsky <schwidefsky@de.ibm.com>
CC: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
|
|
commit b080935c8638e08134629d0a9ebdf35669bec14d
kvm: Directly account vtime to system on guest switch
also removed the irq_disable/enable around kvm guest switch, which
is correct in itself. Unfortunately, there is a BUG ON that (correctly)
checks for preemptible to cover the call to rcu later on.
(Introduced with commit 8fa2206821953a50a3a02ea33fcfb3ced2fd9997
KVM: make guest mode entry to be rcu quiescent state)
This check might trigger depending on the kernel config.
Lets make sure that no preemption happens during kvm_guest_enter.
We can enable preemption again after the call to
rcu_virt_note_context_switch returns.
Please note that we continue to run s390 guests with interrupts
enabled.
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
CC: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Add a new capability, KVM_CAP_S390_CSS_SUPPORT, which will pass
intercepts for channel I/O instructions to userspace. Only I/O
instructions interacting with I/O interrupts need to be handled
in-kernel:
- TEST PENDING INTERRUPTION (tpi) dequeues and stores pending
interrupts entirely in-kernel.
- TEST SUBCHANNEL (tsch) dequeues pending interrupts in-kernel
and exits via KVM_EXIT_S390_TSCH to userspace for subchannel-
related processing.
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Make s390 support KVM_ENABLE_CAP.
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Explicitely catch all channel I/O related instructions intercepts
in the kernel and set condition code 3 for them.
This paves the way for properly handling these instructions later
on.
Note: This is not architecture compliant (the previous code wasn't
either) since setting cc 3 is not the correct thing to do for some
of these instructions. For Linux guests, however, it still has the
intended effect of stopping css probing.
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Add support for injecting machine checks (only repressible
conditions for now).
This is a bit more involved than I/O interrupts, for these reasons:
- Machine checks come in both floating and cpu varieties.
- We don't have a bit for machine checks enabling, but have to use
a roundabout approach with trapping PSW changing instructions and
watching for opened machine checks.
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Add support for handling I/O interrupts (standard, subchannel-related
ones and rudimentary adapter interrupts).
The subchannel-identifying parameters are encoded into the interrupt
type.
I/O interrupts are floating, so they can't be injected on a specific
vcpu.
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Introduce helper functions for decoding the various base/displacement
instruction formats.
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
These tables are never modified.
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
There's no need for this to be an int, it holds a boolean.
Move to the end of the struct for alignment.
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Pull KVM updates from Marcelo Tosatti:
"Considerable KVM/PPC work, x86 kvmclock vsyscall support,
IA32_TSC_ADJUST MSR emulation, amongst others."
Fix up trivial conflict in kernel/sched/core.c due to cross-cpu
migration notifier added next to rq migration call-back.
* tag 'kvm-3.8-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (156 commits)
KVM: emulator: fix real mode segment checks in address linearization
VMX: remove unneeded enable_unrestricted_guest check
KVM: VMX: fix DPL during entry to protected mode
x86/kexec: crash_vmclear_local_vmcss needs __rcu
kvm: Fix irqfd resampler list walk
KVM: VMX: provide the vmclear function and a bitmap to support VMCLEAR in kdump
x86/kexec: VMCLEAR VMCSs loaded on all cpus if necessary
KVM: MMU: optimize for set_spte
KVM: PPC: booke: Get/set guest EPCR register using ONE_REG interface
KVM: PPC: bookehv: Add EPCR support in mtspr/mfspr emulation
KVM: PPC: bookehv: Add guest computation mode for irq delivery
KVM: PPC: Make EPCR a valid field for booke64 and bookehv
KVM: PPC: booke: Extend MAS2 EPN mask for 64-bit
KVM: PPC: e500: Mask MAS2 EPN high 32-bits in 32/64 tlbwe emulation
KVM: PPC: Mask ea's high 32-bits in 32/64 instr emulation
KVM: PPC: e500: Add emulation helper for getting instruction ea
KVM: PPC: bookehv64: Add support for interrupt handling
KVM: PPC: bookehv: Remove GET_VCPU macro from exception handler
KVM: PPC: booke: Fix get_tb() compile error on 64-bit
KVM: PPC: e500: Silence bogus GCC warning in tlb code
...
|
|
TSC initialization will soon make use of online_vcpus.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Switching to or from guest context is done on ioctl context.
So by the time we call kvm_guest_enter() or kvm_guest_exit()
we know we are not running the idle task.
As a result, we can directly account the cputime using
vtime_account_system().
There are two good reasons to do this:
* We avoid some useless checks on guest switch. It optimizes
a bit this fast path.
* In the case of CONFIG_IRQ_TIME_ACCOUNTING, calling vtime_account()
checks for irq time to account. This is pointless since we know
we are not in an irq on guest switch. This is wasting cpu cycles
for no good reason. vtime_account_system() OTOH is a no-op in
this config option.
* We can remove the irq disable/enable around kvm guest switch in s390.
A further optimization may consist in introducing a vtime_account_guest()
that directly calls account_guest_time().
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Xiantao Zhang <xiantao.zhang@intel.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
Newer kernels (linux-next with the transparent huge page patches)
use rrbm if the feature is announced via feature bit 66.
RRBM will cause intercepts, so KVM does not handle it right now,
causing an illegal instruction in the guest.
The easy solution is to disable the feature bit for the guest.
This fixes bugs like:
Kernel BUG at 0000000000124c2a [verbose debug info unavailable]
illegal operation: 0001 [#1] SMP
Modules linked in: virtio_balloon virtio_net ipv6 autofs4
CPU: 0 Not tainted 3.5.4 #1
Process fmempig (pid: 659, task: 000000007b712fd0, ksp: 000000007bed3670)
Krnl PSW : 0704d00180000000 0000000000124c2a (pmdp_clear_flush_young+0x5e/0x80)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 EA:3
00000000003cc000 0000000000000004 0000000000000000 0000000079800000
0000000000040000 0000000000000000 000000007bed3918 000000007cf40000
0000000000000001 000003fff7f00000 000003d281a94000 000000007bed383c
000000007bed3918 00000000005ecbf8 00000000002314a6 000000007bed36e0
Krnl Code:>0000000000124c2a: b9810025 ogr %r2,%r5
0000000000124c2e: 41343000 la %r3,0(%r4,%r3)
0000000000124c32: a716fffa brct %r1,124c26
0000000000124c36: b9010022 lngr %r2,%r2
0000000000124c3a: e3d0f0800004 lg %r13,128(%r15)
0000000000124c40: eb22003f000c srlg %r2,%r2,63
[ 2150.713198] Call Trace:
[ 2150.713223] ([<00000000002312c4>] page_referenced_one+0x6c/0x27c)
[ 2150.713749] [<0000000000233812>] page_referenced+0x32a/0x410
[...]
CC: stable@vger.kernel.org
CC: Alex Graf <agraf@suse.de>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
EXTERNAL_CALL and EMERGENCY type interrupts need to preserve their interrupt
code parameter when being injected from user space.
Signed-off-by: Jason J. Herne <jjherne@us.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Pull KVM updates from Avi Kivity:
"Highlights of the changes for this release include support for vfio
level triggered interrupts, improved big real mode support on older
Intels, a streamlines guest page table walker, guest APIC speedups,
PIO optimizations, better overcommit handling, and read-only memory."
* tag 'kvm-3.7-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (138 commits)
KVM: s390: Fix vcpu_load handling in interrupt code
KVM: x86: Fix guest debug across vcpu INIT reset
KVM: Add resampling irqfds for level triggered interrupts
KVM: optimize apic interrupt delivery
KVM: MMU: Eliminate pointless temporary 'ac'
KVM: MMU: Avoid access/dirty update loop if all is well
KVM: MMU: Eliminate eperm temporary
KVM: MMU: Optimize is_last_gpte()
KVM: MMU: Simplify walk_addr_generic() loop
KVM: MMU: Optimize pte permission checks
KVM: MMU: Update accessed and dirty bits after guest pagetable walk
KVM: MMU: Move gpte_access() out of paging_tmpl.h
KVM: MMU: Optimize gpte_access() slightly
KVM: MMU: Push clean gpte write protection out of gpte_access()
KVM: clarify kvmclock documentation
KVM: make processes waiting on vcpu mutex killable
KVM: SVM: Make use of asm.h
KVM: VMX: Make use of asm.h
KVM: VMX: Make lto-friendly
KVM: x86: lapic: Clean up find_highest_vector() and count_vectors()
...
Conflicts:
arch/s390/include/asm/processor.h
arch/x86/kvm/i8259.c
|
|
Recent changes (KVM: make processes waiting on vcpu mutex killable)
now requires to check the return value of vcpu_load. This triggered
a warning in s390 specific kvm code. Turns out that we can actually
remove the put/load, since schedule will do the right thing via
the preempt notifiers.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Change return code handling of the stsi() function:
In case function code 0 was specified the return value is the
current configuration level (already shifted). That way all
the code that actually copied the stsi_0() function can go
away.
Otherwise the return value is 0 (success) or negative to
indicate an error (currently only -EOPNOTSUPP).
Also stsi() is no longer an inline function. The function is
not performance critical, but every caller would generate an
exception table entry for this function.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Split the base menu into more descriptive categories like processor,
memory, etc. and move virtualization related entries to the
Virtualization menu.
Also select KEXEC unconditionally.
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Introducing kvm_arch_flush_shadow_memslot, to invalidate the
translations of a single memory slot.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Introduce a new trace system, kvm-s390, for some kvm/s390 specific
trace points:
- injection of interrupts
- delivery of interrupts to the guest
- creation/destruction of kvm machines and vcpus
- stop actions for vcpus
- reset requests for userspace
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Add trace events for several s390 architecture specifics:
- SIE entry/exit
- common intercepts
- common instructions (sigp/diagnose)
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Merge patches queued during the run-up to the merge window.
* queue: (25 commits)
KVM: Choose better candidate for directed yield
KVM: Note down when cpu relax intercepted or pause loop exited
KVM: Add config to support ple or cpu relax optimzation
KVM: switch to symbolic name for irq_states size
KVM: x86: Fix typos in pmu.c
KVM: x86: Fix typos in lapic.c
KVM: x86: Fix typos in cpuid.c
KVM: x86: Fix typos in emulate.c
KVM: x86: Fix typos in x86.c
KVM: SVM: Fix typos
KVM: VMX: Fix typos
KVM: remove the unused parameter of gfn_to_pfn_memslot
KVM: remove is_error_hpa
KVM: make bad_pfn static to kvm_main.c
KVM: using get_fault_pfn to get the fault pfn
KVM: MMU: track the refcount when unmap the page
KVM: x86: remove unnecessary mark_page_dirty
KVM: MMU: Avoid handling same rmap_pde in kvm_handle_hva_range()
KVM: MMU: Push trace_kvm_age_page() into kvm_age_rmapp()
KVM: MMU: Add memslot parameter to hva handlers
...
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Pull KVM updates from Avi Kivity:
"Highlights include
- full big real mode emulation on pre-Westmere Intel hosts (can be
disabled with emulate_invalid_guest_state=0)
- relatively small ppc and s390 updates
- PCID/INVPCID support in guests
- EOI avoidance; 3.6 guests should perform better on 3.6 hosts on
interrupt intensive workloads)
- Lockless write faults during live migration
- EPT accessed/dirty bits support for new Intel processors"
Fix up conflicts in:
- Documentation/virtual/kvm/api.txt:
Stupid subchapter numbering, added next to each other.
- arch/powerpc/kvm/booke_interrupts.S:
PPC asm changes clashing with the KVM fixes
- arch/s390/include/asm/sigp.h, arch/s390/kvm/sigp.c:
Duplicated commits through the kvm tree and the s390 tree, with
subsequent edits in the KVM tree.
* tag 'kvm-3.6-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (93 commits)
KVM: fix race with level interrupts
x86, hyper: fix build with !CONFIG_KVM_GUEST
Revert "apic: fix kvm build on UP without IOAPIC"
KVM guest: switch to apic_set_eoi_write, apic_write
apic: add apic_set_eoi_write for PV use
KVM: VMX: Implement PCID/INVPCID for guests with EPT
KVM: Add x86_hyper_kvm to complete detect_hypervisor_platform check
KVM: PPC: Critical interrupt emulation support
KVM: PPC: e500mc: Fix tlbilx emulation for 64-bit guests
KVM: PPC64: booke: Set interrupt computation mode for 64-bit host
KVM: PPC: bookehv: Add ESR flag to Data Storage Interrupt
KVM: PPC: bookehv64: Add support for std/ld emulation.
booke: Added crit/mc exception handler for e500v2
booke/bookehv: Add host crit-watchdog exception support
KVM: MMU: document mmu-lock and fast page fault
KVM: MMU: fix kvm_mmu_pagetable_walk tracepoint
KVM: MMU: trace fast page fault
KVM: MMU: fast path of handling guest page fault
KVM: MMU: introduce SPTE_MMU_WRITEABLE bit
KVM: MMU: fold tlb flush judgement into mmu_spte_update
...
|
|
Suggested-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> # on s390x
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Remove the file name from the comment at top of many files. In most
cases the file name was wrong anyway, so it's rather pointless.
Also unify the IBM copyright statement. We did have a lot of sightly
different statements and wanted to change them one after another
whenever a file gets touched. However that never happened. Instead
people start to take the old/"wrong" statements to use as a template
for new files.
So unify all of them in one go.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
If sigp sense doesn't have any status bits to report, it should set
cc 0 and leave the register as-is.
Since we know about the external call pending bit, we should report
it if it is set as well.
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Just use the defines instead of using plain numbers and adding
a comment behind each line.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
If an invalid parameter is passed or the addressed cpu is in an
incorrect state sigp set prefix will store a status.
This status must only have bits set as defined by the architecture.
The current kvm implementation missed to clear bits and also did
not set the intended status bit ("and" instead of "or" operation).
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Only if the sensed cpu is not running a status is stored, which
is reflected by condition code 1. If the cpu is running, condition
code 0 should be returned.
Just the opposite of what the code is doing.
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
The smp and the kvm code have different defines for the sigp order codes.
Let's just have a single place where these are defined.
Also move the sigp condition code and sigp cpu status bits to the new
sigp.h header file.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
The initial cpu reset sets the cpu in the stopped state.
Several places check for the cpu state (e.g. sigp set prefix) and
not setting the STOPPED state triggered errors with newer guest
kernels after reboot.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
The smp and the kvm code have different defines for the sigp order codes.
Let's just have a single place where these are defined.
Also move the sigp condition code and sigp cpu status bits to the new
sigp.h header file.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|