summaryrefslogtreecommitdiff
path: root/arch/powerpc/kvm/book3s.c
AgeCommit message (Collapse)Author
2012-03-05KVM: PPC: Book3s HV: Implement get_dirty_log using hardware changed bitPaul Mackerras
This changes the implementation of kvm_vm_ioctl_get_dirty_log() for Book3s HV guests to use the hardware C (changed) bits in the guest hashed page table. Since this makes the implementation quite different from the Book3s PR case, this moves the existing implementation from book3s.c to book3s_pr.c and creates a new implementation in book3s_hv.c. That implementation calls kvmppc_hv_get_dirty_log() to do the actual work by calling kvm_test_clear_dirty on each page. It iterates over the HPTEs, clearing the C bit if set, and returns 1 if any C bit was set (including the saved C bit in the rmap entry). Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-05KVM: PPC: booke: Improve timer register emulationScott Wood
Decrementers are now properly driven by TCR/TSR, and the guest has full read/write access to these registers. The decrementer keeps ticking (and setting the TSR bit) regardless of whether the interrupts are enabled with TCR. The decrementer stops at zero, rather than going negative. Decrementers (and FITs, once implemented) are delivered as level-triggered interrupts -- dequeued when the TSR bit is cleared, not on delivery. Signed-off-by: Liu Yu <yu.liu@freescale.com> [scottwood@freescale.com: significant changes] Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-05KVM: PPC: Paravirtualize SPRG4-7, ESR, PIR, MASnScott Wood
This allows additional registers to be accessed by the guest in PR-mode KVM without trapping. SPRG4-7 are readable from userspace. On booke, KVM will sync these registers when it enters the guest, so that accesses from guest userspace will work. The guest kernel, OTOH, must consistently use either the real registers or the shared area between exits. This also applies to the already-paravirted SPRG3. On non-booke, it's not clear to what extent SPRG4-7 are supported (they're not architected for book3s, but exist on at least some classic chips). They are copied in the get/set regs ioctls, but I do not see any non-booke emulation. I also do not see any syncing with real registers (in PR-mode) including the user-readable SPRG3. This patch should not make that situation any worse. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-05KVM: PPC: Rename deliver_interrupts to prepare_to_enterScott Wood
This function also updates paravirt int_pending, so rename it to be more obvious that this is a collection of checks run prior to (re)entering a guest. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-12-27KVM: introduce id_to_memslot functionXiao Guangrong
Introduce id_to_memslot to get memslot by slot id Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-10-31powerpc: add export.h to files making use of EXPORT_SYMBOLPaul Gortmaker
With module.h being implicitly everywhere via device.h, the absence of explicitly including something for EXPORT_SYMBOL went unnoticed. Since we are heading to fix things up and clean module.h from the device.h file, we need to explicitly include these files now. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-07-12KVM: PPC: Deliver program interrupts right away instead of queueing themPaul Mackerras
Doing so means that we don't have to save the flags anywhere and gets rid of the last reference to to_book3s(vcpu) in arch/powerpc/kvm/book3s.c. Doing so is OK because a program interrupt won't be generated at the same time as any other synchronous interrupt. If a program interrupt and an asynchronous interrupt (external or decrementer) are generated at the same time, the program interrupt will be delivered, which is correct because it has a higher priority, and then the asynchronous interrupt will be masked. We don't ever generate system reset or machine check interrupts to the guest, but if we did, then we would need to make sure they got delivered rather than the program interrupt. The current code would be wrong in this situation anyway since it would deliver the program interrupt as well as the reset/machine check interrupt. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12KVM: PPC: Split out code from book3s.c into book3s_pr.cPaul Mackerras
In preparation for adding code to enable KVM to use hypervisor mode on 64-bit Book 3S processors, this splits book3s.c into two files, book3s.c and book3s_pr.c, where book3s_pr.c contains the code that is specific to running the guest in problem state (user mode) and book3s.c contains code which should apply to all Book 3S processors. In doing this, we abstract some details, namely the interrupt offset, updating the interrupt pending flag, and detecting if the guest is in a critical section. These are all things that will be different when we use hypervisor mode. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12KVM: PPC: Move fields between struct kvm_vcpu_arch and kvmppc_vcpu_book3sPaul Mackerras
This moves the slb field, which represents the state of the emulated SLB, from the kvmppc_vcpu_book3s struct to the kvm_vcpu_arch, and the hpte_hash_[v]pte[_long] fields from kvm_vcpu_arch to kvmppc_vcpu_book3s. This is in accord with the principle that the kvm_vcpu_arch struct represents the state of the emulated CPU, and the kvmppc_vcpu_book3s struct holds the auxiliary data structures used in the emulation. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12KVM: PPC: Fix machine checks on 32-bit Book3SPaul Mackerras
Commit 69acc0d3ba ("KVM: PPC: Resolve real-mode handlers through function exports") resulted in vcpu->arch.trampoline_lowmem and vcpu->arch.trampoline_enter ending up with kernel virtual addresses rather than physical addresses. This is OK on 64-bit Book3S machines, which ignore the top 4 bits of the effective address in real mode, but on 32-bit Book3S machines, accessing these addresses in real mode causes machine check interrupts, as the hardware uses the whole effective address as the physical address in real mode. This fixes the problem by using __pa() to convert these addresses to physical addresses. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12KVM: PPC: Resolve real-mode handlers through function exportsAlexander Graf
Up until now, Book3S KVM had variables stored in the kernel that a kernel module or the kvm code in the kernel could read from to figure out where some real mode helper functions are located. This is all unnecessary. The high bits of the EA get ignore in real mode, so we can just use the pointer as is. Also, it's a lot easier on relocations when we use the normal way of resolving the address to a function, instead of jumping through hoops. This patch fixes compilation with CONFIG_RELOCATABLE=y. Signed-off-by: Alexander Graf <agraf@suse.de>
2011-05-20powerpc/kvm: Fix kvmppc_core_pending_decPaul Mackerras
The vcpu->arch.pending_exceptions field is a bitfield indexed by interrupt priority number as returned by kvmppc_book3s_vec2irqprio. However, kvmppc_core_pending_dec was using an interrupt vector shifted by 7 as the bit index. Fix it to use the irqprio value for the decrementer interrupt instead. This problem was found by code inspection. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-17KVM: PPC: Fix SPRG get/set for Book3S and BookEPeter Tyser
Previously SPRGs 4-7 were improperly read and written in kvm_arch_vcpu_ioctl_get_regs() and kvm_arch_vcpu_ioctl_set_regs(); Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Peter Tyser <ptyser@xes-inc.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: replace vmalloc and memset with vzallocTakuya Yoshikawa
Let's use newly introduced vzalloc(). Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-24KVM: PPC: Implement Level interrupts on Book3SAlexander Graf
The current interrupt logic is just completely broken. We get a notification from user space, telling us that an interrupt is there. But then user space expects us that we just acknowledge an interrupt once we deliver it to the guest. This is not how real hardware works though. On real hardware, the interrupt controller pulls the external interrupt line until it gets notified that the interrupt was received. So in reality we have two events: pulling and letting go of the interrupt line. To maintain backwards compatibility, I added a new request for the pulling part. The letting go part was implemented earlier already. With this in place, we can now finally start guests that do not randomly stall and stop to work at random times. This patch implements above logic for Book3S. Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24KVM: PPC: Don't put MSR_POW in MSRAlexander Graf
On Book3S a mtmsr with the MSR_POW bit set indicates that the OS is in idle and only needs to be waked up on the next interrupt. Now, unfortunately we let that bit slip into the stored MSR value which is not what the real CPU does, so that we ended up executing code like this: r = mfmsr(); /* r containts MSR_POW */ mtmsr(r | MSR_EE); This obviously breaks, as we're going into idle mode in code sections that don't expect to be idling. This patch masks MSR_POW out of the stored MSR value on wakeup, making guests happy again. Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24KVM: PPC: Update int_pending also on dequeueAlexander Graf
When having a decrementor interrupt pending, the dequeuing happens manually through an mtdec instruction. This instruction simply calls dequeue on that interrupt, so the int_pending hint doesn't get updated. This patch enables updating the int_pending hint also on dequeue, thus correctly enabling guests to stay in guest contexts more often. Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24KVM: PPC: Put segment registers in shared pageAlexander Graf
Now that the actual mtsr doesn't do anything anymore, we can move the sr contents over to the shared page, so a guest can directly read and write its sr contents from guest context. Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24KVM: PPC: Interpret SR registers on demandAlexander Graf
Right now we're examining the contents of Book3s_32's segment registers when the register is written and put the interpreted contents into a struct. There are two reasons this is bad. For starters, the struct has worse real-time performance, as it occupies more ram. But the more important part is that with segment registers being interpreted from their raw values, we can put them in the shared page, allowing guests to mess with them directly. This patch makes the internal representation of SRs be u32s. Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24KVM: PPC: Don't flush PTEs on NX/RO hitAlexander Graf
When hitting a no-execute or read-only data/inst storage interrupt we were flushing the respective PTE so we're sure it gets properly overwritten next. According to the spec, this is unnecessary though. The guest issues a tlbie anyways, so we're safe to just keep the PTE around and have it manually removed from the guest, saving us a flush. Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24KVM: PPC: Preload magic page when in kernel modeAlexander Graf
When the guest jumps into kernel mode and has the magic page mapped, theres a very high chance that it will also use it. So let's detect that scenario and map the segment accordingly. Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24KVM: PPC: Move EXIT_DEBUG partially to tracepointsAlexander Graf
We have a debug printk on every exit that is usually #ifdef'ed out. Using tracepoints makes a lot more sense here though, as they can be dynamically enabled. This patch converts the most commonly used debug printks of EXIT_DEBUG to tracepoints. Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24KVM: PPC: fix leakage of error page in kvmppc_patch_dcbz()Wei Yongjun
Add kvm_release_page_clean() after is_error_page() to avoid leakage of error page. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24KVM: PPC: Magic Page Book3s supportAlexander Graf
We need to override EA as well as PA lookups for the magic page. When the guest tells us to project it, the magic page overrides any guest mappings. In order to reflect that, we need to hook into all the MMU layers of KVM to force map the magic page if necessary. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24KVM: PPC: Make PAM a defineAlexander Graf
On PowerPC it's very normal to not support all of the physical RAM in real mode. To check if we're matching on the shared page or not, we need to know the limits so we can restrain ourselves to that range. So let's make it a define instead of open-coding it. And while at it, let's also increase it. Signed-off-by: Alexander Graf <agraf@suse.de> v2 -> v3: - RMO -> PAM (non-magic page) Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24KVM: PPC: Tell guest about pending interruptsAlexander Graf
When the guest turns on interrupts again, it needs to know if we have an interrupt pending for it. Because if so, it should rather get out of guest context and get the interrupt. So we introduce a new field in the shared page that we use to tell the guest that there's a pending interrupt lying around. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24KVM: PPC: Add PV guest critical sectionsAlexander Graf
When running in hooked code we need a way to disable interrupts without clobbering any interrupts or exiting out to the hypervisor. To achieve this, we have an additional critical field in the shared page. If that field is equal to the r1 register of the guest, it tells the hypervisor that we're in such a critical section and thus may not receive any interrupts. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24KVM: PPC: Implement hypervisor interfaceAlexander Graf
To communicate with KVM directly we need to plumb some sort of interface between the guest and KVM. Usually those interfaces use hypercalls. This hypercall implementation is described in the last patch of the series in a special documentation file. Please read that for further information. This patch implements stubs to handle KVM PPC hypercalls on the host and guest side alike. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24KVM: PPC: Convert SPRG[0-4] to shared pageAlexander Graf
When in kernel mode there are 4 additional registers available that are simple data storage. Instead of exiting to the hypervisor to read and write those, we can just share them with the guest using the page. This patch converts all users of the current field to the shared page. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24KVM: PPC: Convert SRR0 and SRR1 to shared pageAlexander Graf
The SRR0 and SRR1 registers contain cached values of the PC and MSR respectively. They get written to by the hypervisor when an interrupt occurs or directly by the kernel. They are also used to tell the rfi(d) instruction where to jump to. Because it only gets touched on defined events that, it's very simple to share with the guest. Hypervisor and guest both have full r/w access. This patch converts all users of the current field to the shared page. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24KVM: PPC: Convert DAR to shared page.Alexander Graf
The DAR register contains the address a data page fault occured at. This register behaves pretty much like a simple data storage register that gets written to on data faults. There is no hypervisor interaction required on read or write. This patch converts all users of the current field to the shared page. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24KVM: PPC: Convert DSISR to shared pageAlexander Graf
The DSISR register contains information about a data page fault. It is fully read/write from inside the guest context and we don't need to worry about interacting based on writes of this register. This patch converts all users of the current field to the shared page. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24KVM: PPC: Convert MSR to shared pageAlexander Graf
One of the most obvious registers to share with the guest directly is the MSR. The MSR contains the "interrupts enabled" flag which the guest has to toggle in critical sections. So in order to bring the overhead of interrupt en- and disabling down, let's put msr into the shared page. Keep in mind that even though you can fully read its contents, writing to it doesn't always update all state. There are a few safe fields that don't require hypervisor interaction. See the documentation for a list of MSR bits that are safe to be set from inside the guest. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24KVM: PPC: Introduce shared pageAlexander Graf
For transparent variable sharing between the hypervisor and guest, I introduce a shared page. This shared page will contain all the registers the guest can read and write safely without exiting guest context. This patch only implements the stubs required for the basic structure of the shared page. The actual register moving follows. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: PPC: Make use of hash based Shadow MMUAlexander Graf
We just introduced generic functions to handle shadow pages on PPC. This patch makes the respective backends make use of them, getting rid of a lot of duplicate code along the way. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01KVM: PPC: elide struct thread_struct instances from stackAndreas Schwab
Instead of instantiating a whole thread_struct on the stack use only the required parts of it. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Tested-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01KVM: move vcpu locking to dispatcher for generic vcpu ioctlsAvi Kivity
All vcpu ioctls need to be locked, so instead of locking each one specifically we lock at the generic dispatcher. This patch only updates generic ioctls and leaves arch specific ioctls alone. Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-19KVM: PPC: Add missing vcpu_load()/vcpu_put() in vcpu ioctlsAvi Kivity
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-19KVM: Let vcpu structure alignment be determined at runtimeAvi Kivity
vmx and svm vcpus have different contents and therefore may have different alignmment requirements. Let each specify its required alignment. Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-19KVM: powerpc: use of kzalloc/kfree requires including slab.hStephen Rothwell
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: PPC: Enable native paired singlesAlexander Graf
When we're on a paired single capable host, we can just always enable paired singles and expose them to the guest directly. This approach breaks when multiple VMs run and access PS concurrently, but this should suffice until we get a proper framework for it in Linux. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Improve split modeAlexander Graf
When in split mode, instruction relocation and data relocation are not equal. So far we implemented this mode by reserving a special pseudo-VSID for the two cases and flushing all PTEs when going into split mode, which is slow. Unfortunately 32bit Linux and Mac OS X use split mode extensively. So to not slow down things too much, I came up with a different idea: Mark the split mode with a bit in the VSID and then treat it like any other segment. This means we can just flush the shadow segment cache, but keep the PTEs intact. I verified that this works with ppc32 Linux and Mac OS X 10.4 guests and does speed them up. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Make Performance Counters workAlexander Graf
When we get a performance counter interrupt we need to route it on to the Linux handler after we got out of the guest context. We also need to tell our handling code that this particular interrupt doesn't need treatment. So let's add those two bits in, making perf work while having a KVM guest running. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Convert u64 -> ulongAlexander Graf
There are some pieces in the code that I overlooked that still use u64s instead of longs. This slows down 32 bit hosts unnecessarily, so let's just move them to ulong. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Check max IRQ prioAlexander Graf
We have a define on what the highest bit of IRQ priorities is. So we can just as well use it in the bit checking code and avoid invalid IRQ values to be triggered. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Add Book3S compatibility codeAlexander Graf
Some code we had so far required defines and had code that was completely Book3S_64 specific. Since we now opened book3s.c to Book3S_32 too, we need to take care of these pieces. So let's add some minor code where it makes sense to not go the Book3S_64 code paths and add compat defines on others. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Emulate segment faultAlexander Graf
Book3S_32 doesn't know about segment faults. It only knows about page faults. So in order to know that we didn't map a segment, we need to fake segment faults. We do this by setting invalid segment registers to an invalid VSID and then check for that VSID on normal page faults. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Extract MMU initAlexander Graf
The host shadow mmu code needs to get initialized. It needs to fetch a segment it can use to put shadow PTEs into. That initialization code was in generic code, which is icky. Let's move it over to the respective MMU file. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Improve indirect svcpu accessorsAlexander Graf
We already have some inline fuctions we use to access vcpu or svcpu structs, depending on whether we're on booke or book3s. Since we just put a few more registers into the svcpu, we also need to make sure the respective callbacks are available and get used. So this patch moves direct use of the now in the svcpu struct fields to inline function calls. While at it, it also moves the definition of those inline function calls to respective header files for booke and book3s, greatly improving readability. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Disable MSR_FEx for Cell hostsAlexander Graf
Cell can't handle MSR_FE0 and MSR_FE1 too well. It gets dog slow. So let's just override the guest whenever we see one of the two and mask them out. See commit ddf5f75a16b3e7460ffee881795aa168dffcd0cf for reference. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>