summaryrefslogtreecommitdiff
path: root/drivers/xen
AgeCommit message (Collapse)Author
2017-06-14xen/privcmd: Support correctly 64KB page granularity when mapping memoryJulien Grall
commit 753c09b5652bb4fe53e2db648002ec64b32b8827 upstream. Commit 5995a68 "xen/privcmd: Add support for Linux 64KB page granularity" did not go far enough to support 64KB in mmap_batch_fn. The variable 'nr' is the number of 4KB chunk to map. However, when Linux is using 64KB page granularity the array of pages (vma->vm_private_data) contain one page per 64KB. Fix it by incrementing st->index correctly. Furthermore, st->va is not correctly incremented as PAGE_SIZE != XEN_PAGE_SIZE. Fixes: 5995a68 ("xen/privcmd: Add support for Linux 64KB page granularity") Reported-by: Feng Kan <fkan@apm.com> Signed-off-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14xen: Revert commits da72ff5bfcb0 and 72a9b186292dBoris Ostrovsky
commit 84d582d236dc1f9085e741affc72e9ba061a67c2 upstream. Recent discussion (http://marc.info/?l=xen-devel&m=149192184523741) established that commit 72a9b186292d ("xen: Remove event channel notification through Xen PCI platform device") (and thus commit da72ff5bfcb0 ("partially revert "xen: Remove event channel notification through Xen PCI platform device"")) are unnecessary and, in fact, prevent HVM guests from booting on Xen releases prior to 4.0 Therefore we revert both of those commits. The summary of that discussion is below: Here is the brief summary of the current situation: Before the offending commit (72a9b186292): 1) INTx does not work because of the reset_watches path. 2) The reset_watches path is only taken if you have Xen > 4.0 3) The Linux Kernel by default will use vector inject if the hypervisor support. So even INTx does not work no body running the kernel with Xen > 4.0 would notice. Unless he explicitly disabled this feature either in the kernel or in Xen (and this can only be disabled by modifying the code, not user-supported way to do it). After the offending commit (+ partial revert): 1) INTx is no longer support for HVM (only for PV guests). 2) Any HVM guest The kernel will not boot on Xen < 4.0 which does not have vector injection support. Since the only other mode supported is INTx which. So based on this summary, I think before commit (72a9b186292) we were in much better position from a user point of view. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Julien Grall <julien.grall@arm.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Ross Lagerwall <ross.lagerwall@citrix.com> Cc: xen-devel@lists.xenproject.org Cc: linux-kernel@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: Anthony Liguori <aliguori@amazon.com> Cc: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-30xen/acpi: upload PM state from init-domain to XenAnkur Arora
commit 1914f0cd203c941bba72f9452c8290324f1ef3dc upstream. This was broken in commit cd979883b9ed ("xen/acpi-processor: fix enabling interrupts on syscore_resume"). do_suspend (from xen/manage.c) and thus xen_resume_notifier never get called on the initial-domain at resume (it is if running as guest.) The rationale for the breaking change was that upload_pm_data() potentially does blocking work in syscore_resume(). This patch addresses the original issue by scheduling upload_pm_data() to execute in workqueue context. Cc: Stanislaw Gruszka <sgruszka@redhat.com> Based-on-patch-by: Konrad Wilk <konrad.wilk@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-26swiotlb: Convert swiotlb_force from int to enumGeert Uytterhoeven
commit ae7871be189cb41184f1e05742b4a99e2c59774d upstream. Convert the flag swiotlb_force from an int to an enum, to prepare for the advent of more possible values. Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-06xen/gntdev: Use VM_MIXEDMAP instead of VM_IO to avoid NUMA balancingBoris Ostrovsky
commit 30faaafdfa0c754c91bac60f216c9f34a2bfdf7e upstream. Commit 9c17d96500f7 ("xen/gntdev: Grant maps should not be subject to NUMA balancing") set VM_IO flag to prevent grant maps from being subjected to NUMA balancing. It was discovered recently that this flag causes get_user_pages() to always fail with -EFAULT. check_vma_flags __get_user_pages __get_user_pages_locked __get_user_pages_unlocked get_user_pages_fast iov_iter_get_pages dio_refill_pages do_direct_IO do_blockdev_direct_IO do_blockdev_direct_IO ext4_direct_IO_read generic_file_read_iter aio_run_iocb (which can happen if guest's vdisk has direct-io-safe option). To avoid this let's use VM_MIXEDMAP flag instead --- it prevents NUMA balancing just as VM_IO does and has no effect on check_vma_flags(). Reported-by: Olaf Hering <olaf@aepfle.de> Suggested-by: Hugh Dickins <hughd@google.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Hugh Dickins <hughd@google.com> Tested-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-25Merge tag 'for-linus-4.9-rc2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from David Vrabel: - advertise control feature flags in xenstore - fix x86 build when XEN_PVHVM is disabled * tag 'for-linus-4.9-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xenbus: check return value of xenbus_scanf() xenbus: prefer list_for_each() x86: xen: move cpu_up functions out of ifdef xenbus: advertise control feature flags
2016-10-24xenbus: check return value of xenbus_scanf()Jan Beulich
Don't ignore errors here: Set backend state to unknown when unsuccessful. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-10-24xenbus: prefer list_for_each()Jan Beulich
This is more efficient than list_for_each_safe() when list modification is accompanied by breaking out of the loop. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-10-24xenbus: advertise control feature flagsJuergen Gross
The Xen docs specify several flags which a guest can set to advertise which values of the xenstore control/shutdown key it will recognize. This patch adds code to write all the relevant feature-flag keys. Based-on-patch-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-10-06Merge tag 'for-linus-4.9-rc0-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from David Vrabel: "xen features and fixes for 4.9: - switch to new CPU hotplug mechanism - support driver_override in pciback - require vector callback for HVM guests (the alternate mechanism via the platform device has been broken for ages)" * tag 'for-linus-4.9-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/x86: Update topology map for PV VCPUs xen/x86: Initialize per_cpu(xen_vcpu, 0) a little earlier xen/pciback: support driver_override xen/pciback: avoid multiple entries in slot list xen/pciback: simplify pcistub device handling xen: Remove event channel notification through Xen PCI platform device xen/events: Convert to hotplug state machine xen/x86: Convert to hotplug state machine x86/xen: add missing \n at end of printk warning message xen/grant-table: Use kmalloc_array() in arch_gnttab_valloc() xen: Make VPMU init message look less scary xen: rename xen_pmu_init() in sys-hypervisor.c hotplug: Prevent alloc/free of irq descriptors during cpu up/down (again) xen/x86: Move irq allocation from Xen smp_op.cpu_up()
2016-09-30xen/pciback: support driver_overrideJuergen Gross
Support the driver_override scheme introduced with commit 782a985d7af2 ("PCI: Introduce new device binding path using pci_dev.driver_override") As pcistub_probe() is called for all devices (it has to check for a match based on the slot address rather than device type) it has to check for driver_override set to "pciback" itself. Up to now for assigning a pci device to pciback you need something like: echo 0000:07:10.0 > /sys/bus/pci/devices/0000\:07\:10.0/driver/unbind echo 0000:07:10.0 > /sys/bus/pci/drivers/pciback/new_slot echo 0000:07:10.0 > /sys/bus/pci/drivers_probe while with the patch you can use the same mechanism as for similar drivers like pci-stub and vfio-pci: echo pciback > /sys/bus/pci/devices/0000\:07\:10.0/driver_override echo 0000:07:10.0 > /sys/bus/pci/devices/0000\:07\:10.0/driver/unbind echo 0000:07:10.0 > /sys/bus/pci/drivers_probe So e.g. libvirt doesn't need special handling for pciback. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-09-30xen/pciback: avoid multiple entries in slot listJuergen Gross
The Xen pciback driver has a list of all pci devices it is ready to seize. There is no check whether a to be added entry already exists. While this might be no problem in the common case it might confuse those which consume the list via sysfs. Modify the handling of this list by not adding an entry which already exists. As this will be needed later split out the list handling into a separate function. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-09-30xen/pciback: simplify pcistub device handlingJuergen Gross
The Xen pciback driver maintains a list of all its seized devices. There are two functions searching the list for a specific device with basically the same semantics just returning different structures in case of a match. Split out the search function. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-09-30xen: Remove event channel notification through Xen PCI platform deviceKarimAllah Ahmed
Ever since commit 254d1a3f02eb ("xen/pv-on-hvm kexec: shutdown watches from old kernel") using the INTx interrupt from Xen PCI platform device for event channel notification would just lockup the guest during bootup. postcore_initcall now calls xs_reset_watches which will eventually try to read a value from XenStore and will get stuck on read_reply at XenBus forever since the platform driver is not probed yet and its INTx interrupt handler is not registered yet. That means that the guest can not be notified at this moment of any pending event channels and none of the per-event handlers will ever be invoked (including the XenStore one) and the reply will never be picked up by the kernel. The exact stack where things get stuck during xenbus_init: -xenbus_init -xs_init -xs_reset_watches -xenbus_scanf -xenbus_read -xs_single -xs_single -xs_talkv Vector callbacks have always been the favourite event notification mechanism since their introduction in commit 38e20b07efd5 ("x86/xen: event channels delivery on HVM.") and the vector callback feature has always been advertised for quite some time by Xen that's why INTx was broken for several years now without impacting anyone. Luckily this also means that event channel notification through INTx is basically dead-code which can be safely removed without impacting anybody since it has been effectively disabled for more than 4 years with nobody complaining about it (at least as far as I'm aware of). This commit removes event channel notification through Xen PCI platform device. Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Juergen Gross <jgross@suse.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Julien Grall <julien.grall@citrix.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Ross Lagerwall <ross.lagerwall@citrix.com> Cc: xen-devel@lists.xenproject.org Cc: linux-kernel@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: Anthony Liguori <aliguori@amazon.com> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-09-30xen/events: Convert to hotplug state machineSebastian Andrzej Siewior
Install the callbacks via the state machine. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-08-24xen: rename xen_pmu_init() in sys-hypervisor.cJuergen Gross
There are two functions with name xen_pmu_init() in the kernel. Rename the one in drivers/xen/sys-hypervisor.c to avoid shadowing the one in arch/x86/xen/pmu.c To avoid the same problem in future rename some more functions. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-08-24xenbus: don't look up transaction IDs for ordinary writesJan Beulich
This should really only be done for XS_TRANSACTION_END messages, or else at least some of the xenstore-* tools don't work anymore. Fixes: 0beef634b8 ("xenbus: don't BUG() on user mode induced condition") Reported-by: Richard Schütz <rschuetz@uni-koblenz.de> Cc: <stable@vger.kernel.org> Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Richard Schütz <rschuetz@uni-koblenz.de> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-08-04dma-mapping: use unsigned long for dma_attrsKrzysztof Kozlowski
The dma-mapping core and the implementations do not change the DMA attributes passed by pointer. Thus the pointer can point to const data. However the attributes do not have to be a bitfield. Instead unsigned long will do fine: 1. This is just simpler. Both in terms of reading the code and setting attributes. Instead of initializing local attributes on the stack and passing pointer to it to dma_set_attr(), just set the bits. 2. It brings safeness and checking for const correctness because the attributes are passed by value. Semantic patches for this change (at least most of them): virtual patch virtual context @r@ identifier f, attrs; @@ f(..., - struct dma_attrs *attrs + unsigned long attrs , ...) { ... } @@ identifier r.f; @@ f(..., - NULL + 0 ) and // Options: --all-includes virtual patch virtual context @r@ identifier f, attrs; type t; @@ t f(..., struct dma_attrs *attrs); @@ identifier r.f; @@ f(..., - NULL + 0 ) Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> Acked-by: Mark Salter <msalter@redhat.com> [c6x] Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris] Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm] Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp] Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core] Acked-by: David Vrabel <david.vrabel@citrix.com> [xen] Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb] Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon] Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390] Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32] Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc] Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-27Merge tag 'for-linus-4.8-rc0-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from David Vrabel: "Features and fixes for 4.8-rc0: - ACPI support for guests on ARM platforms. - Generic steal time support for arm and x86. - Support cases where kernel cpu is not Xen VCPU number (e.g., if in-guest kexec is used). - Use the system workqueue instead of a custom workqueue in various places" * tag 'for-linus-4.8-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (47 commits) xen: add static initialization of steal_clock op to xen_time_ops xen/pvhvm: run xen_vcpu_setup() for the boot CPU xen/evtchn: use xen_vcpu_id mapping xen/events: fifo: use xen_vcpu_id mapping xen/events: use xen_vcpu_id mapping in events_base x86/xen: use xen_vcpu_id mapping when pointing vcpu_info to shared_info x86/xen: use xen_vcpu_id mapping for HYPERVISOR_vcpu_op xen: introduce xen_vcpu_id mapping x86/acpi: store ACPI ids from MADT for future usage x86/xen: update cpuid.h from Xen-4.7 xen/evtchn: add IOCTL_EVTCHN_RESTRICT xen-blkback: really don't leak mode property xen-blkback: constify instance of "struct attribute_group" xen-blkfront: prefer xenbus_scanf() over xenbus_gather() xen-blkback: prefer xenbus_scanf() over xenbus_gather() xen: support runqueue steal time on xen arm/xen: add support for vm_assist hypercall xen: update xen headers xen-pciback: drop superfluous variables xen-pciback: short-circuit read path used for merging write values ...
2016-07-26mm, frontswap: convert frontswap_enabled to static keyVlastimil Babka
I have noticed that frontswap.h first declares "frontswap_enabled" as extern bool variable, and then overrides it with "#define frontswap_enabled (1)" for CONFIG_FRONTSWAP=Y or (0) when disabled. The bool variable isn't actually instantiated anywhere. This all looks like an unfinished attempt to make frontswap_enabled reflect whether a backend is instantiated. But in the current state, all frontswap hooks call unconditionally into frontswap.c just to check if frontswap_ops is non-NULL. This should at least be checked inline, but we can further eliminate the overhead when CONFIG_FRONTSWAP is enabled and no backend registered, using a static key that is initially disabled, and gets enabled only upon first backend registration. Thus, checks for "frontswap_enabled" are replaced with "frontswap_enabled()" wrapping the static key check. There are two exceptions: - xen's selfballoon_process() was testing frontswap_enabled in code guarded by #ifdef CONFIG_FRONTSWAP, which was effectively always true when reachable. The patch just removes this check. Using frontswap_enabled() does not sound correct here, as this can be true even without xen's own backend being registered. - in SYSCALL_DEFINE2(swapon), change the check to IS_ENABLED(CONFIG_FRONTSWAP) as it seems the bitmap allocation cannot currently be postponed until a backend is registered. This means that frontswap will still have some memory overhead by being configured, but without a backend. After the patch, we can expect that some functions in frontswap.c are called only when frontswap_ops is non-NULL. Change the checks there to VM_BUG_ONs. While at it, convert other BUG_ONs to VM_BUG_ONs as frontswap has been stable for some time. [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/1463152235-9717-1-git-send-email-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Juergen Gross <jgross@suse.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26xen: add static initialization of steal_clock op to xen_time_opsJuergen Gross
pv_time_ops might be overwritten with xen_time_ops after the steal_clock operation has been initialized already. To prevent calling a now uninitialized function pointer add the steal_clock static initialization to xen_time_ops. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-25xen/evtchn: use xen_vcpu_id mappingVitaly Kuznetsov
Use the newly introduced xen_vcpu_id mapping to get Xen's idea of vCPU id for CPU0. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-25xen/events: fifo: use xen_vcpu_id mappingVitaly Kuznetsov
EVTCHNOP_init_control has vCPU id as a parameter and Xen's idea of vCPU id should be used. Use the newly introduced xen_vcpu_id mapping to convert it from Linux's id. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-25xen/events: use xen_vcpu_id mapping in events_baseVitaly Kuznetsov
EVTCHNOP_bind_ipi and EVTCHNOP_bind_virq pass vCPU id as a parameter and Xen's idea of vCPU id should be used. Use the newly introduced xen_vcpu_id mapping to convert it from Linux's id. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-25x86/xen: use xen_vcpu_id mapping for HYPERVISOR_vcpu_opVitaly Kuznetsov
HYPERVISOR_vcpu_op() passes Linux's idea of vCPU id as a parameter while Xen's idea is expected. In some cases these ideas diverge so we need to do remapping. Convert all callers of HYPERVISOR_vcpu_op() to use xen_vcpu_nr(). Leave xen_fill_possible_map() and xen_filter_cpu_maps() intact as they're only being called by PV guests before perpu areas are initialized. While the issue could be solved by switching to early_percpu for xen_vcpu_id I think it's not worth it: PV guests will probably never get to the point where their idea of vCPU id diverges from Xen's. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-25xen/evtchn: add IOCTL_EVTCHN_RESTRICTDavid Vrabel
IOCTL_EVTCHN_RESTRICT limits the file descriptor to being able to bind to interdomain event channels from a specific domain. Event channels that are already bound continue to work for sending and receiving notifications. This is useful as part of deprivileging a user space PV backend or device model (QEMU). e.g., Once the device model as bound to the ioreq server event channels it can restrict the file handle so an exploited DM cannot use it to create or bind to arbitrary event channels. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2016-07-08xen/acpi: allow xen-acpi-processor driver to load on Xen 4.7Jan Beulich
As of Xen 4.7 PV CPUID doesn't expose either of CPUID[1].ECX[7] and CPUID[0x80000007].EDX[7] anymore, causing the driver to fail to load on both Intel and AMD systems. Doing any kind of hardware capability checks in the driver as a prerequisite was wrong anyway: With the hypervisor being in charge, all such checking should be done by it. If ACPI data gets uploaded despite some missing capability, the hypervisor is free to ignore part or all of that data. Ditch the entire check_prereq() function, and do the only valid check (xen_initial_domain()) in the caller in its place. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-08xenbus: simplify xenbus_dev_request_and_reply()Jan Beulich
No need to retain a local copy of the full request message, only the type is really needed. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-08xenbus: don't bail early from xenbus_dev_request_and_reply()Jan Beulich
xenbus_dev_request_and_reply() needs to track whether a transaction is open. For XS_TRANSACTION_START messages it calls transaction_start() and for XS_TRANSACTION_END messages it calls transaction_end(). If sending an XS_TRANSACTION_START message fails or responds with an an error, the transaction is not open and transaction_end() must be called. If sending an XS_TRANSACTION_END message fails, the transaction is still open, but if an error response is returned the transaction is closed. Commit 027bd7e89906 ("xen/xenbus: Avoid synchronous wait on XenBus stalling shutdown/restart") introduced a regression where failed XS_TRANSACTION_START messages were leaving the transaction open. This can cause problems with suspend (and migration) as all transactions must be closed before suspending. It appears that the problematic change was added accidentally, so just remove it. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-07xenbus: don't BUG() on user mode induced conditionJan Beulich
Inability to locate a user mode specified transaction ID should not lead to a kernel crash. For other than XS_TRANSACTION_START also don't issue anything to xenbus if the specified ID doesn't match that of any active transaction. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-06xen: support runqueue steal time on xenJuergen Gross
Up to now reading the stolen time of a remote cpu was not possible in a performant way under Xen. This made support of runqueue steal time via paravirt_steal_rq_enabled impossible. With the addition of an appropriate hypervisor interface this is now possible, so add the support. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-06xen-pciback: drop superfluous variablesJan Beulich
req_start is simply an alias of the "offset" function parameter, and req_end is being used just once in each function. (And both variables were loop invariant anyway, so should at least have got initialized outside the loop.) Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-06xen-pciback: short-circuit read path used for merging write valuesJan Beulich
There's no point calling xen_pcibk_config_read() here - all it'll do is return whatever conf_space_read() returns for the field which was found here (and which would be found there again). Also there's no point clearing tmp_val before the call. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-06xen-pciback: use const and unsigned in bar_init()Jan Beulich
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-06xen-pciback: simplify determination of 64-bit memory resourceJan Beulich
Other than for raw BAR values, flags are properly separated in the internal representation. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-06xen-pciback: fold read_dev_bar() into its now single callerJan Beulich
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-06xen-pciback: drop rom_init()Jan Beulich
It is now identical to bar_init(). Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-06xen-pciback: drop unused function parameter of read_dev_bar()Jan Beulich
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-06xen: xenbus: Remove create_workqueueBhaktipriya Shridhar
System workqueues have been able to handle high level of concurrency for a long time now and there's no reason to use dedicated workqueues just to gain concurrency. Replace dedicated xenbus_frontend_wq with the use of system_wq. Unlike a dedicated per-cpu workqueue created with create_workqueue(), system_wq allows multiple work items to overlap executions even on the same CPU; however, a per-cpu workqueue doesn't have any CPU locality or global ordering guarantees unless the target CPU is explicitly specified and the increase of local concurrency shouldn't make any difference. In this case, there is only a single work item, increase of concurrency level by switching to system_wq should not make any difference. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-06xen: xen-pciback: Remove create_workqueueBhaktipriya Shridhar
System workqueues have been able to handle high level of concurrency for a long time now and there's no reason to use dedicated workqueues just to gain concurrency. Replace dedicated xen_pcibk_wq with the use of system_wq. Unlike a dedicated per-cpu workqueue created with create_workqueue(), system_wq allows multiple work items to overlap executions even on the same CPU; however, a per-cpu workqueue doesn't have any CPU locality or global ordering guarantees unless the target CPU is explicitly specified and thus the increase of local concurrency shouldn't make any difference. Since the work items could be pending, flush_work() has been used in xen_pcibk_disconnect(). xen_pcibk_xenbus_remove() calls free_pdev() which in turn calls xen_pcibk_disconnect() for every pdev to ensure that there is no pending task while disconnecting the driver. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-06xen: add steal_clock support on x86Juergen Gross
The pv_time_ops structure contains a function pointer for the "steal_clock" functionality used only by KVM and Xen on ARM. Xen on x86 uses its own mechanism to account for the "stolen" time a thread wasn't able to run due to hypervisor scheduling. Add support in Xen arch independent time handling for this feature by moving it out of the arm arch into drivers/xen and remove the x86 Xen hack. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-06xen: use vma_pages().Muhammad Falak R Wani
Replace explicit computation of vma page count by a call to vma_pages(). Signed-off-by: Muhammad Falak R Wani <falakreyaz@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-06ARM64: XEN: Add a function to initialize Xen specific UEFI runtime servicesShannon Zhao
When running on Xen hypervisor, runtime services are supported through hypercall. Add a Xen specific function to initialize runtime services. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tested-by: Julien Grall <julien.grall@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2016-07-06XEN: EFI: Move x86 specific codes to architecture directoryShannon Zhao
Move x86 specific codes to architecture directory and export those EFI runtime service functions. This will be useful for initializing runtime service on ARM later. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tested-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
2016-07-06Xen: ARM: Add support for mapping AMBA device mmioShannon Zhao
Add a bus_notifier for AMBA bus device in order to map the device mmio regions when DOM0 booting with ACPI. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Julien Grall <julien.grall@arm.com> Tested-by: Julien Grall <julien.grall@arm.com>
2016-07-06Xen: ARM: Add support for mapping platform device mmioShannon Zhao
Add a bus_notifier for platform bus device in order to map the device mmio regions when DOM0 booting with ACPI. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tested-by: Julien Grall <julien.grall@arm.com>
2016-07-06Xen: xlate: Use page_to_xen_pfn instead of page_to_pfnShannon Zhao
Make xen_xlate_map_ballooned_pages work with 64K pages. In that case Kernel pages are 64K in size but Xen pages remain 4K in size. Xen pfns refer to 4K pages. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Julien Grall <julien.grall@arm.com> Tested-by: Julien Grall <julien.grall@arm.com>
2016-07-06xen/grant-table: Move xlated_setup_gnttab_pages to common placeShannon Zhao
Move xlated_setup_gnttab_pages to common place, so it can be reused by ARM to setup grant table. Rename it to xen_xlate_map_ballooned_pages. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Julien Grall <julien.grall@arm.com> Tested-by: Julien Grall <julien.grall@arm.com>
2016-06-24xen-pciback: return proper values during BAR sizingJan Beulich
Reads following writes with all address bits set to 1 should return all changeable address bits as one, not the BAR size (nor, as was the case for the upper half of 64-bit BARs, the high half of the region's end address). Presumably this didn't cause any problems so far because consumers use the value to calculate the size (usually via val & -val), and do nothing else with it. But also consider the exception here: Unimplemented BARs should always return all zeroes. And finally, the check for whether to return the sizing address on read for the ROM BAR should ignore all non-address bits, not just the ROM Enable one. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-06-23xen/pciback: Fix conf_space read/write overlap check.Andrey Grodzovsky
Current overlap check is evaluating to false a case where a filter field is fully contained (proper subset) of a r/w request. This change applies classical overlap check instead to include all the scenarios. More specifically, for (Hilscher GmbH CIFX 50E-DP(M/S)) device driver the logic is such that the entire confspace is read and written in 4 byte chunks. In this case as an example, CACHE_LINE_SIZE, LATENCY_TIMER and PCI_BIST are arriving together in one call to xen_pcibk_config_write() with offset == 0xc and size == 4. With the exsisting overlap check the LATENCY_TIMER field (offset == 0xd, length == 1) is fully contained in the write request and hence is excluded from write, which is incorrect. Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Jan Beulich <JBeulich@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>