summaryrefslogtreecommitdiff
path: root/Documentation/firmware_class
diff options
context:
space:
mode:
authorAlex Williamson <alex.williamson@redhat.com>2015-02-06 17:58:56 (GMT)
committerAlex Williamson <alex.williamson@redhat.com>2015-02-06 17:58:56 (GMT)
commit6fe1010d6d9c02cf3556ab076585104551a6ee7e (patch)
treea4067ec65d2adef950cd233db2998c725b0a6905 /Documentation/firmware_class
parente36f014edff70fc02b3d3d79cead1d58f289332e (diff)
downloadlinux-6fe1010d6d9c02cf3556ab076585104551a6ee7e.tar.xz
vfio/type1: DMA unmap chunking
When unmapping DMA entries we try to rely on the IOMMU API behavior that allows the IOMMU to unmap a larger area than requested, up to the size of the original mapping. This works great when the IOMMU supports superpages *and* they're in use. Otherwise, each PAGE_SIZE increment is unmapped separately, resulting in poor performance. Instead we can use the IOVA-to-physical-address translation provided by the IOMMU API and unmap using the largest contiguous physical memory chunk available, which is also how vfio/type1 would have mapped the region. For a synthetic 1TB guest VM mapping and shutdown test on Intel VT-d (2M IOMMU pagesize support), this achieves about a 30% overall improvement mapping standard 4K pages, regardless of IOMMU superpage enabling, and about a 40% improvement mapping 2M hugetlbfs pages when IOMMU superpages are not available. Hugetlbfs with IOMMU superpages enabled is effectively unchanged. Unfortunately the same algorithm does not work well on IOMMUs with fine-grained superpages, like AMD-Vi, costing about 25% extra since the IOMMU will automatically unmap any power-of-two contiguous mapping we've provided it. We add a routine and a domain flag to detect this feature, leaving AMD-Vi unaffected by this unmap optimization. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Diffstat (limited to 'Documentation/firmware_class')
0 files changed, 0 insertions, 0 deletions