linux - NXP/Freescale LSDK linux tree with Scalys patches

Age	Commit message (Collapse)	Author
2016-01-28	drm/i915: Fix VCS ring selection after uapi decoupling	Tvrtko Ursulin
	This got broken in: commit de1add360522c876c25ef2bbbbab1c94bdb509ab Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Date: Fri Jan 15 15:12:50 2016 +0000 drm/i915: Decouple execbuf uAPI from internal implementation BSD ring flags need to be shifted before they can be considered indices into the ring array. Reported by Zhipeng Gong. v2: Simplify the code. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Zhipeng Gong <zhipeng.gong@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1453902069-31353-1-git-send-email-tvrtko.ursulin@linux.intel.com Testcase: igt/gem_exec_basic # bdw-gt3
2016-01-21	drm/i915: Seal busy-ioctl uABI and prevent leaking of internal ids	Chris Wilson
	Tvrtko was looking through the execbuffer-ioctl and noticed that the uABI was tightly coupled to our internal engine identifiers. Close inspection also revealed that we leak those internal engine identifiers through the busy-ioctl, and those internal identifiers already do not match the user identifiers. Fortuitiously, there is only one user of the set of busy rings from the busy-ioctl, and they only wish to choose between the RENDER and the BLT engines. Let's fix the userspace ABI while we still can. v2: Update the uAPI documentation to explain the identifiers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Testcase: igt/gem_busy Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1452876706-21620-1-git-send-email-chris@chris-wilson.co.uk
2015-12-23	Merge tag 'drm-intel-next-2015-12-18' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm-intel into drm-next - fix atomic watermark recomputation logic (Maarten) - modeset sequence fixes for LPT (Ville) - more kbl enabling&prep work (Rodrigo, Wayne) - first bits for mst audio - page dirty tracking fixes from Dave Gordon - new get_eld hook from Takashi, also included in the sound tree - fixup cursor handling when placed at address 0 (Ville) - refactor VBT parsing code (Jani) - rpm wakelock debug infrastructure ( Imre) - fbdev is pinned again (Chris) - tune the busywait logic to avoid wasting cpu cycles (Chris) * tag 'drm-intel-next-2015-12-18' of git://anongit.freedesktop.org/drm-intel: (81 commits) drm/i915: Update DRIVER_DATE to 20151218 drm/i915/skl: Default to noncoherent access up to F0 drm/i915: Only spin whilst waiting on the current request drm/i915: Limit the busy wait on requests to 5us not 10ms! drm/i915: Break busywaiting for requests on pending signals drm/i915: don't enable autosuspend on platforms without RPM support drm/i915/backlight: prefer dev_priv over dev pointer drm/i915: Disable primary plane if we fail to reconstruct BIOS fb (v2) drm/i915: Pin the ifbdev for the info->system_base GGTT mmapping drm/i915: Set the map-and-fenceable flag for preallocated objects drm/i915: mdelay(10) considered harmful drm/i915: check that we are in an RPM atomic section in GGTT PTE updaters drm/i915: add support for checking RPM atomic sections drm/i915: check that we hold an RPM wakelock ref before we put it drm/i915: add support for checking if we hold an RPM reference drm/i915: use assert_rpm_wakelock_held instead of opencoding it drm/i915: add assert_rpm_wakelock_held helper drm/i915: remove HAS_RUNTIME_PM check from RPM get/put/assert helpers drm/i915: get a permanent RPM reference on platforms w/o RPM support drm/i915: refactor RPM disabling due to RC6 being disabled ...
2015-12-11	Merge branch 'drm-header-fixes' of https://github.com/GabrielL/linux into ↵	Dave Airlie
	drm-next Fix all the problems with the header files and userspace builds off them. I really care so little about this, but hey who am I to stop progress. * 'drm-header-fixes' of https://github.com/GabrielL/linux: (30 commits) drm: fix inclusion of drm.h in via_drm.h drm: fix inclusion of drm.h in vmwgfx_drm.h drm: fix inclusion of drm.h in virtgpu_drm.h drm: fix inclusion of drm.h in tegra_drm.h drm: fix inclusion of drm.h in savage_drm.h drm: fix inclusion of drm.h in r128_drm.h drm: fix inclusion of drm.h in qxl_drm.h drm: fix inclusion of drm.h in omap_drm.h drm: fix inclusion of drm.h in msm_drm.h drm: fix inclusion of drm.h in mga_drm.h drm: fix inclusion of drm.h in exynos_sarea.h drm: fix inclusion of drm.h in i810_drm.h drm: fix inclusion of drm.h in exynos_sarea.h drm: fix inclusion of drm.h in drm_sarea.h drm: drm_mode.h fix includes drm: drm_fourcc.h fix includes drm: include drm.h in armada_drm.h include/uapi/drm/amdgpu_drm.h: use __u32 and __u64 from <linux/types.h> drm: Kbuild: add admgpu_drm.h to the installed headers drm: use __u{32,64} instead of uint{32,64}_t in virtgpu_drm.h ...
2015-12-10	drm: fix inclusion of drm.h in exynos_sarea.h	Gabriel Laskar
	Using `#include "drm.h"` instead of `#include <drm/drm.h>` allow drm headers to be moved in another directory without changes, like for the libdrm imports. Signed-off-by: Gabriel Laskar <gabriel@lse.epita.fr> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> CC: Emil Velikov <emil.l.velikov@gmail.com> CC: Mikko Rapeli <mikko.rapeli@iki.fi>
2015-12-09	drm/i915: Add soft-pinning API for execbuffer	Chris Wilson
	Userspace can pass in an offset that it presumes the object is located at. The kernel will then do its utmost to fit the object into that location. The assumption is that userspace is handling its own object locations (for example along with full-ppgtt) and that the kernel will rarely have to make space for the user's requests. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> v2: Fixed incorrect eviction found by Michal Winiarski - fix suggested by Chris Wilson. Fixed incorrect error paths causing crash found by Michal Winiarski. (Not published externally) v3: Rebased because of trivial conflict in object_bind_to_vm. Fixed eviction to allow eviction of soft-pinned objects when another soft-pinned object used by a subsequent execbuffer overlaps reported by Michal Winiarski. (Not published externally) v4: Moved soft-pinned objects to the front of ordered_vmas so that they are pinned first after an address conflict happens to avoid repeated conflicts in rare cases (Suggested by Chris Wilson). Expanded comment on drm_i915_gem_exec_object2.offset to cover this new API. v5: Added I915_PARAM_HAS_EXEC_SOFTPIN parameter for detecting this capability (Kristian). Added check for multiple pinnings on eviction (Akash). Made sure buffers are not considered misplaced without the user specifying EXEC_OBJECT_SUPPORTS_48B_ADDRESS. User must assume responsibility for any addressing workarounds. Updated object2.offset field comment again to clarify NO_RELOC case (Chris). checkpatch cleanup. v6: Trivial rebase on latest drm-intel-nightly v7: Catch attempts to pin above the max virtual address size and return EINVAL (Tvrtko). Decouple EXEC_OBJECT_SUPPORTS_48B_ADDRESS and EXEC_OBJECT_PINNED flags, user must pass both flags in any attempt to pin something at an offset above 4GB (Chris, Daniel Vetter). Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Akash Goel <akash.goel@intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Zou Nanhai <nanhai.zou@intel.com> Cc: Kristian Høgsberg <hoegsberg@gmail.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Acked-by: PDT Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1449575707-20933-1-git-send-email-thomas.daniel@intel.com
2015-11-18	drm/i915: Make the high dword offset more explicit in i915_reg_read_ioctl	Ville Syrjälä
	Store the upper dword of the register offset in the whitelist as well. This would allow it to read register where the two halves aren't sitting right next to each other, and it'll make it easier to make register access type safe. While at it change the register offsets to u32 from u64. Our register space isn't quite that big, yet :) v2: Use ldw/udw as the suffixes, and add a note about 64bit wide split regs (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1446839021-18599-1-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-10-19	drm/i915: Report context GTT size	Chris Wilson
	Since the beginning we have conflated the size of the global GTT with that of the per-process context sizes. In recent times (gen8+), those are no longer the same where the global GTT is limited to 2/4GiB but the per-process GTT may be anything up to 256TiB. Userspace knows nothing of this discrepancy and outside of one or two hacks, uses the getaperture ioctl to determine the maximum size it can use. Let's leave that as reporting the global GTT and use the context reporting method to describe the per-process value (which naturally fallsback to reporting the aliasing or global on older platforms, so userspace can always use this method where available). Testcase: igt/gem_userptr_blits/minor-normal-sync Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90065 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-10-01	drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset	Michel Thierry
	There are some allocations that must be only referenced by 32-bit offsets. To limit the chances of having the first 4GB already full, objects not requiring this workaround use DRM_MM_SEARCH_BELOW/ DRM_MM_CREATE_TOP flags In specific, any resource used with flat/heapless (0x00000000-0xfffff000) General State Heap (GSH) or Instruction State Heap (ISH) must be in a 32-bit range, because the General State Offset and Instruction State Offset are limited to 32-bits. Objects must have EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag to indicate if they can be allocated above the 32-bit address range. To limit the chances of having the first 4GB already full, objects will use DRM_MM_SEARCH_BELOW + DRM_MM_CREATE_TOP flags when possible. The libdrm user of the EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag is here: http://lists.freedesktop.org/archives/intel-gfx/2015-September/075836.html v2: Changed flag logic from neeeds_32b, to supports_48b. v3: Moved 48-bit support flag back to exec_object. (Chris, Daniel) v4: Split pin flags into PIN_ZONE_4G and PIN_HIGH; update PIN_OFFSET_MASK to use last PIN_ defined instead of hard-coded value; use correct limit check in eb_vma_misplaced. (Chris) v5: Don't touch PIN_OFFSET_MASK and update workaround comment (Chris) v6: Apply pin-high for ggtt too (Chris) v7: Handle simultaneous pin-high and pin-mappable end correctly (Akash) Fix check for entries currently using +4GB addresses, use min_t and other polish in object_bind_to_vm (Chris) v8: Commit message updated to point to libdrm patch. v9: vmas are allocated in the correct ozone, so only check flag when the vma has not been allocated. (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4) Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-09-02	uapi/drm/i915_drm.h: fix userspace compilation.	Artem Savkov
	commit 346add7834557b5b9628b9bf2387106d42e631d4 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Jul 14 18:07:30 2015 +0200 drm/i915: Use expcitly fixed type in compat32 structs changed the type of param field in drm_i915_getparam from int to s32. This header is exported to userspace and needs to use userspace type __s32 instead. This fixes userspace compilation errors like the following: include/drm/i915_drm.h:361:2: error: unknown type name 's32' s32 param; Signed-off-by: Artem Savkov <asavkov@redhat.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-08-17	Merge tag 'v4.2-rc7' into drm-next	Dave Airlie
	Linux 4.2-rc7 Backmerge master for i915 fixes
2015-07-21	drm/i915: Use two 32bit reads for select 64bit REG_READ ioctls	Chris Wilson
	Since the hardware sometimes mysteriously totally flummoxes the 64bit read of a 64bit register when read using a single instruction, split the read into two instructions. Since the read here is of automatically incrementing timestamp counters, we also have to be very careful in order to make sure that it does not increment between the two instructions. However, since userspace tried to workaround this issue and so enshrined this ABI for a broken hardware read and in the process neglected that the read only fails in some environments, we have to introduce a new uABI flag for userspace to request the 2x32 bit accurate read of the timestamp. v2: Fix alignment check and include details of the workaround for userspace. Reported-by: Karol Herbst <freedesktop@karolherbst.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91317 Testcase: igt/gem_reg_read Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: stable@vger.kernel.org Tested-by: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-07-15	drm/i915: Use expcitly fixed type in compat32 structs	Daniel Vetter
	I was confused shortly whether the compat was needed for the int, until I noticed the pointer in the original. Also remove typedef. v2: Review from Chris. - Add comments. - Also change the int param in the original structure. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-07-06	drm/i915: Expose I915_EXEC_RESOURCE_STREAMER flag and getparam	Abdiel Janulgue
	Ensures that the batch buffer is executed by the resource streamer. And will let userspace know whether Resource Streamer is supported in the kernel. v2: Don't skip 1<<15 for the exec flags (Jani Nikula) v3: Use HAS_RESOURCE_STREAMER macro for execbuf validation (Chris Wilson) (from getparam patch) v2: Update I915_PARAM_HAS_RESOURCE_STREAMER so it's after I915_PARAM_HAS_GPU_RESET. v3: Only advertise RS support for hardware that supports it. v4: Add HAS_RESOURCE_STREAMER() macro (Chris) Testcase: igt/gem_exec_params Cc: Jani Nikula <jani.nikula@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> [danvet: squash in getparam patch since it'd break bisect, suggested by Chris.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-15	drm/i915: Report to userspace if we have a (presumed) working GPU reset	Chris Wilson
	In igt, we want to test handling of GPU hangs, both for recovery purposes and for reporting. However, we don't want to inject a genuine GPU hang onto a machine that cannot recover and so be permenantly wedged. Rather than embed heuristics into igt, have the kernel report exactly when it expects the GPU reset to work. This can also be usefully extended in future to indicate different levels of fine-grained resets. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Tim Gore <tim.gore@intel.com> Cc: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-29	drm/i915: add a context parameter to {en, dis}able zero address mapping	David Weinehall
	Export a new context parameter that can be set/queried through the context_{get,set}param ioctls. This parameter is passed as a context flag and decides whether or not a GPU address mapping is allowed to be made at address zero. The default is to allow such mappings. Signed-off-by: David Weinehall <david.weinehall@intel.com> Acked-by: "Zou, Nanhai" <nanhai.zou@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-26	drm/i915: Fix the confusing comment about the ioctl limits	Damien Lespiau
	It was reported that this comment was confusing, and indeed it is. v2: (one year later!) Add the range for the DRM_I915_* iotcl defines (Daniel) Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-10	drm/i915: Allow disabling the destination colorkey for overlay	Chris Wilson
	Sometimes userspace wants a true overlay that is never clipped. In such cases, we need to disable the destination colorkey. However, it is currently unconditionally enabled in the overlay with no means of disabling. So rectify that by always default to on, and extending the UPDATE_ATTR ioctl to support explicit disabling of the colorkey. This is contrast to the spite code which requires explicit enabling of either the destination or source colorkey. Handling source colorkey is still todo for the overlay. (Of course it may be worth migrating overlay to sprite before then.) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-03-27	drm/i915: fix definition of the DRM_IOCTL_I915_GET_SPRITE_COLORKEY ioctl	Tommi Rantala
	Fix definition of the DRM_IOCTL_I915_GET_SPRITE_COLORKEY ioctl, so that it is different from the DRM_IOCTL_I915_SET_SPRITE_COLORKEY ioctl. Note that this is just for accuracy, the ioctl implementation itself is totally unused and already ripped out. Signed-off-by: Tommi Rantala <tt.rantala@gmail.com> [danvet: Add note that this is a dead ioctl.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-03-17	drm/i915: Export total subslice and EU counts	Jeff McGee
	Setup new I915_GETPARAM ioctl entries for subslice total and EU total. Userspace drivers need these values when constructing GPGPU commands. This kernel query method is intended to replace the PCI ID-based tables that userspace drivers currently maintain. The kernel driver can employ fuse register reads as needed to ensure the most accurate determination of GT config attributes. This first became important with Cherryview in which the config could differ between devices with the same PCI ID. The kernel detection of these values is device-specific and not included in this patch. Because zero is not a valid value for any of these parameters, a value of zero is interpreted as unknown for the device. Userspace drivers should continue to maintain ID-based tables for older devices not supported by the new query method. v2: Increment our I915_GETPARAM indices to fit after REVISION which was merged ahead of us. For: VIZ-4636 Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> Tested-by: Zhigang Gong <zhigang.gong@linux.intel.com> Acked-by: Zhigang Gong <zhigang.gong@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-03-17	drm/i915: Add I915_PARAM_REVISION	Neil Roberts
	Adds a parameter which can be used with DRM_I915_GETPARAM to query the GPU revision. The intention is to use this in Mesa to implement the WaDisableSIMD16On3SrcInstr workaround on Skylake but only for revision 2. Signed-off-by: Neil Roberts <neil@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-27	drm/i915: add I915_PARAM_HAS_BSD2 to i915_getparam	Zhipeng Gong
	This will let userland only try to use the new ring when the appropriate kernel is present v2: change the number to be consistent with upstream (Zhipeng) Signed-off-by: Zhipeng Gong <zhipeng.gong@intel.com> Reviewed--by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-27	drm/i915: Specify bsd rings through exec flag	Zhipeng Gong
	On Skylake GT3 we have 2 Video Command Streamers (VCS), which is asymmetrical. For example, HEVC GPU commands can be only dispatched to VCS1 ring. But userspace has no control when using VCS1 or VCS2. This patch introduces a mechanism to avoid the default ping-pong mode and use one specific ring through execution flag. This mechanism is usable for all the platforms with 2 VCS rings. The open source usage is from these two commits in vaapi/intel: commit 702050f04131a44ef8ac16651708ce8a8d98e4b8 Author: Zhao, Yakui <yakui.zhao@intel.com> Date: Mon Nov 17 12:44:19 2014 +0800 Allow the batchbuffer to be submitted with override flag commit a56efcdf27d11ad9b21664b4a2cda72d7f90f5a8 Author: Zhao Yakui <yakui.zhao@intel.com> Date: Mon Nov 17 12:44:22 2014 +0800 Add the override flag to assure that HEVC video command always uses BSD ring0 for SKL GT3 machine v2: fix whitespace (Rodrigo) v3: remove incorrect chunk that came on -collector rebase. (Rodrigo) v4: change the comment (Zhipeng) v5: address Daniel's comment (Zhipeng) Signed-off-by: Zhipeng Gong <zhipeng.gong@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-07	drm/i915: Add ioctl to set per-context parameters	Chris Wilson
	Sometimes we wish to tweak how an individual context behaves. Since we always create a context for every filp, this means that individual processes can fine tune their behaviour even if they do not explicitly create a context. The first example parameter here is to enable multi-process GPU testing, but the interface should be able to cope with passing arbitrarily complex parameters. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Testcase: igt/gem_reset_stats/ban-period-* Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-06	drm/i915: Support creation of unbound wc user mappings for objects	Akash Goel
	This patch provides support to create write-combining virtual mappings of GEM object. It intends to provide the same funtionality of 'mmap_gtt' interface without the constraints and contention of a limited aperture space, but requires clients handles the linear to tile conversion on their own. This is for improving the CPU write operation performance, as with such mapping, writes and reads are almost 50% faster than with mmap_gtt. Similar to the GTT mmapping, unlike the regular CPU mmapping, it avoids the cache flush after update from CPU side, when object is passed onto GPU. This type of mapping is specially useful in case of sub-region update, i.e. when only a portion of the object is to be updated. Using a CPU mmap in such cases would normally incur a clflush of the whole object, and using a GTT mmapping would likely require eviction of an active object or fence and thus stall. The write-combining CPU mmap avoids both. To ensure the cache coherency, before using this mapping, the GTT domain has been reused here. This provides the required cache flush if the object is in CPU domain or synchronization against the concurrent rendering. Although the access through an uncached mmap should automatically invalidate the cache lines, this may not be true for non-temporal write instructions and also not all pages of the object may be updated at any given point of time through this mapping. Having a call to get_pages in set_to_gtt_domain function, as added in the earlier patch 'drm/i915: Broaden application of set-domain(GTT)', would guarantee the clflush and so there will be no cachelines holding the data for the object before it is accessed through this map. The drm_i915_gem_mmap structure (for the DRM_I915_GEM_MMAP_IOCTL) has been extended with a new flags field (defaulting to 0 for existent users). In order for userspace to detect the extended ioctl, a new parameter I915_PARAM_MMAP_VERSION has been added for versioning the ioctl interface. v2: Fix error handling, invalid flag detection, renaming (ickle) v3: Rebase to latest drm-intel-nightly codebase The new mmapping is exercised by igt/gem_mmap_wc, igt/gem_concurrent_blit and igt/gem_gtt_speed. Change-Id: Ie883942f9e689525f72fe9a8d3780c3a9faa769a Signed-off-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-11-14	drm/i915: Make the physical object coherent with GTT	Chris Wilson
	Currently objects for which the hardware needs a contiguous physical address are allocated a shadow backing storage to satisfy the contraint. This shadow buffer is not wired into the normal obj->pages and so the physical object is incoherent with accesses via the GPU, GTT and CPU. By setting up the appropriate scatter-gather table, we can allow userspace to access the physical object via either a GTT mmaping of or by rendering into the GEM bo. However, keeping the CPU mmap of the shmemfs backing storage coherent with the contiguous shadow is not yet possible. Fortuituously, CPU mmaps of objects requiring physical addresses are not expected to be coherent anyway. This allows the physical constraint of the GEM object to be transparent to userspace and allow it to efficiently render into or update them via the GTT and GPU. v2: Fix leak of pci handle spotted by Ville v3: Remove the now duplicate call to detach_phys_object during free. v4: Wait for rendering before pwrite. As this patch makes it possible to render into the phys object, we should make it correct as well! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-11-07	drm/i915: Report the actual swizzling back to userspace	Chris Wilson
	Userspace cares about whether or not swizzling depends on the page address for its direct access into bound objects. Extend the get_tiling ioctl to report the physical swizzling value in addition to the logical swizzling value so that userspace can accurately determine when it is possible for manual detiling. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Akash Goel <akash.goel@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Testcase: igt/gem_tiled_wc Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-05-16	drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl	Chris Wilson
	By exporting the ability to map user address and inserting PTEs representing their backing pages into the GTT, we can exploit UMA in order to utilize normal application data as a texture source or even as a render target (depending upon the capabilities of the chipset). This has a number of uses, with zero-copy downloads to the GPU and efficient readback making the intermixed streaming of CPU and GPU operations fairly efficient. This ability has many widespread implications from faster rendering of client-side software rasterisers (chromium), mitigation of stalls due to read back (firefox) and to faster pipelining of texture data (such as pixel buffer objects in GL or data blobs in CL). v2: Compile with CONFIG_MMU_NOTIFIER v3: We can sleep while performing invalidate-range, which we can utilise to drop our page references prior to the kernel manipulating the vma (for either discard or cloning) and so protect normal users. v4: Only run the invalidate notifier if the range intercepts the bo. v5: Prevent userspace from attempting to GTT mmap non-page aligned buffers v6: Recheck after reacquire mutex for lost mmu. v7: Fix implicit padding of ioctl struct by rounding to next 64bit boundary. v8: Fix rebasing error after forwarding porting the back port. v9: Limit the userptr to page aligned entries. We now expect userspace to handle all the offset-in-page adjustments itself. v10: Prevent vma from being copied across fork to avoid issues with cow. v11: Drop vma behaviour changes -- locking is nigh on impossible. Use a worker to load user pages to avoid lock inversions. v12: Use get_task_mm()/mmput() for correct refcounting of mm. v13: Use a worker to release the mmu_notifier to avoid lock inversion v14: Decouple mmu_notifier from struct_mutex using a custom mmu_notifer with its own locking and tree of objects for each mm/mmu_notifier. v15: Prevent overlapping userptr objects, and invalidate all objects within the mmu_notifier range v16: Fix a typo for iterating over multiple objects in the range and rearrange error path to destroy the mmu_notifier locklessly. Also close a race between invalidate_range and the get_pages_worker. v17: Close a race between get_pages_worker/invalidate_range and fresh allocations of the same userptr range - and notice that struct_mutex was presumed to be held when during creation it wasn't. v18: Sigh. Fix the refactor of st_set_pages() to allocate enough memory for the struct sg_table and to clear it before reporting an error. v19: Always error out on read-only userptr requests as we don't have the hardware infrastructure to support them at the moment. v20: Refuse to implement read-only support until we have the required infrastructure - but reserve the bit in flags for future use. v21: use_mm() is not required for get_user_pages(). It is only meant to be used to fix up the kernel thread's current->mm for use with copy_user(). v22: Use sg_alloc_table_from_pages for that chunky feeling v23: Export a function for sanity checking dma-buf rather than encode userptr details elsewhere, and clean up comments based on suggestions by Bradley. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com> Cc: Akash Goel <akash.goel@intel.com> Cc: "Volkin, Bradley D" <bradley.d.volkin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> [danvet: Frob ioctl allocation to pick the next one - will cause a bit of fuss with create2 apparently, but such are the rules.] [danvet2: oops, forgot to git add after manual patch application] [danvet3: Appease sparse.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-04-01	drm/i915: Add a CMD_PARSER_VERSION getparam	Brad Volkin
	So userspace can query the kernel for command parser support. v2: Add i915_cmd_parser_get_version(), history log, and kerneldoc OTC-Tracker: AXIA-4631 Change-Id: I58af650db9f6753c2dcac9c54ab432fd31db302f Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-22	drm/i915: Spelling s/auxilliary/auxiliary/	Geert Uytterhoeven
	Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-12	drm/i915: add i915_get_reset_stats_ioctl	Mika Kuoppala
	This ioctl returns reset stats for specified context. The struct returned contains context loss counters. reset_count: all resets across all contexts batch_active: active batches lost on resets batch_pending: pending batches lost on resets v2: get rid of state tracking completely and deliver only counts. Idea from Chris Wilson. v3: fix commit message v4: default context handled inside i915_gem_context_get_hang_stats v5: reset_count only for priviledged process v6: ctx=0 needs CAP_SYS_ADMIN for batch_* counters (Chris Wilson) v7: context hang stats never returns NULL v8: rebased on top of reworked context hang stats DRM_RENDER_ALLOW for ioctl v9: use DEFAULT_CONTEXT_ID. Improve comments for ioctl struct members Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Ian Romanick <idr@freedesktop.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-09-19	drm/i915: Add second slice l3 remapping	Ben Widawsky
	Certain HSW SKUs have a second bank of L3. This L3 remapping has a separate register set, and interrupt from the first "slice". A slice is simply a term to define some subset of the GPU's l3 cache. This patch implements both the interrupt handler, and ability to communicate with userspace about this second slice. v2: Remove redundant check about non-existent slice. Change warning about interrupts of unknown slices to WARN_ON_ONCE Handle the case where we get 2 slice interrupts concurrently, and switch the tracking of interrupts to be non-destructive (all Ville) Don't enable/mask the second slice parity interrupt for ivb/vlv (even though all docs I can find claim it's rsvd) (Ville + Bryan) Keep BYT excluded from L3 parity v3: Fix the slice = ffs to be decremented by one (found by Ville). When I initially did my testing on the series, I was using 1-based slice counting, so this code was correct. Not sure why my simpler tests that I've been running since then didn't pick it up sooner. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-22	drm/i915: Use Write-Through cacheing for the display plane on Iris	Chris Wilson
	Haswell GT3e has the unique feature of supporting Write-Through cacheing of objects within the eLLC/LLC. The purpose of this is to enable the display plane to remain coherent whilst objects lie resident in the eLLC/LLC - so that we, in theory, get the best of both worlds, perfect display and fast access. However, we still need to be careful as the CPU does not see the WT when accessing the cache. In particular, this means that we need to flush the cache lines after writing to an object through the CPU, and on transitioning from a cached state to WT. v2: Actually do the clflush on transition to WT, nagging by Ville. v3: Flush the CPU cache after writes into WT objects. v4: Rease onto LLC updates and report WT as "uncached" for get_cache_level_ioctl to remain symmetric with set_cache_level_ioctl. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-22	drm/i915: reserve I915_CACHING_DISPLAY and document cache modes	Daniel Vetter
	Resolve the catch-22 of igt needing a stable number and patches first needing testcases by reserving the interface number up-front. v2: Improve the spelling a bit. v3: More spelling fail spotted by Chris. Requested-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-19	drm/i915: Make i915 events part of uapi	Ben Widawsky
	Make the uevent strings part of the user API for people who wish to write their own listeners. v2: Make a space in the string concatenation. (Chad) Use the "UEVENT" suffix intead of "EVENT" (Chad) Make kernel-doc parseable Docbook comments (Daniel) v3: Undid reset change introduced in last submission (Daniel) Fixed up comments to address removal changes. Thanks to Daniel Vetter for a majority of the parity error comments. CC: Chad Versace <chad.versace@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-31	drm/i915: add I915_PARAM_HAS_VEBOX to i915_getparam	Xiang, Haihao
	This will let userland only try to use the new ring when the appropriate kernel is present Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-31	drm/i915: add I915_EXEC_VEBOX to i915_gem_do_execbuffer()	Xiang, Haihao
	A user can run batchbuffer via VEBOX ring. Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-17	drm/i915: Use the reloc.handle as an index into the execbuffer array	Chris Wilson
	Using copywinwin10 as an example that is dependent upon emitting a lot of relocations (2 per operation), we see improvements of: c2d/gm45: 618000.0/sec to 623000.0/sec. i3-330m: 748000.0/sec to 789000.0/sec. (measured relative to a baseline with neither optimisations applied). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-17	drm/i915: Allow userspace to hint that the relocations were known	Daniel Vetter
	Userspace is able to hint to the kernel that its command stream and auxiliary state buffers already hold the correct presumed addresses and so the relocation process may be skipped if the kernel does not need to move any buffers in preparation for the execbuffer. Thus for the common case where the allotment of buffers is static between batches, we can avoid the overhead of individually checking the relocation entries. Note that this requires userspace to supply the domain tracking and requests for workarounds itself that would otherwise be computed based upon the relocation entries. Using copywinwin10 as an example that is dependent upon emitting a lot of relocations (2 per operation), we see improvements of: c2d/gm45: 618000.0/sec to 632000.0/sec. i3-330m: 748000.0/sec to 830000.0/sec. (measured relative to a baseline with neither optimisations applied). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Imre Deak <imre.deak@intel.com> [danvet: Fixup merge conflict in userspace header due to different baseline trees.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-17	drm/i915: Implement workaround for broken CS tlb on i830/845	Daniel Vetter
	Now that Chris Wilson demonstrated that the key for stability on early gen 2 is to simple _never_ exchange the physical backing storage of batch buffers I've tried a stab at a kernel solution. Doesn't look too nefarious imho, now that I don't try to be too clever for my own good any more. v2: After discussing the various techniques, we've decided to always blit batches on the suspect devices, but allow userspace to opt out of the kernel workaround assume full responsibility for providing coherent batches. The principal reason is that avoiding the blit does improve performance in a few key microbenchmarks and also in cairo-trace replays. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: - Drop the hunk which uses HAS_BROKEN_CS_TLB to implement the ring wrap w/a. Suggested by Chris Wilson. - Also add the ACTHD check from Chris Wilson for the error state dumping, so that we still catch batches when userspace opts out of the w/a.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-10-22	Merge tag 'v3.7-rc2' into drm-intel-next-queued	Daniel Vetter
	Linux 3.7-rc2 Backmerge to solve two ugly conflicts: - uapi. We've already added new ioctl definitions for -next. Do I need to say more? - wc support gtt ptes. We've had to revert this for snb+ for 3.7 and also fix a few other things in the code. Now we know how to make it work on snb+, but to avoid losing the other fixes do the backmerge first before re-enabling wc gtt ptes on snb+. And a few other minor things, among them git getting confused in intel_dp.c and seemingly causing a conflict out of nothing ... Conflicts: drivers/gpu/drm/i915/i915_reg.h drivers/gpu/drm/i915/intel_display.c drivers/gpu/drm/i915/intel_dp.c drivers/gpu/drm/i915/intel_modes.c include/drm/i915_drm.h Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-10-04	UAPI: (Scripted) Disintegrate include/drm	David Howells
	Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>