summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)Author
2017-06-14drm: Fix oops + Xserver hang when unplugging USB drm devicesHans de Goede
commit 75fb636324a839c2c31be9f81644034c6142e469 upstream. commit a39be606f99d ("drm: Do a full device unregister when unplugging") causes backtraces like this one when unplugging an usb drm device while it is in use: usb 2-3: USB disconnect, device number 25 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 242 at drivers/gpu/drm/drm_mode_config.c:424 drm_mode_config_cleanup+0x220/0x280 [drm] ... RIP: 0010:drm_mode_config_cleanup+0x220/0x280 [drm] ... Call Trace: gm12u320_modeset_cleanup+0xe/0x10 [gm12u320] gm12u320_driver_unload+0x35/0x70 [gm12u320] drm_dev_unregister+0x3c/0xe0 [drm] drm_unplug_dev+0x12/0x60 [drm] gm12u320_usb_disconnect+0x36/0x40 [gm12u320] usb_unbind_interface+0x72/0x280 device_release_driver_internal+0x158/0x210 device_release_driver+0x12/0x20 bus_remove_device+0x104/0x180 device_del+0x1d2/0x350 usb_disable_device+0x9f/0x270 usb_disconnect+0xc6/0x260 ... [drm:drm_mode_config_cleanup [drm]] *ERROR* connector Unknown-1 leaked! ------------[ cut here ]------------ WARNING: CPU: 0 PID: 242 at drivers/gpu/drm/drm_mode_config.c:458 drm_mode_config_cleanup+0x268/0x280 [drm] ... <same Call Trace> ---[ end trace 80df975dae439ed6 ]--- general protection fault: 0000 [#1] SMP ... Call Trace: ? __switch_to+0x225/0x450 drm_mode_rmfb_work_fn+0x55/0x70 [drm] process_one_work+0x193/0x3c0 worker_thread+0x4a/0x3a0 ... RIP: drm_framebuffer_remove+0x62/0x3f0 [drm] RSP: ffffb776c39dfd98 ---[ end trace 80df975dae439ed7 ]--- After which the system is unusable this is caused by drm_dev_unregister getting called immediately on unplug, which calls the drivers unload function which calls drm_mode_config_cleanup which removes the framebuffer object while userspace is still holding a reference to it. Reverting commit a39be606f99d ("drm: Do a full device unregister when unplugging") leads to the following oops on unplug instead, when userspace closes the last fd referencing the drm_dev: sysfs group 'power' not found for kobject 'card1-Unknown-1' ------------[ cut here ]------------ WARNING: CPU: 0 PID: 2459 at fs/sysfs/group.c:237 sysfs_remove_group+0x80/0x90 ... RIP: 0010:sysfs_remove_group+0x80/0x90 ... Call Trace: dpm_sysfs_remove+0x57/0x60 device_del+0xfd/0x350 device_unregister+0x1a/0x60 drm_sysfs_connector_remove+0x39/0x50 [drm] drm_connector_unregister+0x5a/0x70 [drm] drm_connector_unregister_all+0x45/0xa0 [drm] drm_modeset_unregister_all+0x12/0x30 [drm] drm_dev_unregister+0xca/0xe0 [drm] drm_put_dev+0x32/0x60 [drm] drm_release+0x2f3/0x380 [drm] __fput+0xdf/0x1e0 ... ---[ end trace ecfb91ac85688bbe ]--- BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 IP: down_write+0x1f/0x40 ... Call Trace: debugfs_remove_recursive+0x55/0x1b0 drm_debugfs_connector_remove+0x21/0x40 [drm] drm_connector_unregister+0x62/0x70 [drm] drm_connector_unregister_all+0x45/0xa0 [drm] drm_modeset_unregister_all+0x12/0x30 [drm] drm_dev_unregister+0xca/0xe0 [drm] drm_put_dev+0x32/0x60 [drm] drm_release+0x2f3/0x380 [drm] __fput+0xdf/0x1e0 ... ---[ end trace ecfb91ac85688bbf ]--- This is caused by the revert moving back to drm_unplug_dev calling drm_minor_unregister which does: device_del(minor->kdev); dev_set_drvdata(minor->kdev, NULL); /* safety belt */ drm_debugfs_cleanup(minor); Causing the sysfs entries to already be removed even though we still have references to them in e.g. drm_connector. Note we must call drm_minor_unregister to notify userspace of the unplug of the device, so calling drm_dev_unregister is not completely wrong the problem is that drm_dev_unregister does too much. This commit fixes drm_unplug_dev by not only reverting commit a39be606f99d ("drm: Do a full device unregister when unplugging") but by also adding a call to drm_modeset_unregister_all before the drm_minor_unregister calls to make sure all sysfs entries are removed before calling device_del(minor->kdev) thereby also fixing the second set of oopses caused by just reverting the commit. Fixes: a39be606f99d ("drm: Do a full device unregister when unplugging") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jeffy <jeffy.chen@rock-chips.com> Cc: Marco Diego Aurélio Mesquita <marcodiegomesquita@gmail.com> Reported-by: Marco Diego Aurélio Mesquita <marcodiegomesquita@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Sean Paul <seanpaul@chromium.org> Link: http://patchwork.freedesktop.org/patch/msgid/20170601115430.4113-1-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-14drm/amdgpu/ci: disable mclk switching for high refresh rates (v2)Alex Deucher
commit 0a646f331db0eb9efc8d3a95a44872036d441d58 upstream. Even if the vblank period would allow it, it still seems to be problematic on some cards. v2: fix logic inversion (Nils) bug: https://bugs.freedesktop.org/show_bug.cgi?id=96868 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-07drm/gma500/psb: Actually use VBT mode when it is foundPatrik Jakobsson
commit 82bc9a42cf854fdf63155759c0aa790bd1f361b0 upstream. With LVDS we were incorrectly picking the pre-programmed mode instead of the prefered mode provided by VBT. Make sure we pick the VBT mode if one is provided. It is likely that the mode read-out code is still wrong but this patch fixes the immediate problem on most machines. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78562 Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170418114332.12183-1-patrik.r.jakobsson@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-07drm/radeon: Fix vram_size/visible values in DRM_RADEON_GEM_INFO ioctlMichel Dänzer
commit 51964e9e12d0a054002a1a0d1dec4f661c7aaf28 upstream. vram_size is supposed to be the total amount of VRAM that can be used by userspace, which corresponds to the TTM VRAM manager size (which is normally the full amount of VRAM, but can be just the visible VRAM when DMA can't be used for BO migration for some reason). The above was incorrectly used for vram_visible before, resulting in generally too large values being reported. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-07drm/radeon: Unbreak HPD handling for r600+Lyude
commit 3d18e33735a02b1a90aecf14410bf3edbfd4d3dc upstream. We end up reading the interrupt register for HPD5, and then writing it to HPD6 which on systems without anything using HPD5 results in permanently disabling hotplug on one of the display outputs after the first time we acknowledge a hotplug interrupt from the GPU. This code is really bad. But for now, let's just fix this. I will hopefully have a large patch series to refactor all of this soon. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lyude <lyude@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-07drm/radeon/ci: disable mclk switching for high refresh rates (v2)Alex Deucher
commit 58d7e3e427db1bd68f33025519a9468140280a75 upstream. Even if the vblank period would allow it, it still seems to be problematic on some cards. v2: fix logic inversion (Nils) bug: https://bugs.freedesktop.org/show_bug.cgi?id=96868 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25drm/i915/gvt: Disable access to stolen memory as a guestChris Wilson
commit 04a68a35ce6d7b54749989f943993020f48fed62 upstream. Explicitly disable stolen memory when running as a guest in a virtual machine, since the memory is not mediated between clients and reserved entirely for the host. The actual size should be reported as zero, but like every other quirk we want to tell the user what is happening. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99028 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161109103905.17860-1-chris@chris-wilson.co.uk Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25drm/edid: Add 10 bpc quirk for LGD 764 panel in HP zBook 17 G2Mario Kleiner
commit e345da82bd6bdfa8492f80b3ce4370acfd868d95 upstream. The builtin eDP panel in the HP zBook 17 G2 supports 10 bpc, as advertised by the Laptops product specs and verified via injecting a fixed edid + photometer measurements, but edid reports unknown depth, so drivers fall back to 6 bpc. Add a quirk to get the full 10 bpc. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1492787108-23959-1-git-send-email-mario.kleiner.de@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25drm/nouveau/tmr: handle races with hw when updating the next alarm timeBen Skeggs
commit 1b0f84380b10ee97f7d2dd191294de9017e94d1d upstream. If the time to the next alarm is short enough, we could race with HW and end up with an ~4 second delay until it triggers. Fix this by checking again after we update HW. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25drm/nouveau/tmr: avoid processing completed alarms when adding a new oneBen Skeggs
commit 330bdf62fe6a6c5b99a647f7bf7157107c9348b3 upstream. The idea here was to avoid having to "manually" program the HW if there's a new earliest alarm. This was lazy and bad, as it leads to loads of fun races between inter-related callers (ie. therm). Turns out, it's not so difficult after all. Go figure ;) Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25drm/nouveau/tmr: fix corruption of the pending list when rescheduling an alarmBen Skeggs
commit 9fc64667ee48c9a25e7dca1a6bcb6906fec5bcc5 upstream. At least therm/fantog "attempts" to work around this issue, which could lead to corruption of the pending alarm list. Fix it properly by not updating the timestamp without the lock held, or trying to add an already pending alarm to the pending alarm list.... Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25drm/nouveau/tmr: ack interrupt before processing alarmsBen Skeggs
commit 3733bd8b407211739e72d051e5f30ad82a52c4bc upstream. Fixes a race where we can miss an alarm that triggers while we're already processing previous alarms. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25drm/nouveau/therm: remove ineffective workarounds for alarm bugsBen Skeggs
commit e4311ee51d1e2676001b2d8fcefd92bdd79aad85 upstream. These were ineffective due to touching the list without the alarm lock, but should no longer be required. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25drm/amdgpu: Add missing lb_vblank_lead_lines setup to DCE-6 path.Mario Kleiner
commit effaf848b957fbf72a3b6a1ad87f5e031eda0b75 upstream. This apparently got lost when implementing the new DCE-6 support and would cause failures in pageflip scheduling and timestamping. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25drm/amdgpu: Avoid overflows/divide-by-zero in latency_watermark calculations.Mario Kleiner
commit e190ed1ea7458e446230de4113cc5d53b8dc4ec8 upstream. At dot clocks > approx. 250 Mhz, some of these calcs will overflow and cause miscalculation of latency watermarks, and for some overflows also divide-by-zero driver crash ("divide error: 0000 [#1] PREEMPT SMP" in "dce_v10_0_latency_watermark+0x12d/0x190"). This zero-divide happened, e.g., on AMD Tonga Pro under DCE-10, on a Displayport panel when trying to set a video mode of 2560x1440 at 165 Hz vrefresh with a dot clock of 635.540 Mhz. Refine calculations to avoid the overflows. Tested for DCE-10 with R9 380 Tonga + ASUS ROG PG279 panel. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25drm/amdgpu: Make display watermark calculations more accurateMario Kleiner
commit d63c277dc672e0c568481af043359420fa9d4736 upstream. Avoid big roundoff errors in scanline/hactive durations for high pixel clocks, especially for >= 500 Mhz, and thereby program more accurate display fifo watermarks. Implemented here for DCE 6,8,10,11. Successfully tested on DCE 10 with AMD R9 380 Tonga. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14drm/ttm: fix use-after-free races in vm fault handlingNicolai Hähnle
commit 3089c1df10e2931b1d72d2ffa7d86431084c86b3 upstream. The vm fault handler relies on the fact that the VMA owns a reference to the BO. However, once mmap_sem is released, other tasks are free to destroy the VMA, which can lead to the BO being freed. Fix two code paths where that can happen, both related to vm fault retries. Found via a lock debugging warning which flagged &bo->wu_mutex as locked while being destroyed. Fixes: cbe12e74ee4e ("drm/ttm: Allow vm fault retries") Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14drm/sti: fix GDP size to support up to UHD resolutionVincent Abriou
commit 2f410f88c0a1c7e19762918d2614f506a7b63a82 upstream. On stih407-410 chip family the GDP layers are able to support up to UHD resolution (3840 x 2160). Signed-off-by: Vincent Abriou <vincent.abriou@st.com> Acked-by: Lee Jones <lee.jones@linaro.org> Tested-by: Lee Jones <lee.jones@linaro.org> Link: http://patchwork.freedesktop.org/patch/msgid/1490280292-30466-1-git-send-email-vincent.abriou@st.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-21drm/nouveau/disp/mcp7x: disable dptmds workaroundBen Skeggs
commit 7dfee6827780d4228148263545af936d0cae8930 upstream. The workaround appears to cause regressions on these boards, and from inspection of RM traces, NVIDIA don't appear to do it on them either. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Tested-by: Roy Spliet <nouveau@spliet.org> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-21drm/etnaviv: fix missing unlock on error in etnaviv_gpu_submit()Wei Yongjun
commit 45abdf35cf82e4270328c7237e7812de960ac560 upstream. Add the missing unlock before return from function etnaviv_gpu_submit() in the error handling case. lst: fixed label name. Fixes: f3cd1b064f11 ("drm/etnaviv: (re-)protect fence allocation with GPU mutex") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-21drm/nouveau/mmu/nv4a: use nv04 mmu rather than the nv44 oneIlia Mirkin
commit f94773b9f5ecd1df7c88c2e921924dd41d2020cc upstream. The NV4A (aka NV44A) is an oddity in the family. It only comes in AGP and PCI varieties, rather than a core PCIE chip with a bridge for AGP/PCI as necessary. As a result, it appears that the MMU is also non-functional. For AGP cards, the vast majority of the NV4A lineup, this worked out since we force AGP cards to use the nv04 mmu. However for PCI variants, this did not work. Switching to the NV04 MMU makes it work like a charm. Thanks to mwk for the suggestion. This should be a no-op for NV4A AGP boards, as they were using it already. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70388 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-21drm/nouveau/mpeg: mthd returns true on success nowIlia Mirkin
commit 83bce9c2baa51e439480a713119a73d3c8b61083 upstream. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Fixes: 590801c1a3 ("drm/nouveau/mpeg: remove dependence on namedb/engctx lookup") Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-18Revert "drm/i915/execlists: Reset RING registers upon resume"Greg Kroah-Hartman
This reverts commit f2a0409a08502d64fbe3990354dff5902b08d2fb which is commit bafb2f7d4755bf1571bd5e9a03b97f3fc4fe69ae upstream. It was reported to have problems. Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Eric Blau <eblau1@gmail.com> Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
2017-04-18drm/i915: Avoid rcu_barrier() from reclaim paths (shrinker)Chris Wilson
commit 3d3d18f086cdda72ee18a454db70ca72c6e3246c upstream. The rcu_barrier() takes the cpu_hotplug mutex which itself is not reclaim-safe, and so rcu_barrier() is illegal from inside the shrinker. [ 309.661373] ========================================================= [ 309.661376] [ INFO: possible irq lock inversion dependency detected ] [ 309.661380] 4.11.0-rc1-CI-CI_DRM_2333+ #1 Tainted: G W [ 309.661383] --------------------------------------------------------- [ 309.661386] gem_exec_gttfil/6435 just changed the state of lock: [ 309.661389] (rcu_preempt_state.barrier_mutex){+.+.-.}, at: [<ffffffff81100731>] _rcu_barrier+0x31/0x160 [ 309.661399] but this lock took another, RECLAIM_FS-unsafe lock in the past: [ 309.661402] (cpu_hotplug.lock){+.+.+.} [ 309.661404] and interrupts could create inverse lock ordering between them. [ 309.661410] other info that might help us debug this: [ 309.661414] Possible interrupt unsafe locking scenario: [ 309.661417] CPU0 CPU1 [ 309.661419] ---- ---- [ 309.661421] lock(cpu_hotplug.lock); [ 309.661425] local_irq_disable(); [ 309.661432] lock(rcu_preempt_state.barrier_mutex); [ 309.661441] lock(cpu_hotplug.lock); [ 309.661446] <Interrupt> [ 309.661448] lock(rcu_preempt_state.barrier_mutex); [ 309.661453] *** DEADLOCK *** [ 309.661460] 4 locks held by gem_exec_gttfil/6435: [ 309.661464] #0: (sb_writers#10){.+.+.+}, at: [<ffffffff8120d83d>] vfs_write+0x17d/0x1f0 [ 309.661475] #1: (debugfs_srcu){......}, at: [<ffffffff81320491>] debugfs_use_file_start+0x41/0xa0 [ 309.661486] #2: (&attr->mutex){+.+.+.}, at: [<ffffffff8123a3e7>] simple_attr_write+0x37/0xe0 [ 309.661495] #3: (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa0091b4a>] i915_drop_caches_set+0x3a/0x150 [i915] [ 309.661540] the shortest dependencies between 2nd lock and 1st lock: [ 309.661547] -> (cpu_hotplug.lock){+.+.+.} ops: 829 { [ 309.661553] HARDIRQ-ON-W at: [ 309.661560] __lock_acquire+0x5e5/0x1b50 [ 309.661565] lock_acquire+0xc9/0x220 [ 309.661572] __mutex_lock+0x6e/0x990 [ 309.661576] mutex_lock_nested+0x16/0x20 [ 309.661583] get_online_cpus+0x61/0x80 [ 309.661590] kmem_cache_create+0x25/0x1d0 [ 309.661596] debug_objects_mem_init+0x30/0x249 [ 309.661602] start_kernel+0x341/0x3fe [ 309.661607] x86_64_start_reservations+0x2a/0x2c [ 309.661612] x86_64_start_kernel+0x173/0x186 [ 309.661619] verify_cpu+0x0/0xfc [ 309.661622] SOFTIRQ-ON-W at: [ 309.661627] __lock_acquire+0x611/0x1b50 [ 309.661632] lock_acquire+0xc9/0x220 [ 309.661636] __mutex_lock+0x6e/0x990 [ 309.661641] mutex_lock_nested+0x16/0x20 [ 309.661646] get_online_cpus+0x61/0x80 [ 309.661650] kmem_cache_create+0x25/0x1d0 [ 309.661655] debug_objects_mem_init+0x30/0x249 [ 309.661660] start_kernel+0x341/0x3fe [ 309.661664] x86_64_start_reservations+0x2a/0x2c [ 309.661669] x86_64_start_kernel+0x173/0x186 [ 309.661674] verify_cpu+0x0/0xfc [ 309.661677] RECLAIM_FS-ON-W at: [ 309.661682] mark_held_locks+0x6f/0xa0 [ 309.661687] lockdep_trace_alloc+0xb3/0x100 [ 309.661693] kmem_cache_alloc_trace+0x31/0x2e0 [ 309.661699] __smpboot_create_thread.part.1+0x27/0xe0 [ 309.661704] smpboot_create_threads+0x61/0x90 [ 309.661709] cpuhp_invoke_callback+0x9c/0x8a0 [ 309.661713] cpuhp_up_callbacks+0x31/0xb0 [ 309.661718] _cpu_up+0x7a/0xc0 [ 309.661723] do_cpu_up+0x5f/0x80 [ 309.661727] cpu_up+0xe/0x10 [ 309.661734] smp_init+0x71/0xb3 [ 309.661738] kernel_init_freeable+0x94/0x19e [ 309.661743] kernel_init+0x9/0xf0 [ 309.661748] ret_from_fork+0x2e/0x40 [ 309.661752] INITIAL USE at: [ 309.661757] __lock_acquire+0x234/0x1b50 [ 309.661761] lock_acquire+0xc9/0x220 [ 309.661766] __mutex_lock+0x6e/0x990 [ 309.661771] mutex_lock_nested+0x16/0x20 [ 309.661775] get_online_cpus+0x61/0x80 [ 309.661780] __cpuhp_setup_state+0x44/0x170 [ 309.661785] page_alloc_init+0x23/0x3a [ 309.661790] start_kernel+0x124/0x3fe [ 309.661794] x86_64_start_reservations+0x2a/0x2c [ 309.661799] x86_64_start_kernel+0x173/0x186 [ 309.661804] verify_cpu+0x0/0xfc [ 309.661807] } [ 309.661813] ... key at: [<ffffffff81e37690>] cpu_hotplug+0xb0/0x100 [ 309.661817] ... acquired at: [ 309.661821] lock_acquire+0xc9/0x220 [ 309.661825] __mutex_lock+0x6e/0x990 [ 309.661829] mutex_lock_nested+0x16/0x20 [ 309.661833] get_online_cpus+0x61/0x80 [ 309.661837] _rcu_barrier+0x9f/0x160 [ 309.661841] rcu_barrier+0x10/0x20 [ 309.661847] netdev_run_todo+0x5f/0x310 [ 309.661852] rtnl_unlock+0x9/0x10 [ 309.661856] default_device_exit_batch+0x133/0x150 [ 309.661862] ops_exit_list.isra.0+0x4d/0x60 [ 309.661866] cleanup_net+0x1d8/0x2c0 [ 309.661872] process_one_work+0x1f4/0x6d0 [ 309.661876] worker_thread+0x49/0x4a0 [ 309.661881] kthread+0x107/0x140 [ 309.661884] ret_from_fork+0x2e/0x40 [ 309.661890] -> (rcu_preempt_state.barrier_mutex){+.+.-.} ops: 179 { [ 309.661896] HARDIRQ-ON-W at: [ 309.661901] __lock_acquire+0x5e5/0x1b50 [ 309.661905] lock_acquire+0xc9/0x220 [ 309.661910] __mutex_lock+0x6e/0x990 [ 309.661914] mutex_lock_nested+0x16/0x20 [ 309.661919] _rcu_barrier+0x31/0x160 [ 309.661923] rcu_barrier+0x10/0x20 [ 309.661928] netdev_run_todo+0x5f/0x310 [ 309.661932] rtnl_unlock+0x9/0x10 [ 309.661936] default_device_exit_batch+0x133/0x150 [ 309.661941] ops_exit_list.isra.0+0x4d/0x60 [ 309.661946] cleanup_net+0x1d8/0x2c0 [ 309.661951] process_one_work+0x1f4/0x6d0 [ 309.661955] worker_thread+0x49/0x4a0 [ 309.661960] kthread+0x107/0x140 [ 309.661964] ret_from_fork+0x2e/0x40 [ 309.661968] SOFTIRQ-ON-W at: [ 309.661972] __lock_acquire+0x611/0x1b50 [ 309.661977] lock_acquire+0xc9/0x220 [ 309.661981] __mutex_lock+0x6e/0x990 [ 309.661986] mutex_lock_nested+0x16/0x20 [ 309.661990] _rcu_barrier+0x31/0x160 [ 309.661995] rcu_barrier+0x10/0x20 [ 309.661999] netdev_run_todo+0x5f/0x310 [ 309.662003] rtnl_unlock+0x9/0x10 [ 309.662008] default_device_exit_batch+0x133/0x150 [ 309.662013] ops_exit_list.isra.0+0x4d/0x60 [ 309.662017] cleanup_net+0x1d8/0x2c0 [ 309.662022] process_one_work+0x1f4/0x6d0 [ 309.662027] worker_thread+0x49/0x4a0 [ 309.662031] kthread+0x107/0x140 [ 309.662035] ret_from_fork+0x2e/0x40 [ 309.662039] IN-RECLAIM_FS-W at: [ 309.662043] __lock_acquire+0x638/0x1b50 [ 309.662048] lock_acquire+0xc9/0x220 [ 309.662053] __mutex_lock+0x6e/0x990 [ 309.662058] mutex_lock_nested+0x16/0x20 [ 309.662062] _rcu_barrier+0x31/0x160 [ 309.662067] rcu_barrier+0x10/0x20 [ 309.662089] i915_gem_shrink_all+0x33/0x40 [i915] [ 309.662109] i915_drop_caches_set+0x141/0x150 [i915] [ 309.662114] simple_attr_write+0xc7/0xe0 [ 309.662119] full_proxy_write+0x4f/0x70 [ 309.662124] __vfs_write+0x23/0x120 [ 309.662128] vfs_write+0xc6/0x1f0 [ 309.662133] SyS_write+0x44/0xb0 [ 309.662138] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 309.662142] INITIAL USE at: [ 309.662147] __lock_acquire+0x234/0x1b50 [ 309.662151] lock_acquire+0xc9/0x220 [ 309.662156] __mutex_lock+0x6e/0x990 [ 309.662160] mutex_lock_nested+0x16/0x20 [ 309.662165] _rcu_barrier+0x31/0x160 [ 309.662169] rcu_barrier+0x10/0x20 [ 309.662174] netdev_run_todo+0x5f/0x310 [ 309.662178] rtnl_unlock+0x9/0x10 [ 309.662183] default_device_exit_batch+0x133/0x150 [ 309.662188] ops_exit_list.isra.0+0x4d/0x60 [ 309.662192] cleanup_net+0x1d8/0x2c0 [ 309.662197] process_one_work+0x1f4/0x6d0 [ 309.662202] worker_thread+0x49/0x4a0 [ 309.662206] kthread+0x107/0x140 [ 309.662210] ret_from_fork+0x2e/0x40 [ 309.662214] } [ 309.662220] ... key at: [<ffffffff81e4e1c8>] rcu_preempt_state+0x508/0x780 [ 309.662225] ... acquired at: [ 309.662229] check_usage_forwards+0x12b/0x130 [ 309.662233] mark_lock+0x360/0x6f0 [ 309.662237] __lock_acquire+0x638/0x1b50 [ 309.662241] lock_acquire+0xc9/0x220 [ 309.662245] __mutex_lock+0x6e/0x990 [ 309.662249] mutex_lock_nested+0x16/0x20 [ 309.662253] _rcu_barrier+0x31/0x160 [ 309.662257] rcu_barrier+0x10/0x20 [ 309.662279] i915_gem_shrink_all+0x33/0x40 [i915] [ 309.662298] i915_drop_caches_set+0x141/0x150 [i915] [ 309.662303] simple_attr_write+0xc7/0xe0 [ 309.662307] full_proxy_write+0x4f/0x70 [ 309.662311] __vfs_write+0x23/0x120 [ 309.662315] vfs_write+0xc6/0x1f0 [ 309.662319] SyS_write+0x44/0xb0 [ 309.662323] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 309.662329] stack backtrace: [ 309.662335] CPU: 1 PID: 6435 Comm: gem_exec_gttfil Tainted: G W 4.11.0-rc1-CI-CI_DRM_2333+ #1 [ 309.662342] Hardware name: Hewlett-Packard HP Compaq 8100 Elite SFF PC/304Ah, BIOS 786H1 v01.13 07/14/2011 [ 309.662348] Call Trace: [ 309.662354] dump_stack+0x67/0x92 [ 309.662359] print_irq_inversion_bug.part.19+0x1a4/0x1b0 [ 309.662365] check_usage_forwards+0x12b/0x130 [ 309.662369] mark_lock+0x360/0x6f0 [ 309.662374] ? print_shortest_lock_dependencies+0x1a0/0x1a0 [ 309.662379] __lock_acquire+0x638/0x1b50 [ 309.662383] ? __mutex_unlock_slowpath+0x3e/0x2e0 [ 309.662388] ? trace_hardirqs_on+0xd/0x10 [ 309.662392] ? _rcu_barrier+0x31/0x160 [ 309.662396] lock_acquire+0xc9/0x220 [ 309.662400] ? _rcu_barrier+0x31/0x160 [ 309.662404] ? _rcu_barrier+0x31/0x160 [ 309.662409] __mutex_lock+0x6e/0x990 [ 309.662412] ? _rcu_barrier+0x31/0x160 [ 309.662416] ? _rcu_barrier+0x31/0x160 [ 309.662421] ? synchronize_rcu_expedited+0x35/0xb0 [ 309.662426] ? _raw_spin_unlock_irqrestore+0x52/0x60 [ 309.662434] mutex_lock_nested+0x16/0x20 [ 309.662438] _rcu_barrier+0x31/0x160 [ 309.662442] rcu_barrier+0x10/0x20 [ 309.662464] i915_gem_shrink_all+0x33/0x40 [i915] [ 309.662484] i915_drop_caches_set+0x141/0x150 [i915] [ 309.662489] simple_attr_write+0xc7/0xe0 [ 309.662494] full_proxy_write+0x4f/0x70 [ 309.662498] __vfs_write+0x23/0x120 [ 309.662503] ? rcu_read_lock_sched_held+0x75/0x80 [ 309.662507] ? rcu_sync_lockdep_assert+0x2a/0x50 [ 309.662512] ? __sb_start_write+0x102/0x210 [ 309.662516] ? vfs_write+0x17d/0x1f0 [ 309.662520] vfs_write+0xc6/0x1f0 [ 309.662524] ? trace_hardirqs_on_caller+0xe7/0x200 [ 309.662529] SyS_write+0x44/0xb0 [ 309.662533] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 309.662537] RIP: 0033:0x7f507eac24a0 [ 309.662541] RSP: 002b:00007fffda8720e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 309.662548] RAX: ffffffffffffffda RBX: ffffffff81482bd3 RCX: 00007f507eac24a0 [ 309.662552] RDX: 0000000000000005 RSI: 00007fffda8720f0 RDI: 0000000000000005 [ 309.662557] RBP: ffffc9000048bf88 R08: 0000000000000000 R09: 000000000000002c [ 309.662561] R10: 0000000000000014 R11: 0000000000000246 R12: 00007fffda872230 [ 309.662566] R13: 00007fffda872228 R14: 0000000000000201 R15: 00007fffda8720f0 [ 309.662572] ? __this_cpu_preempt_check+0x13/0x20 Fixes: 0eafec6d3244 ("drm/i915: Enable lockless lookup of request tracking via RCU") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100192 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170314115019.18127-1-chris@chris-wilson.co.uk Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (cherry picked from commit bd784b7cc41af7a19cfb705fa6d800e511c4ab02) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170321144531.12344-1-chris@chris-wilson.co.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-18drm/i915: Stop using RP_DOWN_EI on BaytrailChris Wilson
commit 8f68d591d4765b2e1ce9d916ac7bc5583285c4ad upstream. On Baytrail, we manually calculate busyness over the evaluation interval to avoid issues with miscaluations with RC6 enabled. However, it turns out that the DOWN_EI interrupt generator is completely bust - it operates in two modes, continuous or never. Neither of which are conducive to good behaviour. Stop unmask the DOWN_EI interrupt and just compute everything from the UP_EI which does seem to correspond to the desired interval. v2: Fixup gen6_rps_pm_mask() as well v3: Inline vlv_c0_above() to combine the now identical elapsed calculation for up/down and simplify the threshold testing Fixes: 43cf3bf084ba ("drm/i915: Improved w/a for rps on Baytrail") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170309211232.28878-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170313170617.31564-1-chris@chris-wilson.co.uk (cherry picked from commit e0e8c7cb6eb68e9256de2d8cbeb481d3701c05ac) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-18drm/i915: Drop support for I915_EXEC_CONSTANTS_* execbuf parameters.Kenneth Graunke
commit 0f5418e564ac6452b9086295646e602a9addc4bf upstream. This patch makes the I915_PARAM_HAS_EXEC_CONSTANTS getparam return 0 (indicating the optional feature is not supported), and makes execbuf always return -EINVAL if the flags are used. Apparently, no userspace ever shipped which used this optional feature: I checked the git history of Mesa, xf86-video-intel, libva, and Beignet, and there were zero commits showing a use of these flags. Kernel commit 72bfa19c8deb4 apparently introduced the feature prematurely. According to Chris, the intention was to use this in cairo-drm, but "the use was broken for gen6", so I don't think it ever happened. 'relative_constants_mode' has always been tracked per-device, but this has actually been wrong ever since hardware contexts were introduced, as the INSTPM register is saved (and automatically restored) as part of the render ring context. The software per-device value could therefore get out of sync with the hardware per-context value. This meant that using them is actually unsafe: a client which tried to use them could damage the state of other clients, causing the GPU to interpret their BO offsets as absolute pointers, leading to bogus memory reads. These flags were also never ported to execlist mode, making them no-ops on Gen9+ (which requires execlists), and Gen8 in the default mode. On Gen8+, userspace can write these registers directly, achieving the same effect. On Gen6-7.5, it likely makes sense to extend the command parser to support them. I don't think anyone wants this on Gen4-5. Based on a patch by Dave Gordon. v3: Return -ENODEV for the getparam, as this is what we do for other obsolete features. Suggested by Chris Wilson. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92448 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170215093446.21291-1-kenneth@whitecape.org Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170313170433.26843-1-chris@chris-wilson.co.uk (cherry picked from commit ef0f411f51475f4eebf9fc1b19a85be698af19ff) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-18drm/i915: Only enable hotplug interrupts if the display interrupts are enabledChris Wilson
commit 35a3abfd198e6c69a6644784bb09a2d951fc6b21 upstream. In order to prevent accessing the hpd registers outside of the display power wells, we should refrain from writing to the registers before the display interrupts are enabled. [ 4.740136] WARNING: CPU: 1 PID: 221 at drivers/gpu/drm/i915/intel_uncore.c:795 __unclaimed_reg_debug+0x44/0x50 [i915] [ 4.740155] Unclaimed read from register 0x1e1110 [ 4.740168] Modules linked in: i915(+) intel_gtt drm_kms_helper prime_numbers [ 4.740190] CPU: 1 PID: 221 Comm: systemd-udevd Not tainted 4.10.0-rc6+ #384 [ 4.740203] Hardware name: / , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015 [ 4.740220] Call Trace: [ 4.740236] dump_stack+0x4d/0x6f [ 4.740251] __warn+0xc1/0xe0 [ 4.740265] warn_slowpath_fmt+0x4a/0x50 [ 4.740281] ? insert_work+0x77/0xc0 [ 4.740355] ? fwtable_write32+0x90/0x130 [i915] [ 4.740431] __unclaimed_reg_debug+0x44/0x50 [i915] [ 4.740507] fwtable_read32+0xd8/0x130 [i915] [ 4.740575] i915_hpd_irq_setup+0xa5/0x100 [i915] [ 4.740649] intel_hpd_init+0x68/0x80 [i915] [ 4.740716] i915_driver_load+0xe19/0x1380 [i915] [ 4.740784] i915_pci_probe+0x32/0x90 [i915] [ 4.740799] pci_device_probe+0x8b/0xf0 [ 4.740815] driver_probe_device+0x2b6/0x450 [ 4.740828] __driver_attach+0xda/0xe0 [ 4.740841] ? driver_probe_device+0x450/0x450 [ 4.740853] bus_for_each_dev+0x5b/0x90 [ 4.740865] driver_attach+0x19/0x20 [ 4.740878] bus_add_driver+0x166/0x260 [ 4.740892] driver_register+0x5b/0xd0 [ 4.740906] ? 0xffffffffa0166000 [ 4.740920] __pci_register_driver+0x47/0x50 [ 4.740985] i915_init+0x5c/0x5e [i915] [ 4.740999] do_one_initcall+0x3e/0x160 [ 4.741015] ? __vunmap+0x7c/0xc0 [ 4.741029] ? kmem_cache_alloc+0xcf/0x120 [ 4.741045] do_init_module+0x55/0x1c4 [ 4.741060] load_module+0x1f3f/0x25b0 [ 4.741073] ? __symbol_put+0x40/0x40 [ 4.741086] ? kernel_read_file+0x100/0x190 [ 4.741100] SYSC_finit_module+0xbc/0xf0 [ 4.741112] SyS_finit_module+0x9/0x10 [ 4.741125] entry_SYSCALL_64_fastpath+0x17/0x98 [ 4.741135] RIP: 0033:0x7f8559a140f9 [ 4.741145] RSP: 002b:00007fff7509a3e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 4.741161] RAX: ffffffffffffffda RBX: 00007f855aba02d1 RCX: 00007f8559a140f9 [ 4.741172] RDX: 0000000000000000 RSI: 000055b6db0914f0 RDI: 0000000000000011 [ 4.741183] RBP: 0000000000020000 R08: 0000000000000000 R09: 000000000000000e [ 4.741193] R10: 0000000000000011 R11: 0000000000000246 R12: 000055b6db0854d0 [ 4.741204] R13: 000055b6db091150 R14: 0000000000000000 R15: 000055b6db035924 v2: Set dev_priv->display_irqs_enabled to true for all platforms other than vlv/chv that manually control the display power domain. Fixes: 19625e85c6ec ("drm/i915: Enable polling when we don't have hpd") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97798 Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Lyude <cpaul@redhat.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Hans de Goede <jwrdegoede@fedoraproject.org> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170215131547.5064-1-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/20170313170231.18633-1-chris@chris-wilson.co.uk (cherry picked from commit 262fd485ac6b476479f41f00bb104f6a1766ae66) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-18drm/i915: Avoid tweaking evaluation thresholds on Baytrail v3Mika Kuoppala
commit 34dc8993eef63681b062871413a9484008a2a78f upstream. Certain Baytrails, namely the 4 cpu core variants, have been plaqued by spurious system hangs, mostly occurring with light loads. Multiple bisects by various people point to a commit which changes the reclocking strategy for Baytrail to follow its bigger brethen: commit 8fb55197e64d ("drm/i915: Agressive downclocking on Baytrail") There is also a review comment attached to this commit from Deepak S on avoiding punit access on Cherryview and thus it was excluded on common reclocking path. By taking the same approach and omitting the punit access by not tweaking the thresholds when the hardware has been asked to move into different frequency, considerable gains in stability have been observed. With J1900 box, light render/video load would end up in system hang in usually less than 12 hours. With this patch applied, the cumulative uptime has now been 34 days without issues. To provoke system hang, light loads on both render and bsd engines in parallel have been used: glxgears >/dev/null 2>/dev/null & mpv --vo=vaapi --hwdec=vaapi --loop=inf vid.mp4 So far, author has not witnessed system hang with above load and this patch applied. Reports from the tenacious people at kernel bugzilla are also promising. Considering that the punit access frequency with this patch is considerably less, there is a possibility that this will push the, still unknown, root cause past the triggering point on most loads. But as we now can reliably reproduce the hang independently, we can reduce the pain that users are having and use a static thresholds until a root cause is found. v3: don't break debugfs and simplification (Chris Wilson) References: https://bugzilla.kernel.org/show_bug.cgi?id=109051 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Len Brown <len.brown@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jani Nikula <jani.nikula@intel.com> Cc: fritsch@xbmc.org Cc: miku@iki.fi Cc: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar> CC: Michal Feix <michal@feix.cz> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Deepak S <deepak.s@linux.intel.com> Cc: Jarkko Nikula <jarkko.nikula@linux.intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1487166779-26945-1-git-send-email-mika.kuoppala@intel.com (cherry picked from commit 6067a27d1f0184596d51decbac1c1fdc4acb012f) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-18drm/i915: Nuke debug messages from the pipe update critical sectionVille Syrjälä
commit edd06b8353772dca7afcd4640dafa83b521edd55 upstream. printks are slow so we should not be doing them from the vblank evade critical section. These could explain why we sometimes seem to blow past our 100 usec deadline. The problem has been there ever since commit bfd16b2a23dc ("drm/i915: Make updating pipe without modeset atomic.") but it may not have been readily visible until commit e1edbd44e23b ("drm/i915: Complain if we take too long under vblank evasion.") increased our chances of noticing it. Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Fixes: bfd16b2a23dc ("drm/i915: Make updating pipe without modeset atomic.") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170307205419.19447-1-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> (cherry picked from commit c3f8ad57a01a31397e5a0349a226a32f35ddc19c) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-18drm/i915/gen9: Increase PCODE request timeout to 50msImre Deak
commit d253371c4c2f5fc2d884ef25f64decd7549aff5a upstream. After commit 2c7d0602c815277f7cb7c932b091288710d8aba7 Author: Imre Deak <imre.deak@intel.com> Date: Mon Dec 5 18:27:37 2016 +0200 drm/i915/gen9: Fix PCODE polling during CDCLK change notification there is still one report of the CDCLK-change request timing out on a KBL machine, see the Reference link. On that machine the maximum time the request took to succeed was 34ms, so increase the timeout to 50ms. v2: - Change timeout from 100 to 50 ms to maintain the current 50 ms limit for atomic waits in the driver. (Chris, Tvrtko) Reference: https://bugs.freedesktop.org/show_bug.cgi?id=99345 Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1487946730-17162-1-git-send-email-imre.deak@intel.com (cherry picked from commit 0129936ddda26afd5d9d207c4e86b2425952579f) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12drm/msm/adreno: move function declarations to header fileBaoyou Xie
[ Upstream commit a5725ab0497ad91a2df7c01a78bf1a0cc5be4526 ] We get 2 warnings when building kernel with W=1: drivers/gpu/drm/msm/adreno/a3xx_gpu.c:535:17: warning: no previous prototype for 'a3xx_gpu_init' [-Wmissing-prototypes] drivers/gpu/drm/msm/adreno/a4xx_gpu.c:624:17: warning: no previous prototype for 'a4xx_gpu_init' [-Wmissing-prototypes] In fact, both functions are declared in drivers/gpu/drm/msm/adreno/adreno_device.c, but should be declared in a header file. So this patch moves both function declarations to drivers/gpu/drm/msm/adreno/adreno_gpu.h. Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1477127865-9381-1-git-send-email-baoyou.xie@linaro.org Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12drm/mga: remove device_is_agp callbackDaniel Vetter
[ Upstream commit 858b2c1bf820ebfba89c5e2867ab882bdb5b2f5a ] It's only for a device quirk, and we might as well do that in the load callback. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170125062657.19270-10-daniel.vetter@ffwll.ch Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12drm/i915: actually drive the BDW reserved IDsPaulo Zanoni
[ Upstream commit 98b2f01c8dfc8922a2af1fe82a1c40cac4911634 ] Back in 2014, commit fb7023e0e248 ("drm/i915: BDW: Adding Reserved PCI IDs.") added the reserved PCI IDs in order to try to make sure we had working drivers in case we ever released products using these IDs (since we had instances of this type of problem in the past). The problem is that the patch only touched the macros used by early-quirks.c and by the user space components that rely on i915_pciids.h, it didn't touch the macros used by i915_pci.c. So we correctly handled the stolen memory for these theoretical IDs, but we didn't actually drive the devices from i915.ko. So this patch fixes the original commit by actually making i915.ko drive these IDs, which was the goal. There's no information on what would be the GT count on these IDs, so we just go with the safer intel_broadwell_info, at the risk of ignoring a possibly inexistent BSD2_RING. I did some checking, and it seems that these IDs are driven by intel-gpu-tools, xf86-video-intel and libdrm (since they contain old copies of i915_pciids.h), but they are not checked by mesa. The alternative to this patch would be to just assume we're actually never going to use these IDs, and then remove them from our ID lists and make sure our user space components sync the latest i915_pciids.h copy. I'm fine with either approaches, as long as we make sure that every component tries to drive the same list of PCI IDs. Fixes: fb7023e0e248 ("drm/i915: BDW: Adding Reserved PCI IDs.") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1483473860-17644-3-git-send-email-paulo.r.zanoni@intel.com Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12drm/edid: constify edid quirk listJani Nikula
[ Upstream commit 23c4cfbdab494568600ae6073a2bf02be4b10f4e ] No reason not to be const. Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1482923186-22430-1-git-send-email-jani.nikula@intel.com Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12drm/sun4i: Add compatible string for A31/A31s TCON (timing controller)Chen-Yu Tsai
commit 93a5ec14da24a8abbac5bcb953b45cc7a5d0198a upstream. The A31 TCON has mux controls for how TCON outputs are routed to the HDMI and MIPI DSI blocks. Since the A31s does not have MIPI DSI, it only has a mux for the HDMI controller input. This patch only adds support for the compatible strings. Actual support for the mux controls should be added with HDMI and MIPI DSI support. Signed-off-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12drm/sun4i: Add compatible strings for A31/A31s display pipelinesChen-Yu Tsai
commit 49c440e87cd6f547f93d0dc53571ae0e11d9ec8f upstream. The A31's display pipeline has 2 frontends, 2 backends, and 2 TCONs. It also has new display enhancement blocks, such as the DRC (Dynamic Range Controller), the DEU (Display Enhancement Unit), and the CMU (Color Management Unit). It supports HDMI, MIPI DSI, and 2 LCD/LVDS channels. The A31s display pipeline is almost the same, just without MIPI DSI. Only the TCON seems to be different, due to the missing mux for MIPI DSI. Add compatible strings for both of them. Signed-off-by: Chen-Yu Tsai <wens@csie.org> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12drm/sun4i: tcon: Move SoC specific quirks to a DT matched data structureChen-Yu Tsai
commit 91ea2f29cba6a7fe035ea232e4f981211a9fce5d upstream. We already have some differences between the 2 supported SoCs. More will be added as we support other SoCs. To avoid bloating the probe function with even more conditionals, move the quirks to a separate data structure that's tied to the compatible string. Signed-off-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12drm/vmwgfx: fix integer overflow in vmw_surface_define_ioctl()Li Qiang
commit e7e11f99564222d82f0ce84bd521e57d78a6b678 upstream. In vmw_surface_define_ioctl(), the 'num_sizes' is the sum of the 'req->mip_levels' array. This array can be assigned any value from the user space. As both the 'num_sizes' and the array is uint32_t, it is easy to make 'num_sizes' overflow. The later 'mip_levels' is used as the loop count. This can lead an oob write. Add the check of 'req->mip_levels' to avoid this. Signed-off-by: Li Qiang <liqiang6-s@360.cn> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12drm/vmwgfx: Remove getparam error messageThomas Hellstrom
commit 53e16798b0864464c5444a204e1bb93ae246c429 upstream. The mesa winsys sometimes uses unimplemented parameter requests to check for features. Remove the error message to avoid bloating the kernel log. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12drm/ttm, drm/vmwgfx: Relax permission checking when opening surfacesThomas Hellstrom
commit fe25deb7737ce6c0879ccf79c99fa1221d428bf2 upstream. Previously, when a surface was opened using a legacy (non prime) handle, it was verified to have been created by a client in the same master realm. Relax this so that opening is also allowed recursively if the client already has the surface open. This works around a regression in svga mesa where opening of a shared surface is used recursively to obtain surface information. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12drm/vmwgfx: avoid calling vzalloc with a 0 size in vmw_get_cap_3d_ioctl()Murray McAllister
commit 63774069d9527a1aeaa4aa20e929ef5e8e9ecc38 upstream. In vmw_get_cap_3d_ioctl(), a user can supply 0 for a size that is used in vzalloc(). This eventually calls dump_stack() (in warn_alloc()), which can leak useful addresses to dmesg. Add check to avoid a size of 0. Signed-off-by: Murray McAllister <murray.mcallister@insomniasec.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12drm/vmwgfx: NULL pointer dereference in vmw_surface_define_ioctl()Murray McAllister
commit 36274ab8c596f1240c606bb514da329add2a1bcd upstream. Before memory allocations vmw_surface_define_ioctl() checks the upper-bounds of a user-supplied size, but does not check if the supplied size is 0. Add check to avoid NULL pointer dereferences. Signed-off-by: Murray McAllister <murray.mcallister@insomniasec.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12drm/vmwgfx: Type-check lookups of fence objectsThomas Hellstrom
commit f7652afa8eadb416b23eb57dec6f158529942041 upstream. A malicious caller could otherwise hand over handles to other objects causing all sorts of interesting problems. Testing done: Ran a Fedora 25 desktop using both Xorg and gnome-shell/Wayland. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08drm/etnaviv: (re-)protect fence allocation with GPU mutexLucas Stach
commit f3cd1b064f1179d9e6188c6d67297a2360880e10 upstream. The fence allocation needs to be protected by the GPU mutex, otherwise the fence seqnos of concurrent submits might not match the insertion order of the jobs in the kernel ring. This breaks the assumption that jobs complete with monotonically increasing fence seqnos. Fixes: d9853490176c (drm/etnaviv: take GPU lock later in the submit process) Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08drm/vc4: Allocate the right amount of space for boot-time CRTC state.Eric Anholt
commit 6d6e500391875cc372336c88e9a8af377be19c36 upstream. Without this, the first modeset would dereference past the allocation when trying to free the mm node. Signed-off-by: Eric Anholt <eric@anholt.net> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170328201343.4884-1-eric@anholt.net Fixes: d8dbf44f13b9 ("drm/vc4: Make the CRTCs cooperate on allocating display lists.") Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08drm/radeon: Override fpfn for all VRAM placements in radeon_evict_flagsMichel Dänzer
commit ce4b4f228e51219b0b79588caf73225b08b5b779 upstream. We were accidentally only overriding the first VRAM placement. For BOs with the RADEON_GEM_NO_CPU_ACCESS flag set, radeon_ttm_placement_from_domain creates a second VRAM placment with fpfn == 0. If VRAM is almost full, the first VRAM placement with fpfn > 0 may not work, but the second one with fpfn == 0 always will (the BO's current location trivially satisfies it). Because "moving" the BO to its current location puts it back on the LRU list, this results in an infinite loop. Fixes: 2a85aedd117c ("drm/radeon: Try evicting from CPU accessible to inaccessible VRAM first") Reported-by: Zachary Michaels <zmichaels@oblong.com> Reported-and-Tested-by: Julien Isorce <jisorce@oblong.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-30drm: reference count event->completionDaniel Vetter
commit 24835e442f289813aa568d142a755672a740503c upstream. When writing the generic nonblocking commit code I assumed that through clever lifetime management I can assure that the completion (stored in drm_crtc_commit) only gets freed after it is completed. And that worked. I also wanted to make nonblocking helpers resilient against driver bugs, by having timeouts everywhere. And that worked too. Unfortunately taking boths things together results in oopses :( Well, at least sometimes: What seems to happen is that the drm event hangs around forever stuck in limbo land. The nonblocking helpers eventually time out, move on and release it. Now the bug I tested all this against is drivers that just entirely fail to deliver the vblank events like they should, and in those cases the event is simply leaked. But what seems to happen, at least sometimes, on i915 is that the event is set up correctly, but somohow the vblank fails to fire in time. Which means the event isn't leaked, it's still there waiting for eventually a vblank to fire. That tends to happen when re-enabling the pipe, and then the trap springs and the kernel oopses. The correct fix here is simply to refcount the crtc commit to make sure that the event sticks around even for drivers which only sometimes fail to deliver vblanks for some arbitrary reasons. Since crtc commits are already refcounted that's easy to do. References: https://bugs.freedesktop.org/show_bug.cgi?id=96781 Cc: Jim Rees <rees@umich.edu> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161221102331.31033-1-daniel.vetter@ffwll.ch Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-30drm/bridge: analogix dp: Fix runtime PM state on driver bindMarek Szyprowski
commit f0a8b49c03d22a511a601dc54b2a3425a41e35fa upstream. Analogix_dp_bind() can be called from component framework, which doesn't guarantee proper runtime PM state of the device during bind operation, so ensure that device is runtime active before doing any register access. This ensures that the power domain, to which DP module belongs, is turned on. While at it, also fix the unbalanced call to phy_power_on() in analogix_dp_bind() function. This patch solves the following kernel oops on Samsung Exynos5250 Snow board: Unhandled fault: imprecise external abort (0x406) at 0x00000000 pgd = c0004000 [00000000] *pgd=00000000 Internal error: : 406 [#1] PREEMPT SMP ARM Modules linked in: CPU: 0 PID: 75 Comm: kworker/0:2 Not tainted 4.9.0 #1046 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) Workqueue: events deferred_probe_work_func task: ee272300 task.stack: ee312000 PC is at analogix_dp_enable_sw_function+0x18/0x2c LR is at analogix_dp_init_dp+0x2c/0x50 ... [<c03fcb38>] (analogix_dp_enable_sw_function) from [<c03fa9c4>] (analogix_dp_init_dp+0x2c/0x50) [<c03fa9c4>] (analogix_dp_init_dp) from [<c03fab6c>] (analogix_dp_bind+0x184/0x42c) [<c03fab6c>] (analogix_dp_bind) from [<c03fdb84>] (component_bind_all+0xf0/0x218) [<c03fdb84>] (component_bind_all) from [<c03ed64c>] (exynos_drm_load+0x134/0x200) [<c03ed64c>] (exynos_drm_load) from [<c03d5058>] (drm_dev_register+0xa0/0xd0) [<c03d5058>] (drm_dev_register) from [<c03d66b8>] (drm_platform_init+0x58/0xb0) [<c03d66b8>] (drm_platform_init) from [<c03fe0c4>] (try_to_bring_up_master+0x14c/0x188) [<c03fe0c4>] (try_to_bring_up_master) from [<c03fe188>] (component_add+0x88/0x138) [<c03fe188>] (component_add) from [<c0403a38>] (platform_drv_probe+0x50/0xb0) [<c0403a38>] (platform_drv_probe) from [<c0402470>] (driver_probe_device+0x1f0/0x2a8) [<c0402470>] (driver_probe_device) from [<c0400a54>] (bus_for_each_drv+0x44/0x8c) [<c0400a54>] (bus_for_each_drv) from [<c04021f8>] (__device_attach+0x9c/0x100) [<c04021f8>] (__device_attach) from [<c04018e8>] (bus_probe_device+0x84/0x8c) [<c04018e8>] (bus_probe_device) from [<c0401d1c>] (deferred_probe_work_func+0x60/0x8c) [<c0401d1c>] (deferred_probe_work_func) from [<c012fc14>] (process_one_work+0x120/0x318) [<c012fc14>] (process_one_work) from [<c012fe34>] (process_scheduled_works+0x28/0x38) [<c012fe34>] (process_scheduled_works) from [<c0130048>] (worker_thread+0x204/0x4ac) [<c0130048>] (worker_thread) from [<c01352c4>] (kthread+0xd8/0xf4) [<c01352c4>] (kthread) from [<c0107978>] (ret_from_fork+0x14/0x3c) Code: e59035f0 e5935018 f57ff04f e3c55001 (f57ff04e) ---[ end trace 3d1d0d87796de344 ]--- Reviewed-by: Sean Paul <seanpaul@chromium.org> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Archit Taneja <architt@codeaurora.org> Link: http://patchwork.freedesktop.org/patch/msgid/1483091866-1088-1-git-send-email-m.szyprowski@samsung.com Cc: Javier Martinez Canillas <javier@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-30drm/amdgpu: reinstate oland workaround for sclkAlex Deucher
commit e11ddff68a7c455e63c4b46154a3e75c699a7b55 upstream. Higher sclks seem to be unstable on some boards. bug: https://bugs.freedesktop.org/show_bug.cgi?id=100222 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-26drm/amdgpu/si: add dpm quirk for OlandAlex Deucher
commit 18a8de1bc37e97dff1c96ee6cf49adbd02a0f775 upstream. OLAND 0x1002:0x6604 0x1028:0x066F 0x00 seems to have problems with higher sclks. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>