drm/i915: Beware temporary wedging when determining -EIO
At a few points in our uABI, we check to see if the driver is wedged and
report -EIO back to the user in that case. However, as we perform the
check and reset asynchronously (where once before they were both
serialised by the struct_mutex), we may instead see the temporary wedging
used to cancel inflight rendering to avoid a deadlock during reset
(caused by either us timing out in our reset handler,
i915_wedge_on_timeout or with malice aforethought in intel_reset_prepare
for a stuck modeset). If we suspect this is the case, that is we see a
wedged driver *and* reset in progress, then wait until the reset is
resolved before reporting upon the wedged status.
v2: might_sleep() (Mika)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109580
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190220145637.23503-1-chris@chris-wilson.co.uk
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 2aeea97..3717541 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3833,11 +3833,18 @@ static const struct file_operations i915_cur_wm_latency_fops = {
static int
i915_wedged_get(void *data, u64 *val)
{
- struct drm_i915_private *dev_priv = data;
+ int ret = i915_terminally_wedged(data);
- *val = i915_terminally_wedged(&dev_priv->gpu_error);
-
- return 0;
+ switch (ret) {
+ case -EIO:
+ *val = 1;
+ return 0;
+ case 0:
+ *val = 0;
+ return 0;
+ default:
+ return ret;
+ }
}
static int
@@ -3918,7 +3925,7 @@ i915_drop_caches_set(void *data, u64 val)
mutex_unlock(&i915->drm.struct_mutex);
}
- if (val & DROP_RESET_ACTIVE && i915_terminally_wedged(&i915->gpu_error))
+ if (val & DROP_RESET_ACTIVE && i915_terminally_wedged(i915))
i915_handle_error(i915, ALL_ENGINES, 0, NULL);
fs_reclaim_acquire(GFP_KERNEL);