[SCSI] Handle disk devices which can not process medium access commands
We have experienced several devices which fail in a fashion we do not
currently handle gracefully in SCSI. After a failure these devices will
respond to the SCSI primary command set (INQUIRY, TEST UNIT READY, etc.)
but any command accessing the storage medium will time out.
The following patch adds an callback that can be used by upper level
drivers to inspect the results of an error handling command. This in
turn has been used to implement additional checking in the SCSI disk
driver.
If a medium access command fails twice but TEST UNIT READY succeeds both
times in the subsequent error handling we will offline the device. The
maximum number of failed commands required to take a device offline can
be tweaked in sysfs.
Also add a new error flag to scsi_debug which allows this scenario to be
easily reproduced.
[jejb: fix up integer parsing to use kstrtouint]
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index f66e90d..2cfcbff 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -30,6 +30,7 @@
#include <scsi/scsi_cmnd.h>
#include <scsi/scsi_dbg.h>
#include <scsi/scsi_device.h>
+#include <scsi/scsi_driver.h>
#include <scsi/scsi_eh.h>
#include <scsi/scsi_transport.h>
#include <scsi/scsi_host.h>
@@ -141,11 +142,11 @@
else if (host->hostt->eh_timed_out)
rtn = host->hostt->eh_timed_out(scmd);
+ scmd->result |= DID_TIME_OUT << 16;
+
if (unlikely(rtn == BLK_EH_NOT_HANDLED &&
- !scsi_eh_scmd_add(scmd, SCSI_EH_CANCEL_CMD))) {
- scmd->result |= DID_TIME_OUT << 16;
+ !scsi_eh_scmd_add(scmd, SCSI_EH_CANCEL_CMD)))
rtn = BLK_EH_HANDLED;
- }
return rtn;
}
@@ -778,6 +779,7 @@
int cmnd_size, int timeout, unsigned sense_bytes)
{
struct scsi_device *sdev = scmd->device;
+ struct scsi_driver *sdrv = scsi_cmd_to_driver(scmd);
struct Scsi_Host *shost = sdev->host;
DECLARE_COMPLETION_ONSTACK(done);
unsigned long timeleft;
@@ -832,6 +834,10 @@
}
scsi_eh_restore_cmnd(scmd, &ses);
+
+ if (sdrv->eh_action)
+ rtn = sdrv->eh_action(scmd, cmnd, cmnd_size, rtn);
+
return rtn;
}