scsi: ufs: Fix a race condition between error handler and runtime PM ops
The current IRQ handler blocks SCSI requests before scheduling eh_work,
when error handler calls pm_runtime_get_sync, if ufshcd_suspend/resume
sends a SCSI cmd, most likely the SSU cmd, since SCSI requests are blocked,
pm_runtime_get_sync() will never return because ufshcd_suspend/resume is
blocked by the SCSI cmd.
- In queuecommand path, hba->ufshcd_state check and ufshcd_send_command
should stay under the same spin lock. This is to make sure that no more
commands leak into doorbell after hba->ufshcd_state is changed.
- Don't block SCSI requests before error handler starts to run, let error
handler block SCSI requests when it is ready to start error recovery.
- Don't let SCSI layer keep requeuing the SCSI cmds sent from HBA runtime
PM ops, let them pass or fail them. Let them pass if eh_work is
scheduled due to non-fatal errors. Fail them if eh_work is scheduled due
to fatal errors, otherwise the cmds may eventually time out since UFS is
in bad state, which gets error handler blocked for too long. If we fail
the SCSI cmds sent from HBA runtime PM ops, HBA runtime PM ops fails
too, but it does not hurt since error handler can recover HBA runtime PM
error.
Link: https://lore.kernel.org/r/1596975355-39813-9-git-send-email-cang@codeaurora.org
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Asutosh Das <asutoshd@codeaurora.org>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
1 file changed