scsi: add support for a blk-mq based I/O path.
This patch adds support for an alternate I/O path in the scsi midlayer
which uses the blk-mq infrastructure instead of the legacy request code.
Use of blk-mq is fully transparent to drivers, although for now a host
template field is provided to opt out of blk-mq usage in case any unforseen
incompatibilities arise.
In general replacing the legacy request code with blk-mq is a simple and
mostly mechanical transformation. The biggest exception is the new code
that deals with the fact the I/O submissions in blk-mq must happen from
process context, which slightly complicates the I/O completion handler.
The second biggest differences is that blk-mq is build around the concept
of preallocated requests that also include driver specific data, which
in SCSI context means the scsi_cmnd structure. This completely avoids
dynamic memory allocations for the fast path through I/O submission.
Due the preallocated requests the MQ code path exclusively uses the
host-wide shared tag allocator instead of a per-LUN one. This only
affects drivers actually using the block layer provided tag allocator
instead of their own. Unlike the old path blk-mq always provides a tag,
although drivers don't have to use it.
For now the blk-mq path is disable by defauly and must be enabled using
the "use_blk_mq" module parameter. Once the remaining work in the block
layer to make blk-mq more suitable for slow devices is complete I hope
to make it the default and eventually even remove the old code path.
Based on the earlier scsi-mq prototype by Nicholas Bellinger.
Thanks to Bart Van Assche and Robert Elliot for testing, benchmarking and
various sugestions and code contributions.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Webb Scales <webbnh@hp.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Tested-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Robert Elliott <elliott@hp.com>
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 5e8ebc1..ba20347 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -7,6 +7,7 @@
#include <linux/workqueue.h>
#include <linux/mutex.h>
#include <linux/seq_file.h>
+#include <linux/blk-mq.h>
#include <scsi/scsi.h>
struct request_queue;
@@ -510,6 +511,9 @@
*/
unsigned int cmd_size;
struct scsi_host_cmd_pool *cmd_pool;
+
+ /* temporary flag to disable blk-mq I/O path */
+ bool disable_blk_mq;
};
/*
@@ -580,7 +584,10 @@
* Area to keep a shared tag map (if needed, will be
* NULL if not).
*/
- struct blk_queue_tag *bqt;
+ union {
+ struct blk_queue_tag *bqt;
+ struct blk_mq_tag_set tag_set;
+ };
atomic_t host_busy; /* commands actually active on low-level */
atomic_t host_blocked;
@@ -672,6 +679,8 @@
/* The controller does not support WRITE SAME */
unsigned no_write_same:1;
+ unsigned use_blk_mq:1;
+
/*
* Optional work queue to be utilized by the transport
*/
@@ -772,6 +781,13 @@
shost->tmf_in_progress;
}
+extern bool scsi_use_blk_mq;
+
+static inline bool shost_use_blk_mq(struct Scsi_Host *shost)
+{
+ return shost->use_blk_mq;
+}
+
extern int scsi_queue_work(struct Scsi_Host *, struct work_struct *);
extern void scsi_flush_work(struct Scsi_Host *);