Mauro Carvalho Chehab | 898bd37 | 2019-04-18 19:45:00 -0300 | [diff] [blame] | 1 | ============================== |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 2 | Deadline IO scheduler tunables |
| 3 | ============================== |
| 4 | |
| 5 | This little file attempts to document how the deadline io scheduler works. |
| 6 | In particular, it will clarify the meaning of the exposed tunables that may be |
| 7 | of interest to power users. |
| 8 | |
Alan D. Brunelle | 23c7698 | 2007-10-15 13:22:26 +0200 | [diff] [blame] | 9 | Selecting IO schedulers |
| 10 | ----------------------- |
Mauro Carvalho Chehab | 898bd37 | 2019-04-18 19:45:00 -0300 | [diff] [blame] | 11 | Refer to Documentation/block/switching-sched.rst for information on |
Alan D. Brunelle | 23c7698 | 2007-10-15 13:22:26 +0200 | [diff] [blame] | 12 | selecting an io scheduler on a per-device basis. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 13 | |
Mauro Carvalho Chehab | 898bd37 | 2019-04-18 19:45:00 -0300 | [diff] [blame] | 14 | ------------------------------------------------------------------------------ |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 15 | |
| 16 | read_expire (in ms) |
Mauro Carvalho Chehab | 898bd37 | 2019-04-18 19:45:00 -0300 | [diff] [blame] | 17 | ----------------------- |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 18 | |
Matt LaPlante | a2ffd27 | 2006-10-03 22:49:15 +0200 | [diff] [blame] | 19 | The goal of the deadline io scheduler is to attempt to guarantee a start |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 20 | service time for a request. As we focus mainly on read latencies, this is |
| 21 | tunable. When a read request first enters the io scheduler, it is assigned |
| 22 | a deadline that is the current time + the read_expire value in units of |
Matt LaPlante | 2fe0ae7 | 2006-10-03 22:50:39 +0200 | [diff] [blame] | 23 | milliseconds. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 24 | |
| 25 | |
| 26 | write_expire (in ms) |
Mauro Carvalho Chehab | 898bd37 | 2019-04-18 19:45:00 -0300 | [diff] [blame] | 27 | ----------------------- |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 28 | |
| 29 | Similar to read_expire mentioned above, but for writes. |
| 30 | |
| 31 | |
Aaron Carroll | 6a421c1 | 2008-08-14 18:17:15 +1000 | [diff] [blame] | 32 | fifo_batch (number of requests) |
Mauro Carvalho Chehab | 898bd37 | 2019-04-18 19:45:00 -0300 | [diff] [blame] | 33 | ------------------------------------ |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 34 | |
Mauro Carvalho Chehab | 898bd37 | 2019-04-18 19:45:00 -0300 | [diff] [blame] | 35 | Requests are grouped into ``batches`` of a particular data direction (read or |
Aaron Carroll | 6a421c1 | 2008-08-14 18:17:15 +1000 | [diff] [blame] | 36 | write) which are serviced in increasing sector order. To limit extra seeking, |
| 37 | deadline expiries are only checked between batches. fifo_batch controls the |
| 38 | maximum number of requests per batch. |
| 39 | |
| 40 | This parameter tunes the balance between per-request latency and aggregate |
| 41 | throughput. When low latency is the primary concern, smaller is better (where |
| 42 | a value of 1 yields first-come first-served behaviour). Increasing fifo_batch |
| 43 | generally improves throughput, at the cost of latency variation. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 44 | |
| 45 | |
Alan D. Brunelle | 23c7698 | 2007-10-15 13:22:26 +0200 | [diff] [blame] | 46 | writes_starved (number of dispatches) |
Mauro Carvalho Chehab | 898bd37 | 2019-04-18 19:45:00 -0300 | [diff] [blame] | 47 | -------------------------------------- |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 48 | |
| 49 | When we have to move requests from the io scheduler queue to the block |
| 50 | device dispatch queue, we always give a preference to reads. However, we |
| 51 | don't want to starve writes indefinitely either. So writes_starved controls |
| 52 | how many times we give preference to reads over writes. When that has been |
| 53 | done writes_starved number of times, we dispatch some writes based on the |
| 54 | same criteria as reads. |
| 55 | |
| 56 | |
| 57 | front_merges (bool) |
Mauro Carvalho Chehab | 898bd37 | 2019-04-18 19:45:00 -0300 | [diff] [blame] | 58 | ---------------------- |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 59 | |
Matt LaPlante | 19f5946 | 2009-04-27 15:06:31 +0200 | [diff] [blame] | 60 | Sometimes it happens that a request enters the io scheduler that is contiguous |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 61 | with a request that is already on the queue. Either it fits in the back of that |
| 62 | request, or it fits at the front. That is called either a back merge candidate |
| 63 | or a front merge candidate. Due to the way files are typically laid out, |
| 64 | back merges are much more common than front merges. For some work loads, you |
| 65 | may even know that it is a waste of time to spend any time attempting to |
| 66 | front merge requests. Setting front_merges to 0 disables this functionality. |
| 67 | Front merges may still occur due to the cached last_merge hint, but since |
| 68 | that comes at basically 0 cost we leave that on. We simply disable the |
| 69 | rbtree front sector lookup when the io scheduler merge function is called. |
| 70 | |
| 71 | |
Rob Landley | 26bbb29 | 2007-10-15 11:42:52 +0200 | [diff] [blame] | 72 | Nov 11 2002, Jens Axboe <jens.axboe@oracle.com> |