Vinod Koul | 77fe661 | 2017-11-03 10:19:38 +0530 | [diff] [blame] | 1 | ================================== |
| 2 | DMAengine controller documentation |
| 3 | ================================== |
| 4 | |
| 5 | Hardware Introduction |
| 6 | ===================== |
| 7 | |
| 8 | Most of the Slave DMA controllers have the same general principles of |
| 9 | operations. |
| 10 | |
| 11 | They have a given number of channels to use for the DMA transfers, and |
| 12 | a given number of requests lines. |
| 13 | |
| 14 | Requests and channels are pretty much orthogonal. Channels can be used |
| 15 | to serve several to any requests. To simplify, channels are the |
| 16 | entities that will be doing the copy, and requests what endpoints are |
| 17 | involved. |
| 18 | |
| 19 | The request lines actually correspond to physical lines going from the |
| 20 | DMA-eligible devices to the controller itself. Whenever the device |
| 21 | will want to start a transfer, it will assert a DMA request (DRQ) by |
| 22 | asserting that request line. |
| 23 | |
| 24 | A very simple DMA controller would only take into account a single |
| 25 | parameter: the transfer size. At each clock cycle, it would transfer a |
| 26 | byte of data from one buffer to another, until the transfer size has |
| 27 | been reached. |
| 28 | |
| 29 | That wouldn't work well in the real world, since slave devices might |
| 30 | require a specific number of bits to be transferred in a single |
| 31 | cycle. For example, we may want to transfer as much data as the |
| 32 | physical bus allows to maximize performances when doing a simple |
| 33 | memory copy operation, but our audio device could have a narrower FIFO |
| 34 | that requires data to be written exactly 16 or 24 bits at a time. This |
| 35 | is why most if not all of the DMA controllers can adjust this, using a |
| 36 | parameter called the transfer width. |
| 37 | |
| 38 | Moreover, some DMA controllers, whenever the RAM is used as a source |
| 39 | or destination, can group the reads or writes in memory into a buffer, |
| 40 | so instead of having a lot of small memory accesses, which is not |
| 41 | really efficient, you'll get several bigger transfers. This is done |
| 42 | using a parameter called the burst size, that defines how many single |
| 43 | reads/writes it's allowed to do without the controller splitting the |
| 44 | transfer into smaller sub-transfers. |
| 45 | |
| 46 | Our theoretical DMA controller would then only be able to do transfers |
| 47 | that involve a single contiguous block of data. However, some of the |
| 48 | transfers we usually have are not, and want to copy data from |
| 49 | non-contiguous buffers to a contiguous buffer, which is called |
| 50 | scatter-gather. |
| 51 | |
| 52 | DMAEngine, at least for mem2dev transfers, require support for |
| 53 | scatter-gather. So we're left with two cases here: either we have a |
| 54 | quite simple DMA controller that doesn't support it, and we'll have to |
| 55 | implement it in software, or we have a more advanced DMA controller, |
| 56 | that implements in hardware scatter-gather. |
| 57 | |
| 58 | The latter are usually programmed using a collection of chunks to |
| 59 | transfer, and whenever the transfer is started, the controller will go |
| 60 | over that collection, doing whatever we programmed there. |
| 61 | |
| 62 | This collection is usually either a table or a linked list. You will |
| 63 | then push either the address of the table and its number of elements, |
| 64 | or the first item of the list to one channel of the DMA controller, |
| 65 | and whenever a DRQ will be asserted, it will go through the collection |
| 66 | to know where to fetch the data from. |
| 67 | |
| 68 | Either way, the format of this collection is completely dependent on |
| 69 | your hardware. Each DMA controller will require a different structure, |
| 70 | but all of them will require, for every chunk, at least the source and |
| 71 | destination addresses, whether it should increment these addresses or |
| 72 | not and the three parameters we saw earlier: the burst size, the |
| 73 | transfer width and the transfer size. |
| 74 | |
| 75 | The one last thing is that usually, slave devices won't issue DRQ by |
| 76 | default, and you have to enable this in your slave device driver first |
| 77 | whenever you're willing to use DMA. |
| 78 | |
| 79 | These were just the general memory-to-memory (also called mem2mem) or |
| 80 | memory-to-device (mem2dev) kind of transfers. Most devices often |
| 81 | support other kind of transfers or memory operations that dmaengine |
| 82 | support and will be detailed later in this document. |
| 83 | |
| 84 | DMA Support in Linux |
| 85 | ==================== |
| 86 | |
| 87 | Historically, DMA controller drivers have been implemented using the |
| 88 | async TX API, to offload operations such as memory copy, XOR, |
| 89 | cryptography, etc., basically any memory to memory operation. |
| 90 | |
| 91 | Over time, the need for memory to device transfers arose, and |
| 92 | dmaengine was extended. Nowadays, the async TX API is written as a |
| 93 | layer on top of dmaengine, and acts as a client. Still, dmaengine |
| 94 | accommodates that API in some cases, and made some design choices to |
| 95 | ensure that it stayed compatible. |
| 96 | |
| 97 | For more information on the Async TX API, please look the relevant |
Mauro Carvalho Chehab | ddc9239 | 2020-06-15 08:50:10 +0200 | [diff] [blame] | 98 | documentation file in Documentation/crypto/async-tx-api.rst. |
Vinod Koul | 77fe661 | 2017-11-03 10:19:38 +0530 | [diff] [blame] | 99 | |
| 100 | DMAEngine APIs |
| 101 | ============== |
| 102 | |
| 103 | ``struct dma_device`` Initialization |
| 104 | ------------------------------------ |
| 105 | |
| 106 | Just like any other kernel framework, the whole DMAEngine registration |
| 107 | relies on the driver filling a structure and registering against the |
| 108 | framework. In our case, that structure is dma_device. |
| 109 | |
| 110 | The first thing you need to do in your driver is to allocate this |
| 111 | structure. Any of the usual memory allocators will do, but you'll also |
| 112 | need to initialize a few fields in there: |
| 113 | |
Luca Ceresoli | 881053f | 2017-12-30 23:53:07 +0100 | [diff] [blame] | 114 | - ``channels``: should be initialized as a list using the |
Vinod Koul | 77fe661 | 2017-11-03 10:19:38 +0530 | [diff] [blame] | 115 | INIT_LIST_HEAD macro for example |
| 116 | |
Luca Ceresoli | 881053f | 2017-12-30 23:53:07 +0100 | [diff] [blame] | 117 | - ``src_addr_widths``: |
Vinod Koul | 77fe661 | 2017-11-03 10:19:38 +0530 | [diff] [blame] | 118 | should contain a bitmask of the supported source transfer width |
| 119 | |
Luca Ceresoli | 881053f | 2017-12-30 23:53:07 +0100 | [diff] [blame] | 120 | - ``dst_addr_widths``: |
Vinod Koul | 77fe661 | 2017-11-03 10:19:38 +0530 | [diff] [blame] | 121 | should contain a bitmask of the supported destination transfer width |
| 122 | |
Luca Ceresoli | 881053f | 2017-12-30 23:53:07 +0100 | [diff] [blame] | 123 | - ``directions``: |
Vinod Koul | 77fe661 | 2017-11-03 10:19:38 +0530 | [diff] [blame] | 124 | should contain a bitmask of the supported slave directions |
| 125 | (i.e. excluding mem2mem transfers) |
| 126 | |
Luca Ceresoli | 881053f | 2017-12-30 23:53:07 +0100 | [diff] [blame] | 127 | - ``residue_granularity``: |
Luca Ceresoli | a5d3320 | 2017-12-30 23:53:06 +0100 | [diff] [blame] | 128 | granularity of the transfer residue reported to dma_set_residue. |
| 129 | This can be either: |
Vinod Koul | 77fe661 | 2017-11-03 10:19:38 +0530 | [diff] [blame] | 130 | |
Luca Ceresoli | a5d3320 | 2017-12-30 23:53:06 +0100 | [diff] [blame] | 131 | - Descriptor: |
| 132 | your device doesn't support any kind of residue |
| 133 | reporting. The framework will only know that a particular |
| 134 | transaction descriptor is done. |
Vinod Koul | 77fe661 | 2017-11-03 10:19:38 +0530 | [diff] [blame] | 135 | |
Luca Ceresoli | a5d3320 | 2017-12-30 23:53:06 +0100 | [diff] [blame] | 136 | - Segment: |
| 137 | your device is able to report which chunks have been transferred |
Vinod Koul | 77fe661 | 2017-11-03 10:19:38 +0530 | [diff] [blame] | 138 | |
Luca Ceresoli | a5d3320 | 2017-12-30 23:53:06 +0100 | [diff] [blame] | 139 | - Burst: |
| 140 | your device is able to report which burst have been transferred |
Vinod Koul | 77fe661 | 2017-11-03 10:19:38 +0530 | [diff] [blame] | 141 | |
Luca Ceresoli | 881053f | 2017-12-30 23:53:07 +0100 | [diff] [blame] | 142 | - ``dev``: should hold the pointer to the ``struct device`` associated |
Luca Ceresoli | a5d3320 | 2017-12-30 23:53:06 +0100 | [diff] [blame] | 143 | to your current driver instance. |
Vinod Koul | 77fe661 | 2017-11-03 10:19:38 +0530 | [diff] [blame] | 144 | |
| 145 | Supported transaction types |
| 146 | --------------------------- |
| 147 | |
| 148 | The next thing you need is to set which transaction types your device |
| 149 | (and driver) supports. |
| 150 | |
| 151 | Our ``dma_device structure`` has a field called cap_mask that holds the |
| 152 | various types of transaction supported, and you need to modify this |
| 153 | mask using the dma_cap_set function, with various flags depending on |
| 154 | transaction types you support as an argument. |
| 155 | |
| 156 | All those capabilities are defined in the ``dma_transaction_type enum``, |
| 157 | in ``include/linux/dmaengine.h`` |
| 158 | |
| 159 | Currently, the types available are: |
| 160 | |
| 161 | - DMA_MEMCPY |
| 162 | |
| 163 | - The device is able to do memory to memory copies |
| 164 | |
Adrian Larumbe | 58fe107 | 2021-11-01 18:08:23 +0000 | [diff] [blame] | 165 | - - DMA_MEMCPY_SG |
| 166 | |
| 167 | - The device supports memory to memory scatter-gather transfers. |
| 168 | |
| 169 | - Even though a plain memcpy can look like a particular case of a |
| 170 | scatter-gather transfer, with a single chunk to copy, it's a distinct |
| 171 | transaction type in the mem2mem transfer case. This is because some very |
| 172 | simple devices might be able to do contiguous single-chunk memory copies, |
| 173 | but have no support for more complex SG transfers. |
| 174 | |
| 175 | - No matter what the overall size of the combined chunks for source and |
| 176 | destination is, only as many bytes as the smallest of the two will be |
| 177 | transmitted. That means the number and size of the scatter-gather buffers in |
| 178 | both lists need not be the same, and that the operation functionally is |
| 179 | equivalent to a ``strncpy`` where the ``count`` argument equals the smallest |
| 180 | total size of the two scatter-gather list buffers. |
| 181 | |
| 182 | - It's usually used for copying pixel data between host memory and |
| 183 | memory-mapped GPU device memory, such as found on modern PCI video graphics |
| 184 | cards. The most immediate example is the OpenGL API function |
| 185 | ``glReadPielx()``, which might require a verbatim copy of a huge framebuffer |
| 186 | from local device memory onto host memory. |
| 187 | |
Vinod Koul | 77fe661 | 2017-11-03 10:19:38 +0530 | [diff] [blame] | 188 | - DMA_XOR |
| 189 | |
| 190 | - The device is able to perform XOR operations on memory areas |
| 191 | |
| 192 | - Used to accelerate XOR intensive tasks, such as RAID5 |
| 193 | |
| 194 | - DMA_XOR_VAL |
| 195 | |
| 196 | - The device is able to perform parity check using the XOR |
| 197 | algorithm against a memory buffer. |
| 198 | |
| 199 | - DMA_PQ |
| 200 | |
| 201 | - The device is able to perform RAID6 P+Q computations, P being a |
| 202 | simple XOR, and Q being a Reed-Solomon algorithm. |
| 203 | |
| 204 | - DMA_PQ_VAL |
| 205 | |
| 206 | - The device is able to perform parity check using RAID6 P+Q |
| 207 | algorithm against a memory buffer. |
| 208 | |
| 209 | - DMA_INTERRUPT |
| 210 | |
| 211 | - The device is able to trigger a dummy transfer that will |
| 212 | generate periodic interrupts |
| 213 | |
| 214 | - Used by the client drivers to register a callback that will be |
| 215 | called on a regular basis through the DMA controller interrupt |
| 216 | |
| 217 | - DMA_PRIVATE |
| 218 | |
| 219 | - The devices only supports slave transfers, and as such isn't |
| 220 | available for async transfers. |
| 221 | |
| 222 | - DMA_ASYNC_TX |
| 223 | |
| 224 | - Must not be set by the device, and will be set by the framework |
| 225 | if needed |
| 226 | |
| 227 | - TODO: What is it about? |
| 228 | |
| 229 | - DMA_SLAVE |
| 230 | |
| 231 | - The device can handle device to memory transfers, including |
| 232 | scatter-gather transfers. |
| 233 | |
| 234 | - While in the mem2mem case we were having two distinct types to |
| 235 | deal with a single chunk to copy or a collection of them, here, |
| 236 | we just have a single transaction type that is supposed to |
| 237 | handle both. |
| 238 | |
| 239 | - If you want to transfer a single contiguous memory buffer, |
| 240 | simply build a scatter list with only one item. |
| 241 | |
| 242 | - DMA_CYCLIC |
| 243 | |
| 244 | - The device can handle cyclic transfers. |
| 245 | |
| 246 | - A cyclic transfer is a transfer where the chunk collection will |
| 247 | loop over itself, with the last item pointing to the first. |
| 248 | |
| 249 | - It's usually used for audio transfers, where you want to operate |
| 250 | on a single ring buffer that you will fill with your audio data. |
| 251 | |
| 252 | - DMA_INTERLEAVE |
| 253 | |
| 254 | - The device supports interleaved transfer. |
| 255 | |
| 256 | - These transfers can transfer data from a non-contiguous buffer |
| 257 | to a non-contiguous buffer, opposed to DMA_SLAVE that can |
| 258 | transfer data from a non-contiguous data set to a continuous |
| 259 | destination buffer. |
| 260 | |
| 261 | - It's usually used for 2d content transfers, in which case you |
| 262 | want to transfer a portion of uncompressed data directly to the |
| 263 | display to print it |
| 264 | |
Dave Jiang | 47ec7f09 | 2020-05-13 11:47:49 -0700 | [diff] [blame] | 265 | - DMA_COMPLETION_NO_ORDER |
| 266 | |
| 267 | - The device does not support in order completion. |
| 268 | |
| 269 | - The driver should return DMA_OUT_OF_ORDER for device_tx_status if |
| 270 | the device is setting this capability. |
| 271 | |
| 272 | - All cookie tracking and checking API should be treated as invalid if |
| 273 | the device exports this capability. |
| 274 | |
| 275 | - At this point, this is incompatible with polling option for dmatest. |
| 276 | |
| 277 | - If this cap is set, the user is recommended to provide an unique |
| 278 | identifier for each descriptor sent to the DMA device in order to |
| 279 | properly track the completion. |
| 280 | |
Laurent Pinchart | 9c8ebd8b | 2020-07-17 04:33:34 +0300 | [diff] [blame] | 281 | - DMA_REPEAT |
| 282 | |
| 283 | - The device supports repeated transfers. A repeated transfer, indicated by |
| 284 | the DMA_PREP_REPEAT transfer flag, is similar to a cyclic transfer in that |
| 285 | it gets automatically repeated when it ends, but can additionally be |
| 286 | replaced by the client. |
| 287 | |
| 288 | - This feature is limited to interleaved transfers, this flag should thus not |
| 289 | be set if the DMA_INTERLEAVE flag isn't set. This limitation is based on |
| 290 | the current needs of DMA clients, support for additional transfer types |
| 291 | should be added in the future if and when the need arises. |
| 292 | |
| 293 | - DMA_LOAD_EOT |
| 294 | |
| 295 | - The device supports replacing repeated transfers at end of transfer (EOT) |
| 296 | by queuing a new transfer with the DMA_PREP_LOAD_EOT flag set. |
| 297 | |
| 298 | - Support for replacing a currently running transfer at another point (such |
| 299 | as end of burst instead of end of transfer) will be added in the future |
| 300 | based on DMA clients needs, if and when the need arises. |
| 301 | |
Vinod Koul | 77fe661 | 2017-11-03 10:19:38 +0530 | [diff] [blame] | 302 | These various types will also affect how the source and destination |
| 303 | addresses change over time. |
| 304 | |
| 305 | Addresses pointing to RAM are typically incremented (or decremented) |
| 306 | after each transfer. In case of a ring buffer, they may loop |
| 307 | (DMA_CYCLIC). Addresses pointing to a device's register (e.g. a FIFO) |
| 308 | are typically fixed. |
| 309 | |
Peter Ujfalusi | 7d083ae | 2019-12-23 13:04:43 +0200 | [diff] [blame] | 310 | Per descriptor metadata support |
| 311 | ------------------------------- |
| 312 | Some data movement architecture (DMA controller and peripherals) uses metadata |
| 313 | associated with a transaction. The DMA controller role is to transfer the |
| 314 | payload and the metadata alongside. |
| 315 | The metadata itself is not used by the DMA engine itself, but it contains |
| 316 | parameters, keys, vectors, etc for peripheral or from the peripheral. |
| 317 | |
| 318 | The DMAengine framework provides a generic ways to facilitate the metadata for |
| 319 | descriptors. Depending on the architecture the DMA driver can implement either |
| 320 | or both of the methods and it is up to the client driver to choose which one |
| 321 | to use. |
| 322 | |
| 323 | - DESC_METADATA_CLIENT |
| 324 | |
| 325 | The metadata buffer is allocated/provided by the client driver and it is |
| 326 | attached (via the dmaengine_desc_attach_metadata() helper to the descriptor. |
| 327 | |
| 328 | From the DMA driver the following is expected for this mode: |
Mauro Carvalho Chehab | cf7da89 | 2020-03-03 16:50:33 +0100 | [diff] [blame] | 329 | |
Peter Ujfalusi | 7d083ae | 2019-12-23 13:04:43 +0200 | [diff] [blame] | 330 | - DMA_MEM_TO_DEV / DEV_MEM_TO_MEM |
Mauro Carvalho Chehab | cf7da89 | 2020-03-03 16:50:33 +0100 | [diff] [blame] | 331 | |
Peter Ujfalusi | 7d083ae | 2019-12-23 13:04:43 +0200 | [diff] [blame] | 332 | The data from the provided metadata buffer should be prepared for the DMA |
| 333 | controller to be sent alongside of the payload data. Either by copying to a |
| 334 | hardware descriptor, or highly coupled packet. |
Mauro Carvalho Chehab | cf7da89 | 2020-03-03 16:50:33 +0100 | [diff] [blame] | 335 | |
Peter Ujfalusi | 7d083ae | 2019-12-23 13:04:43 +0200 | [diff] [blame] | 336 | - DMA_DEV_TO_MEM |
Mauro Carvalho Chehab | cf7da89 | 2020-03-03 16:50:33 +0100 | [diff] [blame] | 337 | |
Peter Ujfalusi | 7d083ae | 2019-12-23 13:04:43 +0200 | [diff] [blame] | 338 | On transfer completion the DMA driver must copy the metadata to the client |
| 339 | provided metadata buffer before notifying the client about the completion. |
| 340 | After the transfer completion, DMA drivers must not touch the metadata |
| 341 | buffer provided by the client. |
| 342 | |
| 343 | - DESC_METADATA_ENGINE |
| 344 | |
| 345 | The metadata buffer is allocated/managed by the DMA driver. The client driver |
| 346 | can ask for the pointer, maximum size and the currently used size of the |
| 347 | metadata and can directly update or read it. dmaengine_desc_get_metadata_ptr() |
| 348 | and dmaengine_desc_set_metadata_len() is provided as helper functions. |
| 349 | |
| 350 | From the DMA driver the following is expected for this mode: |
Mauro Carvalho Chehab | cf7da89 | 2020-03-03 16:50:33 +0100 | [diff] [blame] | 351 | |
| 352 | - get_metadata_ptr() |
| 353 | |
Peter Ujfalusi | 7d083ae | 2019-12-23 13:04:43 +0200 | [diff] [blame] | 354 | Should return a pointer for the metadata buffer, the maximum size of the |
| 355 | metadata buffer and the currently used / valid (if any) bytes in the buffer. |
Mauro Carvalho Chehab | cf7da89 | 2020-03-03 16:50:33 +0100 | [diff] [blame] | 356 | |
| 357 | - set_metadata_len() |
| 358 | |
Peter Ujfalusi | 7d083ae | 2019-12-23 13:04:43 +0200 | [diff] [blame] | 359 | It is called by the clients after it have placed the metadata to the buffer |
| 360 | to let the DMA driver know the number of valid bytes provided. |
| 361 | |
| 362 | Note: since the client will ask for the metadata pointer in the completion |
| 363 | callback (in DMA_DEV_TO_MEM case) the DMA driver must ensure that the |
| 364 | descriptor is not freed up prior the callback is called. |
| 365 | |
Vinod Koul | 77fe661 | 2017-11-03 10:19:38 +0530 | [diff] [blame] | 366 | Device operations |
| 367 | ----------------- |
| 368 | |
| 369 | Our dma_device structure also requires a few function pointers in |
| 370 | order to implement the actual logic, now that we described what |
| 371 | operations we were able to perform. |
| 372 | |
| 373 | The functions that we have to fill in there, and hence have to |
| 374 | implement, obviously depend on the transaction types you reported as |
| 375 | supported. |
| 376 | |
| 377 | - ``device_alloc_chan_resources`` |
| 378 | |
| 379 | - ``device_free_chan_resources`` |
| 380 | |
| 381 | - These functions will be called whenever a driver will call |
| 382 | ``dma_request_channel`` or ``dma_release_channel`` for the first/last |
| 383 | time on the channel associated to that driver. |
| 384 | |
| 385 | - They are in charge of allocating/freeing all the needed |
| 386 | resources in order for that channel to be useful for your driver. |
| 387 | |
| 388 | - These functions can sleep. |
| 389 | |
| 390 | - ``device_prep_dma_*`` |
| 391 | |
| 392 | - These functions are matching the capabilities you registered |
| 393 | previously. |
| 394 | |
| 395 | - These functions all take the buffer or the scatterlist relevant |
| 396 | for the transfer being prepared, and should create a hardware |
| 397 | descriptor or a list of hardware descriptors from it |
| 398 | |
| 399 | - These functions can be called from an interrupt context |
| 400 | |
| 401 | - Any allocation you might do should be using the GFP_NOWAIT |
| 402 | flag, in order not to potentially sleep, but without depleting |
| 403 | the emergency pool either. |
| 404 | |
| 405 | - Drivers should try to pre-allocate any memory they might need |
| 406 | during the transfer setup at probe time to avoid putting to |
| 407 | much pressure on the nowait allocator. |
| 408 | |
| 409 | - It should return a unique instance of the |
| 410 | ``dma_async_tx_descriptor structure``, that further represents this |
| 411 | particular transfer. |
| 412 | |
| 413 | - This structure can be initialized using the function |
| 414 | ``dma_async_tx_descriptor_init``. |
| 415 | |
| 416 | - You'll also need to set two fields in this structure: |
| 417 | |
| 418 | - flags: |
| 419 | TODO: Can it be modified by the driver itself, or |
| 420 | should it be always the flags passed in the arguments |
| 421 | |
| 422 | - tx_submit: A pointer to a function you have to implement, |
| 423 | that is supposed to push the current transaction descriptor to a |
| 424 | pending queue, waiting for issue_pending to be called. |
| 425 | |
| 426 | - In this structure the function pointer callback_result can be |
| 427 | initialized in order for the submitter to be notified that a |
| 428 | transaction has completed. In the earlier code the function pointer |
| 429 | callback has been used. However it does not provide any status to the |
| 430 | transaction and will be deprecated. The result structure defined as |
| 431 | ``dmaengine_result`` that is passed in to callback_result |
| 432 | has two fields: |
| 433 | |
| 434 | - result: This provides the transfer result defined by |
| 435 | ``dmaengine_tx_result``. Either success or some error condition. |
| 436 | |
| 437 | - residue: Provides the residue bytes of the transfer for those that |
| 438 | support residue. |
| 439 | |
| 440 | - ``device_issue_pending`` |
| 441 | |
| 442 | - Takes the first transaction descriptor in the pending queue, |
| 443 | and starts the transfer. Whenever that transfer is done, it |
| 444 | should move to the next transaction in the list. |
| 445 | |
| 446 | - This function can be called in an interrupt context |
| 447 | |
| 448 | - ``device_tx_status`` |
| 449 | |
| 450 | - Should report the bytes left to go over on the given channel |
| 451 | |
| 452 | - Should only care about the transaction descriptor passed as |
| 453 | argument, not the currently active one on a given channel |
| 454 | |
| 455 | - The tx_state argument might be NULL |
| 456 | |
| 457 | - Should use dma_set_residue to report it |
| 458 | |
| 459 | - In the case of a cyclic transfer, it should only take into |
| 460 | account the current period. |
| 461 | |
Dave Jiang | 47ec7f09 | 2020-05-13 11:47:49 -0700 | [diff] [blame] | 462 | - Should return DMA_OUT_OF_ORDER if the device does not support in order |
| 463 | completion and is completing the operation out of order. |
| 464 | |
Vinod Koul | 77fe661 | 2017-11-03 10:19:38 +0530 | [diff] [blame] | 465 | - This function can be called in an interrupt context. |
| 466 | |
| 467 | - device_config |
| 468 | |
| 469 | - Reconfigures the channel with the configuration given as argument |
| 470 | |
| 471 | - This command should NOT perform synchronously, or on any |
| 472 | currently queued transfers, but only on subsequent ones |
| 473 | |
| 474 | - In this case, the function will receive a ``dma_slave_config`` |
| 475 | structure pointer as an argument, that will detail which |
| 476 | configuration to use. |
| 477 | |
| 478 | - Even though that structure contains a direction field, this |
| 479 | field is deprecated in favor of the direction argument given to |
| 480 | the prep_* functions |
| 481 | |
| 482 | - This call is mandatory for slave operations only. This should NOT be |
| 483 | set or expected to be set for memcpy operations. |
| 484 | If a driver support both, it should use this call for slave |
| 485 | operations only and not for memcpy ones. |
| 486 | |
| 487 | - device_pause |
| 488 | |
| 489 | - Pauses a transfer on the channel |
| 490 | |
| 491 | - This command should operate synchronously on the channel, |
| 492 | pausing right away the work of the given channel |
| 493 | |
| 494 | - device_resume |
| 495 | |
| 496 | - Resumes a transfer on the channel |
| 497 | |
| 498 | - This command should operate synchronously on the channel, |
| 499 | resuming right away the work of the given channel |
| 500 | |
| 501 | - device_terminate_all |
| 502 | |
| 503 | - Aborts all the pending and ongoing transfers on the channel |
| 504 | |
| 505 | - For aborted transfers the complete callback should not be called |
| 506 | |
| 507 | - Can be called from atomic context or from within a complete |
| 508 | callback of a descriptor. Must not sleep. Drivers must be able |
| 509 | to handle this correctly. |
| 510 | |
| 511 | - Termination may be asynchronous. The driver does not have to |
| 512 | wait until the currently active transfer has completely stopped. |
| 513 | See device_synchronize. |
| 514 | |
| 515 | - device_synchronize |
| 516 | |
| 517 | - Must synchronize the termination of a channel to the current |
| 518 | context. |
| 519 | |
| 520 | - Must make sure that memory for previously submitted |
| 521 | descriptors is no longer accessed by the DMA controller. |
| 522 | |
| 523 | - Must make sure that all complete callbacks for previously |
| 524 | submitted descriptors have finished running and none are |
| 525 | scheduled to run. |
| 526 | |
| 527 | - May sleep. |
| 528 | |
| 529 | |
| 530 | Misc notes |
| 531 | ========== |
| 532 | |
| 533 | (stuff that should be documented, but don't really know |
| 534 | where to put them) |
| 535 | |
| 536 | ``dma_run_dependencies`` |
| 537 | |
| 538 | - Should be called at the end of an async TX transfer, and can be |
| 539 | ignored in the slave transfers case. |
| 540 | |
| 541 | - Makes sure that dependent operations are run before marking it |
| 542 | as complete. |
| 543 | |
| 544 | dma_cookie_t |
| 545 | |
| 546 | - it's a DMA transaction ID that will increment over time. |
| 547 | |
| 548 | - Not really relevant any more since the introduction of ``virt-dma`` |
| 549 | that abstracts it away. |
| 550 | |
| 551 | DMA_CTRL_ACK |
| 552 | |
| 553 | - If clear, the descriptor cannot be reused by provider until the |
Randy Dunlap | 3621d3e5 | 2020-07-03 20:44:46 -0700 | [diff] [blame] | 554 | client acknowledges receipt, i.e. has a chance to establish any |
Vinod Koul | 77fe661 | 2017-11-03 10:19:38 +0530 | [diff] [blame] | 555 | dependency chains |
| 556 | |
| 557 | - This can be acked by invoking async_tx_ack() |
| 558 | |
| 559 | - If set, does not mean descriptor can be reused |
| 560 | |
| 561 | DMA_CTRL_REUSE |
| 562 | |
| 563 | - If set, the descriptor can be reused after being completed. It should |
| 564 | not be freed by provider if this flag is set. |
| 565 | |
| 566 | - The descriptor should be prepared for reuse by invoking |
| 567 | ``dmaengine_desc_set_reuse()`` which will set DMA_CTRL_REUSE. |
| 568 | |
| 569 | - ``dmaengine_desc_set_reuse()`` will succeed only when channel support |
| 570 | reusable descriptor as exhibited by capabilities |
| 571 | |
| 572 | - As a consequence, if a device driver wants to skip the |
| 573 | ``dma_map_sg()`` and ``dma_unmap_sg()`` in between 2 transfers, |
| 574 | because the DMA'd data wasn't used, it can resubmit the transfer right after |
| 575 | its completion. |
| 576 | |
| 577 | - Descriptor can be freed in few ways |
| 578 | |
| 579 | - Clearing DMA_CTRL_REUSE by invoking |
| 580 | ``dmaengine_desc_clear_reuse()`` and submitting for last txn |
| 581 | |
| 582 | - Explicitly invoking ``dmaengine_desc_free()``, this can succeed only |
| 583 | when DMA_CTRL_REUSE is already set |
| 584 | |
| 585 | - Terminating the channel |
| 586 | |
| 587 | - DMA_PREP_CMD |
| 588 | |
| 589 | - If set, the client driver tells DMA controller that passed data in DMA |
| 590 | API is command data. |
| 591 | |
| 592 | - Interpretation of command data is DMA controller specific. It can be |
| 593 | used for issuing commands to other peripherals/register reads/register |
| 594 | writes for which the descriptor should be in different format from |
| 595 | normal data descriptors. |
| 596 | |
Laurent Pinchart | 9c8ebd8b | 2020-07-17 04:33:34 +0300 | [diff] [blame] | 597 | - DMA_PREP_REPEAT |
| 598 | |
| 599 | - If set, the transfer will be automatically repeated when it ends until a |
| 600 | new transfer is queued on the same channel with the DMA_PREP_LOAD_EOT flag. |
| 601 | If the next transfer to be queued on the channel does not have the |
| 602 | DMA_PREP_LOAD_EOT flag set, the current transfer will be repeated until the |
| 603 | client terminates all transfers. |
| 604 | |
| 605 | - This flag is only supported if the channel reports the DMA_REPEAT |
| 606 | capability. |
| 607 | |
| 608 | - DMA_PREP_LOAD_EOT |
| 609 | |
| 610 | - If set, the transfer will replace the transfer currently being executed at |
| 611 | the end of the transfer. |
| 612 | |
| 613 | - This is the default behaviour for non-repeated transfers, specifying |
| 614 | DMA_PREP_LOAD_EOT for non-repeated transfers will thus make no difference. |
| 615 | |
| 616 | - When using repeated transfers, DMA clients will usually need to set the |
| 617 | DMA_PREP_LOAD_EOT flag on all transfers, otherwise the channel will keep |
| 618 | repeating the last repeated transfer and ignore the new transfers being |
| 619 | queued. Failure to set DMA_PREP_LOAD_EOT will appear as if the channel was |
| 620 | stuck on the previous transfer. |
| 621 | |
| 622 | - This flag is only supported if the channel reports the DMA_LOAD_EOT |
| 623 | capability. |
| 624 | |
Vinod Koul | 77fe661 | 2017-11-03 10:19:38 +0530 | [diff] [blame] | 625 | General Design Notes |
| 626 | ==================== |
| 627 | |
| 628 | Most of the DMAEngine drivers you'll see are based on a similar design |
| 629 | that handles the end of transfer interrupts in the handler, but defer |
| 630 | most work to a tasklet, including the start of a new transfer whenever |
| 631 | the previous transfer ended. |
| 632 | |
| 633 | This is a rather inefficient design though, because the inter-transfer |
| 634 | latency will be not only the interrupt latency, but also the |
| 635 | scheduling latency of the tasklet, which will leave the channel idle |
| 636 | in between, which will slow down the global transfer rate. |
| 637 | |
| 638 | You should avoid this kind of practice, and instead of electing a new |
| 639 | transfer in your tasklet, move that part to the interrupt handler in |
| 640 | order to have a shorter idle window (that we can't really avoid |
| 641 | anyway). |
| 642 | |
| 643 | Glossary |
| 644 | ======== |
| 645 | |
| 646 | - Burst: A number of consecutive read or write operations that |
| 647 | can be queued to buffers before being flushed to memory. |
| 648 | |
| 649 | - Chunk: A contiguous collection of bursts |
| 650 | |
| 651 | - Transfer: A collection of chunks (be it contiguous or not) |