Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 | In Linux 2.5 kernels (and later), USB device drivers have additional control |
| 2 | over how DMA may be used to perform I/O operations. The APIs are detailed |
| 3 | in the kernel usb programming guide (kerneldoc, from the source code). |
| 4 | |
| 5 | |
| 6 | API OVERVIEW |
| 7 | |
| 8 | The big picture is that USB drivers can continue to ignore most DMA issues, |
Randy Dunlap | 5872fb9 | 2009-01-29 16:28:02 -0800 | [diff] [blame] | 9 | though they still must provide DMA-ready buffers (see |
| 10 | Documentation/PCI/PCI-DMA-mapping.txt). That's how they've worked through |
| 11 | the 2.4 (and earlier) kernels. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 12 | |
| 13 | OR: they can now be DMA-aware. |
| 14 | |
| 15 | - New calls enable DMA-aware drivers, letting them allocate dma buffers and |
| 16 | manage dma mappings for existing dma-ready buffers (see below). |
| 17 | |
| 18 | - URBs have an additional "transfer_dma" field, as well as a transfer_flags |
| 19 | bit saying if it's valid. (Control requests also have "setup_dma" and a |
| 20 | corresponding transfer_flags bit.) |
| 21 | |
| 22 | - "usbcore" will map those DMA addresses, if a DMA-aware driver didn't do |
| 23 | it first and set URB_NO_TRANSFER_DMA_MAP or URB_NO_SETUP_DMA_MAP. HCDs |
| 24 | don't manage dma mappings for URBs. |
| 25 | |
| 26 | - There's a new "generic DMA API", parts of which are usable by USB device |
| 27 | drivers. Never use dma_set_mask() on any USB interface or device; that |
| 28 | would potentially break all devices sharing that bus. |
| 29 | |
| 30 | |
| 31 | ELIMINATING COPIES |
| 32 | |
| 33 | It's good to avoid making CPUs copy data needlessly. The costs can add up, |
| 34 | and effects like cache-trashing can impose subtle penalties. |
| 35 | |
David Brownell | fbf54dd | 2007-07-01 23:33:12 -0700 | [diff] [blame] | 36 | - If you're doing lots of small data transfers from the same buffer all |
| 37 | the time, that can really burn up resources on systems which use an |
| 38 | IOMMU to manage the DMA mappings. It can cost MUCH more to set up and |
| 39 | tear down the IOMMU mappings with each request than perform the I/O! |
| 40 | |
| 41 | For those specific cases, USB has primitives to allocate less expensive |
| 42 | memory. They work like kmalloc and kfree versions that give you the right |
| 43 | kind of addresses to store in urb->transfer_buffer and urb->transfer_dma. |
| 44 | You'd also set URB_NO_TRANSFER_DMA_MAP in urb->transfer_flags: |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 45 | |
Daniel Mack | 997ea58 | 2010-04-12 13:17:25 +0200 | [diff] [blame^] | 46 | void *usb_alloc_coherent (struct usb_device *dev, size_t size, |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 47 | int mem_flags, dma_addr_t *dma); |
| 48 | |
Daniel Mack | 997ea58 | 2010-04-12 13:17:25 +0200 | [diff] [blame^] | 49 | void usb_free_coherent (struct usb_device *dev, size_t size, |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 50 | void *addr, dma_addr_t dma); |
| 51 | |
David Brownell | fbf54dd | 2007-07-01 23:33:12 -0700 | [diff] [blame] | 52 | Most drivers should *NOT* be using these primitives; they don't need |
| 53 | to use this type of memory ("dma-coherent"), and memory returned from |
| 54 | kmalloc() will work just fine. |
| 55 | |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 56 | For control transfers you can use the buffer primitives or not for each |
| 57 | of the transfer buffer and setup buffer independently. Set the flag bits |
| 58 | URB_NO_TRANSFER_DMA_MAP and URB_NO_SETUP_DMA_MAP to indicate which |
| 59 | buffers you have prepared. For non-control transfers URB_NO_SETUP_DMA_MAP |
| 60 | is ignored. |
| 61 | |
| 62 | The memory buffer returned is "dma-coherent"; sometimes you might need to |
| 63 | force a consistent memory access ordering by using memory barriers. It's |
| 64 | not using a streaming DMA mapping, so it's good for small transfers on |
David Brownell | fbf54dd | 2007-07-01 23:33:12 -0700 | [diff] [blame] | 65 | systems where the I/O would otherwise thrash an IOMMU mapping. (See |
Randy Dunlap | 5872fb9 | 2009-01-29 16:28:02 -0800 | [diff] [blame] | 66 | Documentation/PCI/PCI-DMA-mapping.txt for definitions of "coherent" and |
| 67 | "streaming" DMA mappings.) |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 68 | |
| 69 | Asking for 1/Nth of a page (as well as asking for N pages) is reasonably |
| 70 | space-efficient. |
| 71 | |
David Brownell | fbf54dd | 2007-07-01 23:33:12 -0700 | [diff] [blame] | 72 | On most systems the memory returned will be uncached, because the |
| 73 | semantics of dma-coherent memory require either bypassing CPU caches |
| 74 | or using cache hardware with bus-snooping support. While x86 hardware |
| 75 | has such bus-snooping, many other systems use software to flush cache |
| 76 | lines to prevent DMA conflicts. |
| 77 | |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 78 | - Devices on some EHCI controllers could handle DMA to/from high memory. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 79 | |
David Brownell | fbf54dd | 2007-07-01 23:33:12 -0700 | [diff] [blame] | 80 | Unfortunately, the current Linux DMA infrastructure doesn't have a sane |
| 81 | way to expose these capabilities ... and in any case, HIGHMEM is mostly a |
| 82 | design wart specific to x86_32. So your best bet is to ensure you never |
| 83 | pass a highmem buffer into a USB driver. That's easy; it's the default |
| 84 | behavior. Just don't override it; e.g. with NETIF_F_HIGHDMA. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 85 | |
David Brownell | fbf54dd | 2007-07-01 23:33:12 -0700 | [diff] [blame] | 86 | This may force your callers to do some bounce buffering, copying from |
| 87 | high memory to "normal" DMA memory. If you can come up with a good way |
| 88 | to fix this issue (for x86_32 machines with over 1 GByte of memory), |
| 89 | feel free to submit patches. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 90 | |
| 91 | |
| 92 | WORKING WITH EXISTING BUFFERS |
| 93 | |
| 94 | Existing buffers aren't usable for DMA without first being mapped into the |
David Brownell | fbf54dd | 2007-07-01 23:33:12 -0700 | [diff] [blame] | 95 | DMA address space of the device. However, most buffers passed to your |
| 96 | driver can safely be used with such DMA mapping. (See the first section |
Randy Dunlap | 5872fb9 | 2009-01-29 16:28:02 -0800 | [diff] [blame] | 97 | of Documentation/PCI/PCI-DMA-mapping.txt, titled "What memory is DMA-able?") |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 98 | |
| 99 | - When you're using scatterlists, you can map everything at once. On some |
| 100 | systems, this kicks in an IOMMU and turns the scatterlists into single |
| 101 | DMA transactions: |
| 102 | |
| 103 | int usb_buffer_map_sg (struct usb_device *dev, unsigned pipe, |
| 104 | struct scatterlist *sg, int nents); |
| 105 | |
| 106 | void usb_buffer_dmasync_sg (struct usb_device *dev, unsigned pipe, |
| 107 | struct scatterlist *sg, int n_hw_ents); |
| 108 | |
| 109 | void usb_buffer_unmap_sg (struct usb_device *dev, unsigned pipe, |
| 110 | struct scatterlist *sg, int n_hw_ents); |
| 111 | |
| 112 | It's probably easier to use the new usb_sg_*() calls, which do the DMA |
| 113 | mapping and apply other tweaks to make scatterlist i/o be fast. |
| 114 | |
| 115 | - Some drivers may prefer to work with the model that they're mapping large |
| 116 | buffers, synchronizing their safe re-use. (If there's no re-use, then let |
| 117 | usbcore do the map/unmap.) Large periodic transfers make good examples |
| 118 | here, since it's cheaper to just synchronize the buffer than to unmap it |
| 119 | each time an urb completes and then re-map it on during resubmission. |
| 120 | |
| 121 | These calls all work with initialized urbs: urb->dev, urb->pipe, |
| 122 | urb->transfer_buffer, and urb->transfer_buffer_length must all be |
| 123 | valid when these calls are used (urb->setup_packet must be valid too |
| 124 | if urb is a control request): |
| 125 | |
| 126 | struct urb *usb_buffer_map (struct urb *urb); |
| 127 | |
| 128 | void usb_buffer_dmasync (struct urb *urb); |
| 129 | |
| 130 | void usb_buffer_unmap (struct urb *urb); |
| 131 | |
| 132 | The calls manage urb->transfer_dma for you, and set URB_NO_TRANSFER_DMA_MAP |
| 133 | so that usbcore won't map or unmap the buffer. The same goes for |
| 134 | urb->setup_dma and URB_NO_SETUP_DMA_MAP for control requests. |
David Brownell | fbf54dd | 2007-07-01 23:33:12 -0700 | [diff] [blame] | 135 | |
| 136 | Note that several of those interfaces are currently commented out, since |
| 137 | they don't have current users. See the source code. Other than the dmasync |
| 138 | calls (where the underlying DMA primitives have changed), most of them can |
| 139 | easily be commented back in if you want to use them. |