Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 | <?xml version="1.0" encoding="UTF-8"?> |
| 2 | <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" |
| 3 | "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> |
| 4 | |
| 5 | <book id="DoingIO"> |
| 6 | <bookinfo> |
| 7 | <title>Bus-Independent Device Accesses</title> |
| 8 | |
| 9 | <authorgroup> |
| 10 | <author> |
| 11 | <firstname>Matthew</firstname> |
| 12 | <surname>Wilcox</surname> |
| 13 | <affiliation> |
| 14 | <address> |
| 15 | <email>matthew@wil.cx</email> |
| 16 | </address> |
| 17 | </affiliation> |
| 18 | </author> |
| 19 | </authorgroup> |
| 20 | |
| 21 | <authorgroup> |
| 22 | <author> |
| 23 | <firstname>Alan</firstname> |
| 24 | <surname>Cox</surname> |
| 25 | <affiliation> |
| 26 | <address> |
| 27 | <email>alan@redhat.com</email> |
| 28 | </address> |
| 29 | </affiliation> |
| 30 | </author> |
| 31 | </authorgroup> |
| 32 | |
| 33 | <copyright> |
| 34 | <year>2001</year> |
| 35 | <holder>Matthew Wilcox</holder> |
| 36 | </copyright> |
| 37 | |
| 38 | <legalnotice> |
| 39 | <para> |
| 40 | This documentation is free software; you can redistribute |
| 41 | it and/or modify it under the terms of the GNU General Public |
| 42 | License as published by the Free Software Foundation; either |
| 43 | version 2 of the License, or (at your option) any later |
| 44 | version. |
| 45 | </para> |
| 46 | |
| 47 | <para> |
| 48 | This program is distributed in the hope that it will be |
| 49 | useful, but WITHOUT ANY WARRANTY; without even the implied |
| 50 | warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. |
| 51 | See the GNU General Public License for more details. |
| 52 | </para> |
| 53 | |
| 54 | <para> |
| 55 | You should have received a copy of the GNU General Public |
| 56 | License along with this program; if not, write to the Free |
| 57 | Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, |
| 58 | MA 02111-1307 USA |
| 59 | </para> |
| 60 | |
| 61 | <para> |
| 62 | For more details see the file COPYING in the source |
| 63 | distribution of Linux. |
| 64 | </para> |
| 65 | </legalnotice> |
| 66 | </bookinfo> |
| 67 | |
| 68 | <toc></toc> |
| 69 | |
| 70 | <chapter id="intro"> |
| 71 | <title>Introduction</title> |
| 72 | <para> |
| 73 | Linux provides an API which abstracts performing IO across all busses |
| 74 | and devices, allowing device drivers to be written independently of |
| 75 | bus type. |
| 76 | </para> |
| 77 | </chapter> |
| 78 | |
| 79 | <chapter id="bugs"> |
| 80 | <title>Known Bugs And Assumptions</title> |
| 81 | <para> |
| 82 | None. |
| 83 | </para> |
| 84 | </chapter> |
| 85 | |
| 86 | <chapter id="mmio"> |
| 87 | <title>Memory Mapped IO</title> |
| 88 | <sect1> |
| 89 | <title>Getting Access to the Device</title> |
| 90 | <para> |
| 91 | The most widely supported form of IO is memory mapped IO. |
| 92 | That is, a part of the CPU's address space is interpreted |
| 93 | not as accesses to memory, but as accesses to a device. Some |
| 94 | architectures define devices to be at a fixed address, but most |
| 95 | have some method of discovering devices. The PCI bus walk is a |
| 96 | good example of such a scheme. This document does not cover how |
| 97 | to receive such an address, but assumes you are starting with one. |
| 98 | Physical addresses are of type unsigned long. |
| 99 | </para> |
| 100 | |
| 101 | <para> |
| 102 | This address should not be used directly. Instead, to get an |
| 103 | address suitable for passing to the accessor functions described |
| 104 | below, you should call <function>ioremap</function>. |
| 105 | An address suitable for accessing the device will be returned to you. |
| 106 | </para> |
| 107 | |
| 108 | <para> |
| 109 | After you've finished using the device (say, in your module's |
| 110 | exit routine), call <function>iounmap</function> in order to return |
| 111 | the address space to the kernel. Most architectures allocate new |
| 112 | address space each time you call <function>ioremap</function>, and |
| 113 | they can run out unless you call <function>iounmap</function>. |
| 114 | </para> |
| 115 | </sect1> |
| 116 | |
| 117 | <sect1> |
| 118 | <title>Accessing the device</title> |
| 119 | <para> |
| 120 | The part of the interface most used by drivers is reading and |
| 121 | writing memory-mapped registers on the device. Linux provides |
| 122 | interfaces to read and write 8-bit, 16-bit, 32-bit and 64-bit |
| 123 | quantities. Due to a historical accident, these are named byte, |
| 124 | word, long and quad accesses. Both read and write accesses are |
| 125 | supported; there is no prefetch support at this time. |
| 126 | </para> |
| 127 | |
| 128 | <para> |
| 129 | The functions are named <function>readb</function>, |
| 130 | <function>readw</function>, <function>readl</function>, |
| 131 | <function>readq</function>, <function>readb_relaxed</function>, |
| 132 | <function>readw_relaxed</function>, <function>readl_relaxed</function>, |
| 133 | <function>readq_relaxed</function>, <function>writeb</function>, |
| 134 | <function>writew</function>, <function>writel</function> and |
| 135 | <function>writeq</function>. |
| 136 | </para> |
| 137 | |
| 138 | <para> |
| 139 | Some devices (such as framebuffers) would like to use larger |
| 140 | transfers than 8 bytes at a time. For these devices, the |
| 141 | <function>memcpy_toio</function>, <function>memcpy_fromio</function> |
| 142 | and <function>memset_io</function> functions are provided. |
| 143 | Do not use memset or memcpy on IO addresses; they |
| 144 | are not guaranteed to copy data in order. |
| 145 | </para> |
| 146 | |
| 147 | <para> |
| 148 | The read and write functions are defined to be ordered. That is the |
| 149 | compiler is not permitted to reorder the I/O sequence. When the |
| 150 | ordering can be compiler optimised, you can use <function> |
| 151 | __readb</function> and friends to indicate the relaxed ordering. Use |
| 152 | this with care. |
| 153 | </para> |
| 154 | |
| 155 | <para> |
| 156 | While the basic functions are defined to be synchronous with respect |
| 157 | to each other and ordered with respect to each other the busses the |
| 158 | devices sit on may themselves have asynchronicity. In particular many |
| 159 | authors are burned by the fact that PCI bus writes are posted |
| 160 | asynchronously. A driver author must issue a read from the same |
| 161 | device to ensure that writes have occurred in the specific cases the |
| 162 | author cares. This kind of property cannot be hidden from driver |
| 163 | writers in the API. In some cases, the read used to flush the device |
| 164 | may be expected to fail (if the card is resetting, for example). In |
| 165 | that case, the read should be done from config space, which is |
| 166 | guaranteed to soft-fail if the card doesn't respond. |
| 167 | </para> |
| 168 | |
| 169 | <para> |
| 170 | The following is an example of flushing a write to a device when |
| 171 | the driver would like to ensure the write's effects are visible prior |
| 172 | to continuing execution. |
| 173 | </para> |
| 174 | |
| 175 | <programlisting> |
| 176 | static inline void |
| 177 | qla1280_disable_intrs(struct scsi_qla_host *ha) |
| 178 | { |
| 179 | struct device_reg *reg; |
| 180 | |
| 181 | reg = ha->iobase; |
| 182 | /* disable risc and host interrupts */ |
| 183 | WRT_REG_WORD(&reg->ictrl, 0); |
| 184 | /* |
| 185 | * The following read will ensure that the above write |
| 186 | * has been received by the device before we return from this |
| 187 | * function. |
| 188 | */ |
| 189 | RD_REG_WORD(&reg->ictrl); |
| 190 | ha->flags.ints_enabled = 0; |
| 191 | } |
| 192 | </programlisting> |
| 193 | |
| 194 | <para> |
| 195 | In addition to write posting, on some large multiprocessing systems |
| 196 | (e.g. SGI Challenge, Origin and Altix machines) posted writes won't |
| 197 | be strongly ordered coming from different CPUs. Thus it's important |
| 198 | to properly protect parts of your driver that do memory-mapped writes |
| 199 | with locks and use the <function>mmiowb</function> to make sure they |
| 200 | arrive in the order intended. Issuing a regular <function>readX |
| 201 | </function> will also ensure write ordering, but should only be used |
| 202 | when the driver has to be sure that the write has actually arrived |
| 203 | at the device (not that it's simply ordered with respect to other |
| 204 | writes), since a full <function>readX</function> is a relatively |
| 205 | expensive operation. |
| 206 | </para> |
| 207 | |
| 208 | <para> |
| 209 | Generally, one should use <function>mmiowb</function> prior to |
| 210 | releasing a spinlock that protects regions using <function>writeb |
| 211 | </function> or similar functions that aren't surrounded by <function> |
| 212 | readb</function> calls, which will ensure ordering and flushing. The |
| 213 | following pseudocode illustrates what might occur if write ordering |
| 214 | isn't guaranteed via <function>mmiowb</function> or one of the |
| 215 | <function>readX</function> functions. |
| 216 | </para> |
| 217 | |
| 218 | <programlisting> |
| 219 | CPU A: spin_lock_irqsave(&dev_lock, flags) |
| 220 | CPU A: ... |
| 221 | CPU A: writel(newval, ring_ptr); |
| 222 | CPU A: spin_unlock_irqrestore(&dev_lock, flags) |
| 223 | ... |
| 224 | CPU B: spin_lock_irqsave(&dev_lock, flags) |
| 225 | CPU B: writel(newval2, ring_ptr); |
| 226 | CPU B: ... |
| 227 | CPU B: spin_unlock_irqrestore(&dev_lock, flags) |
| 228 | </programlisting> |
| 229 | |
| 230 | <para> |
| 231 | In the case above, newval2 could be written to ring_ptr before |
| 232 | newval. Fixing it is easy though: |
| 233 | </para> |
| 234 | |
| 235 | <programlisting> |
| 236 | CPU A: spin_lock_irqsave(&dev_lock, flags) |
| 237 | CPU A: ... |
| 238 | CPU A: writel(newval, ring_ptr); |
| 239 | CPU A: mmiowb(); /* ensure no other writes beat us to the device */ |
| 240 | CPU A: spin_unlock_irqrestore(&dev_lock, flags) |
| 241 | ... |
| 242 | CPU B: spin_lock_irqsave(&dev_lock, flags) |
| 243 | CPU B: writel(newval2, ring_ptr); |
| 244 | CPU B: ... |
| 245 | CPU B: mmiowb(); |
| 246 | CPU B: spin_unlock_irqrestore(&dev_lock, flags) |
| 247 | </programlisting> |
| 248 | |
| 249 | <para> |
| 250 | See tg3.c for a real world example of how to use <function>mmiowb |
| 251 | </function> |
| 252 | </para> |
| 253 | |
| 254 | <para> |
| 255 | PCI ordering rules also guarantee that PIO read responses arrive |
| 256 | after any outstanding DMA writes from that bus, since for some devices |
| 257 | the result of a <function>readb</function> call may signal to the |
| 258 | driver that a DMA transaction is complete. In many cases, however, |
| 259 | the driver may want to indicate that the next |
| 260 | <function>readb</function> call has no relation to any previous DMA |
| 261 | writes performed by the device. The driver can use |
| 262 | <function>readb_relaxed</function> for these cases, although only |
| 263 | some platforms will honor the relaxed semantics. Using the relaxed |
| 264 | read functions will provide significant performance benefits on |
| 265 | platforms that support it. The qla2xxx driver provides examples |
| 266 | of how to use <function>readX_relaxed</function>. In many cases, |
| 267 | a majority of the driver's <function>readX</function> calls can |
| 268 | safely be converted to <function>readX_relaxed</function> calls, since |
| 269 | only a few will indicate or depend on DMA completion. |
| 270 | </para> |
| 271 | </sect1> |
| 272 | |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 273 | </chapter> |
| 274 | |
| 275 | <chapter> |
| 276 | <title>Port Space Accesses</title> |
| 277 | <sect1> |
| 278 | <title>Port Space Explained</title> |
| 279 | |
| 280 | <para> |
| 281 | Another form of IO commonly supported is Port Space. This is a |
| 282 | range of addresses separate to the normal memory address space. |
| 283 | Access to these addresses is generally not as fast as accesses |
| 284 | to the memory mapped addresses, and it also has a potentially |
| 285 | smaller address space. |
| 286 | </para> |
| 287 | |
| 288 | <para> |
| 289 | Unlike memory mapped IO, no preparation is required |
| 290 | to access port space. |
| 291 | </para> |
| 292 | |
| 293 | </sect1> |
| 294 | <sect1> |
| 295 | <title>Accessing Port Space</title> |
| 296 | <para> |
| 297 | Accesses to this space are provided through a set of functions |
| 298 | which allow 8-bit, 16-bit and 32-bit accesses; also |
| 299 | known as byte, word and long. These functions are |
| 300 | <function>inb</function>, <function>inw</function>, |
| 301 | <function>inl</function>, <function>outb</function>, |
| 302 | <function>outw</function> and <function>outl</function>. |
| 303 | </para> |
| 304 | |
| 305 | <para> |
| 306 | Some variants are provided for these functions. Some devices |
| 307 | require that accesses to their ports are slowed down. This |
| 308 | functionality is provided by appending a <function>_p</function> |
| 309 | to the end of the function. There are also equivalents to memcpy. |
| 310 | The <function>ins</function> and <function>outs</function> |
| 311 | functions copy bytes, words or longs to the given port. |
| 312 | </para> |
| 313 | </sect1> |
| 314 | |
| 315 | </chapter> |
| 316 | |
| 317 | <chapter id="pubfunctions"> |
| 318 | <title>Public Functions Provided</title> |
Randy Dunlap | 08d7b5a | 2007-10-12 21:17:00 -0700 | [diff] [blame^] | 319 | !Iinclude/asm-x86/io_32.h |
Rolf Eike Beer | 5ca2481 | 2007-07-19 17:48:44 -0700 | [diff] [blame] | 320 | !Elib/iomap.c |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 321 | </chapter> |
| 322 | |
| 323 | </book> |