Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 | How To Write Linux PCI Drivers |
| 2 | |
| 3 | by Martin Mares <mj@ucw.cz> on 07-Feb-2000 |
| 4 | |
| 5 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 6 | The world of PCI is vast and it's full of (mostly unpleasant) surprises. |
| 7 | Different PCI devices have different requirements and different bugs -- |
| 8 | because of this, the PCI support layer in Linux kernel is not as trivial |
| 9 | as one would wish. This short pamphlet tries to help all potential driver |
| 10 | authors find their way through the deep forests of PCI handling. |
| 11 | |
| 12 | |
| 13 | 0. Structure of PCI drivers |
| 14 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 15 | There exist two kinds of PCI drivers: new-style ones (which leave most of |
| 16 | probing for devices to the PCI layer and support online insertion and removal |
| 17 | of devices [thus supporting PCI, hot-pluggable PCI and CardBus in a single |
| 18 | driver]) and old-style ones which just do all the probing themselves. Unless |
| 19 | you have a very good reason to do so, please don't use the old way of probing |
| 20 | in any new code. After the driver finds the devices it wishes to operate |
| 21 | on (either the old or the new way), it needs to perform the following steps: |
| 22 | |
| 23 | Enable the device |
| 24 | Access device configuration space |
| 25 | Discover resources (addresses and IRQ numbers) provided by the device |
| 26 | Allocate these resources |
| 27 | Communicate with the device |
| 28 | Disable the device |
| 29 | |
| 30 | Most of these topics are covered by the following sections, for the rest |
| 31 | look at <linux/pci.h>, it's hopefully well commented. |
| 32 | |
| 33 | If the PCI subsystem is not configured (CONFIG_PCI is not set), most of |
| 34 | the functions described below are defined as inline functions either completely |
| 35 | empty or just returning an appropriate error codes to avoid lots of ifdefs |
| 36 | in the drivers. |
| 37 | |
| 38 | |
| 39 | 1. New-style drivers |
| 40 | ~~~~~~~~~~~~~~~~~~~~ |
| 41 | The new-style drivers just call pci_register_driver during their initialization |
| 42 | with a pointer to a structure describing the driver (struct pci_driver) which |
| 43 | contains: |
| 44 | |
| 45 | name Name of the driver |
| 46 | id_table Pointer to table of device ID's the driver is |
| 47 | interested in. Most drivers should export this |
| 48 | table using MODULE_DEVICE_TABLE(pci,...). |
| 49 | probe Pointer to a probing function which gets called (during |
| 50 | execution of pci_register_driver for already existing |
| 51 | devices or later if a new device gets inserted) for all |
| 52 | PCI devices which match the ID table and are not handled |
| 53 | by the other drivers yet. This function gets passed a |
| 54 | pointer to the pci_dev structure representing the device |
| 55 | and also which entry in the ID table did the device |
| 56 | match. It returns zero when the driver has accepted the |
| 57 | device or an error code (negative number) otherwise. |
| 58 | This function always gets called from process context, |
| 59 | so it can sleep. |
| 60 | remove Pointer to a function which gets called whenever a |
| 61 | device being handled by this driver is removed (either |
| 62 | during deregistration of the driver or when it's |
| 63 | manually pulled out of a hot-pluggable slot). This |
| 64 | function always gets called from process context, so it |
| 65 | can sleep. |
| 66 | save_state Save a device's state before it's suspend. |
| 67 | suspend Put device into low power state. |
| 68 | resume Wake device from low power state. |
| 69 | enable_wake Enable device to generate wake events from a low power |
| 70 | state. |
| 71 | |
| 72 | (Please see Documentation/power/pci.txt for descriptions |
| 73 | of PCI Power Management and the related functions) |
| 74 | |
| 75 | The ID table is an array of struct pci_device_id ending with a all-zero entry. |
| 76 | Each entry consists of: |
| 77 | |
| 78 | vendor, device Vendor and device ID to match (or PCI_ANY_ID) |
| 79 | subvendor, Subsystem vendor and device ID to match (or PCI_ANY_ID) |
| 80 | subdevice |
| 81 | class, Device class to match. The class_mask tells which bits |
| 82 | class_mask of the class are honored during the comparison. |
| 83 | driver_data Data private to the driver. |
| 84 | |
| 85 | Most drivers don't need to use the driver_data field. Best practice |
| 86 | for use of driver_data is to use it as an index into a static list of |
Adrian Bunk | 338cec3 | 2005-09-10 00:26:54 -0700 | [diff] [blame^] | 87 | equivalent device types, not to use it as a pointer. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 88 | |
| 89 | Have a table entry {PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID} |
| 90 | to have probe() called for every PCI device known to the system. |
| 91 | |
| 92 | New PCI IDs may be added to a device driver at runtime by writing |
| 93 | to the file /sys/bus/pci/drivers/{driver}/new_id. When added, the |
| 94 | driver will probe for all devices it can support. |
| 95 | |
| 96 | echo "vendor device subvendor subdevice class class_mask driver_data" > \ |
| 97 | /sys/bus/pci/drivers/{driver}/new_id |
| 98 | where all fields are passed in as hexadecimal values (no leading 0x). |
| 99 | Users need pass only as many fields as necessary; vendor, device, |
| 100 | subvendor, and subdevice fields default to PCI_ANY_ID (FFFFFFFF), |
| 101 | class and classmask fields default to 0, and driver_data defaults to |
| 102 | 0UL. Device drivers must initialize use_driver_data in the dynids struct |
| 103 | in their pci_driver struct prior to calling pci_register_driver in order |
| 104 | for the driver_data field to get passed to the driver. Otherwise, only a |
| 105 | 0 is passed in that field. |
| 106 | |
| 107 | When the driver exits, it just calls pci_unregister_driver() and the PCI layer |
| 108 | automatically calls the remove hook for all devices handled by the driver. |
| 109 | |
| 110 | Please mark the initialization and cleanup functions where appropriate |
| 111 | (the corresponding macros are defined in <linux/init.h>): |
| 112 | |
| 113 | __init Initialization code. Thrown away after the driver |
| 114 | initializes. |
| 115 | __exit Exit code. Ignored for non-modular drivers. |
| 116 | __devinit Device initialization code. Identical to __init if |
| 117 | the kernel is not compiled with CONFIG_HOTPLUG, normal |
| 118 | function otherwise. |
| 119 | __devexit The same for __exit. |
| 120 | |
| 121 | Tips: |
| 122 | The module_init()/module_exit() functions (and all initialization |
| 123 | functions called only from these) should be marked __init/exit. |
| 124 | The struct pci_driver shouldn't be marked with any of these tags. |
| 125 | The ID table array should be marked __devinitdata. |
| 126 | The probe() and remove() functions (and all initialization |
| 127 | functions called only from these) should be marked __devinit/exit. |
| 128 | If you are sure the driver is not a hotplug driver then use only |
| 129 | __init/exit __initdata/exitdata. |
| 130 | |
| 131 | Pointers to functions marked as __devexit must be created using |
| 132 | __devexit_p(function_name). That will generate the function |
| 133 | name or NULL if the __devexit function will be discarded. |
| 134 | |
| 135 | |
| 136 | 2. How to find PCI devices manually (the old style) |
| 137 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 138 | PCI drivers not using the pci_register_driver() interface search |
| 139 | for PCI devices manually using the following constructs: |
| 140 | |
| 141 | Searching by vendor and device ID: |
| 142 | |
| 143 | struct pci_dev *dev = NULL; |
| 144 | while (dev = pci_get_device(VENDOR_ID, DEVICE_ID, dev)) |
| 145 | configure_device(dev); |
| 146 | |
| 147 | Searching by class ID (iterate in a similar way): |
| 148 | |
| 149 | pci_get_class(CLASS_ID, dev) |
| 150 | |
| 151 | Searching by both vendor/device and subsystem vendor/device ID: |
| 152 | |
| 153 | pci_get_subsys(VENDOR_ID, DEVICE_ID, SUBSYS_VENDOR_ID, SUBSYS_DEVICE_ID, dev). |
| 154 | |
| 155 | You can use the constant PCI_ANY_ID as a wildcard replacement for |
| 156 | VENDOR_ID or DEVICE_ID. This allows searching for any device from a |
| 157 | specific vendor, for example. |
| 158 | |
| 159 | These functions are hotplug-safe. They increment the reference count on |
| 160 | the pci_dev that they return. You must eventually (possibly at module unload) |
| 161 | decrement the reference count on these devices by calling pci_dev_put(). |
| 162 | |
| 163 | |
| 164 | 3. Enabling and disabling devices |
| 165 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 166 | Before you do anything with the device you've found, you need to enable |
| 167 | it by calling pci_enable_device() which enables I/O and memory regions of |
| 168 | the device, allocates an IRQ if necessary, assigns missing resources if |
| 169 | needed and wakes up the device if it was in suspended state. Please note |
| 170 | that this function can fail. |
| 171 | |
| 172 | If you want to use the device in bus mastering mode, call pci_set_master() |
| 173 | which enables the bus master bit in PCI_COMMAND register and also fixes |
| 174 | the latency timer value if it's set to something bogus by the BIOS. |
| 175 | |
| 176 | If you want to use the PCI Memory-Write-Invalidate transaction, |
| 177 | call pci_set_mwi(). This enables the PCI_COMMAND bit for Mem-Wr-Inval |
| 178 | and also ensures that the cache line size register is set correctly. |
| 179 | Make sure to check the return value of pci_set_mwi(), not all architectures |
| 180 | may support Memory-Write-Invalidate. |
| 181 | |
| 182 | If your driver decides to stop using the device (e.g., there was an |
| 183 | error while setting it up or the driver module is being unloaded), it |
| 184 | should call pci_disable_device() to deallocate any IRQ resources, disable |
| 185 | PCI bus-mastering, etc. You should not do anything with the device after |
| 186 | calling pci_disable_device(). |
| 187 | |
| 188 | 4. How to access PCI config space |
| 189 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 190 | You can use pci_(read|write)_config_(byte|word|dword) to access the config |
| 191 | space of a device represented by struct pci_dev *. All these functions return 0 |
| 192 | when successful or an error code (PCIBIOS_...) which can be translated to a text |
| 193 | string by pcibios_strerror. Most drivers expect that accesses to valid PCI |
| 194 | devices don't fail. |
| 195 | |
| 196 | If you don't have a struct pci_dev available, you can call |
| 197 | pci_bus_(read|write)_config_(byte|word|dword) to access a given device |
| 198 | and function on that bus. |
| 199 | |
| 200 | If you access fields in the standard portion of the config header, please |
| 201 | use symbolic names of locations and bits declared in <linux/pci.h>. |
| 202 | |
| 203 | If you need to access Extended PCI Capability registers, just call |
| 204 | pci_find_capability() for the particular capability and it will find the |
| 205 | corresponding register block for you. |
| 206 | |
| 207 | |
| 208 | 5. Addresses and interrupts |
| 209 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 210 | Memory and port addresses and interrupt numbers should NOT be read from the |
| 211 | config space. You should use the values in the pci_dev structure as they might |
| 212 | have been remapped by the kernel. |
| 213 | |
| 214 | See Documentation/IO-mapping.txt for how to access device memory. |
| 215 | |
| 216 | You still need to call request_region() for I/O regions and |
| 217 | request_mem_region() for memory regions to make sure nobody else is using the |
| 218 | same device. |
| 219 | |
| 220 | All interrupt handlers should be registered with SA_SHIRQ and use the devid |
| 221 | to map IRQs to devices (remember that all PCI interrupts are shared). |
| 222 | |
| 223 | |
| 224 | 6. Other interesting functions |
| 225 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 226 | pci_find_slot() Find pci_dev corresponding to given bus and |
| 227 | slot numbers. |
| 228 | pci_set_power_state() Set PCI Power Management state (0=D0 ... 3=D3) |
| 229 | pci_find_capability() Find specified capability in device's capability |
| 230 | list. |
| 231 | pci_module_init() Inline helper function for ensuring correct |
| 232 | pci_driver initialization and error handling. |
| 233 | pci_resource_start() Returns bus start address for a given PCI region |
| 234 | pci_resource_end() Returns bus end address for a given PCI region |
| 235 | pci_resource_len() Returns the byte length of a PCI region |
| 236 | pci_set_drvdata() Set private driver data pointer for a pci_dev |
| 237 | pci_get_drvdata() Return private driver data pointer for a pci_dev |
| 238 | pci_set_mwi() Enable Memory-Write-Invalidate transactions. |
| 239 | pci_clear_mwi() Disable Memory-Write-Invalidate transactions. |
| 240 | |
| 241 | |
| 242 | 7. Miscellaneous hints |
| 243 | ~~~~~~~~~~~~~~~~~~~~~~ |
| 244 | When displaying PCI slot names to the user (for example when a driver wants |
| 245 | to tell the user what card has it found), please use pci_name(pci_dev) |
| 246 | for this purpose. |
| 247 | |
| 248 | Always refer to the PCI devices by a pointer to the pci_dev structure. |
| 249 | All PCI layer functions use this identification and it's the only |
| 250 | reasonable one. Don't use bus/slot/function numbers except for very |
| 251 | special purposes -- on systems with multiple primary buses their semantics |
| 252 | can be pretty complex. |
| 253 | |
| 254 | If you're going to use PCI bus mastering DMA, take a look at |
| 255 | Documentation/DMA-mapping.txt. |
| 256 | |
| 257 | Don't try to turn on Fast Back to Back writes in your driver. All devices |
| 258 | on the bus need to be capable of doing it, so this is something which needs |
| 259 | to be handled by platform and generic code, not individual drivers. |
| 260 | |
| 261 | |
| 262 | 8. Obsolete functions |
| 263 | ~~~~~~~~~~~~~~~~~~~~~ |
| 264 | There are several functions which you might come across when trying to |
| 265 | port an old driver to the new PCI interface. They are no longer present |
| 266 | in the kernel as they aren't compatible with hotplug or PCI domains or |
| 267 | having sane locking. |
| 268 | |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 269 | pci_find_device() Superseded by pci_get_device() |
| 270 | pci_find_subsys() Superseded by pci_get_subsys() |
Matthew Wilcox | a3ea7fb | 2005-03-29 19:08:48 +0100 | [diff] [blame] | 271 | pci_find_slot() Superseded by pci_get_slot() |