Mauro Carvalho Chehab | 86beb97 | 2020-02-17 17:12:24 +0100 | [diff] [blame^] | 1 | .. SPDX-License-Identifier: GPL-2.0 |
| 2 | |
| 3 | ===================================================== |
| 4 | sysfs - _The_ filesystem for exporting kernel objects |
| 5 | ===================================================== |
| 6 | |
| 7 | Patrick Mochel <mochel@osdl.org> |
| 8 | |
| 9 | Mike Murphy <mamurph@cs.clemson.edu> |
| 10 | |
| 11 | :Revised: 16 August 2011 |
| 12 | :Original: 10 January 2003 |
| 13 | |
| 14 | |
| 15 | What it is: |
| 16 | ~~~~~~~~~~~ |
| 17 | |
| 18 | sysfs is a ram-based filesystem initially based on ramfs. It provides |
| 19 | a means to export kernel data structures, their attributes, and the |
| 20 | linkages between them to userspace. |
| 21 | |
| 22 | sysfs is tied inherently to the kobject infrastructure. Please read |
| 23 | Documentation/kobject.txt for more information concerning the kobject |
| 24 | interface. |
| 25 | |
| 26 | |
| 27 | Using sysfs |
| 28 | ~~~~~~~~~~~ |
| 29 | |
| 30 | sysfs is always compiled in if CONFIG_SYSFS is defined. You can access |
| 31 | it by doing:: |
| 32 | |
| 33 | mount -t sysfs sysfs /sys |
| 34 | |
| 35 | |
| 36 | Directory Creation |
| 37 | ~~~~~~~~~~~~~~~~~~ |
| 38 | |
| 39 | For every kobject that is registered with the system, a directory is |
| 40 | created for it in sysfs. That directory is created as a subdirectory |
| 41 | of the kobject's parent, expressing internal object hierarchies to |
| 42 | userspace. Top-level directories in sysfs represent the common |
| 43 | ancestors of object hierarchies; i.e. the subsystems the objects |
| 44 | belong to. |
| 45 | |
| 46 | Sysfs internally stores a pointer to the kobject that implements a |
| 47 | directory in the kernfs_node object associated with the directory. In |
| 48 | the past this kobject pointer has been used by sysfs to do reference |
| 49 | counting directly on the kobject whenever the file is opened or closed. |
| 50 | With the current sysfs implementation the kobject reference count is |
| 51 | only modified directly by the function sysfs_schedule_callback(). |
| 52 | |
| 53 | |
| 54 | Attributes |
| 55 | ~~~~~~~~~~ |
| 56 | |
| 57 | Attributes can be exported for kobjects in the form of regular files in |
| 58 | the filesystem. Sysfs forwards file I/O operations to methods defined |
| 59 | for the attributes, providing a means to read and write kernel |
| 60 | attributes. |
| 61 | |
| 62 | Attributes should be ASCII text files, preferably with only one value |
| 63 | per file. It is noted that it may not be efficient to contain only one |
| 64 | value per file, so it is socially acceptable to express an array of |
| 65 | values of the same type. |
| 66 | |
| 67 | Mixing types, expressing multiple lines of data, and doing fancy |
| 68 | formatting of data is heavily frowned upon. Doing these things may get |
| 69 | you publicly humiliated and your code rewritten without notice. |
| 70 | |
| 71 | |
| 72 | An attribute definition is simply:: |
| 73 | |
| 74 | struct attribute { |
| 75 | char * name; |
| 76 | struct module *owner; |
| 77 | umode_t mode; |
| 78 | }; |
| 79 | |
| 80 | |
| 81 | int sysfs_create_file(struct kobject * kobj, const struct attribute * attr); |
| 82 | void sysfs_remove_file(struct kobject * kobj, const struct attribute * attr); |
| 83 | |
| 84 | |
| 85 | A bare attribute contains no means to read or write the value of the |
| 86 | attribute. Subsystems are encouraged to define their own attribute |
| 87 | structure and wrapper functions for adding and removing attributes for |
| 88 | a specific object type. |
| 89 | |
| 90 | For example, the driver model defines struct device_attribute like:: |
| 91 | |
| 92 | struct device_attribute { |
| 93 | struct attribute attr; |
| 94 | ssize_t (*show)(struct device *dev, struct device_attribute *attr, |
| 95 | char *buf); |
| 96 | ssize_t (*store)(struct device *dev, struct device_attribute *attr, |
| 97 | const char *buf, size_t count); |
| 98 | }; |
| 99 | |
| 100 | int device_create_file(struct device *, const struct device_attribute *); |
| 101 | void device_remove_file(struct device *, const struct device_attribute *); |
| 102 | |
| 103 | It also defines this helper for defining device attributes:: |
| 104 | |
| 105 | #define DEVICE_ATTR(_name, _mode, _show, _store) \ |
| 106 | struct device_attribute dev_attr_##_name = __ATTR(_name, _mode, _show, _store) |
| 107 | |
| 108 | For example, declaring:: |
| 109 | |
| 110 | static DEVICE_ATTR(foo, S_IWUSR | S_IRUGO, show_foo, store_foo); |
| 111 | |
| 112 | is equivalent to doing:: |
| 113 | |
| 114 | static struct device_attribute dev_attr_foo = { |
| 115 | .attr = { |
| 116 | .name = "foo", |
| 117 | .mode = S_IWUSR | S_IRUGO, |
| 118 | }, |
| 119 | .show = show_foo, |
| 120 | .store = store_foo, |
| 121 | }; |
| 122 | |
| 123 | Note as stated in include/linux/kernel.h "OTHER_WRITABLE? Generally |
| 124 | considered a bad idea." so trying to set a sysfs file writable for |
| 125 | everyone will fail reverting to RO mode for "Others". |
| 126 | |
| 127 | For the common cases sysfs.h provides convenience macros to make |
| 128 | defining attributes easier as well as making code more concise and |
| 129 | readable. The above case could be shortened to: |
| 130 | |
| 131 | static struct device_attribute dev_attr_foo = __ATTR_RW(foo); |
| 132 | |
| 133 | the list of helpers available to define your wrapper function is: |
| 134 | |
| 135 | __ATTR_RO(name): |
| 136 | assumes default name_show and mode 0444 |
| 137 | __ATTR_WO(name): |
| 138 | assumes a name_store only and is restricted to mode |
| 139 | 0200 that is root write access only. |
| 140 | __ATTR_RO_MODE(name, mode): |
| 141 | fore more restrictive RO access currently |
| 142 | only use case is the EFI System Resource Table |
| 143 | (see drivers/firmware/efi/esrt.c) |
| 144 | __ATTR_RW(name): |
| 145 | assumes default name_show, name_store and setting |
| 146 | mode to 0644. |
| 147 | __ATTR_NULL: |
| 148 | which sets the name to NULL and is used as end of list |
| 149 | indicator (see: kernel/workqueue.c) |
| 150 | |
| 151 | Subsystem-Specific Callbacks |
| 152 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 153 | |
| 154 | When a subsystem defines a new attribute type, it must implement a |
| 155 | set of sysfs operations for forwarding read and write calls to the |
| 156 | show and store methods of the attribute owners:: |
| 157 | |
| 158 | struct sysfs_ops { |
| 159 | ssize_t (*show)(struct kobject *, struct attribute *, char *); |
| 160 | ssize_t (*store)(struct kobject *, struct attribute *, const char *, size_t); |
| 161 | }; |
| 162 | |
| 163 | [ Subsystems should have already defined a struct kobj_type as a |
| 164 | descriptor for this type, which is where the sysfs_ops pointer is |
| 165 | stored. See the kobject documentation for more information. ] |
| 166 | |
| 167 | When a file is read or written, sysfs calls the appropriate method |
| 168 | for the type. The method then translates the generic struct kobject |
| 169 | and struct attribute pointers to the appropriate pointer types, and |
| 170 | calls the associated methods. |
| 171 | |
| 172 | |
| 173 | To illustrate:: |
| 174 | |
| 175 | #define to_dev(obj) container_of(obj, struct device, kobj) |
| 176 | #define to_dev_attr(_attr) container_of(_attr, struct device_attribute, attr) |
| 177 | |
| 178 | static ssize_t dev_attr_show(struct kobject *kobj, struct attribute *attr, |
| 179 | char *buf) |
| 180 | { |
| 181 | struct device_attribute *dev_attr = to_dev_attr(attr); |
| 182 | struct device *dev = to_dev(kobj); |
| 183 | ssize_t ret = -EIO; |
| 184 | |
| 185 | if (dev_attr->show) |
| 186 | ret = dev_attr->show(dev, dev_attr, buf); |
| 187 | if (ret >= (ssize_t)PAGE_SIZE) { |
| 188 | printk("dev_attr_show: %pS returned bad count\n", |
| 189 | dev_attr->show); |
| 190 | } |
| 191 | return ret; |
| 192 | } |
| 193 | |
| 194 | |
| 195 | |
| 196 | Reading/Writing Attribute Data |
| 197 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 198 | |
| 199 | To read or write attributes, show() or store() methods must be |
| 200 | specified when declaring the attribute. The method types should be as |
| 201 | simple as those defined for device attributes:: |
| 202 | |
| 203 | ssize_t (*show)(struct device *dev, struct device_attribute *attr, char *buf); |
| 204 | ssize_t (*store)(struct device *dev, struct device_attribute *attr, |
| 205 | const char *buf, size_t count); |
| 206 | |
| 207 | IOW, they should take only an object, an attribute, and a buffer as parameters. |
| 208 | |
| 209 | |
| 210 | sysfs allocates a buffer of size (PAGE_SIZE) and passes it to the |
| 211 | method. Sysfs will call the method exactly once for each read or |
| 212 | write. This forces the following behavior on the method |
| 213 | implementations: |
| 214 | |
| 215 | - On read(2), the show() method should fill the entire buffer. |
| 216 | Recall that an attribute should only be exporting one value, or an |
| 217 | array of similar values, so this shouldn't be that expensive. |
| 218 | |
| 219 | This allows userspace to do partial reads and forward seeks |
| 220 | arbitrarily over the entire file at will. If userspace seeks back to |
| 221 | zero or does a pread(2) with an offset of '0' the show() method will |
| 222 | be called again, rearmed, to fill the buffer. |
| 223 | |
| 224 | - On write(2), sysfs expects the entire buffer to be passed during the |
| 225 | first write. Sysfs then passes the entire buffer to the store() method. |
| 226 | A terminating null is added after the data on stores. This makes |
| 227 | functions like sysfs_streq() safe to use. |
| 228 | |
| 229 | When writing sysfs files, userspace processes should first read the |
| 230 | entire file, modify the values it wishes to change, then write the |
| 231 | entire buffer back. |
| 232 | |
| 233 | Attribute method implementations should operate on an identical |
| 234 | buffer when reading and writing values. |
| 235 | |
| 236 | Other notes: |
| 237 | |
| 238 | - Writing causes the show() method to be rearmed regardless of current |
| 239 | file position. |
| 240 | |
| 241 | - The buffer will always be PAGE_SIZE bytes in length. On i386, this |
| 242 | is 4096. |
| 243 | |
| 244 | - show() methods should return the number of bytes printed into the |
| 245 | buffer. This is the return value of scnprintf(). |
| 246 | |
| 247 | - show() must not use snprintf() when formatting the value to be |
| 248 | returned to user space. If you can guarantee that an overflow |
| 249 | will never happen you can use sprintf() otherwise you must use |
| 250 | scnprintf(). |
| 251 | |
| 252 | - store() should return the number of bytes used from the buffer. If the |
| 253 | entire buffer has been used, just return the count argument. |
| 254 | |
| 255 | - show() or store() can always return errors. If a bad value comes |
| 256 | through, be sure to return an error. |
| 257 | |
| 258 | - The object passed to the methods will be pinned in memory via sysfs |
| 259 | referencing counting its embedded object. However, the physical |
| 260 | entity (e.g. device) the object represents may not be present. Be |
| 261 | sure to have a way to check this, if necessary. |
| 262 | |
| 263 | |
| 264 | A very simple (and naive) implementation of a device attribute is:: |
| 265 | |
| 266 | static ssize_t show_name(struct device *dev, struct device_attribute *attr, |
| 267 | char *buf) |
| 268 | { |
| 269 | return scnprintf(buf, PAGE_SIZE, "%s\n", dev->name); |
| 270 | } |
| 271 | |
| 272 | static ssize_t store_name(struct device *dev, struct device_attribute *attr, |
| 273 | const char *buf, size_t count) |
| 274 | { |
| 275 | snprintf(dev->name, sizeof(dev->name), "%.*s", |
| 276 | (int)min(count, sizeof(dev->name) - 1), buf); |
| 277 | return count; |
| 278 | } |
| 279 | |
| 280 | static DEVICE_ATTR(name, S_IRUGO, show_name, store_name); |
| 281 | |
| 282 | |
| 283 | (Note that the real implementation doesn't allow userspace to set the |
| 284 | name for a device.) |
| 285 | |
| 286 | |
| 287 | Top Level Directory Layout |
| 288 | ~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 289 | |
| 290 | The sysfs directory arrangement exposes the relationship of kernel |
| 291 | data structures. |
| 292 | |
| 293 | The top level sysfs directory looks like:: |
| 294 | |
| 295 | block/ |
| 296 | bus/ |
| 297 | class/ |
| 298 | dev/ |
| 299 | devices/ |
| 300 | firmware/ |
| 301 | net/ |
| 302 | fs/ |
| 303 | |
| 304 | devices/ contains a filesystem representation of the device tree. It maps |
| 305 | directly to the internal kernel device tree, which is a hierarchy of |
| 306 | struct device. |
| 307 | |
| 308 | bus/ contains flat directory layout of the various bus types in the |
| 309 | kernel. Each bus's directory contains two subdirectories:: |
| 310 | |
| 311 | devices/ |
| 312 | drivers/ |
| 313 | |
| 314 | devices/ contains symlinks for each device discovered in the system |
| 315 | that point to the device's directory under root/. |
| 316 | |
| 317 | drivers/ contains a directory for each device driver that is loaded |
| 318 | for devices on that particular bus (this assumes that drivers do not |
| 319 | span multiple bus types). |
| 320 | |
| 321 | fs/ contains a directory for some filesystems. Currently each |
| 322 | filesystem wanting to export attributes must create its own hierarchy |
| 323 | below fs/ (see ./fuse.txt for an example). |
| 324 | |
| 325 | dev/ contains two directories char/ and block/. Inside these two |
| 326 | directories there are symlinks named <major>:<minor>. These symlinks |
| 327 | point to the sysfs directory for the given device. /sys/dev provides a |
| 328 | quick way to lookup the sysfs interface for a device from the result of |
| 329 | a stat(2) operation. |
| 330 | |
| 331 | More information can driver-model specific features can be found in |
| 332 | Documentation/driver-api/driver-model/. |
| 333 | |
| 334 | |
| 335 | TODO: Finish this section. |
| 336 | |
| 337 | |
| 338 | Current Interfaces |
| 339 | ~~~~~~~~~~~~~~~~~~ |
| 340 | |
| 341 | The following interface layers currently exist in sysfs: |
| 342 | |
| 343 | |
| 344 | devices (include/linux/device.h) |
| 345 | -------------------------------- |
| 346 | Structure:: |
| 347 | |
| 348 | struct device_attribute { |
| 349 | struct attribute attr; |
| 350 | ssize_t (*show)(struct device *dev, struct device_attribute *attr, |
| 351 | char *buf); |
| 352 | ssize_t (*store)(struct device *dev, struct device_attribute *attr, |
| 353 | const char *buf, size_t count); |
| 354 | }; |
| 355 | |
| 356 | Declaring:: |
| 357 | |
| 358 | DEVICE_ATTR(_name, _mode, _show, _store); |
| 359 | |
| 360 | Creation/Removal:: |
| 361 | |
| 362 | int device_create_file(struct device *dev, const struct device_attribute * attr); |
| 363 | void device_remove_file(struct device *dev, const struct device_attribute * attr); |
| 364 | |
| 365 | |
| 366 | bus drivers (include/linux/device.h) |
| 367 | ------------------------------------ |
| 368 | Structure:: |
| 369 | |
| 370 | struct bus_attribute { |
| 371 | struct attribute attr; |
| 372 | ssize_t (*show)(struct bus_type *, char * buf); |
| 373 | ssize_t (*store)(struct bus_type *, const char * buf, size_t count); |
| 374 | }; |
| 375 | |
| 376 | Declaring:: |
| 377 | |
| 378 | static BUS_ATTR_RW(name); |
| 379 | static BUS_ATTR_RO(name); |
| 380 | static BUS_ATTR_WO(name); |
| 381 | |
| 382 | Creation/Removal:: |
| 383 | |
| 384 | int bus_create_file(struct bus_type *, struct bus_attribute *); |
| 385 | void bus_remove_file(struct bus_type *, struct bus_attribute *); |
| 386 | |
| 387 | |
| 388 | device drivers (include/linux/device.h) |
| 389 | --------------------------------------- |
| 390 | |
| 391 | Structure:: |
| 392 | |
| 393 | struct driver_attribute { |
| 394 | struct attribute attr; |
| 395 | ssize_t (*show)(struct device_driver *, char * buf); |
| 396 | ssize_t (*store)(struct device_driver *, const char * buf, |
| 397 | size_t count); |
| 398 | }; |
| 399 | |
| 400 | Declaring:: |
| 401 | |
| 402 | DRIVER_ATTR_RO(_name) |
| 403 | DRIVER_ATTR_RW(_name) |
| 404 | |
| 405 | Creation/Removal:: |
| 406 | |
| 407 | int driver_create_file(struct device_driver *, const struct driver_attribute *); |
| 408 | void driver_remove_file(struct device_driver *, const struct driver_attribute *); |
| 409 | |
| 410 | |
| 411 | Documentation |
| 412 | ~~~~~~~~~~~~~ |
| 413 | |
| 414 | The sysfs directory structure and the attributes in each directory define an |
| 415 | ABI between the kernel and user space. As for any ABI, it is important that |
| 416 | this ABI is stable and properly documented. All new sysfs attributes must be |
| 417 | documented in Documentation/ABI. See also Documentation/ABI/README for more |
| 418 | information. |