Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 1 | ======================================================= |
| 2 | Configfs - Userspace-driven Kernel Object Configuration |
| 3 | ======================================================= |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 4 | |
| 5 | Joel Becker <joel.becker@oracle.com> |
| 6 | |
| 7 | Updated: 31 March 2005 |
| 8 | |
| 9 | Copyright (c) 2005 Oracle Corporation, |
| 10 | Joel Becker <joel.becker@oracle.com> |
| 11 | |
| 12 | |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 13 | What is configfs? |
| 14 | ================= |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 15 | |
| 16 | configfs is a ram-based filesystem that provides the converse of |
| 17 | sysfs's functionality. Where sysfs is a filesystem-based view of |
| 18 | kernel objects, configfs is a filesystem-based manager of kernel |
| 19 | objects, or config_items. |
| 20 | |
| 21 | With sysfs, an object is created in kernel (for example, when a device |
| 22 | is discovered) and it is registered with sysfs. Its attributes then |
| 23 | appear in sysfs, allowing userspace to read the attributes via |
| 24 | readdir(3)/read(2). It may allow some attributes to be modified via |
| 25 | write(2). The important point is that the object is created and |
| 26 | destroyed in kernel, the kernel controls the lifecycle of the sysfs |
| 27 | representation, and sysfs is merely a window on all this. |
| 28 | |
| 29 | A configfs config_item is created via an explicit userspace operation: |
| 30 | mkdir(2). It is destroyed via rmdir(2). The attributes appear at |
| 31 | mkdir(2) time, and can be read or modified via read(2) and write(2). |
| 32 | As with sysfs, readdir(3) queries the list of items and/or attributes. |
| 33 | symlink(2) can be used to group items together. Unlike sysfs, the |
| 34 | lifetime of the representation is completely driven by userspace. The |
| 35 | kernel modules backing the items must respond to this. |
| 36 | |
| 37 | Both sysfs and configfs can and should exist together on the same |
| 38 | system. One is not a replacement for the other. |
| 39 | |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 40 | Using configfs |
| 41 | ============== |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 42 | |
| 43 | configfs can be compiled as a module or into the kernel. You can access |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 44 | it by doing:: |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 45 | |
| 46 | mount -t configfs none /config |
| 47 | |
| 48 | The configfs tree will be empty unless client modules are also loaded. |
| 49 | These are modules that register their item types with configfs as |
| 50 | subsystems. Once a client subsystem is loaded, it will appear as a |
| 51 | subdirectory (or more than one) under /config. Like sysfs, the |
| 52 | configfs tree is always there, whether mounted on /config or not. |
| 53 | |
| 54 | An item is created via mkdir(2). The item's attributes will also |
| 55 | appear at this time. readdir(3) can determine what the attributes are, |
| 56 | read(2) can query their default values, and write(2) can store new |
Pantelis Antoniou | 03607ac | 2015-10-22 23:30:04 +0300 | [diff] [blame] | 57 | values. Don't mix more than one attribute in one attribute file. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 58 | |
Pantelis Antoniou | 03607ac | 2015-10-22 23:30:04 +0300 | [diff] [blame] | 59 | There are two types of configfs attributes: |
| 60 | |
| 61 | * Normal attributes, which similar to sysfs attributes, are small ASCII text |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 62 | files, with a maximum size of one page (PAGE_SIZE, 4096 on i386). Preferably |
| 63 | only one value per file should be used, and the same caveats from sysfs apply. |
| 64 | Configfs expects write(2) to store the entire buffer at once. When writing to |
| 65 | normal configfs attributes, userspace processes should first read the entire |
| 66 | file, modify the portions they wish to change, and then write the entire |
| 67 | buffer back. |
Pantelis Antoniou | 03607ac | 2015-10-22 23:30:04 +0300 | [diff] [blame] | 68 | |
| 69 | * Binary attributes, which are somewhat similar to sysfs binary attributes, |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 70 | but with a few slight changes to semantics. The PAGE_SIZE limitation does not |
| 71 | apply, but the whole binary item must fit in single kernel vmalloc'ed buffer. |
| 72 | The write(2) calls from user space are buffered, and the attributes' |
| 73 | write_bin_attribute method will be invoked on the final close, therefore it is |
| 74 | imperative for user-space to check the return code of close(2) in order to |
| 75 | verify that the operation finished successfully. |
| 76 | To avoid a malicious user OOMing the kernel, there's a per-binary attribute |
| 77 | maximum buffer value. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 78 | |
| 79 | When an item needs to be destroyed, remove it with rmdir(2). An |
| 80 | item cannot be destroyed if any other item has a link to it (via |
| 81 | symlink(2)). Links can be removed via unlink(2). |
| 82 | |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 83 | Configuring FakeNBD: an Example |
| 84 | =============================== |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 85 | |
| 86 | Imagine there's a Network Block Device (NBD) driver that allows you to |
| 87 | access remote block devices. Call it FakeNBD. FakeNBD uses configfs |
| 88 | for its configuration. Obviously, there will be a nice program that |
| 89 | sysadmins use to configure FakeNBD, but somehow that program has to tell |
| 90 | the driver about it. Here's where configfs comes in. |
| 91 | |
| 92 | When the FakeNBD driver is loaded, it registers itself with configfs. |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 93 | readdir(3) sees this just fine:: |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 94 | |
| 95 | # ls /config |
| 96 | fakenbd |
| 97 | |
| 98 | A fakenbd connection can be created with mkdir(2). The name is |
| 99 | arbitrary, but likely the tool will make some use of the name. Perhaps |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 100 | it is a uuid or a disk name:: |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 101 | |
| 102 | # mkdir /config/fakenbd/disk1 |
| 103 | # ls /config/fakenbd/disk1 |
| 104 | target device rw |
| 105 | |
| 106 | The target attribute contains the IP address of the server FakeNBD will |
| 107 | connect to. The device attribute is the device on the server. |
| 108 | Predictably, the rw attribute determines whether the connection is |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 109 | read-only or read-write:: |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 110 | |
| 111 | # echo 10.0.0.1 > /config/fakenbd/disk1/target |
| 112 | # echo /dev/sda1 > /config/fakenbd/disk1/device |
| 113 | # echo 1 > /config/fakenbd/disk1/rw |
| 114 | |
| 115 | That's it. That's all there is. Now the device is configured, via the |
| 116 | shell no less. |
| 117 | |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 118 | Coding With configfs |
| 119 | ==================== |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 120 | |
| 121 | Every object in configfs is a config_item. A config_item reflects an |
| 122 | object in the subsystem. It has attributes that match values on that |
| 123 | object. configfs handles the filesystem representation of that object |
| 124 | and its attributes, allowing the subsystem to ignore all but the |
| 125 | basic show/store interaction. |
| 126 | |
| 127 | Items are created and destroyed inside a config_group. A group is a |
| 128 | collection of items that share the same attributes and operations. |
| 129 | Items are created by mkdir(2) and removed by rmdir(2), but configfs |
| 130 | handles that. The group has a set of operations to perform these tasks |
| 131 | |
| 132 | A subsystem is the top level of a client module. During initialization, |
| 133 | the client module registers the subsystem with configfs, the subsystem |
| 134 | appears as a directory at the top of the configfs filesystem. A |
| 135 | subsystem is also a config_group, and can do everything a config_group |
| 136 | can. |
| 137 | |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 138 | struct config_item |
| 139 | ================== |
| 140 | |
| 141 | :: |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 142 | |
| 143 | struct config_item { |
| 144 | char *ci_name; |
| 145 | char ci_namebuf[UOBJ_NAME_LEN]; |
| 146 | struct kref ci_kref; |
| 147 | struct list_head ci_entry; |
| 148 | struct config_item *ci_parent; |
| 149 | struct config_group *ci_group; |
| 150 | struct config_item_type *ci_type; |
| 151 | struct dentry *ci_dentry; |
| 152 | }; |
| 153 | |
| 154 | void config_item_init(struct config_item *); |
| 155 | void config_item_init_type_name(struct config_item *, |
| 156 | const char *name, |
| 157 | struct config_item_type *type); |
| 158 | struct config_item *config_item_get(struct config_item *); |
| 159 | void config_item_put(struct config_item *); |
| 160 | |
| 161 | Generally, struct config_item is embedded in a container structure, a |
| 162 | structure that actually represents what the subsystem is doing. The |
| 163 | config_item portion of that structure is how the object interacts with |
| 164 | configfs. |
| 165 | |
| 166 | Whether statically defined in a source file or created by a parent |
| 167 | config_group, a config_item must have one of the _init() functions |
| 168 | called on it. This initializes the reference count and sets up the |
| 169 | appropriate fields. |
| 170 | |
| 171 | All users of a config_item should have a reference on it via |
| 172 | config_item_get(), and drop the reference when they are done via |
| 173 | config_item_put(). |
| 174 | |
| 175 | By itself, a config_item cannot do much more than appear in configfs. |
| 176 | Usually a subsystem wants the item to display and/or store attributes, |
| 177 | among other things. For that, it needs a type. |
| 178 | |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 179 | struct config_item_type |
| 180 | ======================= |
| 181 | |
| 182 | :: |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 183 | |
| 184 | struct configfs_item_operations { |
| 185 | void (*release)(struct config_item *); |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 186 | int (*allow_link)(struct config_item *src, |
| 187 | struct config_item *target); |
Andrzej Pietrasiewicz | e16769d | 2016-11-28 13:22:42 +0100 | [diff] [blame] | 188 | void (*drop_link)(struct config_item *src, |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 189 | struct config_item *target); |
| 190 | }; |
| 191 | |
| 192 | struct config_item_type { |
| 193 | struct module *ct_owner; |
| 194 | struct configfs_item_operations *ct_item_ops; |
| 195 | struct configfs_group_operations *ct_group_ops; |
| 196 | struct configfs_attribute **ct_attrs; |
Pantelis Antoniou | 03607ac | 2015-10-22 23:30:04 +0300 | [diff] [blame] | 197 | struct configfs_bin_attribute **ct_bin_attrs; |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 198 | }; |
| 199 | |
| 200 | The most basic function of a config_item_type is to define what |
| 201 | operations can be performed on a config_item. All items that have been |
| 202 | allocated dynamically will need to provide the ct_item_ops->release() |
| 203 | method. This method is called when the config_item's reference count |
Christoph Hellwig | 5179822 | 2015-10-03 15:32:59 +0200 | [diff] [blame] | 204 | reaches zero. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 205 | |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 206 | struct configfs_attribute |
| 207 | ========================= |
| 208 | |
| 209 | :: |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 210 | |
| 211 | struct configfs_attribute { |
| 212 | char *ca_name; |
| 213 | struct module *ca_owner; |
Al Viro | 4394751 | 2011-07-25 00:05:26 -0400 | [diff] [blame] | 214 | umode_t ca_mode; |
Christoph Hellwig | 5179822 | 2015-10-03 15:32:59 +0200 | [diff] [blame] | 215 | ssize_t (*show)(struct config_item *, char *); |
| 216 | ssize_t (*store)(struct config_item *, const char *, size_t); |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 217 | }; |
| 218 | |
| 219 | When a config_item wants an attribute to appear as a file in the item's |
| 220 | configfs directory, it must define a configfs_attribute describing it. |
| 221 | It then adds the attribute to the NULL-terminated array |
| 222 | config_item_type->ct_attrs. When the item appears in configfs, the |
| 223 | attribute file will appear with the configfs_attribute->ca_name |
| 224 | filename. configfs_attribute->ca_mode specifies the file permissions. |
| 225 | |
Christoph Hellwig | 5179822 | 2015-10-03 15:32:59 +0200 | [diff] [blame] | 226 | If an attribute is readable and provides a ->show method, that method will |
| 227 | be called whenever userspace asks for a read(2) on the attribute. If an |
| 228 | attribute is writable and provides a ->store method, that method will be |
Randy Dunlap | 58c8e97 | 2020-07-03 14:43:18 -0700 | [diff] [blame] | 229 | called whenever userspace asks for a write(2) on the attribute. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 230 | |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 231 | struct configfs_bin_attribute |
| 232 | ============================= |
| 233 | |
| 234 | :: |
Pantelis Antoniou | 03607ac | 2015-10-22 23:30:04 +0300 | [diff] [blame] | 235 | |
Helen Koike | 6b5a49b | 2018-12-07 17:11:58 -0200 | [diff] [blame] | 236 | struct configfs_bin_attribute { |
Pantelis Antoniou | 03607ac | 2015-10-22 23:30:04 +0300 | [diff] [blame] | 237 | struct configfs_attribute cb_attr; |
| 238 | void *cb_private; |
| 239 | size_t cb_max_size; |
| 240 | }; |
| 241 | |
| 242 | The binary attribute is used when the one needs to use binary blob to |
| 243 | appear as the contents of a file in the item's configfs directory. |
| 244 | To do so add the binary attribute to the NULL-terminated array |
| 245 | config_item_type->ct_bin_attrs, and the item appears in configfs, the |
| 246 | attribute file will appear with the configfs_bin_attribute->cb_attr.ca_name |
| 247 | filename. configfs_bin_attribute->cb_attr.ca_mode specifies the file |
| 248 | permissions. |
| 249 | The cb_private member is provided for use by the driver, while the |
| 250 | cb_max_size member specifies the maximum amount of vmalloc buffer |
| 251 | to be used. |
| 252 | |
| 253 | If binary attribute is readable and the config_item provides a |
| 254 | ct_item_ops->read_bin_attribute() method, that method will be called |
| 255 | whenever userspace asks for a read(2) on the attribute. The converse |
| 256 | will happen for write(2). The reads/writes are bufferred so only a |
| 257 | single read/write will occur; the attributes' need not concern itself |
| 258 | with it. |
| 259 | |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 260 | struct config_group |
| 261 | =================== |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 262 | |
Matt LaPlante | 4ae0edc | 2006-11-30 04:58:40 +0100 | [diff] [blame] | 263 | A config_item cannot live in a vacuum. The only way one can be created |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 264 | is via mkdir(2) on a config_group. This will trigger creation of a |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 265 | child item:: |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 266 | |
| 267 | struct config_group { |
| 268 | struct config_item cg_item; |
| 269 | struct list_head cg_children; |
| 270 | struct configfs_subsystem *cg_subsys; |
Christoph Hellwig | 1ae1602 | 2016-02-26 11:02:14 +0100 | [diff] [blame] | 271 | struct list_head default_groups; |
| 272 | struct list_head group_entry; |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 273 | }; |
| 274 | |
| 275 | void config_group_init(struct config_group *group); |
| 276 | void config_group_init_type_name(struct config_group *group, |
| 277 | const char *name, |
| 278 | struct config_item_type *type); |
| 279 | |
| 280 | |
| 281 | The config_group structure contains a config_item. Properly configuring |
| 282 | that item means that a group can behave as an item in its own right. |
| 283 | However, it can do more: it can create child items or groups. This is |
| 284 | accomplished via the group operations specified on the group's |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 285 | config_item_type:: |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 286 | |
| 287 | struct configfs_group_operations { |
Joel Becker | f89ab86 | 2008-07-17 14:53:48 -0700 | [diff] [blame] | 288 | struct config_item *(*make_item)(struct config_group *group, |
| 289 | const char *name); |
| 290 | struct config_group *(*make_group)(struct config_group *group, |
| 291 | const char *name); |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 292 | int (*commit_item)(struct config_item *item); |
Joel Becker | 299894c | 2006-10-06 17:33:23 -0700 | [diff] [blame] | 293 | void (*disconnect_notify)(struct config_group *group, |
| 294 | struct config_item *item); |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 295 | void (*drop_item)(struct config_group *group, |
| 296 | struct config_item *item); |
| 297 | }; |
| 298 | |
| 299 | A group creates child items by providing the |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 300 | ct_group_ops->make_item() method. If provided, this method is called from |
| 301 | mkdir(2) in the group's directory. The subsystem allocates a new |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 302 | config_item (or more likely, its container structure), initializes it, |
| 303 | and returns it to configfs. Configfs will then populate the filesystem |
| 304 | tree to reflect the new item. |
| 305 | |
| 306 | If the subsystem wants the child to be a group itself, the subsystem |
| 307 | provides ct_group_ops->make_group(). Everything else behaves the same, |
| 308 | using the group _init() functions on the group. |
| 309 | |
| 310 | Finally, when userspace calls rmdir(2) on the item or group, |
| 311 | ct_group_ops->drop_item() is called. As a config_group is also a |
Matt LaPlante | 53cb472 | 2006-10-03 22:55:17 +0200 | [diff] [blame] | 312 | config_item, it is not necessary for a separate drop_group() method. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 313 | The subsystem must config_item_put() the reference that was initialized |
| 314 | upon item allocation. If a subsystem has no work to do, it may omit |
| 315 | the ct_group_ops->drop_item() method, and configfs will call |
| 316 | config_item_put() on the item on behalf of the subsystem. |
| 317 | |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 318 | Important: |
| 319 | drop_item() is void, and as such cannot fail. When rmdir(2) |
| 320 | is called, configfs WILL remove the item from the filesystem tree |
| 321 | (assuming that it has no children to keep it busy). The subsystem is |
| 322 | responsible for responding to this. If the subsystem has references to |
| 323 | the item in other threads, the memory is safe. It may take some time |
| 324 | for the item to actually disappear from the subsystem's usage. But it |
| 325 | is gone from configfs. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 326 | |
Joel Becker | 299894c | 2006-10-06 17:33:23 -0700 | [diff] [blame] | 327 | When drop_item() is called, the item's linkage has already been torn |
| 328 | down. It no longer has a reference on its parent and has no place in |
| 329 | the item hierarchy. If a client needs to do some cleanup before this |
| 330 | teardown happens, the subsystem can implement the |
| 331 | ct_group_ops->disconnect_notify() method. The method is called after |
| 332 | configfs has removed the item from the filesystem view but before the |
| 333 | item is removed from its parent group. Like drop_item(), |
| 334 | disconnect_notify() is void and cannot fail. Client subsystems should |
| 335 | not drop any references here, as they still must do it in drop_item(). |
| 336 | |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 337 | A config_group cannot be removed while it still has child items. This |
| 338 | is implemented in the configfs rmdir(2) code. ->drop_item() will not be |
| 339 | called, as the item has not been dropped. rmdir(2) will fail, as the |
| 340 | directory is not empty. |
| 341 | |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 342 | struct configfs_subsystem |
| 343 | ========================= |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 344 | |
Matt LaPlante | 4ae0edc | 2006-11-30 04:58:40 +0100 | [diff] [blame] | 345 | A subsystem must register itself, usually at module_init time. This |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 346 | tells configfs to make the subsystem appear in the file tree:: |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 347 | |
| 348 | struct configfs_subsystem { |
| 349 | struct config_group su_group; |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 350 | struct mutex su_mutex; |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 351 | }; |
| 352 | |
| 353 | int configfs_register_subsystem(struct configfs_subsystem *subsys); |
| 354 | void configfs_unregister_subsystem(struct configfs_subsystem *subsys); |
| 355 | |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 356 | A subsystem consists of a toplevel config_group and a mutex. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 357 | The group is where child config_items are created. For a subsystem, |
| 358 | this group is usually defined statically. Before calling |
| 359 | configfs_register_subsystem(), the subsystem must have initialized the |
| 360 | group via the usual group _init() functions, and it must also have |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 361 | initialized the mutex. |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 362 | |
| 363 | When the register call returns, the subsystem is live, and it |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 364 | will be visible via configfs. At that point, mkdir(2) can be called and |
| 365 | the subsystem must be ready for it. |
| 366 | |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 367 | An Example |
| 368 | ========== |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 369 | |
| 370 | The best example of these basic concepts is the simple_children |
Christoph Hellwig | 5179822 | 2015-10-03 15:32:59 +0200 | [diff] [blame] | 371 | subsystem/group and the simple_child item in |
| 372 | samples/configfs/configfs_sample.c. It shows a trivial object displaying |
| 373 | and storing an attribute, and a simple group creating and destroying |
| 374 | these children. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 375 | |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 376 | Hierarchy Navigation and the Subsystem Mutex |
| 377 | ============================================ |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 378 | |
| 379 | There is an extra bonus that configfs provides. The config_groups and |
| 380 | config_items are arranged in a hierarchy due to the fact that they |
| 381 | appear in a filesystem. A subsystem is NEVER to touch the filesystem |
| 382 | parts, but the subsystem might be interested in this hierarchy. For |
| 383 | this reason, the hierarchy is mirrored via the config_group->cg_children |
| 384 | and config_item->ci_parent structure members. |
| 385 | |
| 386 | A subsystem can navigate the cg_children list and the ci_parent pointer |
| 387 | to see the tree created by the subsystem. This can race with configfs' |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 388 | management of the hierarchy, so configfs uses the subsystem mutex to |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 389 | protect modifications. Whenever a subsystem wants to navigate the |
| 390 | hierarchy, it must do so under the protection of the subsystem |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 391 | mutex. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 392 | |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 393 | A subsystem will be prevented from acquiring the mutex while a newly |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 394 | allocated item has not been linked into this hierarchy. Similarly, it |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 395 | will not be able to acquire the mutex while a dropping item has not |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 396 | yet been unlinked. This means that an item's ci_parent pointer will |
| 397 | never be NULL while the item is in configfs, and that an item will only |
| 398 | be in its parent's cg_children list for the same duration. This allows |
| 399 | a subsystem to trust ci_parent and cg_children while they hold the |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 400 | mutex. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 401 | |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 402 | Item Aggregation Via symlink(2) |
| 403 | =============================== |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 404 | |
| 405 | configfs provides a simple group via the group->item parent/child |
| 406 | relationship. Often, however, a larger environment requires aggregation |
| 407 | outside of the parent/child connection. This is implemented via |
| 408 | symlink(2). |
| 409 | |
| 410 | A config_item may provide the ct_item_ops->allow_link() and |
| 411 | ct_item_ops->drop_link() methods. If the ->allow_link() method exists, |
| 412 | symlink(2) may be called with the config_item as the source of the link. |
| 413 | These links are only allowed between configfs config_items. Any |
| 414 | symlink(2) attempt outside the configfs filesystem will be denied. |
| 415 | |
| 416 | When symlink(2) is called, the source config_item's ->allow_link() |
| 417 | method is called with itself and a target item. If the source item |
| 418 | allows linking to target item, it returns 0. A source item may wish to |
| 419 | reject a link if it only wants links to a certain type of object (say, |
| 420 | in its own subsystem). |
| 421 | |
| 422 | When unlink(2) is called on the symbolic link, the source item is |
| 423 | notified via the ->drop_link() method. Like the ->drop_item() method, |
| 424 | this is a void function and cannot return failure. The subsystem is |
| 425 | responsible for responding to the change. |
| 426 | |
| 427 | A config_item cannot be removed while it links to any other item, nor |
| 428 | can it be removed while an item links to it. Dangling symlinks are not |
| 429 | allowed in configfs. |
| 430 | |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 431 | Automatically Created Subgroups |
| 432 | =============================== |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 433 | |
| 434 | A new config_group may want to have two types of child config_items. |
| 435 | While this could be codified by magic names in ->make_item(), it is much |
| 436 | more explicit to have a method whereby userspace sees this divergence. |
| 437 | |
| 438 | Rather than have a group where some items behave differently than |
| 439 | others, configfs provides a method whereby one or many subgroups are |
| 440 | automatically created inside the parent at its creation. Thus, |
Masatake YAMATO | 48cc7ec | 2008-02-03 16:10:08 +0200 | [diff] [blame] | 441 | mkdir("parent") results in "parent", "parent/subgroup1", up through |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 442 | "parent/subgroupN". Items of type 1 can now be created in |
| 443 | "parent/subgroup1", and items of type N can be created in |
| 444 | "parent/subgroupN". |
| 445 | |
| 446 | These automatic subgroups, or default groups, do not preclude other |
| 447 | children of the parent group. If ct_group_ops->make_group() exists, |
| 448 | other child groups can be created on the parent group directly. |
| 449 | |
Christoph Hellwig | 1ae1602 | 2016-02-26 11:02:14 +0100 | [diff] [blame] | 450 | A configfs subsystem specifies default groups by adding them using the |
| 451 | configfs_add_default_group() function to the parent config_group |
| 452 | structure. Each added group is populated in the configfs tree at the same |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 453 | time as the parent group. Similarly, they are removed at the same time |
| 454 | as the parent. No extra notification is provided. When a ->drop_item() |
| 455 | method call notifies the subsystem the parent group is going away, it |
| 456 | also means every default group child associated with that parent group. |
| 457 | |
Christoph Hellwig | 1ae1602 | 2016-02-26 11:02:14 +0100 | [diff] [blame] | 458 | As a consequence of this, default groups cannot be removed directly via |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 459 | rmdir(2). They also are not considered when rmdir(2) on the parent |
| 460 | group is checking for children. |
| 461 | |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 462 | Dependent Subsystems |
| 463 | ==================== |
Joel Becker | 631d1fe | 2007-06-18 18:06:09 -0700 | [diff] [blame] | 464 | |
| 465 | Sometimes other drivers depend on particular configfs items. For |
| 466 | example, ocfs2 mounts depend on a heartbeat region item. If that |
| 467 | region item is removed with rmdir(2), the ocfs2 mount must BUG or go |
| 468 | readonly. Not happy. |
| 469 | |
| 470 | configfs provides two additional API calls: configfs_depend_item() and |
| 471 | configfs_undepend_item(). A client driver can call |
| 472 | configfs_depend_item() on an existing item to tell configfs that it is |
| 473 | depended on. configfs will then return -EBUSY from rmdir(2) for that |
| 474 | item. When the item is no longer depended on, the client driver calls |
| 475 | configfs_undepend_item() on it. |
| 476 | |
| 477 | These API cannot be called underneath any configfs callbacks, as |
| 478 | they will conflict. They can block and allocate. A client driver |
| 479 | probably shouldn't calling them of its own gumption. Rather it should |
| 480 | be providing an API that external subsystems call. |
| 481 | |
| 482 | How does this work? Imagine the ocfs2 mount process. When it mounts, |
| 483 | it asks for a heartbeat region item. This is done via a call into the |
| 484 | heartbeat code. Inside the heartbeat code, the region item is looked |
| 485 | up. Here, the heartbeat code calls configfs_depend_item(). If it |
| 486 | succeeds, then heartbeat knows the region is safe to give to ocfs2. |
| 487 | If it fails, it was being torn down anyway, and heartbeat can gracefully |
| 488 | pass up an error. |
| 489 | |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 490 | Committable Items |
| 491 | ================= |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 492 | |
Mauro Carvalho Chehab | 9826499 | 2020-04-27 23:17:21 +0200 | [diff] [blame] | 493 | Note: |
| 494 | Committable items are currently unimplemented. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 495 | |
| 496 | Some config_items cannot have a valid initial state. That is, no |
| 497 | default values can be specified for the item's attributes such that the |
| 498 | item can do its work. Userspace must configure one or more attributes, |
| 499 | after which the subsystem can start whatever entity this item |
| 500 | represents. |
| 501 | |
| 502 | Consider the FakeNBD device from above. Without a target address *and* |
| 503 | a target device, the subsystem has no idea what block device to import. |
| 504 | The simple example assumes that the subsystem merely waits until all the |
| 505 | appropriate attributes are configured, and then connects. This will, |
| 506 | indeed, work, but now every attribute store must check if the attributes |
| 507 | are initialized. Every attribute store must fire off the connection if |
| 508 | that condition is met. |
| 509 | |
| 510 | Far better would be an explicit action notifying the subsystem that the |
| 511 | config_item is ready to go. More importantly, an explicit action allows |
Matt LaPlante | 3f6dee9 | 2006-10-03 22:45:33 +0200 | [diff] [blame] | 512 | the subsystem to provide feedback as to whether the attributes are |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 513 | initialized in a way that makes sense. configfs provides this as |
| 514 | committable items. |
| 515 | |
| 516 | configfs still uses only normal filesystem operations. An item is |
| 517 | committed via rename(2). The item is moved from a directory where it |
| 518 | can be modified to a directory where it cannot. |
| 519 | |
| 520 | Any group that provides the ct_group_ops->commit_item() method has |
| 521 | committable items. When this group appears in configfs, mkdir(2) will |
| 522 | not work directly in the group. Instead, the group will have two |
| 523 | subdirectories: "live" and "pending". The "live" directory does not |
| 524 | support mkdir(2) or rmdir(2) either. It only allows rename(2). The |
| 525 | "pending" directory does allow mkdir(2) and rmdir(2). An item is |
| 526 | created in the "pending" directory. Its attributes can be modified at |
| 527 | will. Userspace commits the item by renaming it into the "live" |
Matt LaPlante | d6bc8ac | 2006-10-03 22:54:15 +0200 | [diff] [blame] | 528 | directory. At this point, the subsystem receives the ->commit_item() |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 529 | callback. If all required attributes are filled to satisfaction, the |
| 530 | method returns zero and the item is moved to the "live" directory. |
| 531 | |
| 532 | As rmdir(2) does not work in the "live" directory, an item must be |
| 533 | shutdown, or "uncommitted". Again, this is done via rename(2), this |
| 534 | time from the "live" directory back to the "pending" one. The subsystem |
| 535 | is notified by the ct_group_ops->uncommit_object() method. |