Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 1 | The seq_file interface |
| 2 | |
| 3 | Copyright 2003 Jonathan Corbet <corbet@lwn.net> |
| 4 | This file is originally from the LWN.net Driver Porting series at |
| 5 | http://lwn.net/Articles/driver-porting/ |
| 6 | |
| 7 | |
| 8 | There are numerous ways for a device driver (or other kernel component) to |
| 9 | provide information to the user or system administrator. One useful |
| 10 | technique is the creation of virtual files, in debugfs, /proc or elsewhere. |
| 11 | Virtual files can provide human-readable output that is easy to get at |
| 12 | without any special utility programs; they can also make life easier for |
| 13 | script writers. It is not surprising that the use of virtual files has |
| 14 | grown over the years. |
| 15 | |
| 16 | Creating those files correctly has always been a bit of a challenge, |
| 17 | however. It is not that hard to make a virtual file which returns a |
| 18 | string. But life gets trickier if the output is long - anything greater |
| 19 | than an application is likely to read in a single operation. Handling |
| 20 | multiple reads (and seeks) requires careful attention to the reader's |
| 21 | position within the virtual file - that position is, likely as not, in the |
| 22 | middle of a line of output. The kernel has traditionally had a number of |
| 23 | implementations that got this wrong. |
| 24 | |
| 25 | The 2.6 kernel contains a set of functions (implemented by Alexander Viro) |
| 26 | which are designed to make it easy for virtual file creators to get it |
| 27 | right. |
| 28 | |
| 29 | The seq_file interface is available via <linux/seq_file.h>. There are |
| 30 | three aspects to seq_file: |
| 31 | |
| 32 | * An iterator interface which lets a virtual file implementation |
| 33 | step through the objects it is presenting. |
| 34 | |
| 35 | * Some utility functions for formatting objects for output without |
| 36 | needing to worry about things like output buffers. |
| 37 | |
| 38 | * A set of canned file_operations which implement most operations on |
| 39 | the virtual file. |
| 40 | |
| 41 | We'll look at the seq_file interface via an extremely simple example: a |
| 42 | loadable module which creates a file called /proc/sequence. The file, when |
| 43 | read, simply produces a set of increasing integer values, one per line. The |
| 44 | sequence will continue until the user loses patience and finds something |
| 45 | better to do. The file is seekable, in that one can do something like the |
| 46 | following: |
| 47 | |
| 48 | dd if=/proc/sequence of=out1 count=1 |
Jesper Dangaard Brouer | e818880 | 2009-05-26 15:18:52 +0200 | [diff] [blame] | 49 | dd if=/proc/sequence skip=1 of=out2 count=1 |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 50 | |
| 51 | Then concatenate the output files out1 and out2 and get the right |
| 52 | result. Yes, it is a thoroughly useless module, but the point is to show |
| 53 | how the mechanism works without getting lost in other details. (Those |
| 54 | wanting to see the full source for this module can find it at |
| 55 | http://lwn.net/Articles/22359/). |
| 56 | |
Fabian Frederick | 0b07cb8 | 2014-06-06 14:36:40 -0700 | [diff] [blame] | 57 | Deprecated create_proc_entry |
| 58 | |
| 59 | Note that the above article uses create_proc_entry which was removed in |
| 60 | kernel 3.10. Current versions require the following update |
| 61 | |
| 62 | - entry = create_proc_entry("sequence", 0, NULL); |
| 63 | - if (entry) |
| 64 | - entry->proc_fops = &ct_file_ops; |
| 65 | + entry = proc_create("sequence", 0, NULL, &ct_file_ops); |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 66 | |
| 67 | The iterator interface |
| 68 | |
NeilBrown | 1f4aace | 2018-08-17 15:44:41 -0700 | [diff] [blame] | 69 | Modules implementing a virtual file with seq_file must implement an |
| 70 | iterator object that allows stepping through the data of interest |
| 71 | during a "session" (roughly one read() system call). If the iterator |
| 72 | is able to move to a specific position - like the file they implement, |
| 73 | though with freedom to map the position number to a sequence location |
| 74 | in whatever way is convenient - the iterator need only exist |
| 75 | transiently during a session. If the iterator cannot easily find a |
| 76 | numerical position but works well with a first/next interface, the |
| 77 | iterator can be stored in the private data area and continue from one |
| 78 | session to the next. |
| 79 | |
| 80 | A seq_file implementation that is formatting firewall rules from a |
| 81 | table, for example, could provide a simple iterator that interprets |
| 82 | position N as the Nth rule in the chain. A seq_file implementation |
| 83 | that presents the content of a, potentially volatile, linked list |
| 84 | might record a pointer into that list, providing that can be done |
| 85 | without risk of the current location being removed. |
| 86 | |
| 87 | Positioning can thus be done in whatever way makes the most sense for |
| 88 | the generator of the data, which need not be aware of how a position |
| 89 | translates to an offset in the virtual file. The one obvious exception |
| 90 | is that a position of zero should indicate the beginning of the file. |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 91 | |
| 92 | The /proc/sequence iterator just uses the count of the next number it |
| 93 | will output as its position. |
| 94 | |
NeilBrown | 1f4aace | 2018-08-17 15:44:41 -0700 | [diff] [blame] | 95 | Four functions must be implemented to make the iterator work. The |
| 96 | first, called start(), starts a session and takes a position as an |
| 97 | argument, returning an iterator which will start reading at that |
| 98 | position. The pos passed to start() will always be either zero, or |
| 99 | the most recent pos used in the previous session. |
| 100 | |
| 101 | For our simple sequence example, |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 102 | the start() function looks like: |
| 103 | |
| 104 | static void *ct_seq_start(struct seq_file *s, loff_t *pos) |
| 105 | { |
| 106 | loff_t *spos = kmalloc(sizeof(loff_t), GFP_KERNEL); |
| 107 | if (! spos) |
| 108 | return NULL; |
| 109 | *spos = *pos; |
| 110 | return spos; |
| 111 | } |
| 112 | |
| 113 | The entire data structure for this iterator is a single loff_t value |
| 114 | holding the current position. There is no upper bound for the sequence |
| 115 | iterator, but that will not be the case for most other seq_file |
| 116 | implementations; in most cases the start() function should check for a |
| 117 | "past end of file" condition and return NULL if need be. |
| 118 | |
| 119 | For more complicated applications, the private field of the seq_file |
NeilBrown | 1f4aace | 2018-08-17 15:44:41 -0700 | [diff] [blame] | 120 | structure can be used to hold state from session to session. There is |
| 121 | also a special value which can be returned by the start() function |
| 122 | called SEQ_START_TOKEN; it can be used if you wish to instruct your |
| 123 | show() function (described below) to print a header at the top of the |
| 124 | output. SEQ_START_TOKEN should only be used if the offset is zero, |
| 125 | however. |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 126 | |
| 127 | The next function to implement is called, amazingly, next(); its job is to |
| 128 | move the iterator forward to the next position in the sequence. The |
| 129 | example module can simply increment the position by one; more useful |
| 130 | modules will do what is needed to step through some data structure. The |
| 131 | next() function returns a new iterator, or NULL if the sequence is |
| 132 | complete. Here's the example version: |
| 133 | |
| 134 | static void *ct_seq_next(struct seq_file *s, void *v, loff_t *pos) |
| 135 | { |
Jan Engelhardt | f3271f6 | 2008-03-28 20:09:39 +0100 | [diff] [blame] | 136 | loff_t *spos = v; |
| 137 | *pos = ++*spos; |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 138 | return spos; |
| 139 | } |
| 140 | |
NeilBrown | 1f4aace | 2018-08-17 15:44:41 -0700 | [diff] [blame] | 141 | The stop() function closes a session; its job, of course, is to clean |
| 142 | up. If dynamic memory is allocated for the iterator, stop() is the |
| 143 | place to free it; if a lock was taken by start(), stop() must release |
| 144 | that lock. The value that *pos was set to by the last next() call |
| 145 | before stop() is remembered, and used for the first start() call of |
| 146 | the next session unless lseek() has been called on the file; in that |
| 147 | case next start() will be asked to start at position zero. |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 148 | |
| 149 | static void ct_seq_stop(struct seq_file *s, void *v) |
| 150 | { |
| 151 | kfree(v); |
| 152 | } |
| 153 | |
| 154 | Finally, the show() function should format the object currently pointed to |
Jonathan Corbet | 22c36d1 | 2008-04-23 10:34:52 -0600 | [diff] [blame] | 155 | by the iterator for output. The example module's show() function is: |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 156 | |
| 157 | static int ct_seq_show(struct seq_file *s, void *v) |
| 158 | { |
Jan Engelhardt | f3271f6 | 2008-03-28 20:09:39 +0100 | [diff] [blame] | 159 | loff_t *spos = v; |
| 160 | seq_printf(s, "%lld\n", (long long)*spos); |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 161 | return 0; |
| 162 | } |
| 163 | |
Jonathan Corbet | 22c36d1 | 2008-04-23 10:34:52 -0600 | [diff] [blame] | 164 | If all is well, the show() function should return zero. A negative error |
| 165 | code in the usual manner indicates that something went wrong; it will be |
| 166 | passed back to user space. This function can also return SEQ_SKIP, which |
| 167 | causes the current item to be skipped; if the show() function has already |
| 168 | generated output before returning SEQ_SKIP, that output will be dropped. |
| 169 | |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 170 | We will look at seq_printf() in a moment. But first, the definition of the |
| 171 | seq_file iterator is finished by creating a seq_operations structure with |
| 172 | the four functions we have just defined: |
| 173 | |
Jan Engelhardt | f3271f6 | 2008-03-28 20:09:39 +0100 | [diff] [blame] | 174 | static const struct seq_operations ct_seq_ops = { |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 175 | .start = ct_seq_start, |
| 176 | .next = ct_seq_next, |
| 177 | .stop = ct_seq_stop, |
| 178 | .show = ct_seq_show |
| 179 | }; |
| 180 | |
| 181 | This structure will be needed to tie our iterator to the /proc file in |
| 182 | a little bit. |
| 183 | |
Dmitri Vorobiev | b82d404 | 2008-04-15 14:34:40 -0700 | [diff] [blame] | 184 | It's worth noting that the iterator value returned by start() and |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 185 | manipulated by the other functions is considered to be completely opaque by |
| 186 | the seq_file code. It can thus be anything that is useful in stepping |
| 187 | through the data to be output. Counters can be useful, but it could also be |
| 188 | a direct pointer into an array or linked list. Anything goes, as long as |
| 189 | the programmer is aware that things can happen between calls to the |
| 190 | iterator function. However, the seq_file code (by design) will not sleep |
| 191 | between the calls to start() and stop(), so holding a lock during that time |
| 192 | is a reasonable thing to do. The seq_file code will also avoid taking any |
| 193 | other locks while the iterator is active. |
| 194 | |
| 195 | |
| 196 | Formatted output |
| 197 | |
| 198 | The seq_file code manages positioning within the output created by the |
| 199 | iterator and getting it into the user's buffer. But, for that to work, that |
| 200 | output must be passed to the seq_file code. Some utility functions have |
| 201 | been defined which make this task easy. |
| 202 | |
| 203 | Most code will simply use seq_printf(), which works pretty much like |
Joe Perches | 1f33c41 | 2014-09-29 16:08:21 -0700 | [diff] [blame] | 204 | printk(), but which requires the seq_file pointer as an argument. |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 205 | |
| 206 | For straight character output, the following functions may be used: |
| 207 | |
Joe Perches | 1f33c41 | 2014-09-29 16:08:21 -0700 | [diff] [blame] | 208 | seq_putc(struct seq_file *m, char c); |
| 209 | seq_puts(struct seq_file *m, const char *s); |
| 210 | seq_escape(struct seq_file *m, const char *s, const char *esc); |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 211 | |
| 212 | The first two output a single character and a string, just like one would |
| 213 | expect. seq_escape() is like seq_puts(), except that any character in s |
| 214 | which is in the string esc will be represented in octal form in the output. |
| 215 | |
Joe Perches | 1f33c41 | 2014-09-29 16:08:21 -0700 | [diff] [blame] | 216 | There are also a pair of functions for printing filenames: |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 217 | |
Dmitry V. Levin | 3809453 | 2012-10-17 20:29:22 +0400 | [diff] [blame] | 218 | int seq_path(struct seq_file *m, const struct path *path, |
| 219 | const char *esc); |
| 220 | int seq_path_root(struct seq_file *m, const struct path *path, |
| 221 | const struct path *root, const char *esc) |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 222 | |
| 223 | Here, path indicates the file of interest, and esc is a set of characters |
Jonathan Corbet | 9f4def9 | 2008-04-25 11:56:37 -0600 | [diff] [blame] | 224 | which should be escaped in the output. A call to seq_path() will output |
| 225 | the path relative to the current process's filesystem root. If a different |
Dmitry V. Levin | 3809453 | 2012-10-17 20:29:22 +0400 | [diff] [blame] | 226 | root is desired, it can be used with seq_path_root(). If it turns out that |
| 227 | path cannot be reached from root, seq_path_root() returns SEQ_SKIP. |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 228 | |
Joe Perches | 1f33c41 | 2014-09-29 16:08:21 -0700 | [diff] [blame] | 229 | A function producing complicated output may want to check |
| 230 | bool seq_has_overflowed(struct seq_file *m); |
| 231 | and avoid further seq_<output> calls if true is returned. |
| 232 | |
| 233 | A true return from seq_has_overflowed means that the seq_file buffer will |
| 234 | be discarded and the seq_show function will attempt to allocate a larger |
| 235 | buffer and retry printing. |
| 236 | |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 237 | |
| 238 | Making it all work |
| 239 | |
| 240 | So far, we have a nice set of functions which can produce output within the |
| 241 | seq_file system, but we have not yet turned them into a file that a user |
| 242 | can see. Creating a file within the kernel requires, of course, the |
| 243 | creation of a set of file_operations which implement the operations on that |
| 244 | file. The seq_file interface provides a set of canned operations which do |
| 245 | most of the work. The virtual file author still must implement the open() |
| 246 | method, however, to hook everything up. The open function is often a single |
| 247 | line, as in the example module: |
| 248 | |
| 249 | static int ct_open(struct inode *inode, struct file *file) |
| 250 | { |
| 251 | return seq_open(file, &ct_seq_ops); |
Jan Engelhardt | f3271f6 | 2008-03-28 20:09:39 +0100 | [diff] [blame] | 252 | } |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 253 | |
| 254 | Here, the call to seq_open() takes the seq_operations structure we created |
| 255 | before, and gets set up to iterate through the virtual file. |
| 256 | |
| 257 | On a successful open, seq_open() stores the struct seq_file pointer in |
| 258 | file->private_data. If you have an application where the same iterator can |
| 259 | be used for more than one file, you can store an arbitrary pointer in the |
| 260 | private field of the seq_file structure; that value can then be retrieved |
| 261 | by the iterator functions. |
| 262 | |
Rob Jones | 77be4da | 2014-09-07 11:24:40 -0700 | [diff] [blame] | 263 | There is also a wrapper function to seq_open() called seq_open_private(). It |
| 264 | kmallocs a zero filled block of memory and stores a pointer to it in the |
| 265 | private field of the seq_file structure, returning 0 on success. The |
| 266 | block size is specified in a third parameter to the function, e.g.: |
| 267 | |
| 268 | static int ct_open(struct inode *inode, struct file *file) |
| 269 | { |
| 270 | return seq_open_private(file, &ct_seq_ops, |
| 271 | sizeof(struct mystruct)); |
| 272 | } |
| 273 | |
| 274 | There is also a variant function, __seq_open_private(), which is functionally |
| 275 | identical except that, if successful, it returns the pointer to the allocated |
| 276 | memory block, allowing further initialisation e.g.: |
| 277 | |
| 278 | static int ct_open(struct inode *inode, struct file *file) |
| 279 | { |
| 280 | struct mystruct *p = |
| 281 | __seq_open_private(file, &ct_seq_ops, sizeof(*p)); |
| 282 | |
| 283 | if (!p) |
| 284 | return -ENOMEM; |
| 285 | |
| 286 | p->foo = bar; /* initialize my stuff */ |
| 287 | ... |
| 288 | p->baz = true; |
| 289 | |
| 290 | return 0; |
| 291 | } |
| 292 | |
| 293 | A corresponding close function, seq_release_private() is available which |
| 294 | frees the memory allocated in the corresponding open. |
| 295 | |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 296 | The other operations of interest - read(), llseek(), and release() - are |
| 297 | all implemented by the seq_file code itself. So a virtual file's |
| 298 | file_operations structure will look like: |
| 299 | |
Jan Engelhardt | f3271f6 | 2008-03-28 20:09:39 +0100 | [diff] [blame] | 300 | static const struct file_operations ct_file_ops = { |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 301 | .owner = THIS_MODULE, |
| 302 | .open = ct_open, |
| 303 | .read = seq_read, |
| 304 | .llseek = seq_lseek, |
| 305 | .release = seq_release |
| 306 | }; |
| 307 | |
| 308 | There is also a seq_release_private() which passes the contents of the |
| 309 | seq_file private field to kfree() before releasing the structure. |
| 310 | |
| 311 | The final step is the creation of the /proc file itself. In the example |
| 312 | code, that is done in the initialization code in the usual way: |
| 313 | |
| 314 | static int ct_init(void) |
| 315 | { |
| 316 | struct proc_dir_entry *entry; |
| 317 | |
Alexey Dobriyan | 6be4b78 | 2009-12-15 16:47:00 -0800 | [diff] [blame] | 318 | proc_create("sequence", 0, NULL, &ct_file_ops); |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 319 | return 0; |
| 320 | } |
| 321 | |
| 322 | module_init(ct_init); |
| 323 | |
| 324 | And that is pretty much it. |
| 325 | |
| 326 | |
| 327 | seq_list |
| 328 | |
| 329 | If your file will be iterating through a linked list, you may find these |
| 330 | routines useful: |
| 331 | |
| 332 | struct list_head *seq_list_start(struct list_head *head, |
| 333 | loff_t pos); |
| 334 | struct list_head *seq_list_start_head(struct list_head *head, |
| 335 | loff_t pos); |
| 336 | struct list_head *seq_list_next(void *v, struct list_head *head, |
| 337 | loff_t *ppos); |
| 338 | |
| 339 | These helpers will interpret pos as a position within the list and iterate |
| 340 | accordingly. Your start() and next() functions need only invoke the |
Dmitri Vorobiev | b82d404 | 2008-04-15 14:34:40 -0700 | [diff] [blame] | 341 | seq_list_* helpers with a pointer to the appropriate list_head structure. |
Jonathan Corbet | ded4926 | 2008-03-28 11:19:56 -0600 | [diff] [blame] | 342 | |
| 343 | |
| 344 | The extra-simple version |
| 345 | |
| 346 | For extremely simple virtual files, there is an even easier interface. A |
| 347 | module can define only the show() function, which should create all the |
| 348 | output that the virtual file will contain. The file's open() method then |
| 349 | calls: |
| 350 | |
| 351 | int single_open(struct file *file, |
| 352 | int (*show)(struct seq_file *m, void *p), |
| 353 | void *data); |
| 354 | |
| 355 | When output time comes, the show() function will be called once. The data |
| 356 | value given to single_open() can be found in the private field of the |
| 357 | seq_file structure. When using single_open(), the programmer should use |
| 358 | single_release() instead of seq_release() in the file_operations structure |
| 359 | to avoid a memory leak. |