Darrick J. Wong | 4618055 | 2018-07-29 15:44:00 -0400 | [diff] [blame] | 1 | .. SPDX-License-Identifier: GPL-2.0 |
| 2 | |
| 3 | Index Nodes |
| 4 | ----------- |
| 5 | |
| 6 | In a regular UNIX filesystem, the inode stores all the metadata |
| 7 | pertaining to the file (time stamps, block maps, extended attributes, |
| 8 | etc), not the directory entry. To find the information associated with a |
| 9 | file, one must traverse the directory files to find the directory entry |
| 10 | associated with a file, then load the inode to find the metadata for |
| 11 | that file. ext4 appears to cheat (for performance reasons) a little bit |
| 12 | by storing a copy of the file type (normally stored in the inode) in the |
| 13 | directory entry. (Compare all this to FAT, which stores all the file |
| 14 | information directly in the directory entry, but does not support hard |
| 15 | links and is in general more seek-happy than ext4 due to its simpler |
| 16 | block allocator and extensive use of linked lists.) |
| 17 | |
| 18 | The inode table is a linear array of ``struct ext4_inode``. The table is |
| 19 | sized to have enough blocks to store at least |
| 20 | ``sb.s_inode_size * sb.s_inodes_per_group`` bytes. The number of the |
| 21 | block group containing an inode can be calculated as |
| 22 | ``(inode_number - 1) / sb.s_inodes_per_group``, and the offset into the |
| 23 | group's table is ``(inode_number - 1) % sb.s_inodes_per_group``. There |
| 24 | is no inode 0. |
| 25 | |
| 26 | The inode checksum is calculated against the FS UUID, the inode number, |
| 27 | and the inode structure itself. |
| 28 | |
| 29 | The inode table entry is laid out in ``struct ext4_inode``. |
| 30 | |
| 31 | .. list-table:: |
Darrick J. Wong | de7abd7 | 2018-10-02 22:43:40 -0400 | [diff] [blame] | 32 | :widths: 8 8 24 40 |
Darrick J. Wong | 4618055 | 2018-07-29 15:44:00 -0400 | [diff] [blame] | 33 | :header-rows: 1 |
Darrick J. Wong | de7abd7 | 2018-10-02 22:43:40 -0400 | [diff] [blame] | 34 | :class: longtable |
Darrick J. Wong | 4618055 | 2018-07-29 15:44:00 -0400 | [diff] [blame] | 35 | |
| 36 | * - Offset |
| 37 | - Size |
| 38 | - Name |
| 39 | - Description |
| 40 | * - 0x0 |
| 41 | - \_\_le16 |
| 42 | - i\_mode |
| 43 | - File mode. See the table i_mode_ below. |
| 44 | * - 0x2 |
| 45 | - \_\_le16 |
| 46 | - i\_uid |
| 47 | - Lower 16-bits of Owner UID. |
| 48 | * - 0x4 |
| 49 | - \_\_le32 |
| 50 | - i\_size\_lo |
| 51 | - Lower 32-bits of size in bytes. |
| 52 | * - 0x8 |
| 53 | - \_\_le32 |
| 54 | - i\_atime |
| 55 | - Last access time, in seconds since the epoch. However, if the EA\_INODE |
| 56 | inode flag is set, this inode stores an extended attribute value and |
| 57 | this field contains the checksum of the value. |
| 58 | * - 0xC |
| 59 | - \_\_le32 |
| 60 | - i\_ctime |
| 61 | - Last inode change time, in seconds since the epoch. However, if the |
| 62 | EA\_INODE inode flag is set, this inode stores an extended attribute |
| 63 | value and this field contains the lower 32 bits of the attribute value's |
| 64 | reference count. |
| 65 | * - 0x10 |
| 66 | - \_\_le32 |
| 67 | - i\_mtime |
| 68 | - Last data modification time, in seconds since the epoch. However, if the |
| 69 | EA\_INODE inode flag is set, this inode stores an extended attribute |
| 70 | value and this field contains the number of the inode that owns the |
| 71 | extended attribute. |
| 72 | * - 0x14 |
| 73 | - \_\_le32 |
| 74 | - i\_dtime |
| 75 | - Deletion Time, in seconds since the epoch. |
| 76 | * - 0x18 |
| 77 | - \_\_le16 |
| 78 | - i\_gid |
| 79 | - Lower 16-bits of GID. |
| 80 | * - 0x1A |
| 81 | - \_\_le16 |
| 82 | - i\_links\_count |
| 83 | - Hard link count. Normally, ext4 does not permit an inode to have more |
| 84 | than 65,000 hard links. This applies to files as well as directories, |
| 85 | which means that there cannot be more than 64,998 subdirectories in a |
| 86 | directory (each subdirectory's '..' entry counts as a hard link, as does |
| 87 | the '.' entry in the directory itself). With the DIR\_NLINK feature |
| 88 | enabled, ext4 supports more than 64,998 subdirectories by setting this |
| 89 | field to 1 to indicate that the number of hard links is not known. |
| 90 | * - 0x1C |
| 91 | - \_\_le32 |
| 92 | - i\_blocks\_lo |
| 93 | - Lower 32-bits of “block” count. If the huge\_file feature flag is not |
| 94 | set on the filesystem, the file consumes ``i_blocks_lo`` 512-byte blocks |
| 95 | on disk. If huge\_file is set and EXT4\_HUGE\_FILE\_FL is NOT set in |
| 96 | ``inode.i_flags``, then the file consumes ``i_blocks_lo + (i_blocks_hi |
| 97 | << 32)`` 512-byte blocks on disk. If huge\_file is set and |
| 98 | EXT4\_HUGE\_FILE\_FL IS set in ``inode.i_flags``, then this file |
| 99 | consumes (``i_blocks_lo + i_blocks_hi`` << 32) filesystem blocks on |
| 100 | disk. |
| 101 | * - 0x20 |
| 102 | - \_\_le32 |
| 103 | - i\_flags |
| 104 | - Inode flags. See the table i_flags_ below. |
| 105 | * - 0x24 |
| 106 | - 4 bytes |
| 107 | - i\_osd1 |
| 108 | - See the table i_osd1_ for more details. |
| 109 | * - 0x28 |
| 110 | - 60 bytes |
| 111 | - i\_block[EXT4\_N\_BLOCKS=15] |
| 112 | - Block map or extent tree. See the section “The Contents of inode.i\_block”. |
| 113 | * - 0x64 |
| 114 | - \_\_le32 |
| 115 | - i\_generation |
| 116 | - File version (for NFS). |
| 117 | * - 0x68 |
| 118 | - \_\_le32 |
| 119 | - i\_file\_acl\_lo |
| 120 | - Lower 32-bits of extended attribute block. ACLs are of course one of |
| 121 | many possible extended attributes; I think the name of this field is a |
| 122 | result of the first use of extended attributes being for ACLs. |
| 123 | * - 0x6C |
| 124 | - \_\_le32 |
| 125 | - i\_size\_high / i\_dir\_acl |
| 126 | - Upper 32-bits of file/directory size. In ext2/3 this field was named |
| 127 | i\_dir\_acl, though it was usually set to zero and never used. |
| 128 | * - 0x70 |
| 129 | - \_\_le32 |
| 130 | - i\_obso\_faddr |
| 131 | - (Obsolete) fragment address. |
| 132 | * - 0x74 |
| 133 | - 12 bytes |
| 134 | - i\_osd2 |
| 135 | - See the table i_osd2_ for more details. |
| 136 | * - 0x80 |
| 137 | - \_\_le16 |
| 138 | - i\_extra\_isize |
| 139 | - Size of this inode - 128. Alternately, the size of the extended inode |
| 140 | fields beyond the original ext2 inode, including this field. |
| 141 | * - 0x82 |
| 142 | - \_\_le16 |
| 143 | - i\_checksum\_hi |
| 144 | - Upper 16-bits of the inode checksum. |
| 145 | * - 0x84 |
| 146 | - \_\_le32 |
| 147 | - i\_ctime\_extra |
| 148 | - Extra change time bits. This provides sub-second precision. See Inode |
| 149 | Timestamps section. |
| 150 | * - 0x88 |
| 151 | - \_\_le32 |
| 152 | - i\_mtime\_extra |
| 153 | - Extra modification time bits. This provides sub-second precision. |
| 154 | * - 0x8C |
| 155 | - \_\_le32 |
| 156 | - i\_atime\_extra |
| 157 | - Extra access time bits. This provides sub-second precision. |
| 158 | * - 0x90 |
| 159 | - \_\_le32 |
| 160 | - i\_crtime |
| 161 | - File creation time, in seconds since the epoch. |
| 162 | * - 0x94 |
| 163 | - \_\_le32 |
| 164 | - i\_crtime\_extra |
| 165 | - Extra file creation time bits. This provides sub-second precision. |
| 166 | * - 0x98 |
| 167 | - \_\_le32 |
| 168 | - i\_version\_hi |
| 169 | - Upper 32-bits for version number. |
| 170 | * - 0x9C |
| 171 | - \_\_le32 |
| 172 | - i\_projid |
| 173 | - Project ID. |
| 174 | |
| 175 | .. _i_mode: |
| 176 | |
| 177 | The ``i_mode`` value is a combination of the following flags: |
| 178 | |
| 179 | .. list-table:: |
Darrick J. Wong | de7abd7 | 2018-10-02 22:43:40 -0400 | [diff] [blame] | 180 | :widths: 16 64 |
Darrick J. Wong | 4618055 | 2018-07-29 15:44:00 -0400 | [diff] [blame] | 181 | :header-rows: 1 |
| 182 | |
| 183 | * - Value |
| 184 | - Description |
| 185 | * - 0x1 |
| 186 | - S\_IXOTH (Others may execute) |
| 187 | * - 0x2 |
| 188 | - S\_IWOTH (Others may write) |
| 189 | * - 0x4 |
| 190 | - S\_IROTH (Others may read) |
| 191 | * - 0x8 |
| 192 | - S\_IXGRP (Group members may execute) |
| 193 | * - 0x10 |
| 194 | - S\_IWGRP (Group members may write) |
| 195 | * - 0x20 |
| 196 | - S\_IRGRP (Group members may read) |
| 197 | * - 0x40 |
| 198 | - S\_IXUSR (Owner may execute) |
| 199 | * - 0x80 |
| 200 | - S\_IWUSR (Owner may write) |
| 201 | * - 0x100 |
| 202 | - S\_IRUSR (Owner may read) |
| 203 | * - 0x200 |
| 204 | - S\_ISVTX (Sticky bit) |
| 205 | * - 0x400 |
| 206 | - S\_ISGID (Set GID) |
| 207 | * - 0x800 |
| 208 | - S\_ISUID (Set UID) |
| 209 | * - |
| 210 | - These are mutually-exclusive file types: |
| 211 | * - 0x1000 |
| 212 | - S\_IFIFO (FIFO) |
| 213 | * - 0x2000 |
| 214 | - S\_IFCHR (Character device) |
| 215 | * - 0x4000 |
| 216 | - S\_IFDIR (Directory) |
| 217 | * - 0x6000 |
| 218 | - S\_IFBLK (Block device) |
| 219 | * - 0x8000 |
| 220 | - S\_IFREG (Regular file) |
| 221 | * - 0xA000 |
| 222 | - S\_IFLNK (Symbolic link) |
| 223 | * - 0xC000 |
| 224 | - S\_IFSOCK (Socket) |
| 225 | |
| 226 | .. _i_flags: |
| 227 | |
| 228 | The ``i_flags`` field is a combination of these values: |
| 229 | |
| 230 | .. list-table:: |
Darrick J. Wong | de7abd7 | 2018-10-02 22:43:40 -0400 | [diff] [blame] | 231 | :widths: 16 64 |
Darrick J. Wong | 4618055 | 2018-07-29 15:44:00 -0400 | [diff] [blame] | 232 | :header-rows: 1 |
| 233 | |
| 234 | * - Value |
| 235 | - Description |
| 236 | * - 0x1 |
| 237 | - This file requires secure deletion (EXT4\_SECRM\_FL). (not implemented) |
| 238 | * - 0x2 |
| 239 | - This file should be preserved, should undeletion be desired |
| 240 | (EXT4\_UNRM\_FL). (not implemented) |
| 241 | * - 0x4 |
| 242 | - File is compressed (EXT4\_COMPR\_FL). (not really implemented) |
| 243 | * - 0x8 |
| 244 | - All writes to the file must be synchronous (EXT4\_SYNC\_FL). |
| 245 | * - 0x10 |
| 246 | - File is immutable (EXT4\_IMMUTABLE\_FL). |
| 247 | * - 0x20 |
| 248 | - File can only be appended (EXT4\_APPEND\_FL). |
| 249 | * - 0x40 |
| 250 | - The dump(1) utility should not dump this file (EXT4\_NODUMP\_FL). |
| 251 | * - 0x80 |
| 252 | - Do not update access time (EXT4\_NOATIME\_FL). |
| 253 | * - 0x100 |
| 254 | - Dirty compressed file (EXT4\_DIRTY\_FL). (not used) |
| 255 | * - 0x200 |
| 256 | - File has one or more compressed clusters (EXT4\_COMPRBLK\_FL). (not used) |
| 257 | * - 0x400 |
| 258 | - Do not compress file (EXT4\_NOCOMPR\_FL). (not used) |
| 259 | * - 0x800 |
| 260 | - Encrypted inode (EXT4\_ENCRYPT\_FL). This bit value previously was |
| 261 | EXT4\_ECOMPR\_FL (compression error), which was never used. |
| 262 | * - 0x1000 |
| 263 | - Directory has hashed indexes (EXT4\_INDEX\_FL). |
| 264 | * - 0x2000 |
| 265 | - AFS magic directory (EXT4\_IMAGIC\_FL). |
| 266 | * - 0x4000 |
| 267 | - File data must always be written through the journal |
| 268 | (EXT4\_JOURNAL\_DATA\_FL). |
| 269 | * - 0x8000 |
| 270 | - File tail should not be merged (EXT4\_NOTAIL\_FL). (not used by ext4) |
| 271 | * - 0x10000 |
| 272 | - All directory entry data should be written synchronously (see |
| 273 | ``dirsync``) (EXT4\_DIRSYNC\_FL). |
| 274 | * - 0x20000 |
| 275 | - Top of directory hierarchy (EXT4\_TOPDIR\_FL). |
| 276 | * - 0x40000 |
| 277 | - This is a huge file (EXT4\_HUGE\_FILE\_FL). |
| 278 | * - 0x80000 |
| 279 | - Inode uses extents (EXT4\_EXTENTS\_FL). |
Eric Biggers | 84fb7ca | 2019-07-22 09:26:24 -0700 | [diff] [blame] | 280 | * - 0x100000 |
| 281 | - Verity protected file (EXT4\_VERITY\_FL). |
Darrick J. Wong | 4618055 | 2018-07-29 15:44:00 -0400 | [diff] [blame] | 282 | * - 0x200000 |
| 283 | - Inode stores a large extended attribute value in its data blocks |
| 284 | (EXT4\_EA\_INODE\_FL). |
| 285 | * - 0x400000 |
| 286 | - This file has blocks allocated past EOF (EXT4\_EOFBLOCKS\_FL). |
| 287 | (deprecated) |
| 288 | * - 0x01000000 |
| 289 | - Inode is a snapshot (``EXT4_SNAPFILE_FL``). (not in mainline) |
| 290 | * - 0x04000000 |
| 291 | - Snapshot is being deleted (``EXT4_SNAPFILE_DELETED_FL``). (not in |
| 292 | mainline) |
| 293 | * - 0x08000000 |
| 294 | - Snapshot shrink has completed (``EXT4_SNAPFILE_SHRUNK_FL``). (not in |
| 295 | mainline) |
| 296 | * - 0x10000000 |
| 297 | - Inode has inline data (EXT4\_INLINE\_DATA\_FL). |
| 298 | * - 0x20000000 |
| 299 | - Create children with the same project ID (EXT4\_PROJINHERIT\_FL). |
| 300 | * - 0x80000000 |
| 301 | - Reserved for ext4 library (EXT4\_RESERVED\_FL). |
| 302 | * - |
| 303 | - Aggregate flags: |
Eric Biggers | 84fb7ca | 2019-07-22 09:26:24 -0700 | [diff] [blame] | 304 | * - 0x705BDFFF |
Darrick J. Wong | 4618055 | 2018-07-29 15:44:00 -0400 | [diff] [blame] | 305 | - User-visible flags. |
Eric Biggers | 84fb7ca | 2019-07-22 09:26:24 -0700 | [diff] [blame] | 306 | * - 0x604BC0FF |
Darrick J. Wong | 4618055 | 2018-07-29 15:44:00 -0400 | [diff] [blame] | 307 | - User-modifiable flags. Note that while EXT4\_JOURNAL\_DATA\_FL and |
| 308 | EXT4\_EXTENTS\_FL can be set with setattr, they are not in the kernel's |
| 309 | EXT4\_FL\_USER\_MODIFIABLE mask, since it needs to handle the setting of |
| 310 | these flags in a special manner and they are masked out of the set of |
| 311 | flags that are saved directly to i\_flags. |
| 312 | |
| 313 | .. _i_osd1: |
| 314 | |
| 315 | The ``osd1`` field has multiple meanings depending on the creator: |
| 316 | |
| 317 | Linux: |
| 318 | |
| 319 | .. list-table:: |
Darrick J. Wong | de7abd7 | 2018-10-02 22:43:40 -0400 | [diff] [blame] | 320 | :widths: 8 8 24 40 |
Darrick J. Wong | 4618055 | 2018-07-29 15:44:00 -0400 | [diff] [blame] | 321 | :header-rows: 1 |
| 322 | |
| 323 | * - Offset |
| 324 | - Size |
| 325 | - Name |
| 326 | - Description |
| 327 | * - 0x0 |
| 328 | - \_\_le32 |
| 329 | - l\_i\_version |
| 330 | - Inode version. However, if the EA\_INODE inode flag is set, this inode |
| 331 | stores an extended attribute value and this field contains the upper 32 |
| 332 | bits of the attribute value's reference count. |
| 333 | |
| 334 | Hurd: |
| 335 | |
| 336 | .. list-table:: |
Darrick J. Wong | de7abd7 | 2018-10-02 22:43:40 -0400 | [diff] [blame] | 337 | :widths: 8 8 24 40 |
Darrick J. Wong | 4618055 | 2018-07-29 15:44:00 -0400 | [diff] [blame] | 338 | :header-rows: 1 |
| 339 | |
| 340 | * - Offset |
| 341 | - Size |
| 342 | - Name |
| 343 | - Description |
| 344 | * - 0x0 |
| 345 | - \_\_le32 |
| 346 | - h\_i\_translator |
| 347 | - ?? |
| 348 | |
| 349 | Masix: |
| 350 | |
| 351 | .. list-table:: |
Darrick J. Wong | de7abd7 | 2018-10-02 22:43:40 -0400 | [diff] [blame] | 352 | :widths: 8 8 24 40 |
Darrick J. Wong | 4618055 | 2018-07-29 15:44:00 -0400 | [diff] [blame] | 353 | :header-rows: 1 |
| 354 | |
| 355 | * - Offset |
| 356 | - Size |
| 357 | - Name |
| 358 | - Description |
| 359 | * - 0x0 |
| 360 | - \_\_le32 |
| 361 | - m\_i\_reserved |
| 362 | - ?? |
| 363 | |
| 364 | .. _i_osd2: |
| 365 | |
| 366 | The ``osd2`` field has multiple meanings depending on the filesystem creator: |
| 367 | |
| 368 | Linux: |
| 369 | |
| 370 | .. list-table:: |
Darrick J. Wong | de7abd7 | 2018-10-02 22:43:40 -0400 | [diff] [blame] | 371 | :widths: 8 8 24 40 |
Darrick J. Wong | 4618055 | 2018-07-29 15:44:00 -0400 | [diff] [blame] | 372 | :header-rows: 1 |
| 373 | |
| 374 | * - Offset |
| 375 | - Size |
| 376 | - Name |
| 377 | - Description |
| 378 | * - 0x0 |
| 379 | - \_\_le16 |
| 380 | - l\_i\_blocks\_high |
| 381 | - Upper 16-bits of the block count. Please see the note attached to |
| 382 | i\_blocks\_lo. |
| 383 | * - 0x2 |
| 384 | - \_\_le16 |
| 385 | - l\_i\_file\_acl\_high |
| 386 | - Upper 16-bits of the extended attribute block (historically, the file |
| 387 | ACL location). See the Extended Attributes section below. |
| 388 | * - 0x4 |
| 389 | - \_\_le16 |
| 390 | - l\_i\_uid\_high |
| 391 | - Upper 16-bits of the Owner UID. |
| 392 | * - 0x6 |
| 393 | - \_\_le16 |
| 394 | - l\_i\_gid\_high |
| 395 | - Upper 16-bits of the GID. |
| 396 | * - 0x8 |
| 397 | - \_\_le16 |
| 398 | - l\_i\_checksum\_lo |
| 399 | - Lower 16-bits of the inode checksum. |
| 400 | * - 0xA |
| 401 | - \_\_le16 |
| 402 | - l\_i\_reserved |
| 403 | - Unused. |
| 404 | |
| 405 | Hurd: |
| 406 | |
| 407 | .. list-table:: |
Darrick J. Wong | de7abd7 | 2018-10-02 22:43:40 -0400 | [diff] [blame] | 408 | :widths: 8 8 24 40 |
Darrick J. Wong | 4618055 | 2018-07-29 15:44:00 -0400 | [diff] [blame] | 409 | :header-rows: 1 |
| 410 | |
| 411 | * - Offset |
| 412 | - Size |
| 413 | - Name |
| 414 | - Description |
| 415 | * - 0x0 |
| 416 | - \_\_le16 |
| 417 | - h\_i\_reserved1 |
| 418 | - ?? |
| 419 | * - 0x2 |
| 420 | - \_\_u16 |
| 421 | - h\_i\_mode\_high |
| 422 | - Upper 16-bits of the file mode. |
| 423 | * - 0x4 |
| 424 | - \_\_le16 |
| 425 | - h\_i\_uid\_high |
| 426 | - Upper 16-bits of the Owner UID. |
| 427 | * - 0x6 |
| 428 | - \_\_le16 |
| 429 | - h\_i\_gid\_high |
| 430 | - Upper 16-bits of the GID. |
| 431 | * - 0x8 |
| 432 | - \_\_u32 |
| 433 | - h\_i\_author |
| 434 | - Author code? |
| 435 | |
| 436 | Masix: |
| 437 | |
| 438 | .. list-table:: |
Darrick J. Wong | de7abd7 | 2018-10-02 22:43:40 -0400 | [diff] [blame] | 439 | :widths: 8 8 24 40 |
Darrick J. Wong | 4618055 | 2018-07-29 15:44:00 -0400 | [diff] [blame] | 440 | :header-rows: 1 |
| 441 | |
| 442 | * - Offset |
| 443 | - Size |
| 444 | - Name |
| 445 | - Description |
| 446 | * - 0x0 |
| 447 | - \_\_le16 |
| 448 | - h\_i\_reserved1 |
| 449 | - ?? |
| 450 | * - 0x2 |
| 451 | - \_\_u16 |
| 452 | - m\_i\_file\_acl\_high |
| 453 | - Upper 16-bits of the extended attribute block (historically, the file |
| 454 | ACL location). |
| 455 | * - 0x4 |
| 456 | - \_\_u32 |
| 457 | - m\_i\_reserved2[2] |
| 458 | - ?? |
| 459 | |
| 460 | Inode Size |
| 461 | ~~~~~~~~~~ |
| 462 | |
| 463 | In ext2 and ext3, the inode structure size was fixed at 128 bytes |
| 464 | (``EXT2_GOOD_OLD_INODE_SIZE``) and each inode had a disk record size of |
| 465 | 128 bytes. Starting with ext4, it is possible to allocate a larger |
| 466 | on-disk inode at format time for all inodes in the filesystem to provide |
| 467 | space beyond the end of the original ext2 inode. The on-disk inode |
| 468 | record size is recorded in the superblock as ``s_inode_size``. The |
| 469 | number of bytes actually used by struct ext4\_inode beyond the original |
| 470 | 128-byte ext2 inode is recorded in the ``i_extra_isize`` field for each |
| 471 | inode, which allows struct ext4\_inode to grow for a new kernel without |
| 472 | having to upgrade all of the on-disk inodes. Access to fields beyond |
| 473 | EXT2\_GOOD\_OLD\_INODE\_SIZE should be verified to be within |
| 474 | ``i_extra_isize``. By default, ext4 inode records are 256 bytes, and (as |
Ayush Ranjan | 219db95 | 2019-08-22 23:18:33 -0400 | [diff] [blame] | 475 | of August 2019) the inode structure is 160 bytes |
| 476 | (``i_extra_isize = 32``). The extra space between the end of the inode |
Darrick J. Wong | 4618055 | 2018-07-29 15:44:00 -0400 | [diff] [blame] | 477 | structure and the end of the inode record can be used to store extended |
| 478 | attributes. Each inode record can be as large as the filesystem block |
| 479 | size, though this is not terribly efficient. |
| 480 | |
| 481 | Finding an Inode |
| 482 | ~~~~~~~~~~~~~~~~ |
| 483 | |
| 484 | Each block group contains ``sb->s_inodes_per_group`` inodes. Because |
| 485 | inode 0 is defined not to exist, this formula can be used to find the |
| 486 | block group that an inode lives in: |
| 487 | ``bg = (inode_num - 1) / sb->s_inodes_per_group``. The particular inode |
| 488 | can be found within the block group's inode table at |
| 489 | ``index = (inode_num - 1) % sb->s_inodes_per_group``. To get the byte |
| 490 | address within the inode table, use |
| 491 | ``offset = index * sb->s_inode_size``. |
| 492 | |
| 493 | Inode Timestamps |
| 494 | ~~~~~~~~~~~~~~~~ |
| 495 | |
| 496 | Four timestamps are recorded in the lower 128 bytes of the inode |
| 497 | structure -- inode change time (ctime), access time (atime), data |
| 498 | modification time (mtime), and deletion time (dtime). The four fields |
| 499 | are 32-bit signed integers that represent seconds since the Unix epoch |
| 500 | (1970-01-01 00:00:00 GMT), which means that the fields will overflow in |
| 501 | January 2038. For inodes that are not linked from any directory but are |
| 502 | still open (orphan inodes), the dtime field is overloaded for use with |
| 503 | the orphan list. The superblock field ``s_last_orphan`` points to the |
| 504 | first inode in the orphan list; dtime is then the number of the next |
| 505 | orphaned inode, or zero if there are no more orphans. |
| 506 | |
| 507 | If the inode structure size ``sb->s_inode_size`` is larger than 128 |
| 508 | bytes and the ``i_inode_extra`` field is large enough to encompass the |
| 509 | respective ``i_[cma]time_extra`` field, the ctime, atime, and mtime |
| 510 | inode fields are widened to 64 bits. Within this “extra” 32-bit field, |
| 511 | the lower two bits are used to extend the 32-bit seconds field to be 34 |
| 512 | bit wide; the upper 30 bits are used to provide nanosecond timestamp |
| 513 | accuracy. Therefore, timestamps should not overflow until May 2446. |
| 514 | dtime was not widened. There is also a fifth timestamp to record inode |
| 515 | creation time (crtime); this field is 64-bits wide and decoded in the |
| 516 | same manner as 64-bit [cma]time. Neither crtime nor dtime are accessible |
| 517 | through the regular stat() interface, though debugfs will report them. |
| 518 | |
| 519 | We use the 32-bit signed time value plus (2^32 \* (extra epoch bits)). |
| 520 | In other words: |
| 521 | |
| 522 | .. list-table:: |
| 523 | :widths: 20 20 20 20 20 |
| 524 | :header-rows: 1 |
| 525 | |
| 526 | * - Extra epoch bits |
| 527 | - MSB of 32-bit time |
| 528 | - Adjustment for signed 32-bit to 64-bit tv\_sec |
| 529 | - Decoded 64-bit tv\_sec |
| 530 | - valid time range |
| 531 | * - 0 0 |
| 532 | - 1 |
| 533 | - 0 |
| 534 | - ``-0x80000000 - -0x00000001`` |
| 535 | - 1901-12-13 to 1969-12-31 |
| 536 | * - 0 0 |
| 537 | - 0 |
| 538 | - 0 |
| 539 | - ``0x000000000 - 0x07fffffff`` |
| 540 | - 1970-01-01 to 2038-01-19 |
| 541 | * - 0 1 |
| 542 | - 1 |
| 543 | - 0x100000000 |
| 544 | - ``0x080000000 - 0x0ffffffff`` |
| 545 | - 2038-01-19 to 2106-02-07 |
| 546 | * - 0 1 |
| 547 | - 0 |
| 548 | - 0x100000000 |
| 549 | - ``0x100000000 - 0x17fffffff`` |
| 550 | - 2106-02-07 to 2174-02-25 |
| 551 | * - 1 0 |
| 552 | - 1 |
| 553 | - 0x200000000 |
| 554 | - ``0x180000000 - 0x1ffffffff`` |
| 555 | - 2174-02-25 to 2242-03-16 |
| 556 | * - 1 0 |
| 557 | - 0 |
| 558 | - 0x200000000 |
| 559 | - ``0x200000000 - 0x27fffffff`` |
| 560 | - 2242-03-16 to 2310-04-04 |
| 561 | * - 1 1 |
| 562 | - 1 |
| 563 | - 0x300000000 |
| 564 | - ``0x280000000 - 0x2ffffffff`` |
| 565 | - 2310-04-04 to 2378-04-22 |
| 566 | * - 1 1 |
| 567 | - 0 |
| 568 | - 0x300000000 |
| 569 | - ``0x300000000 - 0x37fffffff`` |
| 570 | - 2378-04-22 to 2446-05-10 |
| 571 | |
| 572 | This is a somewhat odd encoding since there are effectively seven times |
| 573 | as many positive values as negative values. There have also been |
| 574 | long-standing bugs decoding and encoding dates beyond 2038, which don't |
| 575 | seem to be fixed as of kernel 3.12 and e2fsprogs 1.42.8. 64-bit kernels |
| 576 | incorrectly use the extra epoch bits 1,1 for dates between 1901 and |
| 577 | 1970. At some point the kernel will be fixed and e2fsck will fix this |
| 578 | situation, assuming that it is run before 2310. |