Darrick J. Wong | b4becd4 | 2018-07-29 15:45:00 -0400 | [diff] [blame] | 1 | .. SPDX-License-Identifier: GPL-2.0 |
| 2 | |
| 3 | The Contents of inode.i\_block |
| 4 | ------------------------------ |
| 5 | |
| 6 | Depending on the type of file an inode describes, the 60 bytes of |
| 7 | storage in ``inode.i_block`` can be used in different ways. In general, |
| 8 | regular files and directories will use it for file block indexing |
| 9 | information, and special files will use it for special purposes. |
| 10 | |
| 11 | Symbolic Links |
| 12 | ~~~~~~~~~~~~~~ |
| 13 | |
| 14 | The target of a symbolic link will be stored in this field if the target |
| 15 | string is less than 60 bytes long. Otherwise, either extents or block |
| 16 | maps will be used to allocate data blocks to store the link target. |
| 17 | |
| 18 | Direct/Indirect Block Addressing |
| 19 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 20 | |
| 21 | In ext2/3, file block numbers were mapped to logical block numbers by |
| 22 | means of an (up to) three level 1-1 block map. To find the logical block |
| 23 | that stores a particular file block, the code would navigate through |
| 24 | this increasingly complicated structure. Notice that there is neither a |
| 25 | magic number nor a checksum to provide any level of confidence that the |
| 26 | block isn't full of garbage. |
| 27 | |
| 28 | .. ifconfig:: builder != 'latex' |
| 29 | |
| 30 | .. include:: blockmap.rst |
| 31 | |
| 32 | .. ifconfig:: builder == 'latex' |
| 33 | |
| 34 | [Table omitted because LaTeX doesn't support nested tables.] |
| 35 | |
| 36 | Note that with this block mapping scheme, it is necessary to fill out a |
| 37 | lot of mapping data even for a large contiguous file! This inefficiency |
| 38 | led to the creation of the extent mapping scheme, discussed below. |
| 39 | |
| 40 | Notice also that a file using this mapping scheme cannot be placed |
| 41 | higher than 2^32 blocks. |
| 42 | |
| 43 | Extent Tree |
| 44 | ~~~~~~~~~~~ |
| 45 | |
| 46 | In ext4, the file to logical block map has been replaced with an extent |
| 47 | tree. Under the old scheme, allocating a contiguous run of 1,000 blocks |
| 48 | requires an indirect block to map all 1,000 entries; with extents, the |
| 49 | mapping is reduced to a single ``struct ext4_extent`` with |
| 50 | ``ee_len = 1000``. If flex\_bg is enabled, it is possible to allocate |
| 51 | very large files with a single extent, at a considerable reduction in |
| 52 | metadata block use, and some improvement in disk efficiency. The inode |
| 53 | must have the extents flag (0x80000) flag set for this feature to be in |
| 54 | use. |
| 55 | |
| 56 | Extents are arranged as a tree. Each node of the tree begins with a |
| 57 | ``struct ext4_extent_header``. If the node is an interior node |
| 58 | (``eh.eh_depth`` > 0), the header is followed by ``eh.eh_entries`` |
| 59 | instances of ``struct ext4_extent_idx``; each of these index entries |
| 60 | points to a block containing more nodes in the extent tree. If the node |
| 61 | is a leaf node (``eh.eh_depth == 0``), then the header is followed by |
| 62 | ``eh.eh_entries`` instances of ``struct ext4_extent``; these instances |
| 63 | point to the file's data blocks. The root node of the extent tree is |
| 64 | stored in ``inode.i_block``, which allows for the first four extents to |
| 65 | be recorded without the use of extra metadata blocks. |
| 66 | |
| 67 | The extent tree header is recorded in ``struct ext4_extent_header``, |
| 68 | which is 12 bytes long: |
| 69 | |
| 70 | .. list-table:: |
Darrick J. Wong | de7abd7 | 2018-10-02 22:43:40 -0400 | [diff] [blame] | 71 | :widths: 8 8 24 40 |
Darrick J. Wong | b4becd4 | 2018-07-29 15:45:00 -0400 | [diff] [blame] | 72 | :header-rows: 1 |
| 73 | |
| 74 | * - Offset |
| 75 | - Size |
| 76 | - Name |
| 77 | - Description |
| 78 | * - 0x0 |
| 79 | - \_\_le16 |
| 80 | - eh\_magic |
| 81 | - Magic number, 0xF30A. |
| 82 | * - 0x2 |
| 83 | - \_\_le16 |
| 84 | - eh\_entries |
| 85 | - Number of valid entries following the header. |
| 86 | * - 0x4 |
| 87 | - \_\_le16 |
| 88 | - eh\_max |
| 89 | - Maximum number of entries that could follow the header. |
| 90 | * - 0x6 |
| 91 | - \_\_le16 |
| 92 | - eh\_depth |
| 93 | - Depth of this extent node in the extent tree. 0 = this extent node |
| 94 | points to data blocks; otherwise, this extent node points to other |
| 95 | extent nodes. The extent tree can be at most 5 levels deep: a logical |
| 96 | block number can be at most ``2^32``, and the smallest ``n`` that |
| 97 | satisfies ``4*(((blocksize - 12)/12)^n) >= 2^32`` is 5. |
| 98 | * - 0x8 |
| 99 | - \_\_le32 |
| 100 | - eh\_generation |
| 101 | - Generation of the tree. (Used by Lustre, but not standard ext4). |
| 102 | |
| 103 | Internal nodes of the extent tree, also known as index nodes, are |
| 104 | recorded as ``struct ext4_extent_idx``, and are 12 bytes long: |
| 105 | |
| 106 | .. list-table:: |
Darrick J. Wong | de7abd7 | 2018-10-02 22:43:40 -0400 | [diff] [blame] | 107 | :widths: 8 8 24 40 |
Darrick J. Wong | b4becd4 | 2018-07-29 15:45:00 -0400 | [diff] [blame] | 108 | :header-rows: 1 |
| 109 | |
| 110 | * - Offset |
| 111 | - Size |
| 112 | - Name |
| 113 | - Description |
| 114 | * - 0x0 |
| 115 | - \_\_le32 |
| 116 | - ei\_block |
| 117 | - This index node covers file blocks from 'block' onward. |
| 118 | * - 0x4 |
| 119 | - \_\_le32 |
| 120 | - ei\_leaf\_lo |
| 121 | - Lower 32-bits of the block number of the extent node that is the next |
| 122 | level lower in the tree. The tree node pointed to can be either another |
| 123 | internal node or a leaf node, described below. |
| 124 | * - 0x8 |
| 125 | - \_\_le16 |
| 126 | - ei\_leaf\_hi |
| 127 | - Upper 16-bits of the previous field. |
| 128 | * - 0xA |
| 129 | - \_\_u16 |
| 130 | - ei\_unused |
| 131 | - |
| 132 | |
| 133 | Leaf nodes of the extent tree are recorded as ``struct ext4_extent``, |
| 134 | and are also 12 bytes long: |
| 135 | |
| 136 | .. list-table:: |
Darrick J. Wong | de7abd7 | 2018-10-02 22:43:40 -0400 | [diff] [blame] | 137 | :widths: 8 8 24 40 |
Darrick J. Wong | b4becd4 | 2018-07-29 15:45:00 -0400 | [diff] [blame] | 138 | :header-rows: 1 |
| 139 | |
| 140 | * - Offset |
| 141 | - Size |
| 142 | - Name |
| 143 | - Description |
| 144 | * - 0x0 |
| 145 | - \_\_le32 |
| 146 | - ee\_block |
| 147 | - First file block number that this extent covers. |
| 148 | * - 0x4 |
| 149 | - \_\_le16 |
| 150 | - ee\_len |
| 151 | - Number of blocks covered by extent. If the value of this field is <= |
| 152 | 32768, the extent is initialized. If the value of the field is > 32768, |
| 153 | the extent is uninitialized and the actual extent length is ``ee_len`` - |
| 154 | 32768. Therefore, the maximum length of a initialized extent is 32768 |
| 155 | blocks, and the maximum length of an uninitialized extent is 32767. |
| 156 | * - 0x6 |
| 157 | - \_\_le16 |
| 158 | - ee\_start\_hi |
| 159 | - Upper 16-bits of the block number to which this extent points. |
| 160 | * - 0x8 |
| 161 | - \_\_le32 |
| 162 | - ee\_start\_lo |
| 163 | - Lower 32-bits of the block number to which this extent points. |
| 164 | |
| 165 | Prior to the introduction of metadata checksums, the extent header + |
| 166 | extent entries always left at least 4 bytes of unallocated space at the |
| 167 | end of each extent tree data block (because (2^x % 12) >= 4). Therefore, |
| 168 | the 32-bit checksum is inserted into this space. The 4 extents in the |
| 169 | inode do not need checksumming, since the inode is already checksummed. |
| 170 | The checksum is calculated against the FS UUID, the inode number, the |
| 171 | inode generation, and the entire extent block leading up to (but not |
| 172 | including) the checksum itself. |
| 173 | |
| 174 | ``struct ext4_extent_tail`` is 4 bytes long: |
| 175 | |
| 176 | .. list-table:: |
Darrick J. Wong | de7abd7 | 2018-10-02 22:43:40 -0400 | [diff] [blame] | 177 | :widths: 8 8 24 40 |
Darrick J. Wong | b4becd4 | 2018-07-29 15:45:00 -0400 | [diff] [blame] | 178 | :header-rows: 1 |
| 179 | |
| 180 | * - Offset |
| 181 | - Size |
| 182 | - Name |
| 183 | - Description |
| 184 | * - 0x0 |
| 185 | - \_\_le32 |
| 186 | - eb\_checksum |
| 187 | - Checksum of the extent block, crc32c(uuid+inum+igeneration+extentblock) |
| 188 | |
| 189 | Inline Data |
| 190 | ~~~~~~~~~~~ |
| 191 | |
| 192 | If the inline data feature is enabled for the filesystem and the flag is |
| 193 | set for the inode, it is possible that the first 60 bytes of the file |
| 194 | data are stored here. |