Darrick J. Wong | 66d3239 | 2018-07-29 15:47:00 -0400 | [diff] [blame] | 1 | .. SPDX-License-Identifier: GPL-2.0 |
| 2 | |
| 3 | Extended Attributes |
| 4 | ------------------- |
| 5 | |
| 6 | Extended attributes (xattrs) are typically stored in a separate data |
| 7 | block on the disk and referenced from inodes via ``inode.i_file_acl*``. |
| 8 | The first use of extended attributes seems to have been for storing file |
| 9 | ACLs and other security data (selinux). With the ``user_xattr`` mount |
| 10 | option it is possible for users to store extended attributes so long as |
| 11 | all attribute names begin with “user”; this restriction seems to have |
| 12 | disappeared as of Linux 3.0. |
| 13 | |
| 14 | There are two places where extended attributes can be found. The first |
| 15 | place is between the end of each inode entry and the beginning of the |
| 16 | next inode entry. For example, if inode.i\_extra\_isize = 28 and |
| 17 | sb.inode\_size = 256, then there are 256 - (128 + 28) = 100 bytes |
| 18 | available for in-inode extended attribute storage. The second place |
| 19 | where extended attributes can be found is in the block pointed to by |
| 20 | ``inode.i_file_acl``. As of Linux 3.11, it is not possible for this |
| 21 | block to contain a pointer to a second extended attribute block (or even |
| 22 | the remaining blocks of a cluster). In theory it is possible for each |
| 23 | attribute's value to be stored in a separate data block, though as of |
| 24 | Linux 3.11 the code does not permit this. |
| 25 | |
| 26 | Keys are generally assumed to be ASCIIZ strings, whereas values can be |
| 27 | strings or binary data. |
| 28 | |
| 29 | Extended attributes, when stored after the inode, have a header |
| 30 | ``ext4_xattr_ibody_header`` that is 4 bytes long: |
| 31 | |
| 32 | .. list-table:: |
Darrick J. Wong | de7abd7 | 2018-10-02 22:43:40 -0400 | [diff] [blame] | 33 | :widths: 8 8 24 40 |
Darrick J. Wong | 66d3239 | 2018-07-29 15:47:00 -0400 | [diff] [blame] | 34 | :header-rows: 1 |
| 35 | |
| 36 | * - Offset |
| 37 | - Type |
| 38 | - Name |
| 39 | - Description |
| 40 | * - 0x0 |
| 41 | - \_\_le32 |
| 42 | - h\_magic |
| 43 | - Magic number for identification, 0xEA020000. This value is set by the |
| 44 | Linux driver, though e2fsprogs doesn't seem to check it(?) |
| 45 | |
| 46 | The beginning of an extended attribute block is in |
| 47 | ``struct ext4_xattr_header``, which is 32 bytes long: |
| 48 | |
| 49 | .. list-table:: |
Darrick J. Wong | de7abd7 | 2018-10-02 22:43:40 -0400 | [diff] [blame] | 50 | :widths: 8 8 24 40 |
Darrick J. Wong | 66d3239 | 2018-07-29 15:47:00 -0400 | [diff] [blame] | 51 | :header-rows: 1 |
| 52 | |
| 53 | * - Offset |
| 54 | - Type |
| 55 | - Name |
| 56 | - Description |
| 57 | * - 0x0 |
| 58 | - \_\_le32 |
| 59 | - h\_magic |
| 60 | - Magic number for identification, 0xEA020000. |
| 61 | * - 0x4 |
| 62 | - \_\_le32 |
| 63 | - h\_refcount |
| 64 | - Reference count. |
| 65 | * - 0x8 |
| 66 | - \_\_le32 |
| 67 | - h\_blocks |
| 68 | - Number of disk blocks used. |
| 69 | * - 0xC |
| 70 | - \_\_le32 |
| 71 | - h\_hash |
| 72 | - Hash value of all attributes. |
| 73 | * - 0x10 |
| 74 | - \_\_le32 |
| 75 | - h\_checksum |
| 76 | - Checksum of the extended attribute block. |
| 77 | * - 0x14 |
| 78 | - \_\_u32 |
| 79 | - h\_reserved[2] |
| 80 | - Zero. |
| 81 | |
| 82 | The checksum is calculated against the FS UUID, the 64-bit block number |
| 83 | of the extended attribute block, and the entire block (header + |
| 84 | entries). |
| 85 | |
| 86 | Following the ``struct ext4_xattr_header`` or |
| 87 | ``struct ext4_xattr_ibody_header`` is an array of |
| 88 | ``struct ext4_xattr_entry``; each of these entries is at least 16 bytes |
| 89 | long. When stored in an external block, the ``struct ext4_xattr_entry`` |
| 90 | entries must be stored in sorted order. The sort order is |
| 91 | ``e_name_index``, then ``e_name_len``, and finally ``e_name``. |
| 92 | Attributes stored inside an inode do not need be stored in sorted order. |
| 93 | |
| 94 | .. list-table:: |
Darrick J. Wong | de7abd7 | 2018-10-02 22:43:40 -0400 | [diff] [blame] | 95 | :widths: 8 8 24 40 |
Darrick J. Wong | 66d3239 | 2018-07-29 15:47:00 -0400 | [diff] [blame] | 96 | :header-rows: 1 |
| 97 | |
| 98 | * - Offset |
| 99 | - Type |
| 100 | - Name |
| 101 | - Description |
| 102 | * - 0x0 |
| 103 | - \_\_u8 |
| 104 | - e\_name\_len |
| 105 | - Length of name. |
| 106 | * - 0x1 |
| 107 | - \_\_u8 |
| 108 | - e\_name\_index |
| 109 | - Attribute name index. There is a discussion of this below. |
| 110 | * - 0x2 |
| 111 | - \_\_le16 |
| 112 | - e\_value\_offs |
| 113 | - Location of this attribute's value on the disk block where it is stored. |
| 114 | Multiple attributes can share the same value. For an inode attribute |
| 115 | this value is relative to the start of the first entry; for a block this |
| 116 | value is relative to the start of the block (i.e. the header). |
| 117 | * - 0x4 |
| 118 | - \_\_le32 |
| 119 | - e\_value\_inum |
| 120 | - The inode where the value is stored. Zero indicates the value is in the |
| 121 | same block as this entry. This field is only used if the |
| 122 | INCOMPAT\_EA\_INODE feature is enabled. |
| 123 | * - 0x8 |
| 124 | - \_\_le32 |
| 125 | - e\_value\_size |
| 126 | - Length of attribute value. |
| 127 | * - 0xC |
| 128 | - \_\_le32 |
| 129 | - e\_hash |
| 130 | - Hash value of attribute name and attribute value. The kernel doesn't |
| 131 | update the hash for in-inode attributes, so for that case this value |
| 132 | must be zero, because e2fsck validates any non-zero hash regardless of |
| 133 | where the xattr lives. |
| 134 | * - 0x10 |
| 135 | - char |
| 136 | - e\_name[e\_name\_len] |
| 137 | - Attribute name. Does not include trailing NULL. |
| 138 | |
| 139 | Attribute values can follow the end of the entry table. There appears to |
| 140 | be a requirement that they be aligned to 4-byte boundaries. The values |
| 141 | are stored starting at the end of the block and grow towards the |
| 142 | xattr\_header/xattr\_entry table. When the two collide, the overflow is |
| 143 | put into a separate disk block. If the disk block fills up, the |
| 144 | filesystem returns -ENOSPC. |
| 145 | |
| 146 | The first four fields of the ``ext4_xattr_entry`` are set to zero to |
| 147 | mark the end of the key list. |
| 148 | |
| 149 | Attribute Name Indices |
| 150 | ~~~~~~~~~~~~~~~~~~~~~~ |
| 151 | |
| 152 | Logically speaking, extended attributes are a series of key=value pairs. |
| 153 | The keys are assumed to be NULL-terminated strings. To reduce the amount |
| 154 | of on-disk space that the keys consume, the beginning of the key string |
| 155 | is matched against the attribute name index. If a match is found, the |
| 156 | attribute name index field is set, and matching string is removed from |
| 157 | the key name. Here is a map of name index values to key prefixes: |
| 158 | |
| 159 | .. list-table:: |
Darrick J. Wong | de7abd7 | 2018-10-02 22:43:40 -0400 | [diff] [blame] | 160 | :widths: 16 64 |
Darrick J. Wong | 66d3239 | 2018-07-29 15:47:00 -0400 | [diff] [blame] | 161 | :header-rows: 1 |
| 162 | |
| 163 | * - Name Index |
| 164 | - Key Prefix |
| 165 | * - 0 |
| 166 | - (no prefix) |
| 167 | * - 1 |
| 168 | - “user.” |
| 169 | * - 2 |
| 170 | - “system.posix\_acl\_access” |
| 171 | * - 3 |
| 172 | - “system.posix\_acl\_default” |
| 173 | * - 4 |
| 174 | - “trusted.” |
| 175 | * - 6 |
| 176 | - “security.” |
| 177 | * - 7 |
| 178 | - “system.” (inline\_data only?) |
| 179 | * - 8 |
| 180 | - “system.richacl” (SuSE kernels only?) |
| 181 | |
| 182 | For example, if the attribute key is “user.fubar”, the attribute name |
| 183 | index is set to 1 and the “fubar” name is recorded on disk. |
| 184 | |
| 185 | POSIX ACLs |
| 186 | ~~~~~~~~~~ |
| 187 | |
| 188 | POSIX ACLs are stored in a reduced version of the Linux kernel (and |
| 189 | libacl's) internal ACL format. The key difference is that the version |
| 190 | number is different (1) and the ``e_id`` field is only stored for named |
| 191 | user and group ACLs. |