Vladimir Oltean | 554aae3 | 2019-05-02 23:23:29 +0300 | [diff] [blame] | 1 | ================================================ |
| 2 | Generic bitfield packing and unpacking functions |
| 3 | ================================================ |
| 4 | |
| 5 | Problem statement |
| 6 | ----------------- |
| 7 | |
| 8 | When working with hardware, one has to choose between several approaches of |
| 9 | interfacing with it. |
| 10 | One can memory-map a pointer to a carefully crafted struct over the hardware |
| 11 | device's memory region, and access its fields as struct members (potentially |
| 12 | declared as bitfields). But writing code this way would make it less portable, |
| 13 | due to potential endianness mismatches between the CPU and the hardware device. |
| 14 | Additionally, one has to pay close attention when translating register |
| 15 | definitions from the hardware documentation into bit field indices for the |
| 16 | structs. Also, some hardware (typically networking equipment) tends to group |
| 17 | its register fields in ways that violate any reasonable word boundaries |
| 18 | (sometimes even 64 bit ones). This creates the inconvenience of having to |
| 19 | define "high" and "low" portions of register fields within the struct. |
| 20 | A more robust alternative to struct field definitions would be to extract the |
| 21 | required fields by shifting the appropriate number of bits. But this would |
| 22 | still not protect from endianness mismatches, except if all memory accesses |
| 23 | were performed byte-by-byte. Also the code can easily get cluttered, and the |
| 24 | high-level idea might get lost among the many bit shifts required. |
| 25 | Many drivers take the bit-shifting approach and then attempt to reduce the |
| 26 | clutter with tailored macros, but more often than not these macros take |
| 27 | shortcuts that still prevent the code from being truly portable. |
| 28 | |
| 29 | The solution |
| 30 | ------------ |
| 31 | |
| 32 | This API deals with 2 basic operations: |
| 33 | - Packing a CPU-usable number into a memory buffer (with hardware |
| 34 | constraints/quirks) |
| 35 | - Unpacking a memory buffer (which has hardware constraints/quirks) |
| 36 | into a CPU-usable number. |
| 37 | |
| 38 | The API offers an abstraction over said hardware constraints and quirks, |
| 39 | over CPU endianness and therefore between possible mismatches between |
| 40 | the two. |
| 41 | |
| 42 | The basic unit of these API functions is the u64. From the CPU's |
| 43 | perspective, bit 63 always means bit offset 7 of byte 7, albeit only |
| 44 | logically. The question is: where do we lay this bit out in memory? |
| 45 | |
| 46 | The following examples cover the memory layout of a packed u64 field. |
| 47 | The byte offsets in the packed buffer are always implicitly 0, 1, ... 7. |
| 48 | What the examples show is where the logical bytes and bits sit. |
| 49 | |
| 50 | 1. Normally (no quirks), we would do it like this: |
| 51 | |
| 52 | 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 |
| 53 | 7 6 5 4 |
| 54 | 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 |
| 55 | 3 2 1 0 |
| 56 | |
| 57 | That is, the MSByte (7) of the CPU-usable u64 sits at memory offset 0, and the |
| 58 | LSByte (0) of the u64 sits at memory offset 7. |
| 59 | This corresponds to what most folks would regard to as "big endian", where |
| 60 | bit i corresponds to the number 2^i. This is also referred to in the code |
| 61 | comments as "logical" notation. |
| 62 | |
| 63 | |
| 64 | 2. If QUIRK_MSB_ON_THE_RIGHT is set, we do it like this: |
| 65 | |
| 66 | 56 57 58 59 60 61 62 63 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47 32 33 34 35 36 37 38 39 |
| 67 | 7 6 5 4 |
| 68 | 24 25 26 27 28 29 30 31 16 17 18 19 20 21 22 23 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 |
| 69 | 3 2 1 0 |
| 70 | |
| 71 | That is, QUIRK_MSB_ON_THE_RIGHT does not affect byte positioning, but |
| 72 | inverts bit offsets inside a byte. |
| 73 | |
| 74 | |
| 75 | 3. If QUIRK_LITTLE_ENDIAN is set, we do it like this: |
| 76 | |
| 77 | 39 38 37 36 35 34 33 32 47 46 45 44 43 42 41 40 55 54 53 52 51 50 49 48 63 62 61 60 59 58 57 56 |
| 78 | 4 5 6 7 |
| 79 | 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 23 22 21 20 19 18 17 16 31 30 29 28 27 26 25 24 |
| 80 | 0 1 2 3 |
| 81 | |
| 82 | Therefore, QUIRK_LITTLE_ENDIAN means that inside the memory region, every |
| 83 | byte from each 4-byte word is placed at its mirrored position compared to |
| 84 | the boundary of that word. |
| 85 | |
| 86 | 4. If QUIRK_MSB_ON_THE_RIGHT and QUIRK_LITTLE_ENDIAN are both set, we do it |
| 87 | like this: |
| 88 | |
| 89 | 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
| 90 | 4 5 6 7 |
| 91 | 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
| 92 | 0 1 2 3 |
| 93 | |
| 94 | |
| 95 | 5. If just QUIRK_LSW32_IS_FIRST is set, we do it like this: |
| 96 | |
| 97 | 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 |
| 98 | 3 2 1 0 |
| 99 | 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 |
| 100 | 7 6 5 4 |
| 101 | |
| 102 | In this case the 8 byte memory region is interpreted as follows: first |
| 103 | 4 bytes correspond to the least significant 4-byte word, next 4 bytes to |
| 104 | the more significant 4-byte word. |
| 105 | |
| 106 | |
| 107 | 6. If QUIRK_LSW32_IS_FIRST and QUIRK_MSB_ON_THE_RIGHT are set, we do it like |
| 108 | this: |
| 109 | |
| 110 | 24 25 26 27 28 29 30 31 16 17 18 19 20 21 22 23 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 |
| 111 | 3 2 1 0 |
| 112 | 56 57 58 59 60 61 62 63 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47 32 33 34 35 36 37 38 39 |
| 113 | 7 6 5 4 |
| 114 | |
| 115 | |
| 116 | 7. If QUIRK_LSW32_IS_FIRST and QUIRK_LITTLE_ENDIAN are set, it looks like |
| 117 | this: |
| 118 | |
| 119 | 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 23 22 21 20 19 18 17 16 31 30 29 28 27 26 25 24 |
| 120 | 0 1 2 3 |
| 121 | 39 38 37 36 35 34 33 32 47 46 45 44 43 42 41 40 55 54 53 52 51 50 49 48 63 62 61 60 59 58 57 56 |
| 122 | 4 5 6 7 |
| 123 | |
| 124 | |
| 125 | 8. If QUIRK_LSW32_IS_FIRST, QUIRK_LITTLE_ENDIAN and QUIRK_MSB_ON_THE_RIGHT |
| 126 | are set, it looks like this: |
| 127 | |
| 128 | 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
| 129 | 0 1 2 3 |
| 130 | 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
| 131 | 4 5 6 7 |
| 132 | |
| 133 | |
| 134 | We always think of our offsets as if there were no quirk, and we translate |
| 135 | them afterwards, before accessing the memory region. |
| 136 | |
| 137 | Intended use |
| 138 | ------------ |
| 139 | |
| 140 | Drivers that opt to use this API first need to identify which of the above 3 |
| 141 | quirk combinations (for a total of 8) match what the hardware documentation |
| 142 | describes. Then they should wrap the packing() function, creating a new |
| 143 | xxx_packing() that calls it using the proper QUIRK_* one-hot bits set. |
| 144 | |
| 145 | The packing() function returns an int-encoded error code, which protects the |
| 146 | programmer against incorrect API use. The errors are not expected to occur |
| 147 | durring runtime, therefore it is reasonable for xxx_packing() to return void |
| 148 | and simply swallow those errors. Optionally it can dump stack or print the |
| 149 | error description. |