Satya Tangirala | 54b259f | 2020-05-14 00:37:16 +0000 | [diff] [blame] | 1 | .. SPDX-License-Identifier: GPL-2.0 |
| 2 | |
| 3 | ================= |
| 4 | Inline Encryption |
| 5 | ================= |
| 6 | |
| 7 | Background |
| 8 | ========== |
| 9 | |
| 10 | Inline encryption hardware sits logically between memory and the disk, and can |
| 11 | en/decrypt data as it goes in/out of the disk. Inline encryption hardware has a |
| 12 | fixed number of "keyslots" - slots into which encryption contexts (i.e. the |
| 13 | encryption key, encryption algorithm, data unit size) can be programmed by the |
| 14 | kernel at any time. Each request sent to the disk can be tagged with the index |
| 15 | of a keyslot (and also a data unit number to act as an encryption tweak), and |
| 16 | the inline encryption hardware will en/decrypt the data in the request with the |
| 17 | encryption context programmed into that keyslot. This is very different from |
| 18 | full disk encryption solutions like self encrypting drives/TCG OPAL/ATA |
| 19 | Security standards, since with inline encryption, any block on disk could be |
| 20 | encrypted with any encryption context the kernel chooses. |
| 21 | |
| 22 | |
| 23 | Objective |
| 24 | ========= |
| 25 | |
| 26 | We want to support inline encryption (IE) in the kernel. |
| 27 | To allow for testing, we also want a crypto API fallback when actual |
| 28 | IE hardware is absent. We also want IE to work with layered devices |
| 29 | like dm and loopback (i.e. we want to be able to use the IE hardware |
| 30 | of the underlying devices if present, or else fall back to crypto API |
| 31 | en/decryption). |
| 32 | |
| 33 | |
| 34 | Constraints and notes |
| 35 | ===================== |
| 36 | |
| 37 | - IE hardware has a limited number of "keyslots" that can be programmed |
| 38 | with an encryption context (key, algorithm, data unit size, etc.) at any time. |
| 39 | One can specify a keyslot in a data request made to the device, and the |
| 40 | device will en/decrypt the data using the encryption context programmed into |
| 41 | that specified keyslot. When possible, we want to make multiple requests with |
| 42 | the same encryption context share the same keyslot. |
| 43 | |
| 44 | - We need a way for upper layers like filesystems to specify an encryption |
| 45 | context to use for en/decrypting a struct bio, and a device driver (like UFS) |
| 46 | needs to be able to use that encryption context when it processes the bio. |
| 47 | |
| 48 | - We need a way for device drivers to expose their inline encryption |
| 49 | capabilities in a unified way to the upper layers. |
| 50 | |
| 51 | |
| 52 | Design |
| 53 | ====== |
| 54 | |
| 55 | We add a :c:type:`struct bio_crypt_ctx` to :c:type:`struct bio` that can |
| 56 | represent an encryption context, because we need to be able to pass this |
| 57 | encryption context from the upper layers (like the fs layer) to the |
| 58 | device driver to act upon. |
| 59 | |
| 60 | While IE hardware works on the notion of keyslots, the FS layer has no |
| 61 | knowledge of keyslots - it simply wants to specify an encryption context to |
| 62 | use while en/decrypting a bio. |
| 63 | |
| 64 | We introduce a keyslot manager (KSM) that handles the translation from |
| 65 | encryption contexts specified by the FS to keyslots on the IE hardware. |
| 66 | This KSM also serves as the way IE hardware can expose its capabilities to |
| 67 | upper layers. The generic mode of operation is: each device driver that wants |
| 68 | to support IE will construct a KSM and set it up in its struct request_queue. |
| 69 | Upper layers that want to use IE on this device can then use this KSM in |
| 70 | the device's struct request_queue to translate an encryption context into |
| 71 | a keyslot. The presence of the KSM in the request queue shall be used to mean |
| 72 | that the device supports IE. |
| 73 | |
| 74 | The KSM uses refcounts to track which keyslots are idle (either they have no |
| 75 | encryption context programmed, or there are no in-flight struct bios |
| 76 | referencing that keyslot). When a new encryption context needs a keyslot, it |
| 77 | tries to find a keyslot that has already been programmed with the same |
| 78 | encryption context, and if there is no such keyslot, it evicts the least |
| 79 | recently used idle keyslot and programs the new encryption context into that |
| 80 | one. If no idle keyslots are available, then the caller will sleep until there |
| 81 | is at least one. |
| 82 | |
| 83 | |
| 84 | blk-mq changes, other block layer changes and blk-crypto-fallback |
| 85 | ================================================================= |
| 86 | |
| 87 | We add a pointer to a ``bi_crypt_context`` and ``keyslot`` to |
| 88 | :c:type:`struct request`. These will be referred to as the ``crypto fields`` |
| 89 | for the request. This ``keyslot`` is the keyslot into which the |
| 90 | ``bi_crypt_context`` has been programmed in the KSM of the ``request_queue`` |
| 91 | that this request is being sent to. |
| 92 | |
| 93 | We introduce ``block/blk-crypto-fallback.c``, which allows upper layers to remain |
| 94 | blissfully unaware of whether or not real inline encryption hardware is present |
| 95 | underneath. When a bio is submitted with a target ``request_queue`` that doesn't |
| 96 | support the encryption context specified with the bio, the block layer will |
| 97 | en/decrypt the bio with the blk-crypto-fallback. |
| 98 | |
| 99 | If the bio is a ``WRITE`` bio, a bounce bio is allocated, and the data in the bio |
| 100 | is encrypted stored in the bounce bio - blk-mq will then proceed to process the |
| 101 | bounce bio as if it were not encrypted at all (except when blk-integrity is |
| 102 | concerned). ``blk-crypto-fallback`` sets the bounce bio's ``bi_end_io`` to an |
| 103 | internal function that cleans up the bounce bio and ends the original bio. |
| 104 | |
| 105 | If the bio is a ``READ`` bio, the bio's ``bi_end_io`` (and also ``bi_private``) |
| 106 | is saved and overwritten by ``blk-crypto-fallback`` to |
| 107 | ``bio_crypto_fallback_decrypt_bio``. The bio's ``bi_crypt_context`` is also |
| 108 | overwritten with ``NULL``, so that to the rest of the stack, the bio looks |
| 109 | as if it was a regular bio that never had an encryption context specified. |
| 110 | ``bio_crypto_fallback_decrypt_bio`` will decrypt the bio, restore the original |
| 111 | ``bi_end_io`` (and also ``bi_private``) and end the bio again. |
| 112 | |
| 113 | Regardless of whether real inline encryption hardware is used or the |
| 114 | blk-crypto-fallback is used, the ciphertext written to disk (and hence the |
| 115 | on-disk format of data) will be the same (assuming the hardware's implementation |
| 116 | of the algorithm being used adheres to spec and functions correctly). |
| 117 | |
| 118 | If a ``request queue``'s inline encryption hardware claimed to support the |
| 119 | encryption context specified with a bio, then it will not be handled by the |
| 120 | ``blk-crypto-fallback``. We will eventually reach a point in blk-mq when a |
| 121 | :c:type:`struct request` needs to be allocated for that bio. At that point, |
| 122 | blk-mq tries to program the encryption context into the ``request_queue``'s |
| 123 | keyslot_manager, and obtain a keyslot, which it stores in its newly added |
| 124 | ``keyslot`` field. This keyslot is released when the request is completed. |
| 125 | |
| 126 | When the first bio is added to a request, ``blk_crypto_rq_bio_prep`` is called, |
| 127 | which sets the request's ``crypt_ctx`` to a copy of the bio's |
| 128 | ``bi_crypt_context``. bio_crypt_do_front_merge is called whenever a subsequent |
| 129 | bio is merged to the front of the request, which updates the ``crypt_ctx`` of |
| 130 | the request so that it matches the newly merged bio's ``bi_crypt_context``. In particular, the request keeps a copy of the ``bi_crypt_context`` of the first |
| 131 | bio in its bio-list (blk-mq needs to be careful to maintain this invariant |
| 132 | during bio and request merges). |
| 133 | |
| 134 | To make it possible for inline encryption to work with request queue based |
| 135 | layered devices, when a request is cloned, its ``crypto fields`` are cloned as |
| 136 | well. When the cloned request is submitted, blk-mq programs the |
| 137 | ``bi_crypt_context`` of the request into the clone's request_queue's keyslot |
| 138 | manager, and stores the returned keyslot in the clone's ``keyslot``. |
| 139 | |
| 140 | |
| 141 | API presented to users of the block layer |
| 142 | ========================================= |
| 143 | |
| 144 | ``struct blk_crypto_key`` represents a crypto key (the raw key, size of the |
| 145 | key, the crypto algorithm to use, the data unit size to use, and the number of |
| 146 | bytes required to represent data unit numbers that will be specified with the |
| 147 | ``bi_crypt_context``). |
| 148 | |
| 149 | ``blk_crypto_init_key`` allows upper layers to initialize such a |
| 150 | ``blk_crypto_key``. |
| 151 | |
| 152 | ``bio_crypt_set_ctx`` should be called on any bio that a user of |
| 153 | the block layer wants en/decrypted via inline encryption (or the |
| 154 | blk-crypto-fallback, if hardware support isn't available for the desired |
| 155 | crypto configuration). This function takes the ``blk_crypto_key`` and the |
| 156 | data unit number (DUN) to use when en/decrypting the bio. |
| 157 | |
| 158 | ``blk_crypto_config_supported`` allows upper layers to query whether or not the |
| 159 | an encryption context passed to request queue can be handled by blk-crypto |
| 160 | (either by real inline encryption hardware, or by the blk-crypto-fallback). |
| 161 | This is useful e.g. when blk-crypto-fallback is disabled, and the upper layer |
| 162 | wants to use an algorithm that may not supported by hardware - this function |
| 163 | lets the upper layer know ahead of time that the algorithm isn't supported, |
| 164 | and the upper layer can fallback to something else if appropriate. |
| 165 | |
| 166 | ``blk_crypto_start_using_key`` - Upper layers must call this function on |
| 167 | ``blk_crypto_key`` and a ``request_queue`` before using the key with any bio |
| 168 | headed for that ``request_queue``. This function ensures that either the |
| 169 | hardware supports the key's crypto settings, or the crypto API fallback has |
| 170 | transforms for the needed mode allocated and ready to go. Note that this |
| 171 | function may allocate an ``skcipher``, and must not be called from the data |
| 172 | path, since allocating ``skciphers`` from the data path can deadlock. |
| 173 | |
| 174 | ``blk_crypto_evict_key`` *must* be called by upper layers before a |
| 175 | ``blk_crypto_key`` is freed. Further, it *must* only be called only once |
| 176 | there are no more in-flight requests that use that ``blk_crypto_key``. |
| 177 | ``blk_crypto_evict_key`` will ensure that a key is removed from any keyslots in |
| 178 | inline encryption hardware that the key might have been programmed into (or the blk-crypto-fallback). |
| 179 | |
| 180 | API presented to device drivers |
| 181 | =============================== |
| 182 | |
| 183 | A :c:type:``struct blk_keyslot_manager`` should be set up by device drivers in |
| 184 | the ``request_queue`` of the device. The device driver needs to call |
| 185 | ``blk_ksm_init`` on the ``blk_keyslot_manager``, which specifying the number of |
| 186 | keyslots supported by the hardware. |
| 187 | |
| 188 | The device driver also needs to tell the KSM how to actually manipulate the |
| 189 | IE hardware in the device to do things like programming the crypto key into |
| 190 | the IE hardware into a particular keyslot. All this is achieved through the |
| 191 | :c:type:`struct blk_ksm_ll_ops` field in the KSM that the device driver |
| 192 | must fill up after initing the ``blk_keyslot_manager``. |
| 193 | |
| 194 | The KSM also handles runtime power management for the device when applicable |
| 195 | (e.g. when it wants to program a crypto key into the IE hardware, the device |
| 196 | must be runtime powered on) - so the device driver must also set the ``dev`` |
| 197 | field in the ksm to point to the `struct device` for the KSM to use for runtime |
| 198 | power management. |
| 199 | |
| 200 | ``blk_ksm_reprogram_all_keys`` can be called by device drivers if the device |
| 201 | needs each and every of its keyslots to be reprogrammed with the key it |
| 202 | "should have" at the point in time when the function is called. This is useful |
| 203 | e.g. if a device loses all its keys on runtime power down/up. |
| 204 | |
| 205 | ``blk_ksm_destroy`` should be called to free up all resources used by a keyslot |
| 206 | manager upon ``blk_ksm_init``, once the ``blk_keyslot_manager`` is no longer |
| 207 | needed. |
| 208 | |
| 209 | |
| 210 | Layered Devices |
| 211 | =============== |
| 212 | |
| 213 | Request queue based layered devices like dm-rq that wish to support IE need to |
| 214 | create their own keyslot manager for their request queue, and expose whatever |
| 215 | functionality they choose. When a layered device wants to pass a clone of that |
| 216 | request to another ``request_queue``, blk-crypto will initialize and prepare the |
| 217 | clone as necessary - see ``blk_crypto_insert_cloned_request`` in |
| 218 | ``blk-crypto.c``. |
| 219 | |
| 220 | |
| 221 | Future Optimizations for layered devices |
| 222 | ======================================== |
| 223 | |
| 224 | Creating a keyslot manager for a layered device uses up memory for each |
| 225 | keyslot, and in general, a layered device merely passes the request on to a |
| 226 | "child" device, so the keyslots in the layered device itself are completely |
| 227 | unused, and don't need any refcounting or keyslot programming. We can instead |
| 228 | define a new type of KSM; the "passthrough KSM", that layered devices can use |
| 229 | to advertise an unlimited number of keyslots, and support for any encryption |
| 230 | algorithms they choose, while not actually using any memory for each keyslot. |
| 231 | Another use case for the "passthrough KSM" is for IE devices that do not have a |
| 232 | limited number of keyslots. |
| 233 | |
| 234 | |
| 235 | Interaction between inline encryption and blk integrity |
| 236 | ======================================================= |
| 237 | |
| 238 | At the time of this patch, there is no real hardware that supports both these |
| 239 | features. However, these features do interact with each other, and it's not |
| 240 | completely trivial to make them both work together properly. In particular, |
| 241 | when a WRITE bio wants to use inline encryption on a device that supports both |
| 242 | features, the bio will have an encryption context specified, after which |
| 243 | its integrity information is calculated (using the plaintext data, since |
| 244 | the encryption will happen while data is being written), and the data and |
| 245 | integrity info is sent to the device. Obviously, the integrity info must be |
| 246 | verified before the data is encrypted. After the data is encrypted, the device |
| 247 | must not store the integrity info that it received with the plaintext data |
| 248 | since that might reveal information about the plaintext data. As such, it must |
| 249 | re-generate the integrity info from the ciphertext data and store that on disk |
| 250 | instead. Another issue with storing the integrity info of the plaintext data is |
| 251 | that it changes the on disk format depending on whether hardware inline |
| 252 | encryption support is present or the kernel crypto API fallback is used (since |
| 253 | if the fallback is used, the device will receive the integrity info of the |
| 254 | ciphertext, not that of the plaintext). |
| 255 | |
| 256 | Because there isn't any real hardware yet, it seems prudent to assume that |
| 257 | hardware implementations might not implement both features together correctly, |
| 258 | and disallow the combination for now. Whenever a device supports integrity, the |
| 259 | kernel will pretend that the device does not support hardware inline encryption |
| 260 | (by essentially setting the keyslot manager in the request_queue of the device |
| 261 | to NULL). When the crypto API fallback is enabled, this means that all bios with |
| 262 | and encryption context will use the fallback, and IO will complete as usual. |
| 263 | When the fallback is disabled, a bio with an encryption context will be failed. |