james qian wang (Arm Technology China) | 557c373 | 2019-01-03 11:41:48 +0000 | [diff] [blame] | 1 | .. SPDX-License-Identifier: GPL-2.0 |
| 2 | |
| 3 | ============================== |
| 4 | drm/komeda Arm display driver |
| 5 | ============================== |
| 6 | |
| 7 | The drm/komeda driver supports the Arm display processor D71 and later products, |
| 8 | this document gives a brief overview of driver design: how it works and why |
| 9 | design it like that. |
| 10 | |
| 11 | Overview of D71 like display IPs |
| 12 | ================================ |
| 13 | |
| 14 | From D71, Arm display IP begins to adopt a flexible and modularized |
| 15 | architecture. A display pipeline is made up of multiple individual and |
| 16 | functional pipeline stages called components, and every component has some |
| 17 | specific capabilities that can give the flowed pipeline pixel data a |
| 18 | particular processing. |
| 19 | |
| 20 | Typical D71 components: |
| 21 | |
| 22 | Layer |
| 23 | ----- |
| 24 | Layer is the first pipeline stage, which prepares the pixel data for the next |
| 25 | stage. It fetches the pixel from memory, decodes it if it's AFBC, rotates the |
| 26 | source image, unpacks or converts YUV pixels to the device internal RGB pixels, |
| 27 | then adjusts the color_space of pixels if needed. |
| 28 | |
| 29 | Scaler |
| 30 | ------ |
| 31 | As its name suggests, scaler takes responsibility for scaling, and D71 also |
| 32 | supports image enhancements by scaler. |
| 33 | The usage of scaler is very flexible and can be connected to layer output |
| 34 | for layer scaling, or connected to compositor and scale the whole display |
| 35 | frame and then feed the output data into wb_layer which will then write it |
| 36 | into memory. |
| 37 | |
| 38 | Compositor (compiz) |
| 39 | ------------------- |
| 40 | Compositor blends multiple layers or pixel data flows into one single display |
| 41 | frame. its output frame can be fed into post image processor for showing it on |
| 42 | the monitor or fed into wb_layer and written to memory at the same time. |
| 43 | user can also insert a scaler between compositor and wb_layer to down scale |
Randy Dunlap | 686ebbf | 2020-07-07 11:04:00 -0700 | [diff] [blame] | 44 | the display frame first and then write to memory. |
james qian wang (Arm Technology China) | 557c373 | 2019-01-03 11:41:48 +0000 | [diff] [blame] | 45 | |
| 46 | Writeback Layer (wb_layer) |
| 47 | -------------------------- |
| 48 | Writeback layer does the opposite things of Layer, which connects to compiz |
| 49 | and writes the composition result to memory. |
| 50 | |
| 51 | Post image processor (improc) |
| 52 | ----------------------------- |
| 53 | Post image processor adjusts frame data like gamma and color space to fit the |
| 54 | requirements of the monitor. |
| 55 | |
| 56 | Timing controller (timing_ctrlr) |
| 57 | -------------------------------- |
| 58 | Final stage of display pipeline, Timing controller is not for the pixel |
| 59 | handling, but only for controlling the display timing. |
| 60 | |
| 61 | Merger |
| 62 | ------ |
| 63 | D71 scaler mostly only has the half horizontal input/output capabilities |
| 64 | compared with Layer, like if Layer supports 4K input size, the scaler only can |
| 65 | support 2K input/output in the same time. To achieve the ful frame scaling, D71 |
| 66 | introduces Layer Split, which splits the whole image to two half parts and feeds |
| 67 | them to two Layers A and B, and does the scaling independently. After scaling |
| 68 | the result need to be fed to merger to merge two part images together, and then |
| 69 | output merged result to compiz. |
| 70 | |
| 71 | Splitter |
| 72 | -------- |
| 73 | Similar to Layer Split, but Splitter is used for writeback, which splits the |
| 74 | compiz result to two parts and then feed them to two scalers. |
| 75 | |
| 76 | Possible D71 Pipeline usage |
| 77 | =========================== |
| 78 | |
| 79 | Benefitting from the modularized architecture, D71 pipelines can be easily |
| 80 | adjusted to fit different usages. And D71 has two pipelines, which support two |
| 81 | types of working mode: |
| 82 | |
| 83 | - Dual display mode |
| 84 | Two pipelines work independently and separately to drive two display outputs. |
| 85 | |
| 86 | - Single display mode |
| 87 | Two pipelines work together to drive only one display output. |
| 88 | |
| 89 | On this mode, pipeline_B doesn't work indenpendently, but outputs its |
| 90 | composition result into pipeline_A, and its pixel timing also derived from |
| 91 | pipeline_A.timing_ctrlr. The pipeline_B works just like a "slave" of |
| 92 | pipeline_A(master) |
| 93 | |
| 94 | Single pipeline data flow |
| 95 | ------------------------- |
| 96 | |
| 97 | .. kernel-render:: DOT |
| 98 | :alt: Single pipeline digraph |
| 99 | :caption: Single pipeline data flow |
| 100 | |
| 101 | digraph single_ppl { |
| 102 | rankdir=LR; |
| 103 | |
| 104 | subgraph { |
| 105 | "Memory"; |
| 106 | "Monitor"; |
| 107 | } |
| 108 | |
| 109 | subgraph cluster_pipeline { |
| 110 | style=dashed |
| 111 | node [shape=box] |
| 112 | { |
| 113 | node [bgcolor=grey style=dashed] |
| 114 | "Scaler-0"; |
| 115 | "Scaler-1"; |
| 116 | "Scaler-0/1" |
| 117 | } |
| 118 | |
| 119 | node [bgcolor=grey style=filled] |
| 120 | "Layer-0" -> "Scaler-0" |
| 121 | "Layer-1" -> "Scaler-0" |
| 122 | "Layer-2" -> "Scaler-1" |
| 123 | "Layer-3" -> "Scaler-1" |
| 124 | |
| 125 | "Layer-0" -> "Compiz" |
| 126 | "Layer-1" -> "Compiz" |
| 127 | "Layer-2" -> "Compiz" |
| 128 | "Layer-3" -> "Compiz" |
| 129 | "Scaler-0" -> "Compiz" |
| 130 | "Scaler-1" -> "Compiz" |
| 131 | |
| 132 | "Compiz" -> "Scaler-0/1" -> "Wb_layer" |
| 133 | "Compiz" -> "Improc" -> "Timing Controller" |
| 134 | } |
| 135 | |
| 136 | "Wb_layer" -> "Memory" |
| 137 | "Timing Controller" -> "Monitor" |
| 138 | } |
| 139 | |
| 140 | Dual pipeline with Slave enabled |
| 141 | -------------------------------- |
| 142 | |
| 143 | .. kernel-render:: DOT |
| 144 | :alt: Slave pipeline digraph |
| 145 | :caption: Slave pipeline enabled data flow |
| 146 | |
| 147 | digraph slave_ppl { |
| 148 | rankdir=LR; |
| 149 | |
| 150 | subgraph { |
| 151 | "Memory"; |
| 152 | "Monitor"; |
| 153 | } |
| 154 | node [shape=box] |
| 155 | subgraph cluster_pipeline_slave { |
| 156 | style=dashed |
| 157 | label="Slave Pipeline_B" |
| 158 | node [shape=box] |
| 159 | { |
| 160 | node [bgcolor=grey style=dashed] |
| 161 | "Slave.Scaler-0"; |
| 162 | "Slave.Scaler-1"; |
| 163 | } |
| 164 | |
| 165 | node [bgcolor=grey style=filled] |
| 166 | "Slave.Layer-0" -> "Slave.Scaler-0" |
| 167 | "Slave.Layer-1" -> "Slave.Scaler-0" |
| 168 | "Slave.Layer-2" -> "Slave.Scaler-1" |
| 169 | "Slave.Layer-3" -> "Slave.Scaler-1" |
| 170 | |
| 171 | "Slave.Layer-0" -> "Slave.Compiz" |
| 172 | "Slave.Layer-1" -> "Slave.Compiz" |
| 173 | "Slave.Layer-2" -> "Slave.Compiz" |
| 174 | "Slave.Layer-3" -> "Slave.Compiz" |
| 175 | "Slave.Scaler-0" -> "Slave.Compiz" |
| 176 | "Slave.Scaler-1" -> "Slave.Compiz" |
| 177 | } |
| 178 | |
| 179 | subgraph cluster_pipeline_master { |
| 180 | style=dashed |
| 181 | label="Master Pipeline_A" |
| 182 | node [shape=box] |
| 183 | { |
| 184 | node [bgcolor=grey style=dashed] |
| 185 | "Scaler-0"; |
| 186 | "Scaler-1"; |
| 187 | "Scaler-0/1" |
| 188 | } |
| 189 | |
| 190 | node [bgcolor=grey style=filled] |
| 191 | "Layer-0" -> "Scaler-0" |
| 192 | "Layer-1" -> "Scaler-0" |
| 193 | "Layer-2" -> "Scaler-1" |
| 194 | "Layer-3" -> "Scaler-1" |
| 195 | |
| 196 | "Slave.Compiz" -> "Compiz" |
| 197 | "Layer-0" -> "Compiz" |
| 198 | "Layer-1" -> "Compiz" |
| 199 | "Layer-2" -> "Compiz" |
| 200 | "Layer-3" -> "Compiz" |
| 201 | "Scaler-0" -> "Compiz" |
| 202 | "Scaler-1" -> "Compiz" |
| 203 | |
| 204 | "Compiz" -> "Scaler-0/1" -> "Wb_layer" |
| 205 | "Compiz" -> "Improc" -> "Timing Controller" |
| 206 | } |
| 207 | |
| 208 | "Wb_layer" -> "Memory" |
| 209 | "Timing Controller" -> "Monitor" |
| 210 | } |
| 211 | |
| 212 | Sub-pipelines for input and output |
| 213 | ---------------------------------- |
| 214 | |
| 215 | A complete display pipeline can be easily divided into three sub-pipelines |
| 216 | according to the in/out usage. |
| 217 | |
| 218 | Layer(input) pipeline |
| 219 | ~~~~~~~~~~~~~~~~~~~~~ |
| 220 | |
| 221 | .. kernel-render:: DOT |
| 222 | :alt: Layer data digraph |
| 223 | :caption: Layer (input) data flow |
| 224 | |
| 225 | digraph layer_data_flow { |
| 226 | rankdir=LR; |
| 227 | node [shape=box] |
| 228 | |
| 229 | { |
| 230 | node [bgcolor=grey style=dashed] |
| 231 | "Scaler-n"; |
| 232 | } |
| 233 | |
| 234 | "Layer-n" -> "Scaler-n" -> "Compiz" |
| 235 | } |
| 236 | |
| 237 | .. kernel-render:: DOT |
| 238 | :alt: Layer Split digraph |
| 239 | :caption: Layer Split pipeline |
| 240 | |
| 241 | digraph layer_data_flow { |
| 242 | rankdir=LR; |
| 243 | node [shape=box] |
| 244 | |
| 245 | "Layer-0/1" -> "Scaler-0" -> "Merger" |
| 246 | "Layer-2/3" -> "Scaler-1" -> "Merger" |
| 247 | "Merger" -> "Compiz" |
| 248 | } |
| 249 | |
| 250 | Writeback(output) pipeline |
| 251 | ~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 252 | .. kernel-render:: DOT |
| 253 | :alt: writeback digraph |
| 254 | :caption: Writeback(output) data flow |
| 255 | |
| 256 | digraph writeback_data_flow { |
| 257 | rankdir=LR; |
| 258 | node [shape=box] |
| 259 | |
| 260 | { |
| 261 | node [bgcolor=grey style=dashed] |
| 262 | "Scaler-n"; |
| 263 | } |
| 264 | |
| 265 | "Compiz" -> "Scaler-n" -> "Wb_layer" |
| 266 | } |
| 267 | |
| 268 | .. kernel-render:: DOT |
| 269 | :alt: split writeback digraph |
| 270 | :caption: Writeback(output) Split data flow |
| 271 | |
| 272 | digraph writeback_data_flow { |
| 273 | rankdir=LR; |
| 274 | node [shape=box] |
| 275 | |
| 276 | "Compiz" -> "Splitter" |
| 277 | "Splitter" -> "Scaler-0" -> "Merger" |
| 278 | "Splitter" -> "Scaler-1" -> "Merger" |
| 279 | "Merger" -> "Wb_layer" |
| 280 | } |
| 281 | |
| 282 | Display output pipeline |
| 283 | ~~~~~~~~~~~~~~~~~~~~~~~ |
| 284 | .. kernel-render:: DOT |
| 285 | :alt: display digraph |
| 286 | :caption: display output data flow |
| 287 | |
| 288 | digraph single_ppl { |
| 289 | rankdir=LR; |
| 290 | node [shape=box] |
| 291 | |
| 292 | "Compiz" -> "Improc" -> "Timing Controller" |
| 293 | } |
| 294 | |
| 295 | In the following section we'll see these three sub-pipelines will be handled |
| 296 | by KMS-plane/wb_conn/crtc respectively. |
| 297 | |
| 298 | Komeda Resource abstraction |
| 299 | =========================== |
| 300 | |
| 301 | struct komeda_pipeline/component |
| 302 | -------------------------------- |
| 303 | |
| 304 | To fully utilize and easily access/configure the HW, the driver side also uses |
| 305 | a similar architecture: Pipeline/Component to describe the HW features and |
| 306 | capabilities, and a specific component includes two parts: |
| 307 | |
| 308 | - Data flow controlling. |
| 309 | - Specific component capabilities and features. |
| 310 | |
| 311 | So the driver defines a common header struct komeda_component to describe the |
| 312 | data flow control and all specific components are a subclass of this base |
| 313 | structure. |
| 314 | |
| 315 | .. kernel-doc:: drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h |
| 316 | :internal: |
| 317 | |
| 318 | Resource discovery and initialization |
| 319 | ===================================== |
| 320 | |
| 321 | Pipeline and component are used to describe how to handle the pixel data. We |
| 322 | still need a @struct komeda_dev to describe the whole view of the device, and |
| 323 | the control-abilites of device. |
| 324 | |
| 325 | We have &komeda_dev, &komeda_pipeline, &komeda_component. Now fill devices with |
| 326 | pipelines. Since komeda is not for D71 only but also intended for later products, |
| 327 | of course we’d better share as much as possible between different products. To |
| 328 | achieve this, split the komeda device into two layers: CORE and CHIP. |
| 329 | |
| 330 | - CORE: for common features and capabilities handling. |
| 331 | - CHIP: for register programing and HW specific feature (limitation) handling. |
| 332 | |
| 333 | CORE can access CHIP by three chip function structures: |
| 334 | |
| 335 | - struct komeda_dev_funcs |
| 336 | - struct komeda_pipeline_funcs |
| 337 | - struct komeda_component_funcs |
| 338 | |
| 339 | .. kernel-doc:: drivers/gpu/drm/arm/display/komeda/komeda_dev.h |
| 340 | :internal: |
| 341 | |
| 342 | Format handling |
| 343 | =============== |
| 344 | |
| 345 | .. kernel-doc:: drivers/gpu/drm/arm/display/komeda/komeda_format_caps.h |
| 346 | :internal: |
| 347 | .. kernel-doc:: drivers/gpu/drm/arm/display/komeda/komeda_framebuffer.h |
| 348 | :internal: |
| 349 | |
| 350 | Attach komeda_dev to DRM-KMS |
| 351 | ============================ |
| 352 | |
| 353 | Komeda abstracts resources by pipeline/component, but DRM-KMS uses |
| 354 | crtc/plane/connector. One KMS-obj cannot represent only one single component, |
| 355 | since the requirements of a single KMS object cannot simply be achieved by a |
| 356 | single component, usually that needs multiple components to fit the requirement. |
| 357 | Like set mode, gamma, ctm for KMS all target on CRTC-obj, but komeda needs |
| 358 | compiz, improc and timing_ctrlr to work together to fit these requirements. |
| 359 | And a KMS-Plane may require multiple komeda resources: layer/scaler/compiz. |
| 360 | |
| 361 | So, one KMS-Obj represents a sub-pipeline of komeda resources. |
| 362 | |
| 363 | - Plane: `Layer(input) pipeline`_ |
| 364 | - Wb_connector: `Writeback(output) pipeline`_ |
| 365 | - Crtc: `Display output pipeline`_ |
| 366 | |
| 367 | So, for komeda, we treat KMS crtc/plane/connector as users of pipeline and |
| 368 | component, and at any one time a pipeline/component only can be used by one |
| 369 | user. And pipeline/component will be treated as private object of DRM-KMS; the |
| 370 | state will be managed by drm_atomic_state as well. |
| 371 | |
| 372 | How to map plane to Layer(input) pipeline |
| 373 | ----------------------------------------- |
| 374 | |
| 375 | Komeda has multiple Layer input pipelines, see: |
| 376 | - `Single pipeline data flow`_ |
| 377 | - `Dual pipeline with Slave enabled`_ |
| 378 | |
| 379 | The easiest way is binding a plane to a fixed Layer pipeline, but consider the |
| 380 | komeda capabilities: |
| 381 | |
| 382 | - Layer Split, See `Layer(input) pipeline`_ |
| 383 | |
| 384 | Layer_Split is quite complicated feature, which splits a big image into two |
| 385 | parts and handles it by two layers and two scalers individually. But it |
| 386 | imports an edge problem or effect in the middle of the image after the split. |
| 387 | To avoid such a problem, it needs a complicated Split calculation and some |
| 388 | special configurations to the layer and scaler. We'd better hide such HW |
| 389 | related complexity to user mode. |
| 390 | |
| 391 | - Slave pipeline, See `Dual pipeline with Slave enabled`_ |
| 392 | |
| 393 | Since the compiz component doesn't output alpha value, the slave pipeline |
| 394 | only can be used for bottom layers composition. The komeda driver wants to |
| 395 | hide this limitation to the user. The way to do this is to pick a suitable |
| 396 | Layer according to plane_state->zpos. |
| 397 | |
| 398 | So for komeda, the KMS-plane doesn't represent a fixed komeda layer pipeline, |
| 399 | but multiple Layers with same capabilities. Komeda will select one or more |
| 400 | Layers to fit the requirement of one KMS-plane. |
| 401 | |
| 402 | Make component/pipeline to be drm_private_obj |
| 403 | --------------------------------------------- |
| 404 | |
| 405 | Add :c:type:`drm_private_obj` to :c:type:`komeda_component`, :c:type:`komeda_pipeline` |
| 406 | |
| 407 | .. code-block:: c |
| 408 | |
| 409 | struct komeda_component { |
| 410 | struct drm_private_obj obj; |
| 411 | ... |
| 412 | } |
| 413 | |
| 414 | struct komeda_pipeline { |
| 415 | struct drm_private_obj obj; |
| 416 | ... |
| 417 | } |
| 418 | |
| 419 | Tracking component_state/pipeline_state by drm_atomic_state |
| 420 | ----------------------------------------------------------- |
| 421 | |
| 422 | Add :c:type:`drm_private_state` and user to :c:type:`komeda_component_state`, |
| 423 | :c:type:`komeda_pipeline_state` |
| 424 | |
| 425 | .. code-block:: c |
| 426 | |
| 427 | struct komeda_component_state { |
| 428 | struct drm_private_state obj; |
| 429 | void *binding_user; |
| 430 | ... |
| 431 | } |
| 432 | |
| 433 | struct komeda_pipeline_state { |
| 434 | struct drm_private_state obj; |
| 435 | struct drm_crtc *crtc; |
| 436 | ... |
| 437 | } |
| 438 | |
| 439 | komeda component validation |
| 440 | --------------------------- |
| 441 | |
| 442 | Komeda has multiple types of components, but the process of validation are |
| 443 | similar, usually including the following steps: |
| 444 | |
| 445 | .. code-block:: c |
| 446 | |
| 447 | int komeda_xxxx_validate(struct komeda_component_xxx xxx_comp, |
| 448 | struct komeda_component_output *input_dflow, |
| 449 | struct drm_plane/crtc/connector *user, |
| 450 | struct drm_plane/crtc/connector_state, *user_state) |
| 451 | { |
| 452 | setup 1: check if component is needed, like the scaler is optional depending |
| 453 | on the user_state; if unneeded, just return, and the caller will |
| 454 | put the data flow into next stage. |
| 455 | Setup 2: check user_state with component features and capabilities to see |
| 456 | if requirements can be met; if not, return fail. |
| 457 | Setup 3: get component_state from drm_atomic_state, and try set to set |
| 458 | user to component; fail if component has been assigned to another |
| 459 | user already. |
| 460 | Setup 3: configure the component_state, like set its input component, |
| 461 | convert user_state to component specific state. |
| 462 | Setup 4: adjust the input_dflow and prepare it for the next stage. |
| 463 | } |
| 464 | |
| 465 | komeda_kms Abstraction |
| 466 | ---------------------- |
| 467 | |
| 468 | .. kernel-doc:: drivers/gpu/drm/arm/display/komeda/komeda_kms.h |
| 469 | :internal: |
| 470 | |
| 471 | komde_kms Functions |
| 472 | ------------------- |
| 473 | .. kernel-doc:: drivers/gpu/drm/arm/display/komeda/komeda_crtc.c |
| 474 | :internal: |
| 475 | .. kernel-doc:: drivers/gpu/drm/arm/display/komeda/komeda_plane.c |
| 476 | :internal: |
| 477 | |
| 478 | Build komeda to be a Linux module driver |
| 479 | ======================================== |
| 480 | |
| 481 | Now we have two level devices: |
| 482 | |
| 483 | - komeda_dev: describes the real display hardware. |
| 484 | - komeda_kms_dev: attachs or connects komeda_dev to DRM-KMS. |
| 485 | |
| 486 | All komeda operations are supplied or operated by komeda_dev or komeda_kms_dev, |
| 487 | the module driver is only a simple wrapper to pass the Linux command |
| 488 | (probe/remove/pm) into komeda_dev or komeda_kms_dev. |