drm/vc4: Add CTM support

The hardware has a single block for applying a CTM prior to gamma lut.
It can be fed with pixels from one of our CRTC at a time and uses a
matrix with S0.9 scalars. Use private atomic state to reject attempts
from userland to apply CTM for more than one CRTC at a time and reject
matrices with scalars that we can't approximate without integer bits.

Signed-off-by: Stefan Schake <stschake@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/218067/
4 files changed