1 files changed, 289 insertions, 0 deletions
diff --git a/mc-v4l2-api.rst b/mc-v4l2-api.rst
new file mode 100644
index 0000000..5ac0c01
--- /dev/null
+++ b/mc-v4l2-api.rst
@@ -0,0 +1,289 @@
+.. SPDX-License-Identifier: CC-BY-SA-4.0
+
+Media Controller and V4L2 Subdevice APIs
+========================================
+
+The term `Media Controller` usually covers two distinct APIs:
+
+- The Media Controller (MC) API itself, whose task it is to expose the internal
+  topology of the device to applications.
+
+- The V4L2 subdevice userspace API, which exposes low-level control of
+  individual subdevices to applications.
+
+Collectively, and in collaboration with the V4L2 API, these offer the features
+needed by applications to control complex video capture devices.
+
+
+The Media Controller API
+------------------------
+
+.. _media-ctl: https://git.linuxtv.org/v4l-utils.git/tree/utils/media-ctl
+
+The Media Controller kernel framework and userspace API model devices as a
+directed acyclic graph of `entities`. Each entity represents a hardware block,
+which can be an external on-board component, an IP core in the SoC, or a piece
+of either of those. The API doesn't precisely define how a device should be
+split in entities. Individual drivers decide on the exact model they want to
+expose, to allow fine-grained control of the hardware blocks while minimizing
+the number of entities to avoid unnecessary complexity.
+
+Entities include `pads`, which model input and output ports through which
+entities receive or produce data. Data inputs are called `sinks`, and data
+outputs `sources`. The data flow through the graph is modelled by `links` that
+connect sources to sinks. Each link connects one source pad of an entity to a
+sink pad of another entity. Cycles in the graph are not allowed.
+
+.. note::
+
+   The Media Controller API is not limited to video capture devices and has
+   been designed to model any type of data flow in a media device. This
+   includes, for instance, audio and display devices. However, as of version
+   6.0 of the kernel, the API is only used in the Linux media subsystem, by
+   V4L2 and DVB drivers, and hasn't made its way to the ALSA and DRM/KMS
+   subsystems.
+
+When used in a V4L2 driver, an entity models either a video device (``struct
+video_device``) or a subdevice (``struct v4l2_subdevice``). For video capture
+devices, subdevices represent video sources (camera sensors, input connectors,
+...) or processing elements, and video devices represent the connection to
+system memory at the end of a pipeline (typically a DMA engine, but it can also
+be a USB connection for USB webcams). The entity type is exposed to
+applications as an entity `function`, for instance
+
+- ``MEDIA_ENT_F_CAM_SENSOR`` for a camera sensor
+- ``MEDIA_ENT_F_PROC_VIDEO_SCALER`` for a video scaler
+- ``MEDIA_ENT_F_IO_V4L`` for a connection to system memory through a V4L2 video
+  device
+
+The kernel media controller device (``struct media_device``) is exposed to
+userspace through a media device node, typically named ``/dev/media[0-9]+``.
+The `media-ctl`_ tool can query the topology of a Media Controller device and
+display it in either plain text (``--print-topology`` or ``-p``) or `DOT format
+<https://graphviz.org/doc/info/lang.html>`_ (``--print-dot``).
+
+.. code-block:: sh
+
+   $ media-ctl -d /dev/media0 --print-dot | dot -Tsvg > omap3isp.svg
+
+The following graph represents the TI OMAP3 ISP, with entities corresponding to
+subdevices in green and entities corresponding to video devices in yellow.
+
+.. graphviz:: omap3isp.dot
+   :caption: Media graph of the TI OMAP3 ISP
+
+The ``mt9p031 2-0048`` on the top row is a camera sensor, all other entities
+are internal to the OMAP3 SoC and part of the ISP.
+
+Entities, their pads, and the links are intrinsic properties of the device.
+They are created by the driver at initialization time to model the hardware
+topology. Unless parts of the device is hot-pluggable, no entities or links are
+created or removed after initialization. Only their properties can be modified
+by applications.
+
+Data flow routing is controlled by enabling or disabling links, using the
+``MEDIA_LNK_FL_ENABLED`` link flag. Links that model immutable connections at
+the hardware level are displayed as a thick plain line in the media graph. They
+have the ``MEDIA_LNK_FL_IMMUTABLE`` and ``MEDIA_LNK_FL_ENABLED`` flags set and
+can't be modified.  Links that model configurable routing options can be
+controlled, and are displayed as a dotted line if they are disabled or as thin
+plain line if they are enabled.
+
+As the Media Controller API can model any type of data flow, it doesn't expose
+any property specific to a particular device type, such as, for instance, pixel
+formats or frame rates. This is left to other, device-specific APIs
+
+
+The V4L2 Subdevice Userspace API
+--------------------------------
+
+.. _V4L2 Subdevice Userspace API: https://linuxtv.org/downloads/v4l-dvb-apis/userspace-api/v4l/dev-subdev.html
+.. _V4L2 controls ioctls: https://linuxtv.org/downloads/v4l-dvb-apis/userspace-api/v4l/vidioc-g-ext-ctrls.html
+.. _v4l2-ctl: https://git.linuxtv.org/v4l-utils.git/tree/utils/v4l2-ctl
+
+The `V4L2 Subdevice Userspace API`_ (often shortened to just V4L2 Subdevice API
+when this doesn't cause any ambiguity with the in-kernel V4L2 subdevice
+operations) has been developed along the Media Controller API to expose to
+applications the properties of entities corresponding to V4L2 subdevices. It
+allows accessing V4L2 controls directly on a subdevice, as well as formats and
+selection rectangles on the subdevice pads.
+
+Subdevices are exposed to userspace through V4L2 subdevice nodes, typically
+named ``/dev/v4l-subdev[0-9]+``. They are controlled using ioctls in a similar
+fashion as the V4L2 video devices. The `v4l2-ctl`_ tool supports a wide range
+of subdevice-specific options to access subdevices from the command line (see
+``v4l2-ctl --help-subdev`` for a detailed list).
+
+The rest of this document will use the NXP i.MX8MP ISP as an example. Its media
+graph is as follows:
+
+.. graphviz:: imx8mp-isp.dot
+   :caption: Media graph of the NXP i.MX8MP
+
+It contains the following V4L2 subdevices:
+
+- A raw camera sensor (``imx290 2-001a``), with a single source pad connected
+  to the SoC through a MIPI CSI-2 link.
+- A MIPI CSI-2 receiver (``csis-32e40000.csi``), internal to the SoC, that
+  receives data from the sensor on its sink pad and provides it to the ISP on
+  its source pad.
+- An ISP (``rkisp1_isp``), with two sink pads that receive image data and
+  processing parameters (0 and 1 respectively) and two source pads that output
+  image data and statistcis (2 and 3 respectively).
+- A scaler (``rkisp1_resizer_mainpath``) that can scale the frames up or down.
+
+It also contains the following video devices:
+
+- A capture device that writes video frames to memory (``rkisp1_mainpath``).
+- A capture device that writes statistics to memory (``rkisp1_stats``).
+- An output device that reads ISP parameters from memory (``rkisp1_params``).
+
+
+V4L2 Subdevice Controls
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Subdevice controls are accessed using the `V4L2 controls ioctls`_ in exactly
+the same way as for video device, except that the ioctls should be issued on
+the subdevice node. Tools that access controls on video devices can usually be
+used unmodified on subdevices. For instance, to list the controls supported by
+the IMX290 camera sensor subdevice,
+
+.. code-block:: none
+
+   $ v4l2-ctl -d /dev/v4l-subdev3 -l
+
+   User Controls
+
+                          exposure 0x00980911 (int)    : min=1 max=1123 step=1 default=1123 value=1123
+
+   Camera Controls
+
+                camera_orientation 0x009a0922 (menu)   : min=0 max=2 default=0 value=0 (Front) flags=read-only
+            camera_sensor_rotation 0x009a0923 (int)    : min=0 max=0 step=1 default=0 value=0 flags=read-only
+
+   Image Source Controls
+
+                 vertical_blanking 0x009e0901 (int)    : min=45 max=45 step=1 default=45 value=45 flags=read-only
+               horizontal_blanking 0x009e0902 (int)    : min=280 max=280 step=1 default=280 value=280 flags=read-only
+                     analogue_gain 0x009e0903 (int)    : min=0 max=240 step=1 default=0 value=0
+
+   Image Processing Controls
+
+                    link_frequency 0x009f0901 (intmenu): min=0 max=1 default=0 value=0 (222750000 0xd46e530) flags=read-only
+                        pixel_rate 0x009f0902 (int64)  : min=1 max=2147483647 step=1 default=178200000 value=178200000 flags=read-only
+                      test_pattern 0x009f0903 (menu)   : min=0 max=7 default=0 value=0 (Disabled)
+
+By accessing controls on subdevices, applications can control the behaviour of
+each subdevice independently. If multiple subdevices in the graph implement the
+same control (such as a digital gain), those controls can be set individually.
+This wouldn't be possible using with the traditional V4L2 API on video devices,
+as the identical controls from two different subdevices would conflict.
+
+
+.. _v4l2-subdevice-formats:
+
+V4L2 Subdevice Formats
+~~~~~~~~~~~~~~~~~~~~~~
+
+Where video devices expose only the format of the frames being captured to
+memory, subdevices allow fine-grained configuration of formats on every pad in
+the pipeline. This enables setting up pipelines with different internal
+configurations to match precise use cases. To understand why this is needed,
+let's consider the following simplified example, where a 12MP camera sensor
+(IMX477) is connected to an SoC that includes an ISP and a scaler.
+
+.. graphviz:: scaler.dot
+   :caption: Scaling pipeline
+
+All three components can affect the image size:
+
+- The camera sensor can subsample the image through mechanisms such as binning
+  and skipping.
+- The ISP can subsample the image horizontally through averaging.
+- The scaler uses a polyphase filter for high quality scaling.
+
+All these components can further crop the image if desired.
+
+Different use cases will call for cropping and resizing the image in different
+ways through the pipeline. Let's assume that, in all cases, we want to capture
+1.5MP images from the 12MP native sensor resolution, When frame rate is more
+important than quality, the sensor will typically subsample the image to comply
+with the bandwidth limitations of the ISP. As the subsampling factor is
+restricted to powers of two, the scaler is further used to achieve the exact
+desired size.
+
+.. graphviz:: scaler-fast.dot
+   :caption: Fast scaling
+
+On the other hand, when capturing still images, the full image should be
+processed through the pipeline and resized at the very end using the higher
+quality scaler.
+
+.. graphviz:: scaler-hq.dot
+   :caption: High quality scaling
+
+Using the traditional V4L2 API on video nodes, the bridge driver configures the
+internal pipeline based on the desired capture format. As the use cases above
+produce the same format at the output of the pipeline, the bridge driver won't
+be able to differentiate between them and configure the pipeline appropriately
+for each use case. To solve this problem, the V4L2 subdevice userspace API let
+applications access formats on pads directly.
+
+Formats on subdevice pads are called `media bus formats`. They are described by
+the ``v4l2_mbus_framefmt`` structure:
+
+.. code-block:: c
+
+   struct v4l2_mbus_framefmt {
+   	__u32			width;
+   	__u32			height;
+   	__u32			code;
+   	__u32			field;
+   	__u32			colorspace;
+   	union {
+   		__u16			ycbcr_enc;
+   		__u16			hsv_enc;
+   	};
+   	__u16			quantization;
+   	__u16			xfer_func;
+   	__u16			flags;
+   	__u16			reserved[10];
+   };
+
+Unlike the pixel formats used on video devices, which describe how image data
+is stored in memory (using the ``v4l2_pix_format`` and
+``v4l2_pix_format_mplane`` structures), media bus formats describe how image
+data is transmitted on buses between subdevices.  The ``bytesperline`` and
+``sizeimage`` fields of the pixel format are thus not found in the media bus
+formats, as they refer to memory sizes.
+
+This difference between the two concepts causes a second difference between the
+media bus and pixel format structures. The FourCC values used to described
+pixel formats are not applicable to bus formats, as they also describe data
+organization in memory.  Media bus formats instead use `format codes` that
+describe how individual bits are organized and transferred on a bus. The format
+codes are 32-bit numerical values defined by the ``MEDIA_BUS_FMT_*`` macros and
+are documented in the `Media Bus Formats`_ section of the V4L2 API
+documentation.
+
+.. _Media Bus Formats: https://linuxtv.org/downloads/v4l-dvb-apis/userspace-api/v4l/subdev-formats.html>
+
+.. note::
+
+   In the remaining of this document, the terms `media bus format`, `bus
+   format` or `format`, when applying to subdevice pads, refers to the
+   combination of all fields of the ``v4l2_mbus_framefmt`` structure. To refer
+   to the media bus format code specifically, the terms `media bus code`,
+   `format code` or `code` will be used.
+
+In general, there is no 1:1 universal mapping between pixel formats and media
+bus formats. To understand this, let's consider the
+``MEDIA_BUS_FMT_UYVY8_1X16`` media bus code that describes on common way to
+transmit YUV 4:2:2 data on a 16-bit parallel bus. When the image data reaches
+the DMA engine at the end of the pipeline and is written to memory, it can be
+rearranged in different ways, producing for instance the ``V4L2_PIX_FMT_UYVY``
+packed pixel format that seem to be a direct match, but also the semi-planar
+``V4L2_PIX_FMT_NV16`` format by writing the luma and chroma data to separate
+memory planes. How a media bus code is translated to pixel formats depends on
+the capabilities of the DMA engine, and is thus device-specific.
+