diff options
Diffstat (limited to 'mc-v4l2-api.rst')
-rw-r--r-- | mc-v4l2-api.rst | 289 |
1 files changed, 289 insertions, 0 deletions
diff --git a/mc-v4l2-api.rst b/mc-v4l2-api.rst new file mode 100644 index 0000000..5ac0c01 --- /dev/null +++ b/mc-v4l2-api.rst @@ -0,0 +1,289 @@ +.. SPDX-License-Identifier: CC-BY-SA-4.0 + +Media Controller and V4L2 Subdevice APIs +======================================== + +The term `Media Controller` usually covers two distinct APIs: + +- The Media Controller (MC) API itself, whose task it is to expose the internal + topology of the device to applications. + +- The V4L2 subdevice userspace API, which exposes low-level control of + individual subdevices to applications. + +Collectively, and in collaboration with the V4L2 API, these offer the features +needed by applications to control complex video capture devices. + + +The Media Controller API +------------------------ + +.. _media-ctl: https://git.linuxtv.org/v4l-utils.git/tree/utils/media-ctl + +The Media Controller kernel framework and userspace API model devices as a +directed acyclic graph of `entities`. Each entity represents a hardware block, +which can be an external on-board component, an IP core in the SoC, or a piece +of either of those. The API doesn't precisely define how a device should be +split in entities. Individual drivers decide on the exact model they want to +expose, to allow fine-grained control of the hardware blocks while minimizing +the number of entities to avoid unnecessary complexity. + +Entities include `pads`, which model input and output ports through which +entities receive or produce data. Data inputs are called `sinks`, and data +outputs `sources`. The data flow through the graph is modelled by `links` that +connect sources to sinks. Each link connects one source pad of an entity to a +sink pad of another entity. Cycles in the graph are not allowed. + +.. note:: + + The Media Controller API is not limited to video capture devices and has + been designed to model any type of data flow in a media device. This + includes, for instance, audio and display devices. However, as of version + 6.0 of the kernel, the API is only used in the Linux media subsystem, by + V4L2 and DVB drivers, and hasn't made its way to the ALSA and DRM/KMS + subsystems. + +When used in a V4L2 driver, an entity models either a video device (``struct +video_device``) or a subdevice (``struct v4l2_subdevice``). For video capture +devices, subdevices represent video sources (camera sensors, input connectors, +...) or processing elements, and video devices represent the connection to +system memory at the end of a pipeline (typically a DMA engine, but it can also +be a USB connection for USB webcams). The entity type is exposed to +applications as an entity `function`, for instance + +- ``MEDIA_ENT_F_CAM_SENSOR`` for a camera sensor +- ``MEDIA_ENT_F_PROC_VIDEO_SCALER`` for a video scaler +- ``MEDIA_ENT_F_IO_V4L`` for a connection to system memory through a V4L2 video + device + +The kernel media controller device (``struct media_device``) is exposed to +userspace through a media device node, typically named ``/dev/media[0-9]+``. +The `media-ctl`_ tool can query the topology of a Media Controller device and +display it in either plain text (``--print-topology`` or ``-p``) or `DOT format +<https://graphviz.org/doc/info/lang.html>`_ (``--print-dot``). + +.. code-block:: sh + + $ media-ctl -d /dev/media0 --print-dot | dot -Tsvg > omap3isp.svg + +The following graph represents the TI OMAP3 ISP, with entities corresponding to +subdevices in green and entities corresponding to video devices in yellow. + +.. graphviz:: omap3isp.dot + :caption: Media graph of the TI OMAP3 ISP + +The ``mt9p031 2-0048`` on the top row is a camera sensor, all other entities +are internal to the OMAP3 SoC and part of the ISP. + +Entities, their pads, and the links are intrinsic properties of the device. +They are created by the driver at initialization time to model the hardware +topology. Unless parts of the device is hot-pluggable, no entities or links are +created or removed after initialization. Only their properties can be modified +by applications. + +Data flow routing is controlled by enabling or disabling links, using the +``MEDIA_LNK_FL_ENABLED`` link flag. Links that model immutable connections at +the hardware level are displayed as a thick plain line in the media graph. They +have the ``MEDIA_LNK_FL_IMMUTABLE`` and ``MEDIA_LNK_FL_ENABLED`` flags set and +can't be modified. Links that model configurable routing options can be +controlled, and are displayed as a dotted line if they are disabled or as thin +plain line if they are enabled. + +As the Media Controller API can model any type of data flow, it doesn't expose +any property specific to a particular device type, such as, for instance, pixel +formats or frame rates. This is left to other, device-specific APIs + + +The V4L2 Subdevice Userspace API +-------------------------------- + +.. _V4L2 Subdevice Userspace API: https://linuxtv.org/downloads/v4l-dvb-apis/userspace-api/v4l/dev-subdev.html +.. _V4L2 controls ioctls: https://linuxtv.org/downloads/v4l-dvb-apis/userspace-api/v4l/vidioc-g-ext-ctrls.html +.. _v4l2-ctl: https://git.linuxtv.org/v4l-utils.git/tree/utils/v4l2-ctl + +The `V4L2 Subdevice Userspace API`_ (often shortened to just V4L2 Subdevice API +when this doesn't cause any ambiguity with the in-kernel V4L2 subdevice +operations) has been developed along the Media Controller API to expose to +applications the properties of entities corresponding to V4L2 subdevices. It +allows accessing V4L2 controls directly on a subdevice, as well as formats and +selection rectangles on the subdevice pads. + +Subdevices are exposed to userspace through V4L2 subdevice nodes, typically +named ``/dev/v4l-subdev[0-9]+``. They are controlled using ioctls in a similar +fashion as the V4L2 video devices. The `v4l2-ctl`_ tool supports a wide range +of subdevice-specific options to access subdevices from the command line (see +``v4l2-ctl --help-subdev`` for a detailed list). + +The rest of this document will use the NXP i.MX8MP ISP as an example. Its media +graph is as follows: + +.. graphviz:: imx8mp-isp.dot + :caption: Media graph of the NXP i.MX8MP + +It contains the following V4L2 subdevices: + +- A raw camera sensor (``imx290 2-001a``), with a single source pad connected + to the SoC through a MIPI CSI-2 link. +- A MIPI CSI-2 receiver (``csis-32e40000.csi``), internal to the SoC, that + receives data from the sensor on its sink pad and provides it to the ISP on + its source pad. +- An ISP (``rkisp1_isp``), with two sink pads that receive image data and + processing parameters (0 and 1 respectively) and two source pads that output + image data and statistcis (2 and 3 respectively). +- A scaler (``rkisp1_resizer_mainpath``) that can scale the frames up or down. + +It also contains the following video devices: + +- A capture device that writes video frames to memory (``rkisp1_mainpath``). +- A capture device that writes statistics to memory (``rkisp1_stats``). +- An output device that reads ISP parameters from memory (``rkisp1_params``). + + +V4L2 Subdevice Controls +~~~~~~~~~~~~~~~~~~~~~~~ + +Subdevice controls are accessed using the `V4L2 controls ioctls`_ in exactly +the same way as for video device, except that the ioctls should be issued on +the subdevice node. Tools that access controls on video devices can usually be +used unmodified on subdevices. For instance, to list the controls supported by +the IMX290 camera sensor subdevice, + +.. code-block:: none + + $ v4l2-ctl -d /dev/v4l-subdev3 -l + + User Controls + + exposure 0x00980911 (int) : min=1 max=1123 step=1 default=1123 value=1123 + + Camera Controls + + camera_orientation 0x009a0922 (menu) : min=0 max=2 default=0 value=0 (Front) flags=read-only + camera_sensor_rotation 0x009a0923 (int) : min=0 max=0 step=1 default=0 value=0 flags=read-only + + Image Source Controls + + vertical_blanking 0x009e0901 (int) : min=45 max=45 step=1 default=45 value=45 flags=read-only + horizontal_blanking 0x009e0902 (int) : min=280 max=280 step=1 default=280 value=280 flags=read-only + analogue_gain 0x009e0903 (int) : min=0 max=240 step=1 default=0 value=0 + + Image Processing Controls + + link_frequency 0x009f0901 (intmenu): min=0 max=1 default=0 value=0 (222750000 0xd46e530) flags=read-only + pixel_rate 0x009f0902 (int64) : min=1 max=2147483647 step=1 default=178200000 value=178200000 flags=read-only + test_pattern 0x009f0903 (menu) : min=0 max=7 default=0 value=0 (Disabled) + +By accessing controls on subdevices, applications can control the behaviour of +each subdevice independently. If multiple subdevices in the graph implement the +same control (such as a digital gain), those controls can be set individually. +This wouldn't be possible using with the traditional V4L2 API on video devices, +as the identical controls from two different subdevices would conflict. + + +.. _v4l2-subdevice-formats: + +V4L2 Subdevice Formats +~~~~~~~~~~~~~~~~~~~~~~ + +Where video devices expose only the format of the frames being captured to +memory, subdevices allow fine-grained configuration of formats on every pad in +the pipeline. This enables setting up pipelines with different internal +configurations to match precise use cases. To understand why this is needed, +let's consider the following simplified example, where a 12MP camera sensor +(IMX477) is connected to an SoC that includes an ISP and a scaler. + +.. graphviz:: scaler.dot + :caption: Scaling pipeline + +All three components can affect the image size: + +- The camera sensor can subsample the image through mechanisms such as binning + and skipping. +- The ISP can subsample the image horizontally through averaging. +- The scaler uses a polyphase filter for high quality scaling. + +All these components can further crop the image if desired. + +Different use cases will call for cropping and resizing the image in different +ways through the pipeline. Let's assume that, in all cases, we want to capture +1.5MP images from the 12MP native sensor resolution, When frame rate is more +important than quality, the sensor will typically subsample the image to comply +with the bandwidth limitations of the ISP. As the subsampling factor is +restricted to powers of two, the scaler is further used to achieve the exact +desired size. + +.. graphviz:: scaler-fast.dot + :caption: Fast scaling + +On the other hand, when capturing still images, the full image should be +processed through the pipeline and resized at the very end using the higher +quality scaler. + +.. graphviz:: scaler-hq.dot + :caption: High quality scaling + +Using the traditional V4L2 API on video nodes, the bridge driver configures the +internal pipeline based on the desired capture format. As the use cases above +produce the same format at the output of the pipeline, the bridge driver won't +be able to differentiate between them and configure the pipeline appropriately +for each use case. To solve this problem, the V4L2 subdevice userspace API let +applications access formats on pads directly. + +Formats on subdevice pads are called `media bus formats`. They are described by +the ``v4l2_mbus_framefmt`` structure: + +.. code-block:: c + + struct v4l2_mbus_framefmt { + __u32 width; + __u32 height; + __u32 code; + __u32 field; + __u32 colorspace; + union { + __u16 ycbcr_enc; + __u16 hsv_enc; + }; + __u16 quantization; + __u16 xfer_func; + __u16 flags; + __u16 reserved[10]; + }; + +Unlike the pixel formats used on video devices, which describe how image data +is stored in memory (using the ``v4l2_pix_format`` and +``v4l2_pix_format_mplane`` structures), media bus formats describe how image +data is transmitted on buses between subdevices. The ``bytesperline`` and +``sizeimage`` fields of the pixel format are thus not found in the media bus +formats, as they refer to memory sizes. + +This difference between the two concepts causes a second difference between the +media bus and pixel format structures. The FourCC values used to described +pixel formats are not applicable to bus formats, as they also describe data +organization in memory. Media bus formats instead use `format codes` that +describe how individual bits are organized and transferred on a bus. The format +codes are 32-bit numerical values defined by the ``MEDIA_BUS_FMT_*`` macros and +are documented in the `Media Bus Formats`_ section of the V4L2 API +documentation. + +.. _Media Bus Formats: https://linuxtv.org/downloads/v4l-dvb-apis/userspace-api/v4l/subdev-formats.html> + +.. note:: + + In the remaining of this document, the terms `media bus format`, `bus + format` or `format`, when applying to subdevice pads, refers to the + combination of all fields of the ``v4l2_mbus_framefmt`` structure. To refer + to the media bus format code specifically, the terms `media bus code`, + `format code` or `code` will be used. + +In general, there is no 1:1 universal mapping between pixel formats and media +bus formats. To understand this, let's consider the +``MEDIA_BUS_FMT_UYVY8_1X16`` media bus code that describes on common way to +transmit YUV 4:2:2 data on a 16-bit parallel bus. When the image data reaches +the DMA engine at the end of the pipeline and is written to memory, it can be +rearranged in different ways, producing for instance the ``V4L2_PIX_FMT_UYVY`` +packed pixel format that seem to be a direct match, but also the semi-planar +``V4L2_PIX_FMT_NV16`` format by writing the luma and chroma data to separate +memory planes. How a media bus code is translated to pixel formats depends on +the capabilities of the DMA engine, and is thus device-specific. + |