diff options
Diffstat (limited to 'mc-v4l2-api.rst')
| -rw-r--r-- | mc-v4l2-api.rst | 289 | 
1 files changed, 289 insertions, 0 deletions
diff --git a/mc-v4l2-api.rst b/mc-v4l2-api.rst new file mode 100644 index 0000000..5ac0c01 --- /dev/null +++ b/mc-v4l2-api.rst @@ -0,0 +1,289 @@ +.. SPDX-License-Identifier: CC-BY-SA-4.0 + +Media Controller and V4L2 Subdevice APIs +======================================== + +The term `Media Controller` usually covers two distinct APIs: + +- The Media Controller (MC) API itself, whose task it is to expose the internal +  topology of the device to applications. + +- The V4L2 subdevice userspace API, which exposes low-level control of +  individual subdevices to applications. + +Collectively, and in collaboration with the V4L2 API, these offer the features +needed by applications to control complex video capture devices. + + +The Media Controller API +------------------------ + +.. _media-ctl: https://git.linuxtv.org/v4l-utils.git/tree/utils/media-ctl + +The Media Controller kernel framework and userspace API model devices as a +directed acyclic graph of `entities`. Each entity represents a hardware block, +which can be an external on-board component, an IP core in the SoC, or a piece +of either of those. The API doesn't precisely define how a device should be +split in entities. Individual drivers decide on the exact model they want to +expose, to allow fine-grained control of the hardware blocks while minimizing +the number of entities to avoid unnecessary complexity. + +Entities include `pads`, which model input and output ports through which +entities receive or produce data. Data inputs are called `sinks`, and data +outputs `sources`. The data flow through the graph is modelled by `links` that +connect sources to sinks. Each link connects one source pad of an entity to a +sink pad of another entity. Cycles in the graph are not allowed. + +.. note:: + +   The Media Controller API is not limited to video capture devices and has +   been designed to model any type of data flow in a media device. This +   includes, for instance, audio and display devices. However, as of version +   6.0 of the kernel, the API is only used in the Linux media subsystem, by +   V4L2 and DVB drivers, and hasn't made its way to the ALSA and DRM/KMS +   subsystems. + +When used in a V4L2 driver, an entity models either a video device (``struct +video_device``) or a subdevice (``struct v4l2_subdevice``). For video capture +devices, subdevices represent video sources (camera sensors, input connectors, +...) or processing elements, and video devices represent the connection to +system memory at the end of a pipeline (typically a DMA engine, but it can also +be a USB connection for USB webcams). The entity type is exposed to +applications as an entity `function`, for instance + +- ``MEDIA_ENT_F_CAM_SENSOR`` for a camera sensor +- ``MEDIA_ENT_F_PROC_VIDEO_SCALER`` for a video scaler +- ``MEDIA_ENT_F_IO_V4L`` for a connection to system memory through a V4L2 video +  device + +The kernel media controller device (``struct media_device``) is exposed to +userspace through a media device node, typically named ``/dev/media[0-9]+``. +The `media-ctl`_ tool can query the topology of a Media Controller device and +display it in either plain text (``--print-topology`` or ``-p``) or `DOT format +<https://graphviz.org/doc/info/lang.html>`_ (``--print-dot``). + +.. code-block:: sh + +   $ media-ctl -d /dev/media0 --print-dot | dot -Tsvg > omap3isp.svg + +The following graph represents the TI OMAP3 ISP, with entities corresponding to +subdevices in green and entities corresponding to video devices in yellow. + +.. graphviz:: omap3isp.dot +   :caption: Media graph of the TI OMAP3 ISP + +The ``mt9p031 2-0048`` on the top row is a camera sensor, all other entities +are internal to the OMAP3 SoC and part of the ISP. + +Entities, their pads, and the links are intrinsic properties of the device. +They are created by the driver at initialization time to model the hardware +topology. Unless parts of the device is hot-pluggable, no entities or links are +created or removed after initialization. Only their properties can be modified +by applications. + +Data flow routing is controlled by enabling or disabling links, using the +``MEDIA_LNK_FL_ENABLED`` link flag. Links that model immutable connections at +the hardware level are displayed as a thick plain line in the media graph. They +have the ``MEDIA_LNK_FL_IMMUTABLE`` and ``MEDIA_LNK_FL_ENABLED`` flags set and +can't be modified.  Links that model configurable routing options can be +controlled, and are displayed as a dotted line if they are disabled or as thin +plain line if they are enabled. + +As the Media Controller API can model any type of data flow, it doesn't expose +any property specific to a particular device type, such as, for instance, pixel +formats or frame rates. This is left to other, device-specific APIs + + +The V4L2 Subdevice Userspace API +-------------------------------- + +.. _V4L2 Subdevice Userspace API: https://linuxtv.org/downloads/v4l-dvb-apis/userspace-api/v4l/dev-subdev.html +.. _V4L2 controls ioctls: https://linuxtv.org/downloads/v4l-dvb-apis/userspace-api/v4l/vidioc-g-ext-ctrls.html +.. _v4l2-ctl: https://git.linuxtv.org/v4l-utils.git/tree/utils/v4l2-ctl + +The `V4L2 Subdevice Userspace API`_ (often shortened to just V4L2 Subdevice API +when this doesn't cause any ambiguity with the in-kernel V4L2 subdevice +operations) has been developed along the Media Controller API to expose to +applications the properties of entities corresponding to V4L2 subdevices. It +allows accessing V4L2 controls directly on a subdevice, as well as formats and +selection rectangles on the subdevice pads. + +Subdevices are exposed to userspace through V4L2 subdevice nodes, typically +named ``/dev/v4l-subdev[0-9]+``. They are controlled using ioctls in a similar +fashion as the V4L2 video devices. The `v4l2-ctl`_ tool supports a wide range +of subdevice-specific options to access subdevices from the command line (see +``v4l2-ctl --help-subdev`` for a detailed list). + +The rest of this document will use the NXP i.MX8MP ISP as an example. Its media +graph is as follows: + +.. graphviz:: imx8mp-isp.dot +   :caption: Media graph of the NXP i.MX8MP + +It contains the following V4L2 subdevices: + +- A raw camera sensor (``imx290 2-001a``), with a single source pad connected +  to the SoC through a MIPI CSI-2 link. +- A MIPI CSI-2 receiver (``csis-32e40000.csi``), internal to the SoC, that +  receives data from the sensor on its sink pad and provides it to the ISP on +  its source pad. +- An ISP (``rkisp1_isp``), with two sink pads that receive image data and +  processing parameters (0 and 1 respectively) and two source pads that output +  image data and statistcis (2 and 3 respectively). +- A scaler (``rkisp1_resizer_mainpath``) that can scale the frames up or down. + +It also contains the following video devices: + +- A capture device that writes video frames to memory (``rkisp1_mainpath``). +- A capture device that writes statistics to memory (``rkisp1_stats``). +- An output device that reads ISP parameters from memory (``rkisp1_params``). + + +V4L2 Subdevice Controls +~~~~~~~~~~~~~~~~~~~~~~~ + +Subdevice controls are accessed using the `V4L2 controls ioctls`_ in exactly +the same way as for video device, except that the ioctls should be issued on +the subdevice node. Tools that access controls on video devices can usually be +used unmodified on subdevices. For instance, to list the controls supported by +the IMX290 camera sensor subdevice, + +.. code-block:: none + +   $ v4l2-ctl -d /dev/v4l-subdev3 -l + +   User Controls + +                          exposure 0x00980911 (int)    : min=1 max=1123 step=1 default=1123 value=1123 + +   Camera Controls + +                camera_orientation 0x009a0922 (menu)   : min=0 max=2 default=0 value=0 (Front) flags=read-only +            camera_sensor_rotation 0x009a0923 (int)    : min=0 max=0 step=1 default=0 value=0 flags=read-only + +   Image Source Controls + +                 vertical_blanking 0x009e0901 (int)    : min=45 max=45 step=1 default=45 value=45 flags=read-only +               horizontal_blanking 0x009e0902 (int)    : min=280 max=280 step=1 default=280 value=280 flags=read-only +                     analogue_gain 0x009e0903 (int)    : min=0 max=240 step=1 default=0 value=0 + +   Image Processing Controls + +                    link_frequency 0x009f0901 (intmenu): min=0 max=1 default=0 value=0 (222750000 0xd46e530) flags=read-only +                        pixel_rate 0x009f0902 (int64)  : min=1 max=2147483647 step=1 default=178200000 value=178200000 flags=read-only +                      test_pattern 0x009f0903 (menu)   : min=0 max=7 default=0 value=0 (Disabled) + +By accessing controls on subdevices, applications can control the behaviour of +each subdevice independently. If multiple subdevices in the graph implement the +same control (such as a digital gain), those controls can be set individually. +This wouldn't be possible using with the traditional V4L2 API on video devices, +as the identical controls from two different subdevices would conflict. + + +.. _v4l2-subdevice-formats: + +V4L2 Subdevice Formats +~~~~~~~~~~~~~~~~~~~~~~ + +Where video devices expose only the format of the frames being captured to +memory, subdevices allow fine-grained configuration of formats on every pad in +the pipeline. This enables setting up pipelines with different internal +configurations to match precise use cases. To understand why this is needed, +let's consider the following simplified example, where a 12MP camera sensor +(IMX477) is connected to an SoC that includes an ISP and a scaler. + +.. graphviz:: scaler.dot +   :caption: Scaling pipeline + +All three components can affect the image size: + +- The camera sensor can subsample the image through mechanisms such as binning +  and skipping. +- The ISP can subsample the image horizontally through averaging. +- The scaler uses a polyphase filter for high quality scaling. + +All these components can further crop the image if desired. + +Different use cases will call for cropping and resizing the image in different +ways through the pipeline. Let's assume that, in all cases, we want to capture +1.5MP images from the 12MP native sensor resolution, When frame rate is more +important than quality, the sensor will typically subsample the image to comply +with the bandwidth limitations of the ISP. As the subsampling factor is +restricted to powers of two, the scaler is further used to achieve the exact +desired size. + +.. graphviz:: scaler-fast.dot +   :caption: Fast scaling + +On the other hand, when capturing still images, the full image should be +processed through the pipeline and resized at the very end using the higher +quality scaler. + +.. graphviz:: scaler-hq.dot +   :caption: High quality scaling + +Using the traditional V4L2 API on video nodes, the bridge driver configures the +internal pipeline based on the desired capture format. As the use cases above +produce the same format at the output of the pipeline, the bridge driver won't +be able to differentiate between them and configure the pipeline appropriately +for each use case. To solve this problem, the V4L2 subdevice userspace API let +applications access formats on pads directly. + +Formats on subdevice pads are called `media bus formats`. They are described by +the ``v4l2_mbus_framefmt`` structure: + +.. code-block:: c + +   struct v4l2_mbus_framefmt { +   	__u32			width; +   	__u32			height; +   	__u32			code; +   	__u32			field; +   	__u32			colorspace; +   	union { +   		__u16			ycbcr_enc; +   		__u16			hsv_enc; +   	}; +   	__u16			quantization; +   	__u16			xfer_func; +   	__u16			flags; +   	__u16			reserved[10]; +   }; + +Unlike the pixel formats used on video devices, which describe how image data +is stored in memory (using the ``v4l2_pix_format`` and +``v4l2_pix_format_mplane`` structures), media bus formats describe how image +data is transmitted on buses between subdevices.  The ``bytesperline`` and +``sizeimage`` fields of the pixel format are thus not found in the media bus +formats, as they refer to memory sizes. + +This difference between the two concepts causes a second difference between the +media bus and pixel format structures. The FourCC values used to described +pixel formats are not applicable to bus formats, as they also describe data +organization in memory.  Media bus formats instead use `format codes` that +describe how individual bits are organized and transferred on a bus. The format +codes are 32-bit numerical values defined by the ``MEDIA_BUS_FMT_*`` macros and +are documented in the `Media Bus Formats`_ section of the V4L2 API +documentation. + +.. _Media Bus Formats: https://linuxtv.org/downloads/v4l-dvb-apis/userspace-api/v4l/subdev-formats.html> + +.. note:: + +   In the remaining of this document, the terms `media bus format`, `bus +   format` or `format`, when applying to subdevice pads, refers to the +   combination of all fields of the ``v4l2_mbus_framefmt`` structure. To refer +   to the media bus format code specifically, the terms `media bus code`, +   `format code` or `code` will be used. + +In general, there is no 1:1 universal mapping between pixel formats and media +bus formats. To understand this, let's consider the +``MEDIA_BUS_FMT_UYVY8_1X16`` media bus code that describes on common way to +transmit YUV 4:2:2 data on a 16-bit parallel bus. When the image data reaches +the DMA engine at the end of the pipeline and is written to memory, it can be +rearranged in different ways, producing for instance the ``V4L2_PIX_FMT_UYVY`` +packed pixel format that seem to be a direct match, but also the semi-planar +``V4L2_PIX_FMT_NV16`` format by writing the luma and chroma data to separate +memory planes. How a media bus code is translated to pixel formats depends on +the capabilities of the DMA engine, and is thus device-specific. +  | 
