.. SPDX-License-Identifier: CC-BY-SA-4.0 Introduction ============ .. note:: This document assumes familiarity of the reader with the concepts of the traditional `V4L2 API`_, excluding the Media Controller extensions. .. _V4L2 API: https://linuxtv.org/downloads/v4l-dvb-apis/userspace-api/v4l/v4l2.html The V4L2 history ---------------- When the original Video4Linux (V4L) API was created in 1999, the video capture devices available for Linux were mostly analog TV capture cards and early webcams (the first widespread USB webcam hit the market about a year later). From the point of view of the operating system, devices provided streams of frames ready to be consumed by applications, with a small set of high-level parameters to control the frame size or modify the image brightness and contrast. Those devices have shaped the API design. As they are fairly monolithic in the sense that they appear to the operating system as a black box with relative high-level controls, the V4L API exposed a device to userspace as one video device node in ``/dev`` with a set of ioctls to handle buffer management, format selection, stream control and access to parameters. Many mistakes in the original design were fixed in Video4Linux2 (V4L2), released in 2002. The original V4L API got deprecated in 2006 and removed from the Linux kernel in 2010. .. note:: While the V4L2 API supports both video capture and video output, this document mostly focusses on the former. V4L2 covers a wide range of features for both analog and digital video devices, including tuner and audio control, and has grown over time to accommodate more features as video capture devices became more complex. It can enumerate the device capabilities and parameters (supported video and audio inputs, formats, frame sizes and frame rates, cropping and composing, analog video standards and digital video timings, and control parameters), expose them to applications (with get, try and set access, and a negotiation mechanism), manage buffers (allocate, queue, dequeue and free them, with the ability to share buffers with other devices for zero copy operation through dmabuf), start and stop video streams, and report various conditions to applications through an event mechanism. The V4L2 API has proven its ability to be extended (from the 51 ioctls present in 2005 in version 2.6.12 of the kernel, 2 were removed and 33 added as of version 6.0 in 2022), but has and still retains the same monolithic device model as its predecessor. Modularity with V4L2 subdevices ------------------------------- As Linux moved towards the embedded space, the video capture devices started exposing the multiple hardware components they contained (such as camera sensors, TV tuners, video encoders and decoders, image signal processors, ...) to the operating system instead of hiding them in a black box. The same camera sensor or TV tuner could be used on different systems with different SoCs, calling for a different architecture inside the kernel that would enable code reuse. In 2008, the Linux media subsystem gained support for a modular model of video capture drivers. A new V4L2 subdevice object (``struct v4l2_subdev``) was created to model external hardware components and expose them to the rest of the kernel through an abstract API (``struct v4l2_subdev_ops``). The main driver, also called the bridge driver as it controls the components the bridge external devices with system memory, still creates the video devices (``struct video_device``) that are exposed to userspace, but translates and delegates the API calls from applications into calls to the appropriate subdevices. For instance, when an application sets a V4L2 control on the video device, the bridge driver will locate the subdevice that implements that control and forward it the set control call. The bridge driver also creates a top-level V4L2 device (``struct v4l2_device``) and registers it with the V4L2 framework core, to bind together the subdevices and video devices inside the kernel. This new model provided code reuse and modularity inside the kernel. .. figure:: subdev.svg Modularity with V4L2 subdevices The new model only addressed in-kernel issues and kept the monolithic V4L2 userspace API untouched. The relief it brought was short-lived, as development of the first Linux kernel driver for an image signal processor (the TI OMAP3 ISP) showed a need for lower-level control of device internals from applications. An ISP is a complex piece of hardware made of multiple processing blocks. Those blocks are assembled in image processing pipelines, and in many devices data routing within pipelines is configurable. Inline pipelines connect a video source (usually a raw Bayer camera sensor) to the ISP and process frames on the fly, writing fully processed images to memory. Offline pipelines first capture raw images to memory and process them in memory-to-memory mode. Hybrid architectures are also possible, and the same device may be configurable in different modes depending on the use case. With different devices having different processing blocks and different routing options, applications need to control data routing within the device. Furthermore, similar operations can often be performed in different places in the pipeline. For instance, both camera sensors and ISPs are able to scale down images, with the former usually offering lower-quality scaling than the latter, but with the ability to achieve higher frame rates. Digital gains and colour gains are also often found in both camera sensors and ISPs. As where to apply a given image processing operation is dependent on the use case, a bridge driver can't correctly decide how to delegate V4L2 API calls from applications to the appropriate V4L2 subdevice without hardcoding and restricting possible use cases. The OMAP3 ISP driver reached the limits of the monolithic V4L2 API. Two years of development were needed to fix this problem and finally merge, at the beginning of 2011, the Media Controller API in the kernel.