Video and Audio Technology- HDR Vivid Technology Application Exploration

Authors, China Mobile Migu, Li Lin, Lighthouse

Beijing Winter Olympics live broadcast. The core of this exploration is, one is the mobile terminal solution, the other is soft decoding and soft rendering, and the third is the application breakthrough of the live broadcast scene , the purpose is to make the picture texture more delicate, the main body more prominent, and more realistic visual effects. In the lighthouse article of the previous team, we made a special analysis, and took advantage of the May Day holiday to review our exploration process.

1. Understand HDR first

HDR in the field of computer graphics and cinematography is a group of technologies used to achieve a larger dynamic range of exposure (ie, a greater difference between light and dark) than ordinary digital image technology. The goal is to correctly represent the real-world brightness range from direct sunlight to the darkest shadows.

This technology was first applied to photography to make dark scenes brighter and more transparent. Later, the Ultra HD Forum listed HDR technology as one of the technologies that the UHD TV standard needs to support. In 2020, Apple announced that the iPhone 12 series mobile phones support HDR video, realizing the expansion of HDR to mobile phones. At present, the HDR technical standards used by TV and mobile terminals mainly include: Dolby Vision, HDR10, HDR10+ and HLG (Hybrid Log-Gamma), etc. These technical standards define a larger brightness range, wider color gamut, larger bit depth, different gamma-corrected conversion functions, and new codec methods compared to traditional standard dynamic range images (SDR).

In 2020, China released the HDR Vivid technology group standard, aiming to enhance the influence of ultra-high-definition video core key technical standards and rapidly promote the development of the ultra-high-definition video industry.

2. Key features

Generally speaking, the brightness effect of HDR video can be closer to the human eye's perception of the physical world, and the color can be closer to the actual life scene seen by the human eye. As shown in the figure below, compared with the SDR on the left, the HDR video has better details in the bright and dark parts, so the effect is more realistic and natural.

HDR Vivid , compared with the traditional standard dynamic range (SDR), has great advantages in many technical parameters such as bit depth, color gamut, maximum brightness, dynamic metadata and its adjustment, intelligent mapping, etc. . Under 4K/8K high resolution, the peak brightness of HDR Vivid can reach up to 10,000 nits; based on the wide color gamut of BT.2020, it can contain up to 68.7 billion colors.

insert image description here
Figure 1 SDR HDR effect comparison diagram

1. Color

Color is derived from the reflection of light or the light emitted by itself. Does light really have color? Sir Newton wrote in 1671: "Light, precisely, is colorless, having no other constituents than those particular energies and configurations which give rise to this or that color."

The color we see is the feeling of color produced by the photoreceptor cells in the fundus of the eye in response to the information carried in the light source. As shown in the figure below, it seems that the spectra obtained by analyzing different light sources that are the same as white are actually very different. Therefore, color is the subjective response of objective light in the human eye, and its measurement cannot be accomplished by simple color classification.

insert image description here
Figure 2 Daylight, incandescent lamps, fluorescent lamps, same color but different spectrum

How do humans measure color scientifically? Next, we analyze from two aspects of photometry and colorimetry.

2, optical power

In the beginning, photography/video was black and white, so there was no chroma, and it was mainly realized by quantizing intensity and brightness. Photometry is the corresponding metrology subject in the visible light band, taking into account the subjective factors of the human eye. Photometry was established by Lambert in 1760 , and defined the main optical photometric parameters such as luminous flux, luminous intensity, illuminance, brightness, etc., and used mathematics to clarify the relationship between them and several important laws of photometry, such as the superposition of illuminance Law, distance square ratio law, cosine law of illuminance, etc. These laws have been used all the time, and practice has proved to be correct.

The most important indicator is luminosity. The concept is the luminous flux emitted per unit projected area and unit solid angle, and the unit is nit . Brightness is a unit of measurement that superimposes the perception of the human eye, and the unit is nit. It can represent the brightness and darkness of the video content.

3. Chromaticity

Later, with the development of technology, photography and video support the collection and presentation of color, so colorimetry was introduced to improve the scientific measurement of color.

At first, it was based on the three-color theory. In the 1920s, the academic community initiated a series of color matching experiments to realize the quantification of chromaticity. Later, the International Commission on Illumination (CIE) gave CIE's various conclusions and standards about color based on their experiments.

The direct experimental result (CIE1931RGB) will have a negative value, which is inconvenient and not easy to understand. Therefore, a mathematical conversion is required to convert from three-dimensional RGB to two-dimensional RG, but the direct conversion result has a negative value and is inconvenient to read. . The three primary colors of the RGB system are replaced by the imaginary three primary colors XYZ , and the coordinate transformation is made to form our most common CIE1931 xy chromaticity diagram.

insert image description here
Figure 3 Comparison of CIE1931RGB and CIE1931 xy horseshoe diagrams

This is the complete set of the color gamut that humans can see in theory, combined with the brightness measurement index introduced earlier, both SDR and HDR use this as the standard basis.

The color gamut (BT.709) and brightness range (100 nit) of the early SDR standard were relatively small. With the further development of technology, the SDR standard can no longer meet the current demand for more realistic effects, and HDR technology has emerged as the times require. HDR video has three core features: high dynamic range, wide color gamut, and high bit depth.

4. High dynamic range

Compared with conventional SDR, a larger brightness range can be displayed. **For example, if you take a picture against the window indoors during the day, you will find that the window will be overexposed if you take a clear picture of the room; on the contrary, if you take a clear picture of the window, the indoor part will appear very dark. **This phenomenon is caused by insufficient dynamic range. The diagram below illustrates the brightest and darkest ranges that the human eye can perceive, and the brightness ranges defined by HDR and SDR.

insert image description here

Figure 4 Schematic diagram of the brightness range defined by HDR and SDR (source LiveVideoStack)

5. Wide color gamut

HDR is capable of representing a wider color gamut than current SDR video systems. The SDR video uses the BT.709 color space, which can only cover 35.9% of the CIE 1931 color space , while the HDR video using BT.2020 can account for 75.8% . In this way, the image presented on the display is closer to the actual physical world.

insert image description here
Figure 5 Comparison of HDR color volume space and SDR

6. High color depth

The traditional SDR video is represented by 8 bit bit depth, and the BT.2020 standard has been increased to 10 bit/12 bit, which makes the grayscale transition smoother and can improve the detail performance of the image, which means that the image has richer color performance .

7. Photoelectric conversion

HDR video typically uses two standard formats for optical-to-electrical conversion, as shown in Figure 7. They are HLG (BBC is based on ARIBSTD-B67 standard), and PQ standard format (Dolby is based on SMPTE ST.2084 standard) . Comparing the two standards, the HLG format allows output brightness levels from 0.01 to around 5000 nits and has the advantage of being compatible with SDR, and the PQ format allows output brightness levels from 0.0001 nits to 10,000 nits.

insert image description here

Figure 6 Comparison of standard parameters between PQ and HLG

The comparison of photoelectric conversion curves of SDR, PQ and HLG is shown in Figure 8, and gamma 2.4 is the photoelectric conversion curve of SDR. Compared with the conversion curve of SDR, PQ and HLG have nonlinear expansion in low light area and high light area, and have higher dynamic range. Various standards provide the industry with solutions for different scenarios, but bring inconvenience to the development of the industry.

insert image description here
Figure 7 Comparison of SDR, PQ and HLG photoelectric conversion curves

3. Background of standardization

1. Limitations of current standards

Affected by the international environment, the country attaches great importance to the health, safety and sustainable development of the ultra-high-definition industry, and has introduced a series of policies and measures to support the application of self-controllable ultra-high-definition technology. However, at this stage, the HDR technology industry is facing great challenges, which are mainly reflected in three aspects: First, the high patent fees of some technical solutions lead to high cost of the industrial chain, the supporting equipment has not formed a scale, and the ecology is fragmented. Second, multiple HDR technical standards coexist, and the compatibility between the standards is poor, which cannot cover the adaptation, certification and testing process of mainstream terminals , resulting in significant differences in terminal presentation effects, and it is difficult for users to obtain a consistent visual experience. The third is that the traditional HDR production process is complicated , the ultra-high-definition film sources using HDR technology are scarce, the supply of high-quality film sources is insufficient, the ultra-high-definition channel area is small, and the duration is short, and the demand for ultra-high-definition on the user side is suppressed. Based on the above background, HDR Vivid (HDR Vivid) technology standard came into being.

Compared with the closed HDR technology in the industry, the HDR Vivid technology standard is relatively open, has industrial security requirements, and is suitable for end-to-end industrial deployment by all parties in the ultra-high-definition industry ecology.

insert image description here
Figure 8 Schematic diagram of industrial deployment of HDR Vivid

2. Improvement of HDR Vivid standard

From the perspective of the entire display adaptation process of the terminal, it is first necessary to input a video source, generate dynamic metadata, and then perform tone mapping and chroma mapping to match the display capabilities of different screens. Therefore, for HDR Vivid, there are three core technical principles: one is dynamic metadata (Metadata), the other is tone mapping (Tone Mapping), and the third is saturation adjustment (Saturation Adjustment) . Compared with other HDR standards, HDR Vivid has the following three characteristics.

1) Openness and security : The HDR Vivid technical standard publicly and completely defines the process of ultra-high-definition video rendering and processing HDR, and ensures the faithful restoration of high-quality video from the method. On the basis of the existing HDR, by adding dynamic metadata to provide a more accurate dynamic range mapping method for the display terminal, so as to restore the original artistic effect of the HDR content to the greatest extent. At the same time, HDR Vivid embeds dynamic metadata from the head-end production to ensure the realizability of the terminal and the controllable and predictable processing effect. Compared with the industry's proprietary HDR technology, this method is a more open and universal technical standard and open solution. For all parties in the industry, technology openness and security are more suitable for industrial deployment.

2) Compatibility and scalability : Add dynamic metadata based on HDR video content, but do not modify the original HDR video content; support BT.2100 PQ & HLG HDR input, dynamic metadata can be manually debugged or automatically generated with the help of post-processing tools ; Various current and future mainstream video encoding methods (H.265/H.266/AVS2/AVS3...) can be used; HDR Vivid defines how HDR display terminals perform Tone Mapping based on dynamic metadata for display adaptation; for decoding For terminals (without direct display capability), HDR Vivid defines how to perform format conversion to adapt to different display devices and is backward compatible with stock devices.

3) Industry-leading color expression ability : the curve used has a total of 4 segmented intervals and 6 interpolation points, including a curve corresponding to the dark area and a cubic spline curve, a basic curve corresponding to the normal scene and a cubic curve corresponding to the bright area Splines. Compared with PQ and HLG, HDR Vivid has finer adjustments in dark and bright areas, and can retain richer detail information.

4. Practice of technology research and development

1. Explore

In response to our own business needs, we implemented the application of HDR Vivid technology end-to-end, and provided HDR Vivid on-demand live content services at the 2021 European Cup and the 2022 Winter Olympics.

insert image description here
Figure 9 End-to-end flow implementation scheme

The team excavated the technical features of HDR Vivid from the following aspects, enhanced the realism of content color, brought users a more realistic and delicate game viewing experience, immersively experienced the fiery passion of the European Cup in summer, and appreciated the extreme speed of the Beijing Winter Olympics ice and snow events charm.

  1. Support frame-level dynamic metadata adjustment , support high dynamic range from 0.05nits to 10000nits, the mapping curve introduces more segmented intervals and more complex cubic spline curves than the PQ curve, in the dark area below 100nit and above 2000nit The bright areas of the PQ have a finer luminance map than the PQ, resulting in finer texture details.
  2. Combined with Migu sports business, special optimization for Winter Olympics, football, basketball sports scenes ; dynamic processing for ice and snow highlight scenes, use ROI-based skin color optimization algorithm for close-up shots, slow-motion shots, etc., to show more realistic perception.
  3. Frame-by-frame color correction, different image processing is carried out for outdoor highlight areas (high brightness) and night scenes (low brightness), to enhance underexposure, show more information in dark areas, and simulate filling details in bright areas . Combined with time domain information, more details can be displayed without flickering. For the color area in the middle, the color enhancement algorithm is used: by using the light map as an intermediary, a balance is achieved in the three aspects of dynamic range compression, color enhancement and color constancy, so as to avoid the introduction of distortion, noise, color blocks and other problems when the color is enhanced.
    4. Under the background that the chip fails to widely support HDR Vivid, the team first realized the live broadcast business verification of HDR Vivid soft rendering at the Beijing Winter Olympics , providing a new solution and practical experience for HDR Vivid content playback on mobile terminals.

insert image description here
Figure 10 Migu video HDR Vivid and SDR comparison chart

2. Industry

HDR technology is not a single-point technology, but an end-to-end ecosystem. Starting from the shooting of the front-end camera, including post-production, codec, and presentation, it is necessary to support and apply the corresponding HDR standard.

In every link of the entire ecosystem, there are many core technology suppliers and manufacturers to support. Due to the inconsistency of technologies among various standards, the competition among various companies is also extremely fierce, forming the current situation that the contradictions between openness and commercial authorization, effect and compatibility continue to exist. See the table below for a comparison of the main standards .

insert image description here
Table 1 Comparison of current mainstream HDR standards

In the field of chips , HiSilicon and MediaTek released chip products that support HDR Vivid, and mobile phone chips such as Kirin 990 and Dimensity 9000 support HDR Vivid technology standards.

In the field of terminals , with the support of chips, TV terminal manufacturers such as Sharp, Skyworth, Xiaomi, Hisense, TCL, Changhong, OPPO, Samsung, and LG will have more models to support in 2022. Mobile terminal manufacturers such as Honor, Vivo, OPPO and Xiaomi have begun to support HDR Vivid product version planning. The innovative demonstration application of IPTV/OTT by telecom operators such as China Mobile will promote the launch of products of set-top box manufacturers such as Jiulian, Skyworth, FiberHome, Chaoge, Inspur, and ZTE that support this standard.

Content production , audio-visual content service platforms such as Migu, Tencent, iQiyi, Huawei, Youku, etc., will gradually launch HDR Vivid on-demand live content services from 2021. We have also made key deployments and many attempts in terms of content. For the 2021 European Cup, we used HDR Vivid technology for the first time to broadcast live broadcasts of major international sports events. At the 2022 Beijing Winter Olympics, we will test the HDR Vivid live broadcast service on the mobile terminal through software decoding and rendering. As the main drafter, Migu participated in the formulation of HDR Vivid-related technical standards, covering the full-link production standards in the field of HDR technology, and promoting the standardized development of HDR Vivid technology.

3. Challenge

The current HDR Vivid industry is still facing certain challenges and opportunities.

1) The efficiency of HDR Vivid acquisition and production needs to be improved urgently . It is still necessary to greatly improve production efficiency, lower the production threshold, and facilitate the high-quality production of HDR Vivid content. It is recommended that non-linear editing software can easily and quickly generate HDR Vivid dynamic metadata and master files. For example, only one process node needs to be added to the existing production environment to generate master files after HDR grading, and fine-tuning can be performed at the same time, which can greatly reduce the cumbersome steps of large-scale HDR Vivid production and further promote industrial development.

2) Terminals supporting HDR Vivid display capability are gradually scaled up . The content service platform will continue to explore HDR vivid landing scenarios, with layouts and explorations for both large and small screens and cloud games. The new scene promotes the industrial implementation of HDR Vivid technical capabilities and promotes the diversified supply of HDR Vivid content.

3) The advantages of HDR Vivid technology need to be fully utilized . Under multiple scenarios, HDR Vivid will face the test of users in the future. How to further tap the potential of HDR Vivid technology and truly realize world-class HDR display rendering effects requires the joint efforts of the entire industry.

Guess you like

Origin blog.csdn.net/weixin_47700780/article/details/124554643