High-performance Unity player implementation solution for mobile terminals

Preliminary summary:
The audio-visual experience is evolving again - how to comprehensively upgrade your video application within 24 hours and
how to create a new era of terminal playback products?

With the emergence of new gameplay methods such as VR, AR, and the Metaverse, the demand for video playback on the Unity platform has gradually increased. For example, the following two animations are real cases on Baidu. The former is a concert scene, and the latter is a lecture venue.

Through these two cases, we can truly feel the changes brought by the combination of virtual and real to the traditional audio-visual experience.
Insert image description here
Insert image description here
The current Unity player solutions have the following two types:

  1. Unity comes with VideoPlayer, but the disadvantage is that it supports few formats
  2. Unity player plug-ins launched by some overseas companies, such as AVPro, EasyMovieTexture, etc., have rich format support, but are expensive, and technical support from overseas companies is also limited.

Therefore, you can see many articles on the Internet about how to implement the Unity player SDK by yourself. They basically have the following idea:
take out the YUV data from the hardware decoder -> send it to the Unity context for rendering.

This idea is very simple and direct, and the problem is also obvious

  1. At 4K high resolution, both data copying and data transfer between CPU and GPU will cause serious performance losses. Even at low resolutions, there are serious performance issues on some low-performance devices.
  2. For Android MediaCodec, strictly speaking, the method of extracting YUV data is not a standard method, so the implementation of some manufacturers (such as MTK) will have various problems. It has laid a very hidden pit for developers.

High performance solution

In view of the above reasons, we seek higher performance solutions in actual business scenarios. The specific plan ideas are as follows:

  1. Let the Unity environment and the playback kernel share the rendering context
  2. The player kernel draws video data onto one/multiple texture ids
  3. Unity gets the texture id and renders it

This method avoids the copying of data between the CPU and GPU, and completely solves the problem of high-resolution video rendering lag. The measured CPU usage can be reduced by 50% at 4K resolution.

Go one step further, a complete 3D experience

In VR, AR, and metaverse scenarios, videos have a 3D experience, so why not go a step further and make audio also have a 3D effect?

Based on this idea, we introduced the 6DoF panoramic spatial audio solution. The x, y, and z parameters in 6DoF correspond to displacement information, and yaw, pitch, and roll correspond to rotation information. As shown in the figure below, in the Unity scene, according to the audience and
Insert image description here
video By setting specific 6DoF parameters for the spatial position of the picture, you can make the sounds in the scene also have a sense of space.

In this way, we can bring users a complete visual and auditory 3D experience.

how to use

The above technologies have been opened in Baidu Smart Cloud Unity Player SDK , and have been put into practical application in Baidu business. The two examples at the beginning of the article were completed using our SDK.

Developers are welcome to access, test, and build a more perfect VR/AR/Metaverse audio-visual experience.

Guess you like

Origin blog.csdn.net/nonmarking/article/details/130229986