Elastic Computing Product Expert Zhang Xintao: Thinking and Practice of Alibaba Cloud Visual Computing

On March 23, 2023, the NVIDIA GTC developer conference Alibaba Cloud developer community viewing portal was officially opened. Alibaba Cloud Elastic Computing Product Expert Zhang Xintao presented a presentation titled "Reconstructing Computing, Driving Vision: Thinking and Practice of Alibaba Cloud Visual Computing" Sharing, the following is organized according to the content of his speech.

Thousands of years ago, there were words, and then there were newspapers and media. The public can obtain information through newspapers, but the information obtained in this way is very limited. More than 20 years ago, people began to enter the PC Internet era, and they were able to obtain information through graphics, text and even audio and video, and people also had a better information acquisition experience.
insert image description here

About 10 years ago, we entered the era of mobile Internet, and the acquisition and interaction of information has been further strengthened. But now, we can make video calls, shop, and order food anytime, anywhere, which might only appear in science fiction movies and TV dramas more than 30 years ago.

Today, we are beginning to explore the next-generation Internet based on immersive interaction, which can bring us more innovative experiences, but it also brings more technical and commercial challenges.
insert image description here

To achieve an immersive experience, you will face challenges at four levels, namely infrastructure support, material construction, application development, and application release.

First of all, to achieve high-quality, immersive experience in large scenes, high-quality screen rendering physics simulation and AI computing are required, which is obviously a huge challenge for computing infrastructure.

Secondly, high-quality content requires high-precision model and material construction, but the traditional manual method is time-consuming and laborious, and requires more high-precision and lower-cost material construction methods.

Third, the prosperity of the immersive application ecosystem requires development tools and platforms with lower thresholds, so that developers can create better 3D applications.

Finally, 3D applications with immersive experience will also face the adaptation of hundreds of terminals, and developers need to achieve better user experience expectations through adaptation. However, due to the limited computing power of computing devices, the popularization of immersive experience is also a huge challenge.

So, how does Alibaba Cloud help the industry to meet the challenges?
insert image description here

In the past three years, Alibaba Cloud has launched Alibaba Cloud Visual Computing product solutions. This is a writing matrix for visual computing scenarios. Together with partners, it provides a full-process solution for immersive interaction on the cloud.

In Alibaba Cloud's visual computing product solution, IaaS capability is the base for realizing immersive high-quality experience, and innovation at this level is also the key to solving infrastructure layer challenges.

At the IaaS layer, we have created an SDK for AI computing and image computing for immersive experience, so as to help the whole industry solve the problem of computing efficiency and further improve the capabilities of the infrastructure. The bigger challenge for immersive experience is still content development and writing. Therefore, we have built new products for material creation, application development and application distribution. From creative design to release operation, it helps customers build a complete business process and helps the industry improve writing efficiency.

In addition, we have also introduced industry editors and industry SaaS for various industries, providing low-code product solutions for the promotion of immersive experience in various industries.

With the product collaboration solution, in the past three years, we have cooperated with dozens of high-quality ISVs and completed dozens of immersive experience online business development.
insert image description here

The computing and communication requirements brought about by the immersive experience are unprecedented. There are mainly three types of computing loads, namely real-time 3D rendering, coding and streaming physical simulation, and real-time AI computing.

In response to the above requirements, in terms of computing, we have implemented a new generation of GPU cloud services based on Nvidia's A10 GPU, Alibaba Cloud's CIPU chip, and IaaS+ software acceleration capabilities, which have greatly improved AI computing and visual computing capabilities.

In addition to powerful computing power, network communication is another important link that determines the experience. On the user access side, Alibaba Cloud provides the acceleration capability of a global network to help users access services. Inside the data center, Alibaba Cloud achieves forwarding acceleration and latency reduction through the self-developed CIPU technology architecture. The minimum latency of Alibaba Cloud's VPC network can be as low as 16us. At the same time, the codec calculation optimization for streaming media is added, which further reduces the delay and improves the user experience.

Today, when customers carry out business innovation, we still recommend that users build and develop based on the cloud, because only the cloud can provide the computing infrastructure capabilities required for immersive experience.

With the computing infrastructure in place, it's time to start building your business.
insert image description here

In the 3D Internet era, 3D materials of people, objects, and places are the raw materials for building new applications. However, the current cost of building a high-clarity 3D model is still very high, because the human cost of the modeler is very considerable. On the other hand, the accuracy and efficiency of the algorithm for converting 2D to 3D is not satisfactory.

In the past few decades, many game studios and 3D engine communities have accumulated a large amount of high-quality materials. Due to the different formats and interfaces of the materials, effective transactions and inheritance cannot be formed. Making wheels has also seriously affected the development of the industry.

Therefore, we began to explore putting the construction and material management of 3D materials on the cloud, which has many new advantages.

Why put it on the cloud?

The reason is that new advances have been made in the fields of AI, 3D construction, and material format conversion, and its implementation requires large-scale computing and storage.

First of all, efficient and high-precision conversion from 2D to 3D can be achieved through AI reverse rendering, and more high-quality materials and content can be generated through AIGC.

Secondly, materials in different formats can be converted into a unified format through the super computing power and storage capacity on the cloud for further application development.
insert image description here

When it comes to the application development stage, there are still many problems. Taking traditional game development as an example, in the development of large scenes, it is impossible to edit at the same time, and it is necessary to wait for each other, which is extremely inefficient. In addition, in terms of time-consuming calculations, for example, the baking process takes up a lot of development time, resulting in reduced development efficiency. The traditional development environment has many restrictions and can only be carried out in a fixed place, which is not conducive to efficient creation.

Therefore, we began to consider moving the development of 3D applications to the cloud, based on the unlimited computing power and real-time online capabilities on the cloud, to help users improve development efficiency.
insert image description here

After completing the application development, you still need to face the problem of publishing.

Traditional 3D games and applications have to be adapted to hundreds of terminals before they are released, and the terminals have different shapes, such as mobile phones, tablets or XR, and their computing capabilities are also different. Therefore, it is difficult to realize the expectations of developers, and there are huge terminals Adaptation costs.

Therefore, we have launched the cloud XR platform for the release scenario on the cloud. Users can deploy their 3D applications on the cloud XR platform within a few minutes and publish the service to any corner of the world. The cloud XR platform integrates Nvidia's CloudXR suite, which further improves the rendering, encoding and streaming capabilities. At the same time, the platform also provides user management, application management and resource management capabilities, further improving the efficiency of business deployment and operation.

The cloud XR platform can support all current mainstream terminal devices. By deploying 3D applications in the cloud, the computing pressure on terminal devices is reduced. Users do not need to download huge client installation packages, and can access them anytime and anywhere, greatly improving the user experience. For developers, it also greatly improves development efficiency.
insert image description here

Based on the above product capabilities, we have helped users achieve many business breakthroughs and innovations in development scenarios such as virtual activities in the Internet industry, digital human construction, and 3D applications on the cloud.
insert image description here

In the 2022 Double Eleven Shopping Festival not long ago, Alimama and Jiangsu Satellite TV jointly created the Metaverse star singing concert "2060 Vowel Realm". The digital avatar of the character is also performing on the stage of the virtual space "Mantavos" continent.
insert image description here

The scene of Mantavos Continental is very rich, there are many venues such as brand pavilion, digital venue and center stage. The "Logos" spacecraft is the main transmission point. Users can listen to concerts, watch collections, and watch brand halls. At the same time, the concert is interactive in real time, and the audience can walk around the scene through virtual avatars.

Such a huge scene means that the amount of high-fidelity model data related to the scene, characters and maps is also huge. If the material is packaged into an installation package and run on the client terminal, the installation package will reach tens of GB.

The "2060 Vowel Realm" supported by the Alibaba Cloud XR platform enables tens of thousands of people to be on the scene at the same time. Viewers do not need to pre-install software in advance, and are not limited to terminals such as mobile phones, PCs, or tablets, and do not occupy any storage space. They only need to scan a code or link to enter.
insert image description here

Together with Bizhen Technology and Axis Factory partners, we created a multi-scene 3D sci-fi concert and completed 3 huge challenges.

First, in performance applications, the models of venues and characters are too large to be rendered through client-side rendering, requiring a light client or no client.

Second, in terms of high-fidelity model rendering and audio transmission, it is necessary to ensure that the cloud and the device are synchronized at the same frequency in order to have an excellent sense of immersion, which puts forward extremely high requirements for computing power and communication.

Third, the concert will be opened online, and a large number of players will flood in at any time, requiring a huge resource pool to carry hundreds of thousands or even millions of players and spectators.

Based on the Alibaba Cloud XR platform, we can easily deploy performances on the cloud, and complete rendering calculations and streaming on the cloud. At the same time, relying on the infrastructure of Alibaba Cloud all over the world, it can realize the same frequency experience of cloud and terminal, and can accommodate a large number of players and audiences at the same time.
insert image description here

Digital Life is a digital human technology service provider, as well as our customer and partner. He has produced famous virtual human IPs for many well-known companies.

In the digital human scene, the fidelity and agility of the digital human are the most important.

Realism requires extremely detailed models. The virtual human made by Digital Life has a total of hundreds of thousands of face + hair + clothing renderings, and it also needs to use AI to drive hundreds of landmarks on the face to imitate expressions, drive dozens of joint points on the body to imitate actions, and drive hair The natural fluttering with the clothes brings a very huge amount of calculation.

Another key to digital humans is smooth interaction and communication. In addition to producing precise facial expressions and highly simulated voices, extremely low latency and transmission are also required, and terminals are required to quickly recognize and respond to human facial expressions and emotions.

In response to the above two computing and communication requirements, Alibaba Cloud provides super rendering and AI computing capabilities, and relies on Alibaba Cloud's CIPU architecture to reduce latency and make interactions smoother. Alibaba Cloud's cloud XR platform has also greatly helped Digital Life to improve software delivery capabilities and update efficiency. The entire process of AI model, 3D model and material iteration, software delivery and deployment is completed based on the cloud, which greatly improves development efficiency and business delivery efficiency.

The cloud XR platform can automatically schedule resources on the cloud according to policies, and Digital Life does not need to care about the underlying resource planning.
insert image description here

In addition to text and voice, human-to-human communication also includes gestures and sign language. Digital Life and Qianbo Information have released their own sign language anchor Qianyan based on the cloud XR platform. The way of communicating with Qianyan is very natural. Users tell Qianyan through the voice and text connected to the cloud XR platform, and Qianyan will convert the voice and text into sign language that the hearing-impaired can understand.

Through AI computing and XR computing, a human-computer interaction method that is closer to natural people, such as Qianyan, can help more people enter the digital world and eliminate the digital divide.
insert image description here

Red Star Macalline is a leading home furnishing company in China. It uses a leading real-time 3D engine based on the backend of Alibaba Cloud to realize the SaaS of home design. Through the simple way of dragging, pulling, and dragging, what you think is what you get. After the store designer and customer design, they can submit the rendering.

Thanks to the rendering capability of the cloud and the powerful light-tracing capability of the GPU, the final effect can be produced within 10 minutes, and customers can experience every detail and corner of their ideal home personally.

We believe that similar scenarios and applications will emerge in other industries. That’s all for my sharing today, I hope it can bring you help and inspiration, thank you all.

Click "Read the original text" at the end of the article to watch the full video.

Elastic Computing Product Expert Zhang Xintao: Thinking and Practice of Alibaba Cloud Visual Computing

Guess you like

Origin blog.csdn.net/bjchenxu/article/details/129841516