"Special Express" Multimedia content understanding, video cloud large model algorithm practice, AI computing power cloud exploration, FreeSWITCH docking artificial intelligence

AI is increasingly used in the field of multimedia. Through technologies such as deep learning and neural networks, artificial intelligence can improve the clarity of video images, reduce noise, and improve colors, allowing users to watch more realistic and clear images. Through the training of models and algorithms, artificial intelligence can identify and understand content, making multimedia content easier to search and manage.

In this conference, we will delve into the integration of AI and multimedia and share the latest technological progress and application cases. At the same time, we will also discuss practices such as large models of artificial intelligence and edge computing, and look forward to the future development of AI in the multimedia field.

01

Mango TV long video content understanding, retrieval and application innovation

Zheng Xiaozhi

Mango TV Algorithm Product Manager

How to quickly and accurately understand or locate target content in Mango TV's massive media resources to improve efficiency and generate rich content application products, such as: member fixed-point interaction, precise advertising, content production, program operations, etc., is Mango's actual business production Difficulties and key points faced in the process. In order to improve the reuse rate of massive inventory content and achieve rapid response to high-quality demands on the business side while ensuring service efficiency and accuracy, the development and innovation of long video content understanding and retrieval technology are needed to protect it.

This sharing will be divided into four parts. The first part introduces the business scenarios and specific reasons for the need for long video content understanding and retrieval. The second part introduces the difficulties and challenges in technical practice, including multi-modal information representation and fusion issues, Time-series information retrieval fusion, high-precision or high-recall retrieval standards, high-speed response of large models with guaranteed accuracy, etc. The third part introduces core technology solutions, including key feature extraction of storyboard-level clips, multi-modal visual language model training, Methods such as structured analysis of time series information are commonly used. The fourth part shows the application cases and effects in the actual business production of Mango TV. Through the above four parts, we will systematically introduce the innovative practice of long video content understanding and retrieval technology in Mango.

02

Alibaba Cloud Video Cloud Large Model Algorithm Practice under the New AI Paradigm

Liu Guodong

Alibaba Cloud Intelligence Senior Algorithm Expert

In the era of artificial intelligence, AI technology has fully penetrated into all fields of the audio and video industry, covering the entire audio and video link of collection, production, processing, transmission, distribution, and consumption. However, the industry has always faced the challenge of better experience, smarter, and more inclusive Require. The outstanding performance of disruptive large model technologies represented by ChatGPT and Midjourney in video understanding and generation has given the industry hope to meet these higher needs. Alibaba Cloud Video Cloud explores new possibilities for video cloud under the new AI paradigm by exploring and practicing large-model innovative technologies.

This time, we will share the Alibaba Cloud Video Cloud large model algorithm system architecture and key technologies in practical operation, show typical practical cases of large model algorithms, and explore more possibilities for the implementation of large models.

03

Heterogeneous convergence is future ready——

Exploration of Netcenter Technology AI Computing Power Cloud

Qu Xin

Vice President of Net Center Technology

Large models are redefining thousands of industries. How can companies break through computing power bottlenecks and blockades? This speech will be based on the latest trends in the global large model upstream and downstream industries, and explore the challenges of the continued high cost of inference computing power and the difficulty of existing resources to meet the growing demand. Relying on the distributed edge inference platform, Netcenter Technology integrates diversified computing resources and uses cloud native and virtualization technologies to optimize resource allocation, significantly reducing the cost of AI inference and helping all walks of life accelerate their embrace of the AGI era.

04

FreeSWITCH connects SIP, RTC and artificial intelligence

Dujinfang

Little Cherry CTO

The emergence of ChatGPT and various large models has pushed artificial intelligence to its peak, and various AI tools and chat applications are emerging one after another. This sharing uses the FreeSWITCH open source project to connect various large models and artificial intelligence interfaces, allowing RTC and traditional SIP equipment (video conferencing terminals, PSTN phones, etc.) to talk about life with large models.

The content covers the characteristics, technical points, video demonstrations, etc. of RTC and large model communication, and shares implementation plans and pitfall experiences.

LiveVideoStackCon 2023 Audio and Video Technology Conference Shenzhen Station sincerely invites you to participate.

Time: November 24-25, 2023

Location: Shenzhen Sentosa Hotel (Jade Branch)

Inquiry: 13520771810 (same number on WeChat), [email protected]

Microsoft launches new "Windows App" .NET 8 officially GA, the latest LTS version Xiaomi officially announced that Xiaomi Vela is fully open source, and the underlying kernel is NuttX Alibaba Cloud 11.12 The cause of the failure is exposed: Access Key Service (Access Key) exception Vite 5 officially released GitHub report : TypeScript replaces Java and becomes the third most popular language Offering a reward of hundreds of thousands of dollars to rewrite Prettier in Rust Asking the open source author "Is the project still alive?" Very rude and disrespectful Bytedance: Using AI to automatically tune Linux kernel parameter operators Magic operation: disconnect the network in the background, deactivate the broadband account, and force the user to change the optical modem
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/3521704/blog/10149290