Immerse yourself in a new horizon · "Listen" to what you think, "See" what you want to see

34f235ddd98cc9a97e19bf7aed70bba7.png

873f4d1d4eaf3cbf28f5b765f833fe88.png

In a blink of an eye, the eighth month of this year is coming to an end. Looking back on the first half of the year, ChatGPT ignited a fire in the technology circle, and then AIGC tools in the audio and video fields such as Midjourney and Google MusicLM also emerged in an endless stream; "IPhone" always becomes the focus of attention.

Hotspots bring dizzying novelty, but also dazzling fatigue.

Fortunately, we ushered in LiveVideoStackCon in Shanghai's rainy July. As a post station for audio and video technicians, the technology conference is an opportunity to "slow down". Here is no longer as disorganized as the fragmented content in the flood of information, but what technical experts have honed for months. Like-minded friends come here to meet, chat, and recharge.

In the past month, the LiveVideoStackCon Organizing Committee conducted a comprehensive review of the Shanghai Station Conference and carefully sorted out the valuable suggestions provided by the friends who participated in the conference. To our surprise, video codec and audio technology are still the content that technical people most want to hear in the conference. Some friends suggested to us: setting audio technology to a short topic is not enough, it is better to expand the content to cover all aspects of audio technology...

Therefore, at the upcoming LiveVideoStackCon Shenzhen Station in November, the two topics "New Audio Experience" and "Video Codec and AI" will meet with you in the form of long topics! Now, we're going to spoil some of the speech for the benefit of our readers. An audio-visual technical feast cannot be without its menu, I believe you can appreciate the ingenuity of "ingredients" and "methods" from it.

91f52f21deb0a24bfcd0e414ef988857.png

Part.01

New audio experience

0c8bc2e790f5134f0f968c244fc52f23.png

a8938aee856e0168afd1c80c96f10767.jpeg

Shen Houzheng

Director of Audio Algorithm Group, Vivo Mobile Communications Co., Ltd. (vivo)

"Mobile Phone External Speaker Enhancement (Super Audio®) Algorithm"

Smartphones are the smart devices that people use the most. Sound is an important part of the audition experience. Improving the sound quality of external speakers and immersive stereo effects can significantly improve the user experience when using mobile phones. Due to the small size of the mobile phone and the pursuit of the ultimate appearance, the size of the speaker is small, the external sound is small, the low frequency is missing, the piano music is noisy, the frequency response performance is poor, and the speaker is prone to nonlinear distortion. The distance between the speakers is small, the width of the sound field is narrow, and the speakers are asymmetrical up and down, resulting in problems such as unbalanced sound from left to right. Vivo has developed virtual bass, loudness adaptive control, multi-segment dynamic range control, adaptive equalization, amplitude and temperature control, nonlinear compensation, and stereo enhancement algorithm through long-term research on speaker cavity and consumer preferences.

ddae8e1272107e98e1ecdf1dca75eda0.png

Chen Chao

Baidu YY live broadcast technical expert

"Design and Development of Ultra-Low Latency Audio Effect Algorithm for YY Live Streaming"

At present, the development of Metaverse and VR technology is bringing new opportunities and challenges to the online live broadcast business. Ultra-low audio latency is one of the key factors to ensure the live broadcast experience. For scenes that require low latency, such as online karaoke chorus, ensemble, etc., common sound effect modules may cause a delay of tens of milliseconds, which is a very big challenge.

After careful analysis of common sound effect algorithms in live broadcasts, guided by "zero delay", combined with signal processing and deep learning methods, we minimized the delay of the YY live broadcast sound effect module, and successfully supported the launch of YY live broadcast ultra-low-latency scenarios. At the same time, we also launched a VST version of the audio plug-in, which is convenient for independent use.

6a60d92c28251be0a6b0baf4a0bb9a8c.png

Ma Jinlong

Head of Media Algorithm, Fun Pill Technology

"Practice of Smart Audio Capabilities on Mobile Terminals"

With the continuous popularity of pan-entertainment and social networking and the rise of AIGC, more and more scenes need to use intelligent voice processing technology to assist content understanding and intelligent interaction. Therefore, it is particularly important and urgent to build intelligent voice technology on the end. For example, users can calibrate content in real time through on-device audio event detection and on-device speech recognition, providing technical support for understanding user intentions. At the same time, end-to-end speech recognition provides a low-cost solution for us to create AIGC-based intelligent interactive assistants.

8bd20559fbb364755363668190783054.png

Part.02

Video codec and AI

e82955f2635786c441335c811279d718.png

95a067d800c05429e23b741b530433c1.png

Wang Shiqi

Associate Professor, City University of Hong Kong

"Video Coding Based on Deep Learning"

Video coding is the core technology of digital video application, which promotes the rapid development of multimedia industry. With the advancement of ultra-high-definition video and virtual reality technology, high-efficiency video coding technology is urgently needed to meet the challenges of massive video data. In addition, with the application of smart city-related technologies, the demand for high-efficiency video coding for machine vision is increasing day by day.

76e56f332e85b52399863ecc1336ea60.jpeg

Fan Zhixing

shopee video codec tech leader

"Shopee Video Coding Technology and Best Practices of Ultra-fast HD"

In the last LVS Shanghai conference, we introduced shopee's internal audio and video-related business and how it was developed and implemented. With the slowdown of economic growth, major Internet companies have shouted the slogan of reducing costs and increasing efficiency, and shopee is no exception. In the past two years, how to improve or at least not reduce the user's picture quality experience while reducing bandwidth and computing power costs has become the biggest challenge for the shopee audio and video technology team.

    This sharing will give you an in-depth understanding of how shopee achieves end-to-end image quality improvement and bandwidth/computing cost savings. We have achieved this goal by combining AI enhancement, acquisition-side coding strategy optimization (combination of soft and hard coding), background transcoding optimization (encoder bdrate improvement, coding efficiency improvement), playback-side enhancement and other technologies.

9fbb58d1a91793f656f31f3f9a19741e.jpeg

Li Li

Distinguished Professor of University of Science and Technology of China

"End-to-end image and video coding and its standardization"

Traditional image and video coding is based on a hybrid coding framework. After decades of development, its performance improvement has reached a bottleneck. As a new coding framework, end-to-end image and video coding has achieved performance matching that of traditional image and video coding in just a few years of development. This sharing intends to introduce the basic idea of ​​end-to-end image and video coding, and introduce its development status and standardization for multiple modalities such as image, video, and 3D biomedical images.

The above is only part of the content of the topic, and you are welcome to continue to pay attention to more highlights. You, as a "diner", are also welcome to provide suggestions for this gluttonous feast of audio and video technology. If you also have first-hand audio and video technology practice and want to share it, if you are keen on experiencing the fun of technical communication and thinking collision, you are welcome to become a lecturer and work with us to create first-class audio and video technology content.

As an offline technical conference, we firmly believe that "seeing is worth seeing once heard a hundred times". 20% off ticket purchases at Shenzhen Station are in full swing, and the limited-time offer ends on September 3! At the same time, we have won a special benefit of 40% off ticket purchases for students in school (to buy student tickets, please contact the secretary, WeChat ID: LVSgogo).

Are you ready? Together with many senior audio and video technicians, we will see the future.

20% discount on ticket purchase, countdown to 12 days.

Click "Read the original text" below to jump to the official website ticket purchase page.

↓↓↓

Guess you like

Origin blog.csdn.net/vn9PLgZvnPs1522s82g/article/details/132463342
Recommended