Audio and Video Technology Development Weekly | 293

Once a week, an overview of the dry goods in the field of audio and video technology.

News contribution: [email protected].

4882a4f2c8ad681bcc5c1b9c073d2b02.png

Google fully counterattacks ChatGPT! PaLM 2, Gemini double kill, Bard is officially open

The above is the key content of the 2023 Google I/O conference, and the AI ​​content is extremely high.

Google pushes a supercomputer with 26,000 H100s to accelerate the AI ​​arms race

Cloud providers are building up armies of GPUs to provide more AI firepower. At its annual Google I/O developer conference today, Google announced the A3, an AI supercomputer with 26,000 GPUs, which is part of an active effort to dedicate more resources to Google's battle with Microsoft for AI supremacy. Another piece of evidence of a counterattack.

OpenAI releases the latest open source project Shap-E, which can generate 3D models through text

On May 6, OpenAI, the parent company of ChatGPT, released the latest open source project Shap-E, which can generate 3D models through text. At present, github has exceeded 2000 stars.

https://github.com/openai/shap-e 

Why is Hinton, the "Godfather of AI" born in Cambridge, worried?

The stability and robustness of human society are not fragile and will collapse at the first touch. As the saying goes, "the height of the devil is one foot, and the height of the road is one foot", and taking precautions and being prepared for danger in times of peace will always accompany the evolution and development of human civilization. What is happening now has happened in the past, and more than once, it is estimated that this time will be no exception... 

HugNLP is open source! Teach you how to play various NLP tasks, and can also train ChatGPT models

Today, when large model training is extremely popular, based on the HugNLP framework, HugChat, a product that can support ChatGPT-like models for training and deployment, has also been launched.

Align all modalities with images, and Meta open source multi-sensory AI basic model to achieve grand unification

e0d622c2cf9b69b8ebd531ae92b1ddfe.png

The Orillusion engine is officially open source! WebGPU lightweight 3D rendering engine in AIGC era!

IBM bright king fried! Launched the large-scale model Watsonx, which will be open sourced in July!

Watsonx consists of three major components, the basic model watsonx.ai; the dedicated data storage platform watsonx.data based on the open Lakehouse architecture; watsonx.governance for AI security governance. Under the empowerment of these three platforms, it can provide users with one-stop safe and reliable generative AI services.

"AI Stefanie Sun" is boiling all over the Internet! AI cover singing exploded, and the entire Chinese music scene was "revived"

Spring 2023 "Computational Conformal Geometry" course summary

Douyin’s platform specification and industry initiative on artificial intelligence-generated content

The rapid development of artificial intelligence technology has brought more possibilities to the Internet industry, but it has also brought problems such as false information and infringement. Referring to laws and regulations such as the "Regulations on the Administration of Deep Synthesis of Internet Information Services", Douyin has proposed eleven platform norms and industry initiatives.

d0e5db99a808ffc3736006c09b906e0f.png

SoundNet self-developed encoder a264 & a265: better picture quality and lower energy consumption, further adapting to the needs of real-time interactive scenarios

"Linglong" Codec Fusion Architecture Helps Diversified Video Demand

LiveVideoStackCon 2022 Beijing Station invited Dong Feng, the multimedia product manager of Anmou Technology, to share with us the "Linglong" codec fusion architecture to help the diversified needs of video.

What impact will the merger of MPEG LA and Via Licensing patent pool bring?

The article stated that the merger will make MPEG LA responsible for managing the patent pool of major video codecs such as HEVC, AV1 and VVC, which are the core technologies that many current and future streaming media applications rely on. Additionally, the merger will reduce the cost of using these codecs by reducing the complexity of customers needing to communicate with multiple companies when obtaining licenses.

https://www.streamingmedia.com/Articles/News/Online-Video-News/Via-LAs-Heath-Hoglund-Talks-MPEG-LA-Via-Licensing-Patent-Pool-Merger-158547.aspx

d42d3798e438accdbfa1485c24365daf.png

Terminal Architecture Design and Key Technologies of Metaverse Live Streaming

adfeb7f94d489695fb816bda53c257ea.jpeg

At 19:00 on May 16th, we invited Mr. Li Minglu, a senior research and development engineer of Baidu Smart Cloud Video Cloud, to focus on the development and evolution of terminal engine technology, introduce the Metaverse live broadcast technology system, terminal architecture design and key technologies in detail, and share Baidu Smart Cloud in Yuan Practical exploration in the cosmic live broadcast scene.

7bf6ab0e181ceaab5598177790fa07fd.png

AVIF image encoder Added experimental AV2 support Code pull request

Looks like work on AV2 is going well and it's good to see newer AVIF image format support will be supported in time too

https://github.com/AOMediaCodec/libavif/pull/1361 

The difference between Metal and OpenGLES, quick start Metal development

This article introduces Metal and Metal Shader Language, as well as the differences between Metal and OpenGL ES, and is also a summary of the implementation of the introductory tutorial. 

Point2Pix: Realistic Point Cloud Rendering via Neural Radiation Fields

The authors combine point cloud and NeRF to propose a new point cloud renderer called Point2Pix, which can synthesize realistic images from colored point clouds.

f3214d4ac651a5d7fd330a8e9134564f.png

Memory Chip Roadmap

The types of memory considered in this article are DRAM and non-volatile memory (NVM). The focus is on commodity, stand-alone chips because those chips tend to drive memory technology. However, embedded memory chips are expected to follow the same trend as commodity memory chips, usually with some time lag. For both DRAM and NVM, detailed technical requirements and potential solutions are considered.

b6526aee7540fe7c262b959813324217.png

Architecture Design and Evolution of Cloud Editing-B-end Online Editing Tool

We encountered many challenges in the process of exploring B-side online editing products: how to meet the two integration scenarios of fast and customized? How to ensure the efficiency and quality of cloud video synthesis? LiveVideoStackCon 2022 Beijing Station invited Mr. Cheng Ruilin from Tencent Cloud Audio and Video to share with us how their team answered this series of questions.

Audio and Video Talk - AI Tool Competition

The author asked about the difference between TCP and UDP such as impression AI, ChatGPT and Bard, and the above are their answers.

How does video technology help property insurance claims?

This is an article about the application of video technology in the insurance industry. Throughout the claims process, digital tools are key to reducing wait times and increasing customer satisfaction.

ed5538f72be191b3c59b716d570be5ac.png

Audio and video communication QoS technology and its evolution

Use a variety of algorithms and strategies to control network transmission to maximize the audio and video user experience in weak network scenarios.

Remote teaching through DRM (Digital Radio Mondiale) digital broadcasting

This is an article that introduces the application scenarios of DRM broadcasting education in detail, combining the characteristics and functions of DRM technology to successfully realize the ideal vision of popularizing education.

https://www.audioblog.iis.fraunhofer.com/cn/radioschooling

2026f93ff13d25fc89a9314d857c82d4.jpeg

Nanyang Technological University proposes VR-SLAM based on monocular camera and ultra-wideband sensor: to achieve high-precision indoor positioning and mapping

This paper proposes a SLAM system using a monocular camera and a UWB sensor. The system, called VRSLAM, is a multi-stage framework that exploits each sensor's strengths and compensates for its weaknesses.

The first systematic review! The latest research progress of camera calibration technology based on deep learning!

This review systematically summarizes the camera calibration technology driven by deep learning for the first time, covering the latest research progress of various camera model calibration and its application since the era of deep learning (8-year time span).

6d42bbef7c7f33ef9feccc4df5fd08ed.png

"Extreme value" measurement and application of RTC experience optimization

LiveVideoStackCon 2022 Beijing Station invited Yang Zhichao, the head of the Volcano Engine RTC team, to introduce the understanding and application of Volcano Engine RTC's experience in real-time communication scenarios.

NSDI 2015 | PCC: Reconstructing Congestion Control for Sustained High Performance

The author of this paper proposes performance-oriented congestion control, a new congestion control structure (PCC), in which continuous attention is paid to the link between the action (asction) in the control mechanism and the performance performance based on experience. Actions can consistently lead to high performance.

How to perfect the 5G in-venue experience with real-time precision

This article discusses how to deliver flawless 5G experiences in venues. The author believes that 5G can greatly improve the user experience in the venue, including video streaming, VR/AR, real-time interaction, etc.

https://www.red5pro.com/blog/perfect-5g-in-venue-experiences/


6066459a12f801e2a4ba16fc8ede0069.png

LiveVideoStackCon 2023 Shanghai lecturer recruitment

LiveVideoStackCon is everyone's stage. If you are in charge of a team or company, have years of practice in a certain field or technology, and are keen on technical exchanges, welcome to apply to be a lecturer at LiveVideoStackCon. Please submit your speech content to the email address: [email protected].

Guess you like

Origin blog.csdn.net/vn9PLgZvnPs1522s82g/article/details/130675866