Real-time audio codec No. 19 AI-based speech coding (LPCNet)

Please do not reprint this article in any form, thank you.
After nearly 10 years since the launch of the Opus encoder in 2012, the COVID-19 pandemic in 2020 has further increased the demand for real-time audio and video conferences and virtual enhanced conferences. Opus is a very good audio encoder in such scenarios, but AI technology can further Improve audio and video effects.

Satin

Satin is an AI-based speech encoder officially announced by Microsoft in February 2021. Its goal is to replace the Silk encoder. Silk is the speech encoder used by Skype. The LPC part of Opus is also based on the Silk encoder. The characteristics of Satin as follows:

Starting from 6kbps, it can support ultra-bandwidth voice

Can support full bandwidth voice from 17kbps

Higher bitrates lead to better encoding quality

High audio quality even with high packet loss

Better redundancy algorithm, better protection in the case of sudden loss
Please add a picture description
Satin has been used in two-way calls between Microsoft Teams and Skype, and it will obviously be extended to multi-person calls in the future. The goal of Satin is to replace the Silk/Opus encoder.

In order to achieve ultra-bandwidth at 6kbps bit rate, Satin extracts and encodes a sparse representation of the signal based on a deep understanding of speech generation, modeling, and psychoacoustics. When further reducing the required bit rate, Satin encodes only the lower frequency bands and transmit some parameters, on the decoding side, Satin uses a deep learning network to estimate high frequency band parameters from the received low frequency band parameters and additional information. Although this method uses ultra-low bit rate to encode ultra-wideband signals, the computational complexity is greatly increased. improve. Analyzing input speech signals to extract low-dimensional representations is computationally intensive, and real-time inference on deep neural networks adds even more complexity.

Supongo que te gusta

Origin blog.csdn.net/shichaog/article/details/124780180
Recomendado
Clasificación