Open source projects will be used in audio and video development

There are many open source projects that you can refer to for real-time audio and video development and learning.

Audio and video streaming has become ubiquitous in today’s life, and having a large number of top-notch audio/video tools really comes in handy. Trim files, edit videos, maximize audio – we need to meet the distribution needs of social media streams, and companies will always need audio/video content to communicate with users most effectively.

As a benefit for this article, you can receive free C++ audio and video learning materials package + learning route outline, technical videos/codes, including (audio and video development, interview questions, FFmpeg, webRTC, rtmp, hls, rtsp, ffplay, codec, push-pull streaming, srs )↓↓↓↓↓↓See below↓↓Click at the bottom of the article to get it for free↓↓

A real-time audio and video application includes several links: collection, encoding, pre- and post-processing, transmission, decoding, buffering, rendering and many other links. For each subdivided link, there are also more subdivided technical modules. For example, the pre- and post-processing links include beautification, filters, echo cancellation, noise suppression, etc., the collection includes microphone arrays, etc., and the codecs include VP8, VP9, ​​H.264, H.265, etc.

Today we have summarized some open source projects that can help real-time audio and video developers who are learning or developing audio and video, and several commercial services that are also contributing to the open source community. These projects are divided into several categories: audio and video encoding and decoding, video pre- and post-processing, server-side, etc.

Audio and video codec open source projects

The function of video codec is to compress the image and digitally encode it for transmission after the device's camera collects the image and pre-processes it. The advantages and disadvantages of codecs basically lie in: compression efficiency, speed and power consumption.

Currently, mainstream video encoders are divided into three series: VPx (VP8, VP9), H.26x (H.264, H.265), and AVS (AVS1.0, AVS2.0). The VPx series is a video codec standard open sourced by Google. Under the condition of ensuring the same quality, VP9 reduces the code rate by about 50% compared to VP8. The H.26x series has relatively extensive hardware support. The encoding efficiency of H.265 can be improved by 30-50% compared to the previous generation, but the complexity and power consumption will be much greater than the previous generation, so there are certain bottlenecks in pure software encoding implementation. , under the existing technology, we still need to rely mainly on hardware encoding and decoding. AVS is my country's second-generation source coding standard with independent intellectual property rights, and it has now developed into the second generation.

WebRTC

The first thing you will use is definitely WebRTC, which is an open source project that supports web browsers for real-time voice conversations or video conversations. It provides functions including audio and video collection, encoding and decoding, network transmission, display and other functions. If you want to develop real-time audio and video applications based on WebRTC, you need to note that since WebRTC lacks server-side design and deployment solutions, you also need to combine WebRTC with server-side open source projects such as Janus.

Official website address: ​ ​webrtc.org/​

x264

H.264 is currently the most widely used code stream standard. x264 is an encoder that can generate code streams that comply with the H.264 standard. It can encode video streams into H.264 and MPEG-4 AVC formats. It provides a command line interface and API. The former is used in some graphical user interfaces such as Straxrip and MeGUI, and the latter is called by FFmpeg, Handbrake, etc. Of course, since there is x264, there is also x265 corresponding to HEVC/H.265.

Official website address: https://www.videolan.org/developers/x264.html​

FFmpeg

FFmpeg should be familiar to everyone. It provides functions such as encoding, decoding, conversion, and encapsulation, as well as post-processing such as cropping, scaling, and color gamut. It supports almost all current audio and video encoding standards (due to the numerous formats, we will not list them one by one). , can be found in Wikipedia).

At the same time, FFmpeg also derived the libav project, from which the video decoder LAV was born. Many playback software can call LAV for decoding, and LAV itself also supports the use of graphics cards for video hard decoding. Many mainstream video players use FFmpeg as the core player. Not just video players, even browsers like Chrome that can play web videos also benefit from FFmpeg. Many developers have also done a lot of development based on FFmpeg and open sourced it, such as the great god Lei Xiaohua (the code can be seen on his sourceforge).

Official website address: ​ ​ffmpeg.org/​

ijkplayer

Before introducing ijkplayer, we must first mention ffplay. ffplay is a portable media player using FFmpeg and sdl libraries. ijkplay is Bilibili's open source lightweight iOS/Android video player based on ffplay.c. The API is easy to integrate, and the compilation configuration can be tailored, which helps control the size of the installation package.

In terms of encoding and decoding, ijkplayer supports video soft decoding and hard decoding, which can be configured before playback, but cannot be switched during playback. Video hard decoding on iOS and Android can use the familiar VideoToolbox and MediaCodec respectively. But ijkplayer only supports soft decoding of audio.

Github address: https://github.com/Bilibili/ijkplayer

JSMpeg

JSMpeg is a JavaScript-based MPEG1 video decoder. If you want to do live video broadcast on the H5 side, you can consider using JSMpeg to decode on the mobile side. For audio and video live streaming on the H5 side, you can use JSMpeg for video decoding. This is also the mainstream strategy for catching dolls in H5 that has become popular recently.

Github address: https://github.com/phoboslab/jsmpeg

Opus

Opus is a highly flexible audio encoder developed in C language. It is specially optimized for ARM and x86 and implemented with fix-point. Opus has obvious advantages in all aspects. It supports both voice and music encoding, with a bit rate of 6k-510k. It combines the SILK encoding method and the CELT encoding method. SILK was originally used in Skype, based on linear predictive analysis (LPC) of voice signals, and did not support music well. Although CELT is suitable for full-bandwidth audio, it is not efficient in encoding low-bitrate speech, so the two complement each other in Opus.

Opus "replaces" Speex. But there are features in Speex that Opus does not have, such as echo cancellation. This functionality has been separated from the encoder. Therefore, if you want to achieve good echo cancellation, you can cooperate with WebRTC's AEC and AECM modules for secondary development.

Official website address: ​ ​opus-codec.org/​

live555

live555 is a C++ streaming media open source project, which not only includes transmission protocols (SIP, RTP), audio and video encoders (H.264, MPEG4), etc., but also includes examples of streaming media servers. It is the first choice for streaming media projects. The transmission module is very worthy of reference for video conferencing development.

Official website address: ​ ​www.live555.com/​

Audio and video pre- and post-processing open source projects

Pre- and post-processing includes many segmentation technologies, which, if applied correctly, can improve the video quality to a greater or lesser extent. However, each additional processing step will inevitably increase the amount of calculations and delay, so everyone has to make a decision individually.

Seetaface

Seetaface is a complete set of face detection, face alignment and face verification solutions open sourced by Professor Shan Shiguang of the Chinese Academy of Sciences. The code is implemented in C++, and the open source license is BSD-2, which can be used free of charge by academia and industry. It does not rely on any third-party library functions. When using aligned LFW images, the detection alignment can reach 97.1% when all the open source software is used.

Github address: https://github.com/seetaface/SeetaFaceEngine​

GPUImage

Nowadays, when doing beauty effects and adding watermarks on iOS, GPUImage is basically used. It has 125 built-in rendering effects and supports script customization. This project implements picture filters and camera real-time filters. Its advantage is that the processing effect is based on GPU, which has higher processing performance than CPU.

Github address: https://github.com/BradLarson/GPUImage

Open nsfw model

Open nsfw model is a Yahoo open source project. Its full name is Open Not suitable for work model. It is specially designed to identify pictures that are not suitable for browsing during working hours (in other words, they are little yellow pictures). It is a model trained based on the Caffe framework and is used for audio and video post-processing. However, it cannot identify scary and bloody pictures.

Github address: https://github.com/yahoo/open_nsfw​​

Soundtouch

Soundtouch is an open source audio processing framework. Its main function is to change the speed and tone of audio to achieve the effect of sound changing. At the same time, it can also process media streams in real time. It adopts 32-bit floating point or 16-bit fixed point, supports mono or dual channels, and the sampling rate range is 8k - 48k.

Official website address: www.surina.net/soundtouch/

Server-side open source projects

As we said at the beginning, WebRTC lacks server-side design and deployment. Using MCU and SFU to implement multi-person chat and improve transmission quality requires developers to do it themselves. The following open source projects can help you.

Jitsi

Jitsi is an open source video conferencing system that can realize online video conferencing, document sharing and instant message sharing. It supports network video conferencing and uses SFU mode to implement video router functions. The development language is Java. It supports SIP account registration for phone calls. Not only supports local installation on a single machine, but also supports cloud platform installation.

Official website address: ​​jitsi.org/​​

JsSIP

JsSIP is a library based on WebRTC's JavaScript SIP protocol implementation that can run in browsers and Node.js. It can run with SIP Servers such as OverSIP, Kamailio, Asterisk, OfficeSIP, etc.

Github address: https://github.com/versatica/JsSIP

SRS

SRS is a simple domestic RTMP/HLS live broadcast server licensed under the MIT protocol. The latest version also supports FLV mode, has the real-time nature of RTMP, and the HTTP protocol in HLS is highly adaptable to various network environments, and supports more players. Its function is similar to nginx-rtmp-module, which can realize RTMP/HLS distribution.

Github address: https://github.com/ossrs/srs

JRTPLIB

JRTPLIB is an open source RTP protocol implementation library that supports Windows and Unix platforms. It supports multi-threading and has better processing performance. It also supports RFC3550, UDP IPV6, and custom extended transmission protocols. But it does not support TCP transmission, which needs to be implemented by developers themselves. At the same time, it does not support subcontracting of audio and video. You have to implement the code yourself.

Github address: https://github.com/j0r1/JRTPLIB

OPAL

OPAL is the next version of OpenH323 and inherits the Openh323 protocol. It newly includes the SIP protocol stack and is the first choice for implementing the SIP protocol. The disadvantage is that there are few reference examples.

Code address: https://link.zhihu.com/?target=http://sourceforge.net/projects/opalvoip/files/

current

Kurento is a media server based on WebRTC and includes a series of APIs that can simplify the development of real-time video applications on web and mobile terminals.

Github address: https://github.com/Kurento

Janus

Janus is a WebRTC media gateway. Whether it is streaming media, video conferencing, recording, or gateway, it can all be implemented based on Janus.

Github address: github.com/Kurento

​ ​Callstats.io​​

In the process of real-time communication, quality problems such as delay, packet loss, connection rate, and dropout rate all affect the user experience. Commercial projects require particular attention. Callstats is a service provider that helps users collect communication data and improve call quality by professionally monitoring WebRTC calls.

Callstats also opens many cases through Github, which can be used as a reference for developers using Jitsi-videobridge, turn-server, and JsSIP.

Github address: https://github.com/callstats-io

Meetecho

Meetecho is the developer of Janus, a well-known open source WebRTC gateway project. They also provide technical consulting and deployment services based on Janus development, and establish video conference live broadcast and recording services.

Github address: https://github.com/carlhuda/janus

Agora

Agora provides a full set of services from encoding and decoding to end-to-end transmission. Developers can access the open source projects for audio and video pre- and post-processing mentioned above, and use Agora SDK to build high-quality real-time audio and video applications. On the Web side, Agora Web SDK can help WebRTC developers solve problems such as stuck, delay, echo, and unstable multi-person video that may be encountered during server-side transmission. At the same time, SoundNet SDK also provides real-time audio and video communication services for applications on multiple system platforms.

Shengwang has a lot of demo source code on Github for developers to refer to and practice, covering everything from web pages, iOS to Android platforms, as well as audio and video live broadcasts, games with microphones, corporate meetings, AR, live Q&A, small programs, etc. A real-time interactive application scenario.

Github address: https://github.com/AgoraIO-Community

We have listed 18 open source projects here, as well as 3 services that can effectively ensure the quality of real-time audio and video transmission. However, the space is limited, and there are many open source projects that we have not listed in detail. For example, in terms of audio and video, Speex and FLAC of http://Xiph.org, as well as Xvid, libvpx, Lagarith, Daala, Thor, etc. Everyone is welcome to continue to add.

As a benefit for this article, you can receive free C++ audio and video learning materials package + learning route outline, technical videos/codes, including (audio and video development, interview questions, FFmpeg, webRTC, rtmp, hls, rtsp, ffplay, codec, push-pull streaming, srs )↓↓↓↓↓↓See below↓↓Click at the bottom of the article to get it for free↓↓

Guess you like

Origin blog.csdn.net/m0_73443478/article/details/134994327