Current open source WebRTC project technology selection

foreword

At present, Webrtc technology is mature, and major companies are integrating Webrtc functions in their products. In addition to realizing video and audio communication, Webrtc also needs to realize signaling transmission, and the core module of Webrtc is the stream forwarding module, that is, the streaming media server. If you want to implement a streaming media server by yourself, it is still relatively difficult, and the time cost will be a lot. It involves researching the DTLS protocol, ICE protocol, SRTP/SRTCP protocol, etc. It takes a lot of time to understand these protocols. Less time, let alone implement it, so the quickest way is to use open source implementation. This article introduces and compares several open source webrtc.

If you want to quickly build an audio and video conferencing system, using open source technology solutions is undoubtedly the fastest way, which can save labor costs and time costs the most. The following analyzes the advantages and disadvantages of the common video conferencing core module (stream forwarding server) SFU open source solution, so that you can provide a reference for choosing a suitable open source solution.

1. Mediasoup

Mediasoup is an open source library of WebRTC streaming media server that has not been launched for a long time. Its address is:  https://github.com/versatica/mediasoup  .

Mediasoup consists of application layer and data processing layer. The application layer is implemented by Node.js; the data processing layer is implemented by C++ language, including DTLS protocol implementation, ICE protocol implementation, SRTP/SRTCP protocol implementation, routing and forwarding, etc.

 

​ Mediasoup calls each instance a Worker, and there are multiple Routers inside the Worker, and each Router is equivalent to a room. There can be multiple users or participants in each room, and each participant is represented by a Transport in Mediasoup. In other words, for a room (Router), Transport is equivalent to a user.

There are three types of Transport, WebRtcTransport, PlainRtpTransport, and PipeTransport.

  • WebRtcTransport is used to connect with WebRTC type clients, such as browsers.
  • PlainRtpTransport is used to connect with traditional RTP clients, through which multimedia files and FFmpeg push streams can be played.
  • PipeTransport is used for the connection between Routers, that is, the audio and video stream in one room is transmitted to another room through PipeTransport.

Multiple Producers and Consumers can be included in each Transport.

  • Producer represents the sharer of the media stream, which is divided into two types, that is, the sharer of audio and the sharer of video.
  • Consumer represents the consumer of the media stream, which is also divided into two types, namely audio consumer and video consumer.
  • The implementation logic of Mediasoup is very clear. It does not care about how the upper layer application should do, but only cares about the transmission of the underlying data, and makes it the ultimate

The bottom layer of Mediasoup is developed in C++, and libuv is used as its asynchronous IO event processing library, so its high performance is guaranteed. At the same time, it supports almost all WebRTC optimizations for real-time transmission, so it is a particularly excellent WebRTC SFU streaming server. Compared with Janus, it focuses more on the real-time performance, efficiency, and simplicity of data transmission, while Janus does more things than Mediasoup, and its architecture and logic are more complex.

For companies with relatively strong development capabilities, it is also a highly recommended technical solution to do secondary development on Mediasoup according to their own business needs.

On the mobile phone side, you need to implement the Android and ios SDKs yourself

2. Licode

Licode can be used as both SFU type streaming server and MCU type streaming server. Generally, it is used for SFU type streaming media server.

Licode is not only a streaming media communication server, but also a complete system including media communication layer, business layer, user management and other functions, and the system also supports distributed deployment.

Licode is implemented by C++ and Node.js languages. Among them, the media communication part is realized by C++ language, while the signaling control, user management and room management are realized by Node.js. Its source address is:  https://github.com/lynckia/licode  . The following picture is the overall architecture of Licode:

 

​ From this picture, you can see that Licode is divided into three parts from the functional level, namely Nuve, ErizoController and ErizoAgent, and they communicate through message queues.

  • Nuve is a web service for managing users, rooms, generating tokens, and room load balancing. It uses MangoDB to store room and token information, but not user information.
  • ErizoController, used for management control, signaling and non-audio and video data are received through it. It communicates with Nuve through message queue, that is to say, Nuve can control ErizoController through message queue.
  • ErizoAgent, used for the transmission of audio and video streaming data, can be deployed in a distributed manner. The communication between ErizoAgent and ErizoController is also through the message queue. After the signaling message is received by ErizoController, it is sent to ErizoAgent through the message queue, so as to realize the control of ErizoAgent.

icode is not just a SFU streaming media server, it also includes streaming media-related business management system, signaling system, streaming media server and client SDK, etc. It can be said that it is a relatively complete product.

If you use Licode as a streaming server, you basically don't need to do secondary development, so such a system is very attractive to companies and individuals who have no audio and video accumulation. At present, the Intel CS project is developed on the basis of Licode, and has provided services for many companies.

The official website provides learning demos and documents.

But Licode also has the following disadvantages:

  • github star 2.4k issue and pr are quite active, the community uses traditional questions, and timely communication is relatively poor
  • Under Linux, only Ubuntu 14.04 version is currently supported, and it is difficult to compile and pass on other versions.
  • Licode includes not only SFU, but also MCU, so its code structure is relatively heavy, and it takes a lot of time to learn and master it.
  • The performance of Licode is average, if you put the performance of the streaming server first, then Licode is not a particularly ideal SFU streaming server.
  • The official did not see the SDKs for android and ios. Others have implemented them, but they have not been updated for a long time. If you want to consider Android and ios, you may work hard yourself.

3. Janus-gateway

Janus is a very famous WebRTC streaming media server. It is a service program written in Linux style and implemented in C language. It supports compilation and deployment under Linux/MacOS, but it does not support Windows environment.

It is an open source project, and its source code compilation and installation are very simple, just follow the instructions on GitHub. The address of the source code and compilation manual is:  https://github.com/meetecho/janus-gateway  .

The deployment of Janus is also very simple, the specific steps are detailed in the document, the address is:

https://janus.conf.meetecho.com/docs/deploy.html 。

 

From the above architecture diagram, you can see that Janus is divided into two layers, namely the application layer and the transport layer.

The plug-in layer is also called the application layer, and each application is a plug-in, which can dynamically load or uninstall an application according to the user's needs. The plug-in architecture scheme is a very good design scheme, which is flexible, easy to expand, and has strong fault tolerance. It is especially suitable for businesses with more complex businesses, but the disadvantage is that the implementation is complicated and the cost is relatively high.

The plugins supported by default in Janus include the following.

  • SIP: This plugin makes Janus a proxy for SIP users, allowing WebRTC endpoints to register with a SIP server (such as Asterisk) and send or receive audio and video streams to or from the SIP server.
  • TextRoom: This plugin implements a text chat room application using DataChannel.
  • Streaming: It allows WebRTC endpoints to watch/listen to pre-recorded files or media generated by other tools.
  • VideoRoom: It implements the SFU service of video conferencing, and is actually an audio/video router.
  • VideoCall: This is a simple video call application that allows two WebRTC terminals to communicate with each other. It is similar to the example on the WebRTC official website (  https://apprtc.appspot.com  ). The video stream is transferred, and the example on the WebRTC official website uses P2P direct connection.
  • RecordPlay: This plug-in has two functions, one is to record the data sent to WebRTC, and the other is to play it back through WebRTC.

The transport layer includes media data transmission and signaling transmission. The media data transmission layer mainly implements the streaming media protocol and related protocols in WebRTC, such as DTLS protocol, ICE protocol, SDP protocol, RTP protocol, SRTP protocol, SCTP protocol, etc.

The signaling transport layer is used to process various signaling of Janus, and the transport protocols it supports include HTTP/HTTPS, WebSocket/WebSockets, NanoMsg, MQTT, PfUnix, RabbitMQ. However, it should be noted that some protocols can be installed or not controlled through compilation options, which means that these protocols are not all installed by default. In addition, the format of all Janus signaling is in Json format.

The overall architecture of Janus adopts a plug-in solution, which is very good, and users can easily write their own applications on it according to their needs.

And it currently supports a lot of functions, such as supporting SIP, RTSP, audio and video file playback, recording, etc., so it has a great advantage in integration with other systems.

In addition, its underlying code is written in C language, and its performance is also very strong. The development and deployment manuals of Janus are also very complete, so it is a great open source project.

github star4.1k, and handle issues and pr relatively quickly.

The official SDK for Android and iOS is provided.

shortcoming:

  • The structure is too complicated and not suitable for beginners. If the company adopts it, the labor cost and time cost will be relatively high
  • Janus does not use asynchronous I/O event processing mechanisms such as epoll, which should be said to be a major defect of it
  • Janus also uses the glib library, because the glib library is used less for many domestic development students, so there will be a certain learning cost

4. Medooze

Medooze is a comprehensive streaming media server that not only supports the WebRTC protocol stack, but also supports many other protocols, such as RTP, RTMP, etc. Its source address is:  https://github.com/medooze/media-server

 

​ From a large perspective, Medooze supports RTP/RTCP, SRTP/SRCP and other related protocols, so that it can be interconnected with WebRTC terminals. In addition, Medooze can also access RTP streams, RTMP streams, etc., so you can use GStreamer/FFmpeg to push streams to Medooze, so that other WebRTC terminals entering the same room can see/hear the streams sent by GStreamer/FFmpeg Pushed up the audio and video stream. In addition, Medooze also supports the recording function, which is the role of the Recorder module in the above figure, through which the audio and video streams in the room can be recorded for later playback.

The control logic layer of Medooze is implemented through Node.js. Medooze provides APIs related to complete control logic operations through Node.js. Through these APIs, you can easily control the behavior of Medooze.

Compared with Mediasoup, Medooze has similar functions at the core layer, but Medooze has more powerful functions, including related operations such as recording, pushing RTMP streams, and playing FLV files, while Mediasoup does not have these functions.

Medooze also has some shortcomings. Although Medooze is also a streaming media server developed by C++ and uses an asynchronous IO event processing mechanism, the asynchronous IO event processing API it uses is poll. Compared with epoll, a powerful asynchronous IO event API, it is much inferior, which makes its performance slightly worse than Mediasoup when receiving/sending audio and video packets.

Five, jitsi

The server built using Java also uses c/c++ at the bottom layer, and the Java language is used, so the performance is not as good as that of using c/c++.

Main modules and implementation languages:

  • Jitsi Video-Bridge (Software video-bridge implementation language java)
  • Jitsi Jicofo (Component mandatory for jitsi conference implementation language java)
  • Prosody (XMPP Server implementation language lua)
  • Nginx (Web Server)
  • Jitsi Meet (Web application – to which the end user will interact. Implementation language js)

advantage:

  • github star12.3k, issue and pr processing fast
  • complete documentation
  • The official Android and ios SDKs are provided, and you can also compile the SDK yourself, using React Native
  • The official web-side SDK is provided, and the use of electron for desktop-side packaging is provided (the end is very complete)
  • The community uses forums to communicate and is very active
  • The community provides distributed solutions, but the documentation is sparse.
  • Every Monday, the maintenance team conducts video conferences on jitsi, answers questions from developers, and communicates in English. The domestic time seems to be at night.
  • The community version update iteration is faster

Six, Kurento

Kurento, like jitsi, has been maintained for many years and has passed the test of time. The difference is that it is developed in c++, has rich documentation and examples, and is very friendly to developers.

7. pion/webrtc

The Pure Go implementation of WebRTC API, star 4.7k on github, is currently used by a small number of people. It is not recommended to use it in a production environment. You can learn and use it for reference. It is recommended to pay attention to it for a long time.

Summarize:

For the choice of SFU streaming media server, there is no best, only the most suitable. Each open source implementation has its own characteristics, and can be applied to actual products. However, as a developer, you have your own unique technical background. You need to choose the most suitable one according to your own characteristics and project characteristics. Next, I will introduce how I judge and select these open source projects.

The team in a team will definitely choose a language that everyone is familiar with as the language for project development, so when we choose an open source project, we must choose an open source project developed in this language. For example, the Ali department basically uses the Java language for development, so when they choose open source projects, they will basically choose open source projects developed in Java; and developers of audio and video streaming services generally choose C/C++ in order to pursue performance. Open source project for language development. If the team is not well manned, try not to choose particularly complicated open source technologies with fewer documents.

To be suitable for the business, you must fully consider the number of users and user groups of your business. If your business volume is large and needs to be distributed, then you must first find out whether the open source technology you choose supports distributed deployment. Deploy that way. How much concurrency is supported by a single machine, it is best to use the server to actually test it yourself, the official data will be somewhat different from the actual test data. Project functions also need to be considered. For example, the business needs to record and play back. Open source technology does not have such a function.

Secondary development Licode is a complete system that supports distributed cluster deployment, so the system is relatively complex and the learning cycle is longer. It can be deployed directly in the production environment, but the flexibility of secondary development is not enough. Janus-gateway is an independent service that supports rich signaling protocols, supports plug-in development, and is easy to expand. It is a good choice for developers with Linux/C background. Both Medooze and Mediasoup are streaming server libraries and should be the choice for developers who need to integrate streaming servers into their products.

Time cost The company also considers the time plan and cost of the project, because the use of open source technology will more or less encounter pitfalls, and a pit may be stuck for a long time, so it is better to use open source technologies with complete documentation and active communities.

No matter which open source technology you choose, you must do a good job of research in the early stage, and actually build and use it before making a decision. After you choose it, in order to make up for the technical debt, you need to deeply understand the code of the open source technology, otherwise it will be very painful when repaying the debt .

Original link: Current open source WebRTC project technology selection

★The business card at the end of the article can receive audio and video development learning materials for free, including (FFmpeg, webRTC, rtmp, hls, rtsp, ffplay, srs) and audio and video learning roadmaps, etc.

see below!

 

Guess you like

Origin blog.csdn.net/yinshipin007/article/details/132324743