WebRTC Raiders on the Web Front End (1) Basic Introduction

With the rapid development of the Internet and the coming 5G era, WebRTC, as a sharp tool for front-end interactive live broadcast and real-time audio and video, is also an unmissable learning field for front-end developers. If you've only heard it by now, you might want to learn more.

What is WebRTC?

The full name of WebRTC is (Web browsers with Real-Time Communications (RTC)

Around 2011, Google acquired GIPS, a company that developed many components for RTC, such as codecs and echo cancellation technology. Google open-sources the technology developed by GIPS and hopes to make it an industry standard.

The acquisition cost a lot of money. Google said that open source is open source. I really have to admire it, but obviously for Googl, creating an open source ecology of audio and video has greater value. "Browser + WebRTC" is an answer given by Google. Its vision is to enable fast audio and video communication between browsers.

Today, simply put: WebRTC is a free and open project. Enables web browsers to implement real-time communication functions through a simple JavaScript API interface.


WebRTC and architecture

Generally speaking, this picture will be taken out when talking about the WebRTC architecture. The WebRTC architecture from top to bottom is as follows:

Web API layer : provides standard API (javascirpt) for developers , through which front-end applications access and use WebRTC capabilities.

C++ API layer : for browser developers, enabling browser manufacturers to easily implement Web API solutions.

Audio engine (VoiceEngine) : Audio engine is a series of audio multimedia processing framework, including the entire solution from video capture card to network transmission end.

  1. Codecs such as iSAC/iLBC/Opus.
  2. NetEQ speech signal processing.
  3. Echo cancellation and noise reduction.

Video Engine (VideoEngine) : It is an overall framework for a series of video processing, a solution for the entire process from camera capture video, video information network transmission to video display.

  1. VP8 codec.
  2. jitter buffer: dynamic jitter buffer.
  3. Image enhancements: Image gain.

Transport : transport/session layer, session negotiation + NAT penetration components.

  1. RTP Real-time protocol.
  2. Network traversal realized by P2P transmission STUN+TRUN+ICE.

Hardware module : audio and video hardware capture and NetWork IO related.


Important classes and APIs of WebRTC

Network Stream API

1.MediaStream (media stream) and MediaStreamTrack (media track)

This class does not completely belong to the category of WebRTC, but it is related to WebRTC in obtaining local media streams and streaming to vedio tags at the remote end. MS consists of two parts: MediaStreamTrack and MediaStream.

  • MediaStreamTrack media track, representing a single type of data stream, which can be audio track or video track.
  • MediaStream is a complete audio and video stream. It can contain >= 0 MediaStreamTracks. Its main function is to ensure that several media tracks are played synchronously.

2.Constraints media constraints

Regarding MediaStream, there is another important concept called: Constraints (constraints). It is used to regulate whether the currently collected data meets the requirements, and can be set through parameters.

// 基本
const constraint1 = {
    "audio": true,  // 是否捕获音频
    "video": true   // 是否捕获视频
}

// 详细
const constraint2 = {
    "audio": {
      "sampleSize": 8,
      "echoCancellation": true //回声消除
    },
    "video": {  // 视频相关设置
        "width": {
            "min": "381", // 当前视频的最小宽度
            "max": "640" 
        },
        "height": {
            "min": "200", // 最小高度
            "max": "480"
        },
        "frameRate": {
            "min": "28", // 最小帧率
             "max": "10"
        }
    }
}

3. Obtain the local audio and video of the device

Among them, the local media stream acquisition is used navigator.getUserMedia(), which provides a means to access the user's local camera/microphone media stream.

var video = document.querySelector('video');
navigator.getUserMedia({
    audio : true,
    video : true
    }, function (stream) {
            //拿到本地媒体流
            video.src = window.URL.creatObjectURL(stream);
    }, function (error) {
            console.log(error);
});

The above demo is to getUserMediaobtain streamthe browser pop-up window to ask the user for permission, and only after permission can it be passed streamto the video tag for playback.

getUserMediaThe first parameter of is Constraint, and the second parameter is passed to the callback function to get the video stream. Of course, you can use the following Promise writing method:

navigator.mediaDevices.getUserMedia(constraints).
then(successCallback).catch(errorCallback);

RTCPeerConnection

RTCPeerConnection is used to achieve NAT penetration between peers, and then a connection channel for transmitting audio and video data streams without a server .

This is too abstract, in order to help understand, you can use an inappropriate but helpful metaphor: RTCPeerConnectionit is an advanced and powerful channel for transmitting audio and video data and establishing a link channel similar to Websocket, but it can be used to establish browser

The reason why it is advanced and powerful is because it is the core API of the WebRTC web layer, so that you do not need to pay attention to data transmission delay jitter, audio and video codec, audio and video synchronization and other issues. By directly using PeerConnection, you can use the underlying encapsulated capabilities provided by these browsers.

var pc =  new RTCPeerConnection({
    "iceServers": [
        { "url": "stun:stun.l.google.com:19302" }, //使用google公共测试服务器
        { "url": "turn:[email protected]", "credential": "pass" } // 如有turn服务器,可在此配置
    ]
};);
pc.setRemoteDescription(remote.offer);
pc.addIceCandidate(remote.candidate);
pc.addstream(local.stream);
pc.createAnswer(function (answer) { 
    // 生成描述端连接的SDP应答并发送到对端
    pc.setLocalDescription(answer);
    signalingChannel.send(answer.sdp);
});
pc.onicecandidate = function (evt) {
    // 生成描述端连接的SDP应答并发送到对端
    if (evt.candidate) {
        signalingChannel.send(evt.candidate);
    }
}
pc.onaddstream = function (evt) {
    //收到远端流并播放
    var remote_video = document.getElementById('remote_video');
    remote_video.src = window.URL.createObjectURL(evt.stream);
}

You may wonder what is the ice Server configuration here? What is signalingChannel? What are answer and offer? What is a candidate?

We can new RTCPeerConnection()create RTCPeerConnection by. The above code only shows the API and setting method of RTCPeerConnection, but it does not work.

To complete an RTCPeerConnection, you need to set up ICE Server (STUN server or TURN server), and exchange information before connecting . For this purpose, you need to use a signaling server (signaling server), mainly exchanging SDP session description protocol and ICE candidate. We Introduced in the following paragraphs.

Peer-to-peer Data API

RTCDataChannel can establish peer-to-peer communication between browsers. Commonly used communication methods include websocket, ajax and other methods. Although websocket is a two-way communication, whether it is websocket or ajax, it is the communication between the client and the server. You must configure the server to communicate.

And because RTCDATAChannel uses RTCPeerConnection to provide point-to-point communication without going through the server, there is no need/(avoid) the middleware of the server.

var pc = new RTCPeerConnection();
var dc = pc.createDataChannel("my channel");

dc.onmessage = function (event) {
  console.log("received: " + event.data);
};

dc.onopen = function () {
  console.log("datachannel open");
};

dc.onclose = function () {
  console.log("datachannel close");
};

Signaling Signaling

We say that WebRTC's RTCPeerConnection can communicate between browsers (no service).

But there is a problem here, when two browsers do not establish a PeerConnection through the server, how do they know each other's existence? Furthermore, how do they know each other's network connection location (IP/port, etc.)? What codecs are supported? When does media streaming start and when does it even end?

Therefore, before establishing WebRTC's RTCPeerConnection, another channel must be established to deliver these negotiation information, which are also called signaling , and this channel becomes a signaling channel (Signaling Channel) .

The signaling exchanged by the two client browsers has the following functions:

  • Negotiate Media Capabilities and Settings
  • Identify and verify the identity of session participants (exchange information in SDP objects: metadata such as media type, codec, bandwidth, etc.)
  • Control media sessions, indicate progress, change sessions, terminate sessions, etc.

It mainly involves the SDP (offer, answer) session description protocol and the exchange of ICE candidates.

One thing to note here:

The WebRTC standard itself does not specify the communication method of signaling exchange, and the signaling service is implemented according to its own situation

Generally, the websocket channel is used as the signaling channel. For example, the signaling service can be built based on http://socket.io . Of course, there are many open source, stable and mature signaling service solutions in the industry to choose from.

The key to establishing a connection with WebRTC - ICE connection

After exchanging SDP, webrtc starts the real connection to transmit audio and video data. The process of establishing a connection is quite complicated, because webrtc must not only ensure efficient transmission , but also ensure stable connectivity .

Since the locations between browser clients are often quite complicated, they may be in the same intranet segment, or may be in two different locations, and the NAT gateways they are in may also be very complicated. Therefore, a mechanism is needed to find a path with the best transmission quality, and WebRTC has this ability.

First, simply understand the following three concepts.

  • ICE Canidate (ICE Candidate): Contains information such as the protocol used in remote communication, IP address and port, and candidate type.
  • STUN/TURN : STUN realizes P2P connection, and TRUN realizes relay connection. Both implementations have standard protocols. (Refer to the picture below)
  • NAT traversal : NAT is network address translation. Since the client cannot be assigned to the public network IP, it needs to map the internal network IP and the public network IP port to communicate with the external network. And NAT traversal is to find each other and establish a connection between the clients behind the layers of Nat gateways.

The general principle and steps of ICE connection are as follows:

  1. Initiate the task of collecting ICE Canidate.
  2. This machine can collect candidates of host type (intranet IP port).
  3. Candiates of the srflx type (NAT mapped to the IP port of the external network) are collected through the STUN server.
  4. Collect relay type (IP and port of relay server) candidates through TUN server.
  5. Start to try NAT traversal, and connect according to the priority of host type, srflx type, and relay type.

Above, WebRTC can find a connection path with the best transmission quality. Of course, the actual situation is not so simple, the whole process contains more complex underlying details.

WebRTC uses step Demo code

Through the above understanding, combined with WebRTC API, signaling service, SDP negotiation, ICE connection and other content. We use a piece of code to illustrate the steps of using WebRTC.

var signalingChannel = new SignalingChannel();
var pc = null;
var ice = {
    "iceServers": [
        { "url": "stun:stun.l.google.com:19302" }, //使用google公共测试服务器
        { "url": "turn:[email protected]", "credential": "pass" } // 如有turn服务器,可在此配置
    ]
};
signalingChannel.onmessage = function (msg) {
    if (msg.offer) { // 监听并处理通过发信通道交付的远程提议
        pc = new RTCPeerConnection(ice);
        pc.setRemoteDescription(msg.offer);
        navigator.getUserMedia({ "audio": true, "video": true }, gotStream, logError);
    } else if (msg.candidate) { // 注册远程ICE候选项以开始连接检查
        pc.addIceCandidate(msg.candidate);
    }
}
function gotStream(evt) {
    pc.addstream(evt.stream);
    var local_video = document.getElementById('local_video');
    local_video.src = window.URL.createObjectURL(evt.stream);
    pc.createAnswer(function (answer) { // 生成描述端连接的SDP应答并发送到对端
        pc.setLocalDescription(answer);
        signalingChannel.send(answer.sdp);
    });
}
pc.onicecandidate = function (evt) {
    if (evt.candidate) {
        signalingChannel.send(evt.candidate);
    }
}
pc.onaddstream = function (evt) {
    var remote_video = document.getElementById('remote_video');
    remote_video.src = window.URL.createObjectURL(evt.stream);
}
function logError() { ... }

The current state of WebRTC

standard

At the beginning, each browser manufacturer will implement its own set of APIs. Differences such as webkitRTCPeerConnectionand mozRTCPeerConnection such are of course miserable for front-end developers.

And adapter.js is to eliminate this difference and help us to write our WebRTC code according to the specification. You can refer to  https://github.com/webrtcHacks/adapter.

Another key point about the standard is: the WebRTC 1.0 standard (candidate recommendation) https://www.w3.org/TR/webrtc released by W3C in 2018   makes WebRTC also become the main technology push for the explosion of video communication business application scenarios force. So you can see that the vast majority of communication vendors currently implement WebRTC on the web browser side.

compatibility

The development of standards will inevitably promote the improvement of compatibility and support. I was doing pre-research on H5 online clip dolls in about 2017. At that time, I found that many browsers, especially mobile terminals and IOS, were completely unavailable, so I had to give up the WebRTC solution.

At present, it seems that the browser supports it very well. Except that IE still does not support it, PC browsers basically already support it. IOS 11 or above on the mobile terminal has been supported.

There is a key point here: don't just look at the browser of caniuse, but also look at whether the customized browsers on the mobile end support it. I don't have extensive compatibility test data here.

But it can be concluded that WebRTC is available in the latest IOS and Android mobile QQ and WeChat.

WebRTC learning strategy

The general learning strategy given in the above figure can start from the core API of webRTC, and implement such as local audio and video acquisition and display according to the demo. Secondly, it is a good attempt to build a simple signaling service to realize simple communication between browsers on the intranet. After using it, go deep into Li Jue's connection traversal, transmission principles and related protocols, and finally try to dig deeper into webrtc's internal audio and video related knowledge.

The above is a relatively easy-to-understand and comprehensive webrtc basic introduction for the web front-end.

reference article

https://webrtc.org/architecture/ https://developer.mozilla.org/zh-CN/docs/Web/API/WebRTC_API

 

The original WebRTC strategy of the front end of the Web (1) basic introduction-Knowledge 

★The business card at the end of the article can receive audio and video development learning materials for free, including (FFmpeg, webRTC, rtmp, hls, rtsp, ffplay, srs) and audio and video learning roadmaps, etc.

see below!

 

Guess you like

Origin blog.csdn.net/yinshipin007/article/details/132502829