Analysis of the working principle of WebRTC | Nuggets Technology Call for Papers

introduce

WebRTC is an open source project currently in development to provide real-time, peer-to-peer communication between web applications.

WebRTC provides a simple JavaScript API that can help developers easily build web applications with real-time audio and video transmission functions. WebRTC is also planned to support native applications on mobile phones. Under the API provided by WebRTC, the underlying implementation principle of WebRTC is actually hidden, so in addition to using the API, it is necessary to understand the underlying implementation of WebRTC.

This article is very suitable for those who are new to WebRTC, especially those who do not know the working principle of WebRTC. In order to make everyone understand as much as possible, we will use simple terms and analogies to explain the underlying working principle of WebRTC in detail. .

start

In order to establish a WebRTC connection between A and B, the following two steps need to be performed:

  1. Find each other's location.
  2. Notify the peer to set up a WebRTC connection.

Step 1: Find each other

The process of finding the other party in WebRTC can be imagined as making a phone call. When you need to talk to someone by phone, you need to enter the phone number of the other party before you can contact the person. The same process is required when someone wants to call you. When making a phone call, we use the phone number to identify the user and then further use this identification in the telecommunications system to locate the user.

However, web applications cannot dial and call each other. Because there are many browsers in the world, and many browsers can exist in one system at the same time, and the browser does not have a unique ID like a phone number. Although there is no unique ID, the system where the browser is located has a unique ID. ID is the IP address, which can be used for positioning.
 

However, the process is not that easy. Because, in most cases, these systems are behind a Network Address Translation (NAT) device. Security and IPv4 restrictions on usable public IP addresses require a NAT device. A NAT device assigns private IP addresses to systems on a local network. This private IP address is only valid and visible within the local network, and cannot be used to accept communications from the outside world, as systems outside the network do not know the public IP of devices within the network.


Due to the involvement of the NAT device, the wanting peer does not know its own public IP address because it is masked by the private IP address assigned by the NAT. Therefore, it cannot share its public IP address with another peer to accept connections. To make things even more understandable, if you want someone to call you, you need to give the other person your phone number. However, in the presence of NAT it is like staying in a hotel where the phone numbers of the rooms are hidden from the outside world, calls coming to the hotel are processed at the reception and further redirected to your room on request. This indirect form of connection is not intended for peer-to-peer connection technology.

To overcome this we use a protocol called ICE (Interactive Connection Establishment). ICE's job is to find the best path to connect two peers. ICE can perform direct connections, that is, without NAT, and indirect connections, that is, when NAT is present. The ICE framework provides us with ICE candidates. ICE candidates are nothing but objects containing our own public IP address, port number and other connection related information.

In the absence of NAT, ICE is very simple, because the peer's public IP address is always available. However, in the presence of NAT, ICE relies on entities called session traversal utilities for NAT (STUN) and/or traversal using relays around NAT (TURN).
 

A STUN server basically allows a peer to find its own public IP address. A peer that needs to know its own public IP address sends a request to the STUN server. The STUN server replies with the peer's public IP address. This public address can now be shared with other peers so they can find you. However, if the peer is behind a complex NAT or firewall, even STUN cannot find and provide its IP address to the requesting peer. In this case, ICE relies on TURN to establish the connection. As the name suggests, TURN is a relay server, which can be used as a medium for transmitting data, audio, and video when direct connection between two peers is not possible.

The STUN server is only involved in the process of finding the public IP. Once a WebRTC connection is established, all further communication happens over WebRTC. However, in the case of TURN, a TURN server is required even after setting up a WebRTC connection.

But due to limitations of STUN, we have to rely on it. The success rate of the STUN server is only 86%.

Step 2: Notify the peer to set up the WebRTC connection
 

Now that we have ICE candidates, the next step is to send these candidates to the peers we wish to connect to. Along with candidates, session descriptions like session information, time description, media description are sent. ICE candidates and session descriptions are bundled inside objects and delivered using SDP (Session Description Protocol). In some cases, ICE candidates are not bundled in the same object as the session description, but sent separately, which is called Trickle ICE.

When establishing a connection, we need to "send" information to other peers. But how do we transfer candidates and session descriptions when we only know the IP address of the sender and not the IP address of the receiving peer? Since the WebRTC connection has not yet been established, through what medium is this information transmitted?

The answer to all these questions lies in a concept known as the signaling mechanism. Before we can establish a WebRTC connection, we need some medium to transmit the above information between the peers and let them know how to locate and connect for the WebRTC connection. This is where the signaling mechanism comes in. As the name suggests, signaling mechanisms exchange connection signals (ICE candidates, session descriptions, etc.) between two peers intending to connect.

WebRTC does not define any standard for implementing such a signaling mechanism, leaving it to the developer to create a mechanism of his choice. The signaling mechanism for exchanging information can be implemented by simply copy-pasting information into the respective peers or by using communication channels such as WebSockets, http://Socket.io , Server Side Events, etc. In short, the signaling mechanism is just a pattern. Connection related information is exchanged between peers so that peers can recognize each other and further initiate communication using WebRTC.

quick recap

Let's review the whole process for a better understanding.

If, suppose, peer A wants to establish a WebRTC connection with peer B, it needs to do the following:

Peer A uses ICE to generate its ICE candidates. In most cases it requires a NAT (STUN) session traversal utility or a NAT (TURN) server traversal using a relay.
Peer A bundles ICE candidates and session descriptions into one object. This object is stored as a local description within peer A (peer's own connection information) and communicated to peer B via a signaling mechanism. This part is called an offer.
Peer B receives the proposal and stores it as a remote description (connection information for the peer at the other end) for further use. Peer B generates its own ICE candidate and session description, stores them as a local description, and sends it to peer A through a signaling mechanism. This part is called the answer. (Note: As mentioned earlier, the ICE candidates in steps 2 and 3 can also be sent individually)
Peer A receives the answer from peer B and stores it as a remote description. This way, both peers have each other's connection information and can successfully start communicating via WebRTC!

Analysis of the working principle of the original  WebRTC | Nuggets Technology Call for Papers

 

★The business card at the end of the article can receive audio and video development learning materials for free, including (FFmpeg, webRTC, rtmp, hls, rtsp, ffplay, srs) and audio and video learning roadmaps, etc.

see below!

 

Guess you like

Origin blog.csdn.net/yinshipin007/article/details/132475154