Web real-time communication technology WebRTC

Web Real-Time Communication (WebRTC) is an open source project currently under development with the primary purpose of providing real-time, peer-to-peer communication between web applications.

WebRTC is an open source project that allows adding peer-to-peer real-time communication capabilities to applications.

When WebRTC was first released, it was targeted at web applications running on Chrome. But it is now possible to run WebRTC applications on almost all popular browsers, Android, iOS and desktop platforms.

WebRTC provides a simple JavaScript API for developers to easily build web applications with real-time audio, video, and data transfer capabilities. Recent developments in WebRTC have also enabled its incorporation into native applications. Because there's a lot going on behind the scenes of the API, it's important to understand what WebRTC is and how it works in order to get the most out of the technology.

What are the advantages of WebRTC?

If you need to create an application or platform for real-time communication, you need to consider many factors, such as:

  • Communication quality (latency, media quality, stability, etc.)
  • Access device hardware (camera, microphone, etc.)
  • Network usage (bandwidth usage, network throttling, etc.)
  • Video and audio encoding/decoding
  • Safety
  • UX improvements (noise reduction, echo cancellation, etc.)
  • Support multiple platforms (Windows, Mac, Linux, Android, iOS, etc.)

If you use WebRTC, you don't need to consider the above factors.

WebRTC enables application developers to enable real-time communication capabilities using simple APIs.

how to establish a connection

In order to establish a WebRTC connection, two steps are required:

  • Find the location of the peer.
  • Notifies the peer to set up a WebRTC connection.

Step 1: Find peers

Think of it like making a phone call, when you need to talk to someone on the phone, dial their phone number and connect with that person. The same thing happens when someone tries to call you. In the case of mobile communications, the mobile/telephone number is used as the user's identification. The telecommunication system further uses this identification to locate the user.

However, web applications cannot "dial and call" each other. Every single one of the millions of browsers in the world is not assigned a unique ID (like a phone number). However, the systems on which these applications reside are generally assigned a unique IP address that can be used to "locate" peers.

However, the process is not as easy as it sounds. Because, most of these systems are behind a Network Address Translation (NAT) device. A NAT device is required for security and IPv4 restrictions on available public IP addresses. A NAT device assigns private IP addresses to systems on a local network. This private IP address is only valid and visible within the local network and cannot be used to accept communications from the outside world, as systems outside the network do not know the public IP of the device within the network.

Due to the involvement of the NAT device, the peer does not know its own public IP address because it is masked by the private IP address assigned by the NAT. Therefore, it cannot share its public IP address with another peer to accept connections. In more layman's terms, if you want someone to call you, you need to give them your phone number. However, in the presence of NAT, it is like staying in a hotel, where the room's phone number is hidden from the outside world, and calls to the hotel are processed at the reception and further redirected to the room on request. This indirect form of connection is not intended for peer-to-peer connection technology.

To overcome this problem, a protocol called Interactive Connection Establishment (ICE) is used. ICE's job is to find the best path to connect two peers. ICE can perform direct connections, that is, without NAT, and indirect connections, that is, in the presence of NAT. The ICE Framework provides "ICE Candidates" for . An 'ICE candidate' is nothing more than an object containing its own public IP address, port number and other connection related information.

ICE is very simple without NAT because the peer's public IP address is readily available. However, in the presence of NAT, ICE relies on entities called Session Traversal Utility for NAT (STUN) and/or Traversal Using Relay Around NAT (TURN).

A STUN server basically allows a peer to find out its own public IP address. A peer that needs to know its own public IP address sends a request to the STUN server. The STUN server replies with the peer's public IP address. This public address can now be shared with other peers so they can find you. However, if the peer is behind a complex NAT and/or firewall, even STUN cannot find and provide its IP address to the requesting peer. In this case, ICE relies on TURN to establish the connection. TURN is a relay server that acts as an intermediary for transferring data, audio, video when a direct connection between two peers is not possible.
 

The STUN server only participates in the process of finding the public IP. Once a WebRTC connection is established, all further communication happens over WebRTC. However, in the case of TURN, a TURN server is always required even after setting up a WebRTC connection.

A TURN server is not intended, but must be relied upon due to STUN limitations. STUN servers only have an approximate  86% success rate.

Step 2: Notify the peer to set up the WebRTC connection
 

Now that ICE candidates have been obtained, the next step is to send these candidates to the peers wishing to connect. Send session descriptions such as session information, time description, media description, etc. with the candidate. ICE candidates and session descriptions are bundled into one object and delivered using the Session Description Protocol (SDP). In some cases, candidate ICEs are not  Session Description bundled in the same object, but sent separately, which is called  Trickle ICE(this is a completely new concept).

Information needs to be "sent" to other peers. But how are candidate and session descriptions transferred when only the sender's IP address is known and not the receiver's? And since the WebRTC connection hasn't been established yet, what medium is this information transported over?

The answer to all these questions lies in a concept called the signaling mechanism. Before a WebRTC connection can be established, some intermediary is needed to pass the above information between the peers and let them know how to locate and connect to each other for the WebRTC connection. This is where the signaling mechanism comes into play. Signaling mechanisms exchange connection signals (ICE candidates, session description, etc.) between two peers intending to connect.

WebRTC does not define any standard for implementing such a signaling mechanism, leaving it to the developer to create the mechanism of choice. The signaling mechanism for exchanging information can be implemented by simply copy-pasting information to the various peers or using  communication channels such as WebSockets, , server-side events, etc. Socket.ioIn short, signaling mechanism is just a mode to exchange connection related information between peers so that peers can recognize each other and start further communication using WebRTC.

Suppose peer A wants to establish a WebRTC connection with peer B, it needs to do the following:

Peer A generates its ICE candidates using Interactive Connection Establishment (ICE). In most cases, it requires the Session Traversal Utility for NAT (STUN) or Traversal Using Relay Around NAT (TURN) server.

Peer A bundles ICE candidates and session descriptions into one object. This object is stored in peer A as a local description (the peer's own connection information) and transmitted to peer B via a signaling mechanism. This part is called the offer.

Peer B Receive the offer and store it as Remote Description (the connection information of the peer at the other end) for further use. Peer B generates its own ICE candidates and session descriptions, stores them as local descriptions, and sends them to peer A through signaling mechanisms. This part is called the answer. (Note: As mentioned earlier, ICE candidates for steps 2 and 3 can also be sent separately)


Peer A receives the answer from peer B and stores it as a remote description.

In this way, both parties can know the connection information of the other party, and can successfully start communication through WebRTC!

Author: Tianxing Wuji
Original  Web Real-time Communication Technology WebRTC

 

★The business card at the end of the article can receive audio and video development learning materials for free, including (FFmpeg, webRTC, rtmp, hls, rtsp, ffplay, srs) and audio and video learning roadmaps, etc.

see below!

 

Guess you like

Origin blog.csdn.net/yinshipin007/article/details/132253635