Realize remote control based on webRTC+Electron

The motivation for writing this article: In the past, online education used third-party development, but the bottom layer used webRTC technology. I have been trying to find time to study it. Recently, I saw related implementations and personally coded to realize the principles. Learn the following functions and how to implement them:

This article is written in a rough way, and the specific implementation is combined with the source code to understand
the main steps:

  • Start the project first and call createOffer to get the offer
  • Then pass the offer obtained above into createAnswer on the puppet side, and call it to get pc.localDescription. In this function, the desktop stream must be added
  • Then pass the pc.localDescription obtained above into setRemote on the control side, and call it, and monitor the increase of the stream at the same time
  • The process of realizing STUN, see below

MediaStream API

  • streaming of media content
  • A stream object can contain multiple tracks, including audio and video tracks, etc.
  • Can be transmitted via WebRTC
  • Playable via tab

How to capture desktop/window stream

async function getScreenStream(){
    const sources = await desktopCapturer.getSources({types: ['screen']})
    navigator.webkitGetUserMedia({
        audio:false,
        video: {
            mandatory: {
                chromeMediaSource: 'desktop',
                chromeMediaSourceId: sources[0].id,
                maxWidth: window.screen.width,
                maxHeight: window.screen.height
            }
        }
        // 捕获成功放在callback中的第一个参数
    },(stream)=>{
        peer.emit('add-stream', stream)
    }, (err)=> {
        // 这里必须写,不然报错
        console.log(err)
    })
}

How to play a media stream object

var video = document.querySelector('video')
video.srcObject = stream
video.onloadedmetadata = function(e) {
 video.play();
}

Enable Desktop Streaming

The easiest transfer process

SDP
SDP (Session Description Protocol) is a session description protocol used to describe multimedia sessions, mainly used to negotiate the communication process between the two parties and transfer basic information.

  • The format of the SDP consists of multiple lines, each line<type>=<value>
  • <type>: character, representing a specific attribute, such as v, representing the version
  • <value>: Structured text, the format is related to the attribute type, UTF8 encoding

Combat coding

Step 1: Create RTCPeerConnection, which is an encapsulated object for us to establish a P2P connection, and then we will call the method of RTCPeerConnection to create an offer, which is an SDP, and SDP is essentially a protocol. Let’s talk about SDP later. You can understand that we initiated an invitation, and we set our invitation to our LocalDescription. You can simply understand these three steps. In fact, we are creating an initial connection, an initial P2P connection.
Then we need to pass the invitation generated by our control terminal, which is our offer, through other media, such as you can directly send the SDP of the invitation to our puppet terminal through WeChat or SMS, our puppet terminal After getting the offer, it will also create a PeerConnection object, and then because our puppet side wants to share our screen, in the section of desktop sharing, we actually said that we need to capture our desktop stream Then add it to our PeerConnection, and then we set the control end as the remote end of our puppet end. That is, if we want to pass the PeerConnection corresponding to the offer, we call the setRemoteDescription method, and then we will call a createAnswer to indicate that we have confirmed it, which represents a SDP that I have determined, and then I will also put the SDP in the puppet At this time, our puppet terminal will also generate an SDP. Similarly, we can also pass our response SDP to the control terminal through any method. At this time, the control terminal will also set this SDP as it wants to transmit. Object.
At this point, the control terminal and the puppet terminal have been set as Remote, and a P2P connection will be established at this time. To sum it up briefly, our control end initiates the clothes moth invitation. After confirming the invitation, the puppet end adds its own desktop stream to the P2P connection, and then returns a confirmed protocol. Finally, our control end sends the confirmed protocol to In terms of settings, this means that the P2P connection between the control end and the puppet end can start.
NAT (Network Address Translation)
P2P data exchange needs to pass through the server. For example, in Meituan Elephant, if you want to send a message, we will send it to the server and then send it to the corresponding user. If you use a P2P connection, it will definitely be faster and safer than using a server.
In fact, there are many answers to why we need to forward a layer of server-side. One of the reasons is that we need to know the public IP and port number of the other party during end-to-end communication. In fact, this is not an easy task, because our The network environment is full of NAT technology. NAT is an acronym for Network Address Translation. Why does this technology appear? If you have a certain understanding of network knowledge, IPv4 addresses have long been insufficient. There are two main reasons for this. :

  1. The first reason is IPv4, which is originally a 32-bit integer, and theoretically can only support more than 4 billion addresses, which is far smaller than the total population of the world;
  2. The second reason is that IP addresses are distributed unevenly geographically. There are many in the United States, but they are very scarce in China. China has only 0.06 addresses per capita, while Asia, which accounts for 56% of the world's population, can only get 9% of the addresses. .

So in order to solve the address problem, NAT appeared. Each device in NAT will have an independent LAN address, and then they will share the same public network IP when connecting to the external network, and NAT is responsible for maintaining A mapping table including local IP port and external network IP port.
How to get the real IP and port?
This will involve NAT hole punching. The basic method is to establish a connection between the server and one of ClientB. At this time, a mapping of the internal and external network of a port number will be established in NAT. After that, our server can know that ClientB is outside The IP and port of the network, and then pass it to ClientA, and finally ClientA can directly use NAT to make a hole, and then communicate with ClientB. There is already an integrated mechanism in webRTC, which is the STUN service. When ClientA and ClientB want to connect P2P, it first needs to make a traversal hole with our server, and then pass the result of the hole to Under ClientB, ClientB also needs to do a similar operation. In this way, with the help of the server, ClientA and ClientB can get the IP and port of each other's real public network.
NAT penetration of webRTC is a whole mechanism, we call it ICE
ICE (Interactive Connectivity Establishment) interactive connection creation

  • Priority STUN (Session Traversal Utilities for NAT), NAT session traversal application
  • Alternative TURN (Traversal Using Relay NAT), penetration achieved by relay NAT
  1. Full Cone NAT - Full Cone NAT
  2. Restricted Cone NAT - Restricted Cone NAT
  3. Port Restricted Cone NAT Port Restricted Cone NAT
  4. Symmetric NAT Symmetric NAT

For video playback, a site change operation is required

The whole process of STUN

First of all, our control terminal will first initiate an address inquiry, and then our STUN service will make this hole, and then return it to our control terminal. At this time, the control terminal will know the IP and port of its external network, and then we will It needs to pass through a certain medium and then send it to the puppet side. This is the same as the SDP transmission of PeerConnection. You can transmit it through any medium, such as email or WeChat. After the puppet side gets the IceEvent, it will pass addIceCandidate In this way, our puppet end will know an external network IP of the control end, similar puppet end will also get its own IP and port to the control end, and the control end will add ICE agent, like this , our P2P is the real establishment of success.
Signaling service: It is the server that transmits messages between webRTC, and realizes
the function of connecting both ends. The role of signaling bearer is various forwarding
based on webSocket

Establish data transmission RTCDataChannel process

//控制端
var pc = new RTCPeerConnection();
let dc = pc.createDataChannel('robotchannel', {reliable: false});
// 建立成功
dc.onopen = function() {
    console.log('opened')
    peer.on('robot', (type, data) => {
        dc.send(JSON.stringify({type, data}))
    })
}
// 接收消息
dc.onmessage = function(event) {
    console.log('message', event)
}
dc.onerror = (e) => {console.log(e)}

//傀儡端
const pc = new window.RTCPeerConnection();	
pc.ondatachannel = (e) => {
    console.log('data', e)
       e.channel.onmessage = (e)  => {
        console.log('onmessage', e, JSON.parse(e.data))
       let {type, data} = JSON.parse(e.data)
        console.log('robot', type, data)
        if(type === 'mouse') {
            data.screen = {
                width: window.screen.width, 
                height: window.screen.height
            }
        }
        ipcRenderer.send('robot', type, data)
     }
 }

See your own git for specific source code

The interview is divided into 3 steps:

  1. Get multimedia data
  2. Establish P2P connection and transmit multimedia data through signaling
  3. transfer data

 

The original text realizes remote control based on webRTC+Electron - Nuggets 

★The business card at the end of the article can receive audio and video development learning materials for free, including (FFmpeg, webRTC, rtmp, hls, rtsp, ffplay, srs) and audio and video learning roadmaps, etc.

see below!

 

Guess you like

Origin blog.csdn.net/yinshipin007/article/details/132382243