Technical Analysis of P2P Video Chat

The entire P2P video process needs to know the media types, streams and candidates of both parties, so the following technologies will be used here:

​ Signaling server socket.io

​ state machine

ICE server

​WebRTC framework

​ Media consultation

Signaling server Socket.io

To put it bluntly, the signaling server functions as a transfer station for sending messages. A sends the msg to the signaling server, and then the signaling server sends the msg to B.

Socket.IO is a library that enables low-latency , bi-directional , event-based communication between clients and servers .Diagram of communication between server and client

It is built on top of the WebSocket protocol and provides additional guarantees such as fallback to HTTP long polling or automatic reconnection.

WebSocket is a communication protocol that provides a full-duplex and low-latency channel between a server and a browser. More information can be found here .

There are several Socket.IO server implementations available:

  • JavaScript (documentation can be found on this site)
  • Java: https://github.com/mrniko/netty-socketio
  • Java: https://github.com/trinopoty/socket.io-server-java
  • Python: https://github.com/miguelgrinberg/python-socketio

Client implementations for most major languages:

  • JavaScript (can run in browser, Node.js or React Native)
  • Java: https://github.com/socketio/socket.io-client-java
  • C++: https://github.com/socketio/socket.io-client-cpp
  • Swift: https://github.com/socketio/socket.io-client-swift
  • Dart: https://github.com/rikulo/socket.io-client-dart
  • Python: https://github.com/miguelgrinberg/python-socketio
  • .Net: https://github.com/doghappy/socket.io-client-csharp
  • Golang: https://github.com/googollee/go-socket.io
  • Rust: https://github.com/1c3t3a/rust-socketio
  • Kotlin: https://github.com/icerockdev/moko-socket-io

Here's a basic example using plain WebSocket:

Server (based on ws )

import {
    
     WebSocketServer } from "ws";

const server = new WebSocketServer({
    
     port: 3000 });

server.on("connection", (socket) => {
    
    
  // 向客户端发送消息
  socket.send(JSON.stringify({
    
    
    type: "hello from server",
    content: [ 1, "2" ]
  }));

  // 从客户端接收消息
  socket.on("message", (data) => {
    
    
    const packet = JSON.parse(data);

    switch (packet.type) {
    
    
      case "hello from server":
        // ...
        break;
    }
  });
});

client

const socket = new WebSocket("ws://localhost:3000");

socket.addEventListener("open", () => {
    
    
  // 向服务器发送消息
  socket.send(JSON.stringify({
    
    
    type: "hello from server",
    content: [ 3, "4" ]
  }));
});

// 从服务器接收消息
socket.addEventListener("message", ({
    
     data }) => {
    
    
  const packet = JSON.parse(data);

  switch (packet.type) {
    
    
    case "hello from server":
      // ...
      break;
  }
});

Here's the same example with Socket.IO:

server

import {
    
     Server } from "socket.io";

const io = new Server(3000);

io.on("connection", (socket) => {
    
    
  // 向客户端发送消息
  socket.emit("hello from server", 1, "2", {
    
     3: Buffer.from([4]) });

  // 从客户端接收消息
  socket.on("hello from server", (...args) => {
    
    
    // ...
  });
});

client

import {
    
     io } from "socket.io-client";

const socket = io("ws://localhost:3000");

// 向服务器发送消息
socket.emit("hello from server", 5, "6", {
    
     7: Uint8Array.from([8]) });

// 从服务器接收消息
socket.on("hello from server", (...args) => {
    
    
  // ...
});

These two examples look very similar, but in fact Socket.IO provides additional features that hide the complexities of running WebSockets-based applications in a production environment. These functions are listed below .

But first, let's be clear about what Socket.IO is not.

What Socket.IO is not:

Socket.IO is not a WebSocket implementation.

Although Socket.IO does use WebSocket for transport where possible, it adds extra metadata to each packet. That's why WebSocket clients won't be able to successfully connect to Socket.IO servers, and Socket.IO clients won't be able to connect to plain WebSocket servers.

// 警告:客户端将无法连接!
const socket = io("ws://echo.websocket.org");

If you're looking for a plain WebSocket server, check out ws or µWebSockets.js .

There is also discussion about including a WebSocket server in the Node.js core. On the client side, you might be interested in robust-websocket .

Socket.IO is not intended to be used as a background service in mobile applications .

The Socket.IO library maintains an open TCP connection to the server, which can cause significant battery drain for the user. Please use a dedicated messaging platform like FCM for this use case .

features

Here's what Socket.IO provides over plain WebSockets:

HTTP long polling fallback

If a WebSocket connection cannot be established, the connection will fall back to HTTP long polling.

This feature is the reason people used Socket.IO when they created projects over a decade ago (!), because browser support for WebSockets was still in its infancy.

Even though most browsers now support WebSockets (over 97% ), it's still a great feature because we still get reports from users who cannot establish WebSocket connections because they use some misconfigured acting.

auto reconnect

In some specific cases, the WebSocket connection between the server and the client may be broken without either party being aware of the broken state of the link.

That's why Socket.IO includes a heartbeat mechanism that periodically checks the status of the connection.

When the client finally disconnects, it automatically reconnects with an exponential backoff delay so as not to overwhelm the server.

packet buffer

Packets are automatically buffered when a client disconnects and sent when it reconnects.

More info here .

callback after receiving

Socket.IO provides a convenient way to send events and receive responses:

sender

socket.emit("hello", "world", (response) => {
    
    
  console.log(response); // "got it"
});

receiver

socket.on("hello", (arg, callback) => {
    
    
  console.log(arg); // "world"
  callback("got it!");
});

You can also add timeouts:

socket.timeout(5000).emit("hello", "world", (err, response) => {
    
    
  if (err) {
    
    
    // 另一方未在给定延迟内确认事件
  } else {
    
    
    console.log(response); // "got it"
  }
});

broadcast

On the server side, you can send events to all connected clients or a subset of clients :

// 到所有连接的客户端
io.emit("hello");

// 致“news”房间中的所有连接客户端
io.to("news").emit("hello");

This also works when scaling to multiple nodes .

multiplexing

Namespaces allow you to split your application's logic over a single shared connection. This can be useful, for example, if you want to create an "admin" channel that only authorized users can join.

io.on("connection", (socket) => {
    
    
  // 普通用户
});

io.of("/admin").on("connection", (socket) => {
    
    
  // 管理员用户
});

common problem

Do we still need Socket.IO now?

This is a good question because WebSocket is supported almost everywhere these days.

Having said that, we believe that if you're using plain WebSocket in your application, you'll end up needing to implement most of the functionality already included (and battle-tested) in Socket.IO, such as reconnection , acknowledgment , or broadcasting .

Data table size for Socket.IO protocol?

socket.emit("hello", "world")will be sent as a single WebSocket frame containing 42["hello","world"]:

  • 4is the Engine.IO "message" packet type
  • 2is the Socket.IO "message" packet type
  • ["hello","world"]is JSON.stringify()the -ed version of the argument array

So each message adds a few bytes, which can be further reduced by using a custom parser .

& the size of the browser bundle itself is 10.4 kB(minified and compressed).

start

Disclaimer: different versions of socket.io have different usage, here is the latest usage

The first step: use the express framework to build server routing

const express = require('express'); //引入express模块
const app = express();
app.get('/', (req, res) => {
    
    
  res.sendFile(__dirname + '/login.html');
});
/http/
const http = require('http');//此处引用的是http,但是我们的目的是创建P2P视频,所以要开启https,所以需要拥有ssl证书,这个可以在各大云服务商免费申请(推荐华为云和阿里云)
const httpServer = http.createServer(app)
				.listen(8888, "0.0.0.0");
https
const https = require('https');
var fs = require("fs");
var opt = {
    
    
    key:fs.readFileSync("./cert/ssl_server.key"),
    cert:fs.readFileSync("./cert/ssl_server.pem")
}//引入ssl证书
const httpsServer = http.createServer(app,opt)
 				.listen(8008, "0.0.0.0");
//这样一个服务器就搭好了,直接node xx.js就可以启动

Step 2: Use the Socket.io server

The server can directly npm install socket.io

老版本///
var io = socketIo.listen(httpsServer);
/新版本///
const io = new Server(httpsServer);

io.sockets.on("connection", (socket)=>{
    
    
    /操作内容///
})

Step 3: Use the Socket.io client

Clients can use CDN to import or download related js libraries

<script src="/socket.io/socket.io.js"></script>
<script>
  var url = "120.78.xxx.xx:8008"
  var socket = io(url);
  });
</script>

Such a client has already connected to a server, and the next step is related operations

Step 4: Related operations (omitted)

P2P join room process

In the figure below, there are three clients ABC, they are connected to the signaling server at the same time. First, A initiates the join room signal join, and then the signaling server adds A to the room and then replies the joined signal to A; then B initiates the join room signal join, Then the signaling server will join B into the room and reply the joined signal to B, and at the same time give A the other join room signal otherjoin. Then C initiates the signal join to join the room, but the signaling server will control the number of people in the room. When the number of people in the room is equal to 2, it will reply a full signal to the client that newly requested to join, indicating that the room is full and you have not been added. At this time, A and B are in the same room, and then they can communicate in the room.
insert image description here

state machine

Use state machine for state transformation and judgment

Why have a state machine:

First consider that a client will have several states in a chat room:

Before joining the room or after leaving the room (Init/Leave)

After joining the room (joined)

After the second chatter joins (joined_conn)

After the second chatter leaves (joined_unbind)

From the above status, it can be found that unless the user status is joining and leaving, what will happen if the user enters and exits the room. Here we need to think about what will happen when the user enters the chat room and leaves the chat room? There will be at least one person in a chat room, then this person is the initiator, so how do you know that the current user is the initiator. It is determined by the user status. Next, look at the picture below. At the beginning, the current user has not joined the room, that is, he is in the leave state, but after joining the room, he becomes joined. When the second person joins, he becomes joined_conn state, which is different from the second person. The joined_conn state will appear, so it can be judged whether the current user is the first user, that is, the initiator (the role of the initiator involves media negotiation). Finally, when the second user leaves, it will become joined_unbind state.
insert image description here

ICE framework

First, let's understand how the two clients communicate point-to-point.

The first one: know the host ip yourself

The second method: use a STUN server, both A and B can access the STUN server to get the public network ip of the other party, and then use the signaling server to access, any client has joined the signaling server, through the signaling server By exchanging information, the effect of NAT penetration can be achieved

The third type: use a relay server Relay server (TURN server)

These three methods are three candidates. Why are there three communication methods? Because the signaling server wants to avoid participating in communication as much as possible, or the signaling server only wants to do some simple information communication. So we can know that the high probability of our audio and video communication is the Relay server and STUN server
insert image description here

Now the host IP, Relay server and STUN server are all integrated into one ICE server project, and now we only need to build this server.

Steps to build a stun/turn server:

First install the dependent library:

ubuntu:
apt(-get) install build-essential
apt(-get) install openssl libssl-dev
centos:
yum install libevent-devel openssl-devel

Download version 4.5 source code

wget https://github.com/coturn/coturn/archive/4.5.1.1.tar.ge
连不上github的查下资料改hosts

decompress

tar -zxvf 4.5.1.1.tar.gz

Go to the project directory

cd coturn-4.5.1.1

Source code installation 3 times

./configure
make
make install

copy configuration file

cp examoles/etc/turnserver.conf bin/turnserver.conf

Modify the configuration file

#服务器监听端口,3478为默认端口,可修改为自己想要的端口号
listening-port=3478
#想要监听的ip地址,云服务器则是云服务器的内网ip
listening-ip=xxx.xxx.xxx.xxx
#扩展ip地址,云服务器则是云服务器的外网ip
extenal-ip=xxx.xxx.xxx.xxx
#可以设置访问的用户及密码
user=demon:123

start service

turnserver -v -r 外网ip:监听端口 -a -o

verify:

https://webrtc.github.io/samples/src/content/peerconnection/trickle-ice/

host local connection

Result after srflx nat mapping

relay relay server
insert image description here

WebRTC framework

RTCPeerConnection related principles

pc = new RTCPeerConnection([config])

Related abilities of PC:

media consultation

Adding and stopping streams and tracks

Transport related functions

Statistics related functions

media consultation

Each client has its own supported media format information, so in order to unify the media support formats of both parties, media negotiation is required, which is the function of pc.

The process is as follows:

insert image description here

First of all, A is an initiator (known through the state machine), A first Get an offer, this Get process collects A's own media information and candidate information (known through ICE), then setLocalDescription this information, and then put the media The information is sent to the signaling server, and the signaling server sends it to another client B in the room. Then B will setRemoteDescription of this message after receiving this message. At the same time, B also collects its own media information and candidate information. After setLocalDescription, it sends the answer to the signaling server, and the signaling server forwards it to A. A receives it After answer also setRemoteDescription. This way both parties know the media information and candidates supported by the other side.

insert image description here

code show as below:

function mediaNegociate() {
    
    
    if (status === "joined_conn") {
    
    //joined_conn代表我是连接的发起人
        if (pc) {
    
    
            var options = {
    
    //要协商的内容,如音频、视频...
                offerToReceiveVideo:true,
                offerToReceiveAudio:true
            }

            pc.createOffer(options)
                .then(getOffer)
                .catch(handleErr);
        }
    }
}


socked.on("vgetdata", (room, data)=>{
    
    
        console.log("vgetdata:", data);
        if (!data) {
    
    
            return ;
        }
        if (data.type === "candidata") {
    
    //拿到对方传过来的候选者信息
           //
        } else if(data.type === "offer") {
    
    //媒体协商默认有type值为offer
            console.log("get offer");
            pc.setRemoteDescription(new RTCSessionDescription(data));//把对方的媒体格式设置进来

            //查询自己的媒体格式信息并且应答给对方
            pc.createAnswer()
                .then(getAnswer)
                .catch(handleErr);
        } else if(data.type === "answer") {
    
    //媒体协商默认回应有type值为answer
            console.log("get answer");
            pc.setRemoteDescription(new RTCSessionDescription(data));//我把offer发给对方,对方回的answer。offer和answer都是有媒体格式信息。所以offer和answer不会同时存在一个客户端,第一个进来的会发offer,第二个进来的会发answer。把对方的媒体格式设置进来
        }
    });

P2P code

html code:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>视频聊天</title>
    <link rel="icon" href="./2.jpg" type="image/x-icon">
</head>
<body>

    <div align="center">
        <table>
            <tr><td colspan="2">
                <h1 id="welcom">欢迎来到1v1的聊天室</h1>
                <input id = "room">
                <button id = "enterRoom">进入房间</button>
                <button id="leaveRoom" >离开房间</button>
            </td></tr>
            <tr>
                <td><lable>本地视频</lable></td>
                <td><label>远端视频</label></td>
            </tr>
            <tr>
                <td><video id="localVideo" autoplay playsinline></video></td>
                <td><video id="remoteVideo" autoplay playsinline></video></td>
            </tr>
        </table>
    </div>

    <script src="js/socket.io.js"></script>
    <script src="js/videoRoom.js"></script>

</body>
</html>

Signaling server code:

"use strict";

var http = require("http");
var https = require("https");
var fs = require("fs");

//自己安装的模块
var express = require("express");
var serveIndex = require("serve-index");//文件目录
var sqlite3 = require("sqlite3");
var log4js = require("log4js");
var socketIo = require("socket.io");

var logger = log4js.getLogger();
logger.level = "info";

var app=express();
app.use(serveIndex("./zhangyangsong"));
app.use(express.static("./zhangyangsong"));

var opt = {
    
    
    key:fs.readFileSync("./cert/ssl_server.key"),
    cert:fs.readFileSync("./cert/ssl_server.pem")
}

// var httpServer=http.createServer(app)
//     .listen(8888, "0.0.0.0");
var httpsServer=https.createServer(opt, app)
    .listen(8008, "0.0.0.0");

var db = null;
var sql = "";
// var io = socketIo.listen(httpServer);
var io = socketIo.listen(httpsServer);
io.sockets.on("connection", (socket)=>{
    
    
    logger.info("connection:", socket.id);

    //处理1v1聊天室的消息
    socket.on("vjoin", (room, uname)=>{
    
    
        logger.info("vjoin", room, uname);
        socket.join(room);

        var myRoom = io.sockets.adapter.rooms[room];
        var users = Object.keys(myRoom.sockets).length;
        logger.info(room + "user=" + users);
        if (users > 2) {
    
    
            socket.leave(room);
            socket.emit("vfull", room);
        } else {
    
    
            socket.emit("vjoined", room);
            if (users > 1) {
    
    
                socket.to(room).emit("votherjoined", room, uname);
            }
        }
    });
    socket.on("vdata", (room, data)=>{
    
    
        logger.info("vdata", data);
        socket.to(room).emit("vgetdata", room, data);
    });

    socket.on("vleave", (room, uname)=>{
    
    
        if (room === "") {
    
    
            logger.info("room is empty string");
        } else if (room === undefined) {
    
    
            logger.info("room is undefine");
        } else if (room === null) {
    
    
            logger.info("room is null");
        }

        var myRoom = io.sockets.adapter.rooms[room];
        var users = Object.keys(myRoom.sockets).length;

        logger.info("vleave users=" + (users - 1));
        socket.leave(room);
        socket.emit("vleft", room);
        socket.to(room).emit("votherleft", room, uname);
    });
});

function handleErr(e) {
    
    
    logger.info(e);
}

Specific operation js code:

"use strict"
//整个P2P过程需要知道双方的媒体类型、流和候选者
var hWelcom = document.querySelector("h1#welcom");

var url = location.href;
var uname = url.split("?")[1].split("=")[1];

hWelcom.textContent = "欢迎来到1v1视频聊天室:" + uname;

var iptRoom = document.querySelector("input#room");
var btnEnterRoom = document.querySelector("button#enterRoom");
var btnLeaveRoom = document.querySelector("button#leaveRoom");

var videoLocal = document.querySelector("video#localVideo");
var videoRemote = document.querySelector("video#remoteVideo");

var localStream = null;
var remoteStream = null;

var socked = null;
var room = null;
var status = "init";
var pc = null;
var url = "120.78.130.50:8008"
function getMedia(stream) {
    
    
    localStream = stream;
    videoLocal.srcObject = stream;
}

function  start() {
    
    
    var constraints = {
    
    
        video:true,
        audio:true
    };

    //打开摄像头
    navigator.mediaDevices.getUserMedia(constraints)
        .then(getMedia)
        .catch(handleErr);
    conn();
}

function conn() {
    
    
    socked = io.connect(url);
    //监听来自服务器的信号
    socked.on("vfull", (room)=>{
    
    
        status = "leaved";
        alert("房间已满:" + room);
        console.log("vfull", status);
    });

    socked.on("vjoined", (room)=>{
    
    
        //创建视频连接类
        alert("成功加入房间:" + room);
        createPeerConnection();

        status = "joined";
        console.log("vjoined:", status);
    });

    socked.on("votherjoined", (room, uname)=>{
    
    
        //建立视频连接
        alert("有人进来了:" + uname);

        if (status === "joined_unbind") {
    
    
            createPeerConnection();
        }
        status = "joined_conn";

        //当第二个人进来就要发起媒体协商了:媒体协商就是双方互相知道和设置对方的媒体格式
        mediaNegociate();

        console.log("votherjoined:", status);
    });

    socked.on("vgetdata", (room, data)=>{
    
    
        console.log("vgetdata:", data);
        if (!data) {
    
    
            return ;
        }
        if (data.type === "candidata") {
    
    //拿到对方传过来的候选者信息
            console.log("get other candidata");

            //候选者信息
            var cddt = new RTCIceCandidate({
    
    
                sdpMLineIndex:data.label,
                candidate:data.candidate
            });
            pc.addIceCandidate(cddt);//把候选者对象加入pc

        } else if(data.type === "offer") {
    
    //媒体协商默认有type值为offer
            console.log("get offer");
            pc.setRemoteDescription(new RTCSessionDescription(data));//把对方的媒体格式设置进来

            //查询自己的媒体格式信息并且应答给对方
            pc.createAnswer()
                .then(getAnswer)
                .catch(handleErr);
        } else if(data.type === "answer") {
    
    //媒体协商默认回应有type值为answer
            console.log("get answer");
            pc.setRemoteDescription(new RTCSessionDescription(data));//我把offer发给对方,对方回的answer。offer和answer都是有媒体格式信息。所以offer和answer不会同时存在一个客户端,第一个进来的会发offer,第二个进来的会发answer。把对方的媒体格式设置进来
        }
    });

    socked.on("vleft", (room)=>{
    
    
        status = "leaved";
        console.log("vleft:", status);
    });

    socked.on("votherleft", (room, uname)=>{
    
    
        status = "joined_unbind";
        closePeerConnection();
        console.log("votherleft:", status);
    });
}

function getAnswer(decs) {
    
    
    pc.setLocalDescription(decs);//设置一下本地的媒体格式信息
    sendMessage(decs);
}

function mediaNegociate() {
    
    
    if (status === "joined_conn") {
    
    //joined_conn代表我是连接的发起人
        if (pc) {
    
    
            var options = {
    
    //要协商的内容,如音频、视频...
                offerToReceiveVideo:true,
                offerToReceiveAudio:true
            }

            pc.createOffer(options)
                .then(getOffer)
                .catch(handleErr);
        }
    }
}

function getOffer(desc) {
    
    //收到的媒体格式
    pc.setLocalDescription(desc);
    sendMessage(desc);//把我需要的媒体格式发给对方
}

function createPeerConnection() {
    
    
    if (!pc) {
    
    
        var pcConfig = {
    
    //ICE服务器
            "iceServers":[{
    
    
                "urls":"turn:120.78.130.xx:3478", //指定中继服务器turn
                "username":"zhangyangsong",
                "credential":"123"
            }]
        }

        pc = new RTCPeerConnection(pcConfig); //pc作用:媒体协商,流和轨的添加和停止,传输相关功能,统计相关功能

        pc.onicecandidate = (e)=>{
    
     //得到了ICE服务器选择的候选者返回的事件
            if (e.candidate) {
    
    //先判断是不是候选者事件回来的
                console.log("CANDIDATE", e.candidate);
                sendMessage({
    
    //把候选者信息发给对方(会发给信令服务器然后转发给对方)
                    type:"candidata",
                    label:e.candidate.sdpMLineIndex,//候选者标签
                    id:e.candidate.sdpMid,//候选者id
                    candidate:e.candidate.candidate//候选者数据
                });
            }
        }

        //当媒体到达的时候,做什么
        pc.ontrack = (e)=>{
    
    //ontrack收到远程音视频轨e时
            //alert("连接成功")
            remoteStream = e.streams[0];
            videoRemote.srcObject = remoteStream;//把远程媒体流放到远程音频标签里面显示出来
        }
    }

    if (localStream) {
    
    
        localStream.getTracks().forEach((track)=>{
    
    
            pc.addTrack(track, localStream);//将本地的媒体流轨加到pc里面
        })
    }
}

start();

function sendMessage(data) {
    
    
    if (socked) {
    
    
        socked.emit("vdata", room, data);
    }
}

function handleErr(e) {
    
    
    console.log(e);
}

function enterRoom() {
    
    
    room = iptRoom.value.trim();
    if (room === "") {
    
    
        alert("请输入房间号");
        return;
    }
    socked.emit("vjoin", room, uname);
}

function  leaveRoom() {
    
    
    socked.emit("vleave", room, uname);
    closePeerConnection();
}

function closePeerConnection() {
    
    
    console.log("close RTCPeerConnection");
    if (pc) {
    
    
        pc.close();
        pc = null;
    }
}

btnEnterRoom.onclick = enterRoom;
btnLeaveRoom.onclick = leaveRoom;

At this point, the entire WebRTCP2P chat room is completed. There are still many functions that WebRTC can develop, but the basic principles are these few contents.

Note: Remember to open the chrome://flags/ website to search for platform and open it

insert image description here

Guess you like

Origin blog.csdn.net/qq_58360406/article/details/129111151