Aizhi EdgerOS in-depth analysis of how to use Aizhi's video streaming module to complete streaming

1. ONVIF specifications and common video streaming protocols

① ONVIF specification

  • As the video surveillance industry chain matures, a variety of network camera equipment has appeared on the market, and these devices all require communication protocols for data transmission. Early manufacturers used private protocols, but now manufacturers have a clear division of labor. Some are responsible for manufacturing cameras, some are responsible for developing video servers, and some are responsible for integrating and selling solutions. Private protocols have serious compatibility issues. Similar to the HTTP protocol's specification for communication between browsers and servers, the interface of web cameras has also been standardized. Among them, ONVIF is far ahead in the number of supported manufacturers, the popularity of the manufacturers, and the market share.
  • The ONVIF specification describes the network video model, interface, data type and data interaction mode, and reuses some existing standards, such as the WS series standards, etc. The goal is to implement a network video framework protocol to enable video streaming produced by different manufacturers. Network video products (including recording front-end, recording equipment, etc.) are fully interoperable. The interfaces defined in the device management and control sections of the ONVIF specification are provided in the form of Web Services. The ONVIF specification covers complete XML and WSDL definitions. Each terminal device that supports the ONVIF specification must provide Web Services corresponding to the functions. Simple data interaction between the server and the client uses the SOAP protocol, and audio and video streaming is performed through RTP/RTSP.

Insert image description here

② Video streaming protocol

Insert image description here

  • http-flv / ws-flv
    • There is a Content-Length field in the HTTP protocol, which represents the length of the HTTP body part. If there is this field when the server replies to the HTTP request, the client will receive data of this length and then consider the data transmission to be completed. If the server does not reply to the HTTP request with this field, the client will continue to receive data until the server communicates with the client's socket. Disconnect. To stream flv files, it is impossible for the server to know the content size in advance. In this case, you can use Transfer-Encoding: chunked mode to continuously transmit the data in chunks.
    • http-flv can penetrate firewalls better. It is based on HTTP/80 transmission and effectively avoids being intercepted by firewalls. In addition, it can flexibly schedule/load balance through HTTP 302 jumps, and supports encrypted transmission using HTTPS. ws-flv is similar to http-flv. The difference is that http-flv is based on HTTP and can only transmit data in one direction, while ws-flv is based on WS and can transmit data in both directions.
  • RTMP
    • RTMP is the acronym for Real Time Messaging Protocol. A set of live video protocols developed by Macromedia and now owned by Adobe. Because RTMP is a proprietary protocol developed to transmit streaming audio, video and data between a Flash player and a server over the Internet. Early web pages required a Flash player to broadcast live, so the RTMP protocol was the best choice. Internet companies have also accumulated a lot of technology on this protocol, and domestic CDNs have optimized RTMP.
    • With the popularity of HTML5 and the discontinuation of Flash updates, it will theoretically be replaced by other protocols. However, due to the emergence of libraries like flv.js, flv format videos can also be played on HTML5 web pages, so RTMP is still the mainstream of current video live broadcasts. protocol.
  • HLS
    • HLS is the abbreviation of HTTP Live Streaming. It is a streaming media network transmission protocol based on HTTP proposed by Apple. It can realize live streaming and on-demand streaming. It is mainly used in iOS systems to provide audio for iOS devices (such as iPhone and iPad). Live video and on-demand solutions. The HLS protocol stores live data streams as continuous, short-duration media files (MPEG-TS format) on the server side, while the client continuously downloads and plays these small files, because the server always stores the latest live broadcast. The data generates new small files, so that the client only needs to play the files obtained from the server in order to achieve live broadcast.
    • Both iOS and Android naturally support this protocol, and the configuration is simple, just use the video tag. This is a live broadcast implemented in an on-demand manner, and the duration of the segmented files is very short, because the client can switch the code rate relatively quickly to adapt to different broadband conditions. However, due to this technical characteristic, its delay will be higher than that of ordinary live streaming protocols. Disk storage and processing of a large number of small files generated by the HLS protocol will affect the response speed of the server and consume the normal life of the disk.
  • RTSP
    • RTSP (Real Time Streaming Protocol) is an application layer protocol jointly proposed by Real Network and Netscape to effectively transmit streaming media data on IP networks. Unlike the streaming media network transmission protocol based on the HTTP protocol, RTSP is a two-way communication, and both the client and the server can actively initiate requests.
    • RTSP provides controls such as pause and fast forward for streaming media, but it does not transmit data itself. RTSP functions as a remote control of the streaming media server. The server can choose to use TCP or UDP to transmit streaming content. Its syntax and operation are similar to HTTP 1.1, but it does not place special emphasis on time synchronization, so it is more tolerant of network delays. It also allows multiple streaming demand control (Multicast) at the same time. In addition to reducing network usage on the server side, it can also support multi-party video conferences (Video Conference).

Insert image description here

  • DASH
    • DASH (Dynamic Adaptive Streaming over HTTP) stands for "Dynamic Adaptive Streaming over HTTP". It is an adaptive bitrate streaming technology that enables high-quality streaming media to be delivered over the Internet through traditional HTTP web servers.
    • Similar to Apple's HTTP Live Streaming (HLS) solution, DASH breaks content into a series of small HTTP-based file fragments, each fragment containing a short length of playable content, and the total length of the content may be up to several hours (e.g. movies or live sports events). Content will be cut into multiple bitrate alternatives to provide multiple bitrate versions for selection. When the content is played back by the client, the client will automatically choose which alternative to download and play based on current network conditions. The client will select for playback the highest bitrate clip that can be downloaded in a timely manner, thus avoiding playback stutters or rebuffering events. Because of this, the client can seamlessly adapt to changing network conditions and provide a high-quality playback experience with fewer lags and rebuffering.

2. EdgerOS’s support for streaming media

  • Video streaming is used in daily life scenarios, such as face-based access control, unmanned garages, and object recognition. EdgerOS also provides developers with two modules, MediaDecoder and WebMedia, making the development process simple and convenient. Push the video stream to the front end. MediaDecoder is responsible for decoding the real-time audio and video streams pushed by the device and converting them into target videos in the specified format (video width and height, frame rate, pixels).
  • At the same time, you can use WebMedia to create a streaming media service. This service supports dual transmission channels. The stream channel supports http-flv and ws-flv protocols for transmitting video streams; the data channel supports the WS protocol, so that the client and server can directly Send simple control commands to each other. The client receives the video stream through the stream channel and uses open source libraries such as flv.js to implement real-time live broadcasting on the web page.

Insert image description here

  • EdgerOS uses several modules to process the target video stream generated by MediaDecoder decoding to adapt to different scenarios, such as face recognition. The general process is as follows:

Insert image description here

3. Push flow and pull flow

  • Push streaming: The process of pushing live content to the server is actually the process of transmitting live video signals to the network. "Push streaming" has relatively high requirements on the network. If the network is unstable, the live broadcast effect will be very poor. The audience will experience freezes when watching the live broadcast, and the viewing experience will be very bad. If it is to be used for streaming, the audio and video data must be encapsulated using a transmission protocol and turned into streaming data. Commonly used streaming protocols include RTSP, RTMP, HLS, etc. The delay of RTMP transmission is usually 1-3 seconds. For mobile live broadcasts, which require very high real-time performance, RTMP has also become the most commonly used streaming transmission in mobile live broadcasts. protocol. Finally, the audio and video stream data is pushed to the network through a certain QoS algorithm and distributed through CDN.
  • Pulling stream: refers to the process of pulling live content on the server using a specified address. That is to say, there are streaming video files in the server. The process of these video files being read according to different network protocol types (such as RTMP, RTSP, HTTP, etc.) is called streaming. Daily viewing of videos and live broadcasts is a process of streaming. the process of.

Insert image description here

4. How to use Aizhi’s video streaming module to complete streaming?

①Introduce modules

  • In the process of pulling the video stream, the WebMedia module and MediaDecoder module need to be used. At the same time, a WebApp needs to be established to communicate with the front end and bind it to WebMedia:
var WebApp = require('webapp');
var MediaDecoder = require('mediadecoder');
var WebMedia = require('webmedia');

② Select the streaming protocol and bind it

  • WebMedia supports http-flv and ws-flv protocols to transmit video streams. You can choose the streaming media protocol according to your actual needs. Write the streaming protocol into the configuration and bind it to WebMedia when it starts.
  • http-flv:
var opts = {
    
    
  mode: 1,
  path: '/live.flv',
  mediaSource: {
    
    
    source: 'flv',
  },
  streamChannel: {
    
    
    protocol: 'xhr',
  },
}
var server = WebMedia.createServer(opts, app);
  • ws-flv:

var opts = {
    
    
  mode: 1,
  path: '/live.flv',
  mediaSource: {
    
    
  source: 'flv',},
    streamChannel: {
    
    
     protocol: 'ws',
     server: wsSer
 },
}
var server = WebMedia.createServer(opts, app);

③ Check the streaming URL on the mobile terminal

  • After the WebMedia server is started successfully, you need to start the media decoding module MediaDecoder and bind the URL of the pull stream. You can view the URL in the settings of the mobile push application:

Insert image description here

④ Create video stream decoding module

  • Use the MediaDecoder interface to configure the configuration of pulling the video stream and decoding it. After the configuration is completed, listen to the remux event and header event, push the monitored data into the channel that sends the video stream to the front end, and complete the pulling, parsing and decoding of the video stream. Push process:

server.on('start', () => {
    
    
  var netcam = new MediaDecoder().open('rtsp://192.168.128.102:8554/live.sdp', 
  {
    
    proto: 'tcp'}, 5000);

  netcam.destVideoFormat({
    
    width: 640, height: 360, fps: 1, 
  pixelFormat: MediaDecoder.PIX_FMT_RGB24, noDrop: false, disable: false});
  netcam.destAudioFormat({
    
    disable: false});
  netcam.remuxFormat({
    
    enable: true, enableAudio: true, audioFormat:'aac', 
  format: 'flv'});
  netcam.on('remux', (frame) => {
    
    
    var buf = Buffer.from(frame.arrayBuffer);
    server.pushStream(buf);
  });
  netcam.on('header', (frame) => {
    
    
    var buf = Buffer.from(frame.arrayBuffer);
    server.pushStream(buf);
  });
  netcam.start();
});

⑤ Detailed explanation of the configuration interface of MediaDecoder

  • mediadecoder.destVideoFormat(fmt) sets the target video format of the media decoder:
    • fmt {Object} target video format;
    • return {Boolean} Returns true on success, false otherwise.
  • The target video format object contains the following members:
    • disable {Boolean} whether to disable video
    • width {Integer} video width
    • height {Integer} video height
    • pixelFormat {Integer} video pixel format
    • fps {Integer} video frame rate
  • mediadecoder.destAudioFormat(fmt) sets the target audio format of the media decoder:
    • fmt {Object} target audio format
    • return {Boolean} Returns true if successful, otherwise returns false
  • The target audio format object contains the following members:
    • disable {Boolean} whether to disable audio
    • channelLayout{String} audio channel layout
    • channels{Integer} Number of audio channels
    • sampleRate {Integer} audio sampling rate
    • sampleFormat {Integer} audio frame format
  • mediadecoder.remuxFormat(fmt) sets the remux format of the media decoder. remux data is mainly used for live broadcast servers and video recording:
    • fmt{Object} remux format
    • return {Boolean} Returns true if successful, otherwise returns false
  • The target audio format object contains the following members:
    • enable{Boolean} whether to enable remux
    • enableAudio{Boolean} whether to enable audio
    • audioFormat {String} audio compression format
    • format {String} Target media format, default is flv format.
  • One thing to note is that when you need to pull video as well as audio information, you need to set MediaDecoder.remuxFormat() and the audioFormat parameter in the interface to AAC ('aac'). Currently only this format is supported. of compression.

⑥ Test video streaming transmission

  • Write a simple front-end to test whether the video stream is pushed successfully. The front-end uses the NodePlayer tool to process and display the flv file.
  • NodePlayer is a tool that can play http-flv/websocket protocol live streams. Its characteristic is that it can achieve millisecond-level low-latency live broadcast in PC\Android\iOS browser Webview. And can soft decode H.264 / H.265+AAC stream, WebGL video rendering, WebAudio audio playback. Supports opening in WeChat public accounts and Moments sharing.
  • NodePlayer download address: https://www.nodemedia.cn/doc/web/#/1?page_id=1
  • Front-end test code:
<html>
<div>
  <canvas id="video1" style="width:640px;height:480px;background-color: black;"></canvas>
</div>
<body>
<script type="text/javascript" src="./NodePlayer.min.js"></script>
<script>
var player;
  NodePlayer.load(()=>{
    
    
    player = new NodePlayer();
  });
player.setView("video1");

   
player.on("start", () => {
    
    
// 当连接成功并收到数据
});
player.on("stop", () => {
    
    
// 当本地stop或远端断开连接
});
player.on("error", (e) => {
    
    
// 当连接错误或播放中发生错误
});
player.on("videoInfo", (w, h) => {
    
    
    //当解析出视频信息时回调
    console.log("player on video info width=" + w + " height=" + h);
})
player.on("audioInfo", (r, c) => {
    
    
    //当解析出音频信息时回调
    console.log("player on audio info samplerate=" + r + " channels=" + c);
})
player.on("stats", (stats) => {
    
    
// 每秒回调一次流统计信息
    console.log("player on stats=", stats);
})

player.setVolume(1000)
player.start("http://192.168.128.1:10002/live.flv"); 
</script>
</body>
</html>

Guess you like

Origin blog.csdn.net/Forever_wj/article/details/134880190