Build an RTMP streaming server with Intel® DevKit in 10 minutes and enable video processing based on OpenVINO™ AI

Author: Zhou Zhaojing

Article guidance: Fang Liang, Guo Yejun

1. Purpose of this article

This article will introduce how to use the Intel® certified DevKit-Aix development board to quickly build an RTMP streaming media server, and use FFmpeg* to realize the function of video streaming. Since the backend of FFmpeg supports OpenVINO™ empowerment, we can deploy AI models to implement AI processing of video streams on the basis of video streaming. Moreover, we will make full use of the integrated graphics card (iGPU) carried by the CPU for video codec acceleration and AI reasoning.

2. Project Introduction

Figure 1: Intel®DevKit RTMP streaming server project flow chart

Project introduction: Use FFmpeg to read camera video stream, local video or network video and decode it. After decoding, call the video processing functions contained in FFmpeg, including video editing, video splicing, video watermarking, etc., and can support OpenVINO™ tool suite as the backend to perform AI processing on input video. Since FFmpeg is compatible with software and hardware codec libraries, you can choose CPU or integrated graphics card (iGPU) to accelerate video codec and other functions. Finally, the processed video is pushed to the locally built RTMP streaming server (Simple Realtime Server) through FFmpeg. If it is in the local area network, the customer can directly pull the video stream to watch according to the IP address. If it is on the public network, the video stream will be transmitted to the public video streaming website and broadcast through the public network nodes.

2.1 Introduction to Intel® Certified DevKit--EX Development Board

Figure 2: Aix board hardware parameters

Figure 3: The physical shooting of the X board

Intel®-certified DevKit—AIxBoard (AixBoard*) development board is an embedded hardware designed to support entry-level edge AI applications. It can meet developers' needs for artificial intelligence learning, development, training and other application scenarios.

The development board designed based on the x86 platform can support Linux Ubuntu and the full version of the Windows operating system, which is very convenient for developers to develop software and hardware, and try all the software functions that can be applied to the x86 platform. The development board is equipped with an Intel® Celeron® N5105 4-core 4-thread processor, with a turbo frequency of up to 2.9 GHz, and a built-in Intel® UHD graphics card. The integrated graphics card runs at a frequency of 450MHz to 800MHz, contains 24 execution units, and supports a maximum resolution of 4K60 frames. It also supports Intel® Quick Sync Video technology to quickly convert videos from portable multimedia players, and also provides online sharing, video editing, and video production functions . Onboard 64GB eMMC storage and LPDDR4x 2933MHz (4GB/6GB/8GB), built-in Bluetooth and Wi-Fi modules, support USB 3.0, HDMI video output, 3.5mm audio interface, 1000Mbps Ethernet port. The interface of the board is rich, and various sensor modules can also be extended.

In addition, its interface is compatible with the Jetson Nano carrier board, and the GPIO is compatible with the Raspberry Pi, which can maximize the reuse of ecological resources such as Raspberry Pi and Jetson Nano. Whether it is camera object recognition, 3D printing, or CNC real-time interpolation control, it can run stably. It can be used as an edge computing engine for artificial intelligence product verification and development; it can also be used as a domain control core for robot product development.

The x86 architecture of Intel® DevKit can support a complete Windows system, without special optimization, you can directly get the most powerful software support such as Visual Studio, OpenVINO™, OpenCV, etc., the most mature development ecology, and millions of open source projects, providing more help for your creativity. Whether you are a DIY fanatic, an interaction designer or a robot expert, you can play with the development board for creative development.

2.2 Real Time Messaging Protocol (RTMP)

2.2.1 Introduction to RTMP

The full name of Real-Time Messaging Protocol is Real-Time Messaging Protocol (RTMP). Simply put, a streaming protocol is a set of rules for transmitting multimedia files between two communication systems, which defines how video files will be broken down into small data packets and the order in which they are transmitted on the Internet. RTMP is a relatively common streaming media protocol. RTMP was developed by Macromedia for streaming to Flash players. However, as Flash began to be phased out and HTTP-based protocols became the new standard for streaming to playback devices, RTMP's application scope in streaming media protocols gradually narrowed. But it does not affect the use of RTMP at all, and it still has a great advantage in end-to-end live video streaming!

RTMP uses the exclusive 1935 port, based on the TCP protocol, and realizes the characteristics of low delay without buffering, and the protocol connection is stable. When the user is watching the video, if the network is disconnected, the user can continue to play based on the last disconnection point after reconnecting. The integration flexibility of RTMP is very strong. It can not only integrate text, video and audio, but also support MP3 and AAC audio streams, MP4, FLV and F4V video streams.

However, RTMP also has some disadvantages, such as not supporting high-resolution video and VP9, ​​AV1 and other video compression methods, like iOS, Android, most embedded players and some browsers no longer accept RTMP live streaming, and some networks block RTMP ports by default, which requires special firewall modifications to allow through the blocked network.

Figure 4: RTMP-based video streaming framework

For video push-pull stream transmission based on the RTMP protocol, we need a relay video streaming server to distribute the RTMP stream, which can not only ensure the stability of the push stream, but also facilitate the administrator to supervise and manage the video stream.

2.2.2 Simple Realtime Server(SRS)

SRS* is a simple and efficient real-time video streaming server, supporting protocols including RTMP/WebRTC/HLS/HTTP-FLV/SRT/GB28181. The supported systems include Linux/Windows/macOS, and the chip architecture includes X86_64/ARMv7/AARCH64/M1/RISCV/LOONGARCH/MIPS. You can use it to implement video streaming, and supports http callback events (HTTP Callback), and can also save video streaming files. Supports localized deployment and is easy to operate. Open source address: GitHub - ossrs/srs: SRS is a simple, high-efficiency, real-time video server supporting RTMP, WebRTC, HLS, HTTP-FLV, SRT, MPEG-DASH, and GB28181.

By compiling and installing such an open source streaming server, we can save development costs, realize rapid deployment of the streaming server, and push video.

2.3   FFmpeg integrates OpenVINO™ inference engine

2.3.1 Introduction to FFmpeg

FFmpeg is not only an audio and video encoding and decoding tool, but also a set of audio and video encoding development kits. As an encoding development kit, it provides developers with rich calling interfaces for audio and video processing.

FFmpeg provides encapsulation and decapsulation of a variety of media formats, including multiple audio and video encoding, streaming media of multiple protocols, multiple colorful format conversions, multiple sampling rate conversions, multiple bit rate conversions, etc.; the FFmpeg framework provides a variety of rich plug-in modules, including plug-ins for encapsulation and decapsulation, and encoding and decoding.

The basic composition of the FFmpeg framework includes module libraries such as AVFormat, AVCodec, AVFilter, AVDevice, and AVUtil. The structure diagram is as follows:

Figure 5: FFmpeg software architecture diagram

2.3.2  FFmpeg integrates OpenVINO™ Toolkit

OpenVINO™ is a deep learning framework released by Intel that supports multiple model file formats, including Tensorflow*, Caffe*, ONNX*, MXNet*, Kaldi, and Torch*, etc. It also supports various Intel hardware, including CPU, GPU, FPGA, Movidius™ neural computing stick, etc. Since FFmpeg requires the called library to provide a C API, OpenVINO™ just added such an interface in the 2020 release. In addition, the OpenVINO™ backend can provide more model format support than the TensorFlow backend, and can support more and better underlying hardware. So, the FFmpeg community embraced the inference engine in OpenVINO™ as a new deep learning backend.

Figure 6: AVFilter internal architecture diagram

In AVFilter, we will integrate the inference engine of OpenVINO™ as the backend of the DNN interface. Currently, there is no filter for image analysis based on deep learning models in FFmpeg, only a general filter for image processing, namely dnn_processing, so we use dnn_processing as a demonstration example:

dnn_processing=dnn_backend=openvino:model=<YOUR PATH OF espcn.xml>

Since FFmpeg does not have the OpenVINO™ backend support library under the default compilation options , in this example, developers need to recompile FFmpeg to integrate libopenvino into FFmpeg's built-in library. I would also like to thank Mr. Guo Yejun for incorporating the C interface of OpenVINO into the enable library of FFmpeg, so that FFmpeg officially supports calling the libopenvino.so library and accessing the OpenVINO engine for model reasoning. Open source address: GitHub - mattcurf/ffmpeg_openvino

3. Project process

3.1  Operating system installation

The Linux OS installed on the official Xboard Operating Manual ( Product Introduction - Intel Digital Development Kit ) is Ubuntu 20.04. In order to facilitate the operation and demonstration, we choose to use the Ubuntu 20.04 operating system with a graphical interface for demonstration.

If the encoding and decoding pressure is too high, you can also choose a server version Linux system without a graphical interface to build a streaming server. If you are using other system versions of Linux, this process is for reference only.

3.2  Build RTMP streaming media server

Build steps:

1. Obtain the source code of the srs server.

git clone https://github.com/ossrs/srs

cd srs/trunk

2. Install dependencies and compile srs source code.

sudo apt install -y automake tclsh

./configure && make

3. Edit the srs configuration file.

Save the following content as a file, such as conf/rtmp.conf , you can modify the conf file according to your own needs, and specify this configuration file when the server starts (the conf folder of srs has this file).

# conf/rtmp.conf

listen              1935;

max_connections     1000;

vhost __defaultVhost__ {

}

4. Start the srs server

./objs/srs -c conf/rtmp.conf

5. Start the streaming encoder:

Use the FFMPEG command to push a video to the server:

for((;;)); do \

./objs/ffmpeg/bin/ffmpeg -re -i <your mp4/flv…>

-vcodec copy -acodec copy \

-f flv -y rtmp://<YOUR SERVER IP>/live/livestream; \

sleep 1; \

done

6. Watch RTMP stream.

The RTMP stream address is: rtmp://<YOUR SERVER IP>/live/livestream

If the system does not have a VLC player, use the following command to install the VLC player:

sudo apt-get install vlc

Use VLC player to enter the RTMP stream address to watch the video stream.

3.3  Install FFmpeg and compile OpenVINO™ toolkit

1. Install software dependencies

apt-get install -y -q --no-install-recommends \

      apt-utils \

      build-essential \

      ca-certificates \

      cmake \

      cpio \

      curl \

      git \

      gnupg-agent \

      libdrm-dev \

      libpciaccess-dev \

      libva-dev \

      libx11-dev \

      libsdl2-2.0 \

      libsdl2-dev \

      libx11-xcb-dev \

      libxcb-dri3-dev \

      libxcb-present-dev \

      lsb-release \

      nasm \

      pkg-config \

      software-properties-common \

      wget \

      xorg-dev \

      xutils-dev \

      clang \

      libfdk-aac-dev \

      libspeex-dev \

      libx264-dev \

      libx265-dev \

      libnuma-dev \

      libopencore-amrnb-dev \

      libopencore-amrwb-dev\

      yasm

2.安装OpenCL & VAAPI

   curl -L https://repositories.intel.com/graphics/intel-graphics.key | sudo apt-key add - && \

   apt-add-repository 'deb [arch=amd64] https://repositories.intel.com/graphics/ubuntu focal main' && \

   apt-get update && \

   sudo apt-get install -y -q --no-install-recommends \

     clinfo \

     intel-opencl-icd \

     intel-media-va-driver-non-free 

3.安装OpenVINO™工具套件

curl -L https://registrationcenter-download.intel.com/akdlm/irc_nas/18319/l_openvino_toolkit_p_2021.4.752.tgz | tar xzf -

#解压缩tgz 并安装OpenVINO

cd l_openvino_toolkit_p_2021.4.752

sudo ./install.sh

#设置环境变量

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/intel/openvino_2021/inference_engine/lib/intel64:/opt/intel/openvino_2021/inference_engine/external/tbb/lib:/opt/intel/openvino_2021/deployment_tools/ngraph/lib

4.下载并编译安装FFmpeg 同时enable OpenVINO

git clone https://git.ffmpeg.org/ffmpeg.git
cd ffmpeg

./configure \

            --cpu=native \

            --extra-cflags=-I/opt/intel/openvino_2021/inference_engine/include/ \

            --extra-ldflags=-L/opt/intel/openvino_2021/inference_engine/lib/intel64 \

            --extra-libs=-lpthread \

            --disable-cuda-llvm \

            --prefix=/usr \

            --enable-static \

            --disable-shared \

            --enable-pic  \

            --disable-doc \

            --disable-manpages  \

            --enable-libopenvino \

            --enable-vaapi \

            --enable-libx264 \

            --enable-libx265 \

            --enable-ffplay \

            --enable-ffprobe \

            --enable-gpl \

            --enable-nonfree \

            --enable-libxcb && \

   make -j $(nproc) && \

   sudo make install

#清理build文件,添加VA变量

   rm -rf /build && \

   echo 'LIBVA_DRIVER_NAME=iHD' >>sudo /etc/environment && \

   sudo ldconfig

3.4  运行流媒体服务器

               运行RTMP流媒体 服务器

./objs/srs -c conf/rtmp.conf

  • 用FFmpeg将USB Camera的实时画面流进行推流

for((;;)); do \

       ffmpeg -f video4linux2 -i "/dev/video0" -vcodec libx264 -preset:v ultrafast -tune:v zerolatency -f flv rtmp://<YOUR server IP>/live/livestream;\

       sleep 1; \

   done

图7:摄像头视频流推流

  • 通过FFmpeg将视频进行处理后进行推流

for((;;)); do \

       ffmpeg -re -i <YOUR MP4 FILES> \

       -vcodec copy -acodec copy \

       -f flv -y rtmp:// <YOUR SERVER IP>/live/livestream; \

       sleep 1; \

   done

图8:推流MP4视频

  • 视频多路拼接,以4路视频拼接为例:

for((;;)); do \

       ffmpeg -re -i <video1> -i <video2> \

       -i <video3> -i <video4> \

       -filter_complex "[0:v]pad=iw*2:ih*2[a];[a][1:v]overlay=w*1[b];[b][2:v]overlay=0:h[c];[c][3:v]overlay=w:h" \

        \

       -f flv -ar 44100 -y rtmp://<YOUR SERVER IP> /live/livestream; \

       sleep 1; \

   done

图9:推流4路MP4视频并拼接

  • 视频增加水印

for((;;)); do \

       ffmpeg -i <YOUR MP4 FILES> -i <YOUR LOGO> -filter_complex overlay \

       -f flv -ar 44100 -y rtmp://<YOUR SERVER IP>/live/livestream; \

       sleep 1; \

   done

图10:推流视频并添加水印

  • 使用AI处理视频

使用OpenVINO™工具套件作为backend,对输入视频进行AI超分,下载所需的IR模型与测试视频。AI模型使用的是视频超分模型Efficient Sub-Pixel Convolutional Neural Network(ESPCN),了解模型更多信息,请至:https://arxiv.org/abs/1609.05158

wget https://raw.githubusercontent.com/guoyejun/dnn_processing/master/models/espcn.xml

wget https://raw.githubusercontent.com/guoyejun/dnn_processing/master/models/espcn.bin

wget https://raw.githubusercontent.com/guoyejun/dnn_processing/master/models/480p.mp4

输入视频为480p格式的mp4视频,利用VAAPI将编解码置于集成显卡中进行,并且在集成显卡中利用VAAPI对视频编解码进行加速。首先,硬解码需要先hwdonwload到缓存中进行处理,通过“dnn_processing”读入使用OpenVINO推理的ESPCN 模型,input/output确定模型的输入层和输出层,“Device” 参数可以设置运行模型推理的设备,这里我们将其设置为“GPU,意思是使用集成显卡进行模型推理,你也可以将其设置为“CPU”,vfilters工作结束之后hwupload进行封装,最后,将超分完成的视频进行推流:          

for((;;)); do \

          ffmpeg -y -loglevel warning -hide_banner -stats -benchmark -hwaccel vaapi -hwaccel_output_format vaapi -i <YOUR PATH OF 480p.mp4> -vf hwdownload,format=yuv420p,dnn_processing=dnn_backend=openvino:model=<YOUR PATH OF espcn.xml>:input=x:output=espcn/prediction:options=device=GPU,format=nv12,hwupload -c:v h264_vaapi  \

       -f flv -ar 44100 -y rtmp://<YOUR SERVER IP>/live/livestream; \

       sleep 1; \

   done

11:480P的视频super-resolution 后成为 960P推流以及集成显卡占用情况/每秒帧率

FFmpeg 作为一款开源的视频处理软件,其后端兼容的软件工具网络多种多样,你可以探索接入各种视频处理后端,软硬件加速工具,以及自定义程序。

你也可以根据你的需求或者兴趣,利用FFmpeg、OpenVINO™工具套件和艾克斯开发板实现更多有创意的应用或者发明。如果你已经有了好的创意或者开发计划,可以通过填写开发计划说明来免费申请艾克斯开发板进行实验,详情请点击:5周年快乐,英特尔"走近开发者"活动免费领取开发者套件5周年快乐,英特尔"走近开发者"活动免费领取开发者套件

 3.5 网络推流直播

在公网上进行视频推流,以斗鱼*网为例:

网络直播现在当下是一个非常热门的领域,对于一般用户来说,他们会使用直播网站提供的直播伴侣软件进行直播,例如下图:

图12:斗鱼网直播伴侣界面

基于RTMP服务器的建立,我们可以通过服务器推流的方式,将想要进行直播的摄像头视频流或者是存储于服务器的视频进行推流直播。在FFmpeg和OpenVINO™的加持下,我们也可以对要推的视频流进行编辑操作,赋予AI推理之类的能力,将处理过的视频进行推流直播。直播软件一般都会提供获取互联网资源的接入口,方便主播直接进行拉流。

局域网:

通过局域网IP,可以在内网对客户端进行推送处理过的视频流。

4. 总结

英特尔®认证的DevKit—— 艾克斯开发板以Intel® Celeron™ N5105作为处理核心,在相同的功耗下获得了优秀的计算性能。在本例中,它作为一个小型的流媒体服务器,可以做到多路编解码,实时视频传输,以及在OpenVINO™工具套件的帮助下对视频进行AI处理后将视频进行推流。本文以一个超分模型为例,将AI推理应用于这样一个流媒体服务器中,这主要也是给广大开发者提供了这样一个思路,可以将例如人脸检测,分割,或者识别模型同样部署于流服务器中,利用AI模型将推流出来的视频经过AI处理,这样就能给一个普通的视频流服务器进行AI赋能。别忘了,如果你有好的创意,可以通过提交开发计划说明来免费申请艾克斯开发板进行实验,详情请点击:5周年快乐,英特尔"走近开发者"活动免费领取开发者套件5周年快乐,英特尔"走近开发者"活动免费领取开发者套件

RTMP streaming media servers are widely used in the market. In addition to point-to-point video transmission, such a server can be used to automatically play videos and process videos automatically, which can obtain certain commercial value. Since FFmpeg, an open source framework, has a high degree of freedom, as long as it is a function that FFmpeg can integrate, it can be easily deployed to the server. Come and participate in the activity, apply for a board and start creating experiments.

Guess you like

Origin blog.csdn.net/gc5r8w07u/article/details/131700104