From Video Sources to Edge Intelligence: Technical Challenges and Practices for Realizing End-to-End AI Solutions

From Video Sources to Edge Intelligence: Technical Challenges and Practices for Realizing End-to-End AI Solutions

introduction

With the rapid development of artificial intelligence technology, end-to-end AI solutions are more and more widely used in various fields. These solutions can realize the complete process from data collection to result output, organically combine artificial intelligence algorithms with technologies such as physical equipment, network communication and cloud services, and provide users with a comprehensive and efficient intelligent experience.

The advantage of using edge computing is that it can push data processing and analysis closer to the source of the data, reducing data transmission and delay, and improving response speed and real-time performance. Especially in real-time application scenarios such as video surveillance, edge computing can effectively reduce the load on the cloud and provide faster data processing and response capabilities. In addition, edge computing also has the ability to run offline. Even when the network is unstable or there is no network connection, it can still perform local data processing and analysis to ensure the stability and reliability of the system.

The general solution flow chart is as follows:

A[视频源头采集] --> B[选择合适的摄像机设备]
B --> C[配置摄像机参数(分辨率、帧率等)]
C --> D[视频数据采集]
D --> E[视频编码(H.264、H.265等)]
E --> F[视频传输]
F --> G[选择合适的传输协议]
G --> H[网络带宽评估和调整]
H --> I[选择合适的存储介质]
I --> J[存储容量规划]
J --> K[边缘计算和盒子设备]
K --> L[选择合适的边缘盒子设备]
L --> M[盒子硬件要求和选择]
M --> N[盒子操作系统选择]
N --> O[配置盒子依赖项和运行环境]
O --> P[安装和配置SDK]
P --> Q[SDK初始化]
Q --> R[设置SDK参数]
R --> S[调用SDK接口]

Open up the end-to-end AI solution

In addition to the known boxes of equipment, NVR (Network Video Recorder) and DVR (Digital Video Recorder), we also need to know what types of equipment are on the video camera monitoring:

  1. Camera : The camera is one of the most basic and core devices in the video surveillance system. It is used to capture video images and convert the images into electronic signals for processing and transmission. Cameras can be divided into various types according to different application scenarios and requirements, such as fixed cameras, dome cameras, bullet cameras, infrared cameras, etc. Cameras are usually connected to other devices via wired or wireless means.
  2. PTZ camera (Pan-Tilt-Zoom Camera) : PTZ camera has the function of pan tilt, pitch and zoom, and can realize the adjustment of direction, angle and focal length through remote control. These cameras can be used for real-time monitoring and tracking of moving targets, and are widely used in security, traffic monitoring and other fields.
  3. High Definition Camera (High Definition Camera) : High definition cameras have high resolution and high-quality image capture capabilities, which can provide clearer and more detailed video images. These cameras are suitable for scenes that require high image quality, such as banks, shopping malls, public places, etc.
  4. Network Camera : A network camera is a camera that can transmit video signals through the network, also known as an IP camera. They can be connected to a network via Ethernet to stream video directly to an NVR, DVR or computer for storage, processing and monitoring. Network cameras have flexible installation and configuration options for large surveillance systems and remote monitoring.
  5. Storage Device : The storage device is used to store and save the video data captured by the camera. In surveillance systems, NVR (Network Video Recorder) and DVR (Digital Video Recorder) are common storage devices. They receive and process video signals from cameras and store the data on hard drives or other storage media for subsequent retrieval, playback and analysis.
  6. Switch (Switch) : The switch plays the role of connection and networking in the monitoring system. They are used to establish a local area network (LAN) and connect various devices including cameras, storage devices, servers, and more. Switches usually support high bandwidth and fast data transmission to ensure stable information transmission.

Video source collection

In an end-to-end AI solution, video source acquisition is a key step in realizing intelligent video analysis. When collecting video sources, the following two aspects need to be considered: camera type and configuration parameters, as well as video encoding standards and parameter selection.

Camera type and configuration parameters

Choosing the right camera equipment is critical for video source capture. According to specific application requirements, different types of cameras can be selected, such as network cameras (IP cameras), analog cameras or high-definition cameras. Each camera type has its specific characteristics and applicable scenarios.

Configuration parameters include but are not limited to resolution, frame rate, exposure, contrast, etc. These parameters need to be set according to specific scenarios and requirements. For example, higher resolutions and frame rates can be selected in situations where detail needs to be captured, while in low-light environments, exposure adjustments and contrast ratios may need to be adjusted for a sharper image.

Video coding standards and parameter selection

Video coding standards determine how video data is compressed and transmitted. Commonly used video coding standards include H.264, H.265, and so on. Choosing an appropriate video coding standard can reduce the amount of data and improve transmission efficiency while ensuring video quality.

In addition to encoding standards, appropriate encoding parameters also need to be selected. Encoding parameters include bit rate, GOP size, encoding quality, etc. The selection of these parameters needs to balance video quality and transmission efficiency. For example, a higher bit rate and encoding quality results in better video quality, but increases the amount of data, placing higher demands on bandwidth and storage.

Through reasonable selection of camera types and configuration parameters, as well as video coding standards and parameters, high-quality and high-efficiency video source collection can be achieved, providing a high-quality data basis for subsequent video analysis and processing.

Video transmission and storage

In an end-to-end AI solution, video transmission and storage are key links, which involve selecting appropriate network transmission protocols and storage media, as well as capacity planning.

Network transmission protocol
Selecting a suitable network transmission protocol is crucial to the stability and efficiency of video transmission. The following are some commonly used network transport protocols:

  1. RTSP (Real Time Streaming Protocol) :
    • Features: RTSP is an application layer protocol used to control multimedia data transmission. It can realize real-time streaming media transmission, and provide flexible control mechanism, including playback, pause, fast forward and other operations. RTSP is often used in the field of video surveillance, with lower delay and better real-time performance.
    • Applicable scenarios: Suitable for scenarios that require real-time monitoring and control, such as video conferences, live video broadcasts, and real-time monitoring systems.
  2. RTMP (Real Time Messaging Protocol) :
    • Features: RTMP is a protocol for audio, video and data transmission. It uses a real-time message communication mechanism in the transmission process, which can realize real-time data transmission and interaction. RTMP supports streaming and playback, suitable for real-time multimedia applications.
    • Applicable scenarios: suitable for real-time multimedia applications, such as online live broadcast, video on demand and real-time audio and video communication.
  3. SRT (Secure Reliable Transport) :
    • Features: SRT is an open source transmission protocol designed to provide safe and reliable streaming media transmission. It uses mechanisms such as forward error correction, retransmission and encryption to ensure the stability and security of transmission. SRT is suitable for video transmission in an unstable network environment, and can maintain smooth playback in the event of packet loss or bandwidth jitter.
    • Applicable scenarios: Suitable for scenarios with unstable network environment and high packet loss rate, such as remote monitoring, mobile applications and cloud video transmission.
  4. National standard protocol ONVIF (Open Network Video Interface) .
    • Features: ONVIF is a standard protocol created by an industry alliance organization, which aims to promote interoperability between network video equipment from different manufacturers. It defines a set of standard interfaces and protocols, so that devices from different manufacturers can be interconnected and interoperable, thus facilitating the integration and management of devices.
    • Applicable scenarios: ONVIF is widely used in video surveillance systems, and can realize linkage and integration of various devices, including network cameras, NVR (Network Video Recorder), VMS (Video Management Software), etc. By using the ONVIF protocol, functions such as device discovery, video stream transmission, event triggering, and device configuration can be realized.

The steps to use the national standard protocol:

  1. Install and configure a video capture device or camera that supports the national standard protocol.
  2. On the box device at the receiving end, use the corresponding SDK or library to realize the parsing and receiving of the national standard protocol (this part will be introduced below).
  3. Configure the protocol as needed, such as specifying the transmitted code stream type, resolution, frame rate and other parameters.
  4. Realize the transmission and reception of video data by specifying the URL or address of the protocol.

bandwidth requirements

When choosing a network transport protocol, the required bandwidth also needs to be considered. Bandwidth requirements depend on video resolution, frame rate, and encoding parameters. Higher resolution and frame rate and higher encoding quality may require greater bandwidth to ensure smooth and stable video transmission. Therefore, when performing network transmission, it is necessary to evaluate and plan the bandwidth to meet the requirements of video transmission.

Storage Media and Capacity Planning

Video storage is to save the collected video data in a suitable storage medium for subsequent analysis and playback. When choosing a storage medium, factors such as capacity, stability, and scalability need to be considered.

Common storage media include hard disks, solid-state drives (SSDs), and cloud storage. The hard disk has the advantages of large capacity and relatively low cost, and is suitable for storing a large amount of video data. SSD has higher read and write speed and better anti-seismic performance, which is suitable for application scenarios with high real-time requirements. Cloud storage can provide highly scalable storage solutions, while providing backup and disaster recovery functions.

When planning storage capacity, the continuous generation and storage cycles of video data need to be considered. The specific capacity planning depends on the video acquisition frequency, resolution, frame rate, and storage cycle requirements. The following are general steps for storage capacity planning:

  1. Estimate the rate at which video data is generated : Estimate the amount of data generated per second based on the resolution, frame rate, and encoding parameters of the video.
  2. Calculation of storage requirements : Calculate the total storage capacity required based on the amount of data generated per second, combined with the requirements of the storage cycle. For example, if you need to store a week's worth of video data, you can calculate the total capacity based on the amount of data per second and the number of seconds in a week.
  3. Consider data compression and optimization : Use video coding techniques to compress data to reduce storage requirements. Different coding standards and parameters will have an impact on storage capacity.
  4. Select the appropriate storage medium : select the appropriate storage medium according to the storage requirements and actual situation. Hard drives and SSDs are usually common choices, which can be expanded and backed up as needed.
  5. Implement a storage management strategy : regularly clean up and archive video data that is no longer needed to free up storage space. According to the actual use of storage capacity, storage management and optimization are carried out.

Through appropriate network transmission and storage strategies, the stable transmission and reliable storage of video data can be ensured, providing a high-quality data basis for subsequent data analysis and processing.

Edge Computing and Box Devices

Edge computing is an important link when implementing end-to-end AI solutions. Edge computing utilizes edge devices (such as boxes) close to data sources for data processing and analysis to reduce data transmission and response time.

Box Hardware Requirements and Options

Choosing the right box hardware is critical to enabling efficient edge computing. Here are some key box hardware requirements and selection points:

  1. Processor and memory : Select a processor with sufficient computing power and memory capacity to support complex algorithms and model inference. Common processors include ARM, Intel Core, and AMD Ryzen, among others. The memory capacity should be determined according to actual needs.
  2. Storage device : Choose a storage device with sufficient capacity and high-speed read and write performance to store and access data. Hard disk and SSD are common choices, and the appropriate capacity and type can be selected according to needs.
  3. Network connection : Make sure the box has a stable network connection for communication and data transfer with other devices. Boxes that support connectivity methods such as Ethernet, Wi-Fi, and cellular networks are common choices.
  4. Temperature and environmental requirements: According to the environment and application scenarios of the box, consider its temperature range and seismic performance and other requirements.

When selecting box hardware, factors such as computing power, storage requirements, network connections, and environmental requirements need to be considered comprehensively to meet actual edge computing needs.

Box OS and dependencies

Selecting the appropriate operating system and installing the necessary dependencies are critical to the proper functioning of the box and the deployment of applications.

  1. Operating system : Common box operating systems include Linux (such as Ubuntu, CentOS, etc.) and Windows IoT, etc. Select the appropriate operating system according to the requirements of the application program and the compatibility of the hardware platform.
  2. Dependencies and Libraries : Depending on the needs of the application, install and configure the necessary dependencies and libraries. This might include image processing libraries, machine learning frameworks, network transport libraries, etc. Use package management tools (such as apt, yum, etc.) or install manually to satisfy the application's dependencies.

Ensure the stability and security of the box operating system, and install the required dependencies and libraries according to the needs of the application to ensure the smooth progress of edge computing.

In fact, the flow of data collection information from the device to the box can be simplified as shown in the figure below. The difference lies in whether there is a storage medium, what is the storage medium, and where is it located.

insert image description here
There may be some complex device combinations in the connection between the video capture device and the storage device, such as adding a switch on the NVR, the purpose is to use the switch to connect to a storage device.

If there is no storage device, it is the direct communication between the capture video device and the edge box

If there is a storage device, the storage device communicates with the box (this process is simpler)

There is a way of storage medium: In this way, the video camera device stores the collected video data to the medium (such as hard disk, SSD, etc.), and then the box accesses the storage medium in a certain way to obtain the video data for subsequent processing. Common storage media methods include :

  • Local storage : The camera device stores video data in a local storage medium, and the box accesses the storage medium on the device through the network or other methods to obtain video data. This method is often used in edge computing scenarios. The box can obtain data through network transmission or direct access to storage devices.
  • Central storage : The video camera device stores video data to a central server or network storage device, and the box is connected to the central storage device through the network to obtain video data. This method is often used to centrally manage and store video data of a large number of camera devices, such as Alibaba Cloud's oss

The way without storage medium: In this way, the video camera device directly transmits the collected video data to the box without going through the intermediate steps of the storage medium. Common methods without storage media include :

  • Real-time transmission protocol : The video camera device transmits the real-time video stream to the box through a real-time transmission protocol (such as RTSP, RTMP, etc.). The box can directly receive and process real-time video data. This method is suitable for scenarios that require high real-time performance, such as video surveillance and video conferencing.
  • Streaming media server : The video camera device transmits video data to the streaming media server, and the box obtains video data by accessing the streaming media server. The streaming media server is responsible for receiving, storing and distributing video data, and the box can access the server to obtain data through streaming media protocols (such as HTTP, HLS, etc.).

In the absence of storage media, communication between devices and edge boxes can be achieved in different ways, depending on the type, protocol, and communication interface of the device. Here are some common ways devices communicate with boxes in general

  1. Device drivers : For some standardized devices, the operating system usually provides corresponding device drivers. The edge box can use the appropriate device driver to communicate with the device and obtain the data sent by the device. The driver can provide a unified interface so that the device data can be recognized and processed by the box.
  2. Interface protocols and APIs : Certain devices provide specific interface protocols and APIs that allow other devices or applications to communicate with them. Edge boxes can use corresponding protocols and APIs to interact with devices. For example, the camera device may support the RTSP (Real-Time Streaming Protocol) protocol, through which the video stream data of the device can be obtained by the box.
  3. SDK : Some device manufacturers provide specific SDKs for communicating and controlling their devices. These SDKs usually contain the functions, methods, and protocols needed to interact with the device. By using the SDK, the edge box can use the API provided by the device manufacturer to realize the communication with the device. The specific steps to use the SDK may vary depending on the device manufacturer and SDK, and usually involve the following steps:
    • Download and install the SDK: Download the SDK for your device and operating system from the device manufacturer's official website or other trusted source and follow the installation instructions provided.
    • Import SDK library and header files: In your project, import the library files and header files provided by the SDK into your development environment. This usually needs to be set in the project configuration to ensure that the functions and functions of the SDK can be used correctly when compiling and linking.
    • Initialize device connection: Use the functions or methods provided by the SDK to initialize device connection. This may involve specifying connection parameters such as the device's IP address, port number, username, and password.
    • Call the SDK function: By calling the function or method provided by the SDK, the communication with the device can be realized. This includes operations such as sending commands, receiving data, subscribing to events, and more. SDK documentation usually provides detailed API references and sample codes to help you use the SDK for device communication.
    • Process device data: Once the device connection is established, you can use the functions provided by the SDK to obtain the data sent by the device. According to the documentation and examples of the SDK, learn how to parse and process device data for subsequent processing and analysis in the edge box.

SDK

The device network SDK is developed based on the device's private network communication protocol. It is a supporting module for network products such as embedded network hard disk recorders, NVRs, video servers, network cameras, network ball cameras, decoders, and alarm hosts. It is used for secondary development of remote access and control device software.

Using the device network SDK, you can use the provided functions and interfaces to realize communication, data transmission and control with the device.

Identify the device and corresponding web SDK: First, you need to identify the device you are using and the associated web SDK. Depending on the type and manufacturer of the device, you can find the corresponding SDK and documentation.

Generally speaking, the SDK interface functions of the device are all written in C or C++, but the few Raspberry Pis provide everything from acquisition devices to storage devices to computing power devices in python language .

Because of performance and efficiency: C and C++ are compiled languages ​​that are directly compiled into machine code, so they have advantages in terms of performance and efficiency. For some application scenarios that require high computing performance and resource utilization efficiency, using C or C++ can better meet the requirements. In contrast, Python is an interpreted language that needs to be interpreted and executed line by line at runtime, so it is relatively slow.

SDK development is usually divided into C/S architecture (Client/Server) and B/S architecture (Browser/Server)

  • C/S architecture: In the C/S architecture, a direct communication connection is established between the client and the server. SDK is usually used in the development of client applications, providing interfaces and functions for communicating with servers. The client application uses the SDK to perform data exchange, request processing, and other operations with the server. This architecture is suitable for applications that require data processing and interaction on the client side, such as desktop applications, mobile applications, etc.
  • B/S architecture: In the B/S architecture, the client communicates with the server through the browser without installing additional software or SDK on the client. The server provides Web services, and the client interacts with the server by accessing Web pages through a browser. In the B/S architecture, SDK is usually used for server-side development, providing an interface for managing server functions and data. This architecture is suitable for scenarios such as web applications and cloud services.

In addition to C/S architecture and B/S architecture, there are other architecture models that can be developed using SDK, such as distributed architecture, microservice architecture, etc. These architectural patterns divide the system into multiple components or services according to specific requirements and application scenarios, and the SDK can be used to develop the interfaces and functions of these components or services to achieve communication and collaboration between components or services.

A server may refer to a server side of a device that provides data, functions, or handles requests. For example, for a video surveillance system, the server can be an NVR (Network Video Recorder) or a video streaming server, and the SDK is used to develop client applications that communicate with these servers

The figure below shows the basic calling process of Hikvision's SDK.
insert image description here
The process in the dashed box is an optional part and will not affect the function usage of other processes and modules.

The basic calling process of Hikvision NVR mainly includes the following modules:

  1. Initialize the SDK: Call the NET_DVR_Init function to initialize the SDK, including operations such as memory pre-allocation.
  2. Set the connection timeout time: set the network connection timeout time in the SDK by calling the NET_DVR_SetConnectTime function.
  3. Set the callback function for receiving exception messages: Use the NET_DVR_SetDVRMessage or NET_DVR_SetExceptionCallBack_V30 function to set the callback function for receiving exception information from modules such as preview, alarm, playback, transparent channel, and voice intercom.
  4. Obtain the IP address of the device from the resolution server: Obtain the IP address of the device from the resolution server by calling the NET_DVR_GetDVRIPByResolveSvr_EX function, which can be queried through the device name, DDNS domain name or serial number.
  5. User registration device: Call the NET_DVR_Login_V30 function to register the user, and the returned user ID will be used as the unique identifier for other functional operations.
  6. Preview module: By calling preview-related functions, the real-time code stream is obtained from the device, decoded and displayed, and the preview function is realized.
  7. Playback and download module: Remotely play back or download the video files of the device by time or file name, and support breakpoint resume function.
  8. Parameter configuration module: set and obtain various parameters of the device, including device parameters, network parameters, channel compression parameters, serial port parameters, alarm parameters, abnormal parameters, transaction information and user configuration, etc.
  9. Remote device maintenance module: realize maintenance functions such as shutting down the device, restarting the device, restoring default values, remote hard disk formatting, remote upgrade, and configuration file import/export.
  10. Voice intercom forwarding module: realize voice intercom and voice data acquisition with the device.
  11. Alarm module: process various alarm signals uploaded by the device.
  12. Transparent channel module: realize the control of the serial device by analyzing the IP datagram and sending it directly to the serial port.
  13. PTZ control module: realize the control of the basic operation of the PTZ, preset point, cruise, pattern scanning and transparent PTZ.

display

The following is the deployment process of using the Hikvision NVR SDK, including downloading the SDK, importing, IP channel resource configuration, and calling sample codes for the playback and download modules. Please note that due to space limitations, the following examples only show part of the process and code snippets, and assume that you already have the basis of the CentOS system and C/C++ development environment.

  1. Download SDK:
    a. Visit Hikvision official website (https://www.hikvision.com/) and find the SDK download page.
    b. Select the appropriate SDK version according to your needs and download it to your local computer.
  2. Import the SDK:
    a. Unzip the downloaded SDK into your working directory.
    b. Open the terminal and enter the directory where the SDK is decompressed.
  3. Create a new C/C++ project:
    a. Create a new C/C++ project using the following command in Terminal:
mkdir my_nvr_project
cd my_nvr_project
  1. Configure the project:
    a. Open the directory where the project is located, and run the following command in the terminal:
cmake path_to_sdk_directory
make
  1. To write the code:
    a. Create a new source code file (eg main.cpp) in the project directory and edit it.
    b. Write code in main.cpp, including including necessary header files and function calls. The following is a sample code snippet showing how to use Hikvision NVR SDK to configure IP channel resources and call playback and download modules:
vi main.cpp
#include <iostream>
#include "HCNetSDK.h"

int main() {
    
    
    // 初始化SDK
    NET_DVR_Init();

    // 设置连接超时时间
    NET_DVR_SetConnectTime(2000, 1);

    // 用户登录设备
    NET_DVR_USER_LOGIN_INFO loginInfo = {
    
    0};
    NET_DVR_DEVICEINFO_V40 deviceInfo = {
    
    0};
    loginInfo.bUseAsynLogin = 0;
    strncpy(loginInfo.sDeviceAddress, "192.168.1.100", NET_DVR_DEV_ADDRESS_MAX_LEN);
    strncpy(loginInfo.sUserName, "admin", NAME_LEN);
    strncpy(loginInfo.sPassword, "password", NAME_LEN);

    LONG userID = NET_DVR_Login_V40(&loginInfo, &deviceInfo);
    if (userID < 0) {
    
    
        std::cout << "Failed to login. Error code: " << NET_DVR_GetLastError() << std::endl;
        NET_DVR_Cleanup();
        return -1;
    }

    // 登录成功,可以进行其他操作

    // IP通道资源配置
    NET_DVR_IPPARACFG_V40 ipParaCfg = {
    
    0};
    DWORD dwReturned = 0;
    if (!NET_DVR_GetDVRConfig(userID, NET_DVR_GET_IPPARACFG_V40, 0, &ipParaCfg, sizeof(ipParaCfg), &dwReturned)) {
    
    
        std::cout << "Failed to get IP channel configuration. Error code: " << NET_DVR_GetLastError() << std::endl;
        NET_DVR_Logout(userID);
        NET_DVR_Cleanup();
        return -1;
    }

    // 在这里对IP通道配置进行修改

    if (!NET_DVR_SetDVRConfig(userID, NET_DVR_SET_IPPARACFG_V40, 0, &ipParaCfg, sizeof(ipParaCfg))) {
    
    
        std::cout << "Failed to set IP channel configuration. Error code: " << NET_DVR_GetLastError() << std::endl;
        NET_DVR_Logout(userID);
        NET_DVR_Cleanup();
        return -1;
    }

    // 回放和下载模块调用
    NET_DVR_FIND_DATA_V30 findData = {
    
    0};
    LONG findHandle = NET_DVR_FindFile_V30(userID, &findData);
    if (findHandle < 0) {
    
    
        std::cout << "Failed to find file for playback. Error code: " << NET_DVR_GetLastError() << std::endl;
        NET_DVR_Logout(userID);
        NET_DVR_Cleanup();
        return -1;
    }

    // 在这里对回放和下载进行操作

    if (!NET_DVR_StopFindFile(findHandle)) {
    
    
        std::cout << "Failed to stop file finding. Error code: " << NET_DVR_GetLastError() << std::endl;
    }

    // 注销设备
    NET_DVR_Logout(userID);

    // 释放SDK资源
    NET_DVR_Cleanup();

    return 0;
}
  1. Build and run the project:
    a. Build the project using the following commands in a terminal:
    b. Run the project:
make
./my_nvr_project

The above code example demonstrates the process of using the Hikvision NVR SDK to initialize the SDK, set the connection timeout, log in to the device, configure the IP channel resources, call the playback and download modules, log out of the device, and release the SDK resources.

Note that the code and functionality of the actual project will vary based on requirements. You can call and implement corresponding functions according to the documents and functions provided by the SDK to meet your specific needs.

Guess you like

Origin blog.csdn.net/weixin_42010722/article/details/131574666