Network real-time video surveillance based on Socket, OpenCV and MFC



1 summary

As a well-known cross-platform computer vision library, OpenCV is used in various fields, but there are few applications in network video surveillance. This project combines OpenCV with Socket, and uses TCP protocol to realize real-time image transmission from client to server with MFC as UI interface.

This article will introduce the design and implementation of the client and server from the aspects of OpenCV image processing and Socket network connection. Subsequently, this article introduces the MFC interface design of the project and the program test results under the local area network and the local area network. Finally, this paper analyzes the advantages and disadvantages of this project.

2 Overview

The project is divided into two parts, the client and the server, and its overall framework is shown in the figure. The role of the client is to collect real-time image information and display it to the Picture Control control image of MFC in real time, then use OpenCV compression encoding, and then send the image data stream to the server through Socket. The role of the server is to receive the data stream, and then decode and display the received data stream.
The overall framework of this project
Next, this article will mainly explain in detail from the client side, server side and MFC interface design side.

3 Client design and implementation

The main function of the client is to obtain the image stream from the camera, compress and encode it, initiate a TCP connection, and send the data stream. The following will further explain how to implement it from the OpenCV image processing and Socket network communication.

3.1 OpenCV image processing

The flow chart of OpenCV image processing on the client side is shown in the figure.
Flow chart of client image processing part

3.1.1 cv::VideoCapture class

The cv::VideoCapture class not only supports reading from video files, but also supports reading image streams from cameras. Before that, a VideoCapture object must be created first, and this object calls the member function isOpened() to check whether the opening is successful, and returns true if successful.

The cv::VideoCapture object calls the video file (.mpg or .avi format) to read the video in the following format:

//从视频文件读取
cv::VideoCapture capture(const string& filename);
cv::VideoCapture capture("路径");

The cv::VideoCapture object calls the camera to read the real-time video format as:

//从摄像头读取
cv::VideoCapture capture(int device);

Among them, device is the identifier of the camera. If there is only one camera, the device is assigned a value of 0. If your computer has multiple cameras, just increase them gradually.

Open the default camera, read the image stream from it, assign the image stream to cv:Mat image, and its code block is:

Mat image; 
VideoCapture capture;
while(capture.isOpened()){
    
    
capture.open(0);
capture >> image;
imshow(image);
waitKey(3);
}

3.1.2 Image Compression Coding

For image transmission, many schemes are based on pixel transmission, that is, the size of the transmitted data is directly equivalent to the image resolution and the number of channels. Taking 640480 resolution as an example, if a single-channel grayscale image is transmitted, the amount of data transmitted for one image is 307200 bytes; if a three-channel RGB image is transmitted, one image needs to transmit 921600 bytes. An image has a data volume of more than 900,000 bytes, which is obviously a huge waste of network resources. If this pixel-based image transmission is adopted, it is easy to cause video freeze. At this time, we introduce two functions imencode and imdecode of OpenCV to perform binary compression encoding and decoding on pixels respectively.

The definition format of imencode() is:

bool imencode(
const String& ext,
InputArray img, 
vector<uchar>& buf,
const vector<int>& params=vector<int>())

The explanation of each parameter is shown in the table.

parameter explain
ext Defines the extension of the output file format
img image to be encoded
buf output buffer
params Encoded format and compression rate

In this project, it is encoded into jpg format with a compression rate of 50%.

3.2 Socket network communication

In terms of network communication, the client of this project adopts the custom ClientSocket class inherited from the CAsyncSocket class. The program flow chart is shown in the figure below.
The program flow chart of the client Socket network communication part
The user enters the destination IP and port number from the MFC interface, clicks the "Connect" button, the object of the ClientSocket class calls Create() to create a socket, Connect() initiates a TCP connection, and Send() sends a data stream to the server after the connection is successful. The user clicks the "Disconnect" button, Close() closes the socket, and capture.release() closes the camera.

4 Design and implementation of server side

The role of the server side is to realize the TCP connection with the client first, then receive the data stream from the client, and then decode and display the real-time image. Therefore, next, the author will introduce the server side from two aspects of Socket network communication and OpenCV image processing.

4.1 Socket network communication

Server-side network communication, this project uses the custom ListenSocket class and ServerSocket class inherited from the CAsyncSocket class, which are used for monitoring and connection services respectively.

The program flow chart of the Socket network communication part of the server is shown in the figure.
The program flow chart of the server-side Socket network communication part
The user clicks "Local Loopback Test" to set the IP address to 127.0.0.1, clicks "Get Local IP" and automatically obtains and sets the IP address of the machine. The method for this project to automatically obtain the local IP is: refer to the winsock2.h library, call the gethostname() function to obtain the local host name, and then use the gethostbyname() function to obtain the IP using the host name. After setting the IP address, it will be displayed on the IP control of MFC.

After setting the IP address and port number, click "Open Monitoring", and the object of the ListenSocket class calls Create() to create a listening socket. Listen() enters the listening state, and then detects whether there is a connection request from the client. If there is no request, it continues to listen. If a request is received, OnAccept() receives the request to implement a TCP connection. Then use the object of the ServerSocket class to call Create() to create a service socket. OnReceive() receives the data stream from the client. If the client is not disconnected, it will continue to receive the real-time image data stream. If the client is disconnected, OnClose( )Disconnect.

If the server actively disconnects, that is, the user clicks "Disconnect" in the TCP connection state, the objects of the ListenSocket class and ServerSocket class will call Close() to close the socket and disconnect from the client.

4.2 OpenCV image processing

The image processing part on the server side is very simple. The received data stream is decoded by imdecode(), and then the decoded Mat type image is displayed in the window in real time.

imdecode() is an image decoding function in OpenCV. It is used in pairs with imencode(). The imencode() function of compression encoding has been introduced earlier. Let's introduce imdecode().
imencode() has two overloads:

Mat imdecode(InputArray buf, int flags)
Mat imdecode(InputArray buf, int flags, Mat* dst)

  • buf: decompressed memory space;
  • flags: color space, commonly used CV_LOAD_IMOSE_COLOR refers to three-channel color image, CV_LOAD_IMAGE_GRAYSCALE refers to single-channel grayscale image; dst: optional output placeholder of decoding matrix, if not filled, it is NULL.

5 Design and analysis of MFC interface

5.1 Introduction to MFC

The UI interface of this project uses MFC (Microsoft Foundation Classes, Microsoft Foundation Classes), which is an application framework similar to VCL. This class library provides a set of common reusable class libraries for developers to use, and most classes are directly or indirectly derived from CObject.

The overall structure of an MFC application usually consists of several classes derived from the MFC class by the developer and a CWinApp class object (application object). MFC provides the MFC AppWizard automatic generation framework. In Windows applications, the main MFC include file is Afxwin.h.

5.2 MFC interface of client and server

Client's MFC interface:
Client-side MFC interface

MFC interface on the server side:
Server-side MFC interface

5.3 Detailed analysis

In order to improve the practicability and robustness of the program, the following three details about the MFC interface design of this project are introduced.

First, the function of displaying pictures is completed by using OpenCV's imshow(), but the displayed window generally has a separate sub-window, which is not only unsightly, but also inconvenient to use. Therefore, this project here is to bind the OpenCV window and the MFC Picture control together, and realize the image display window is embedded in the main window. The relevant code block is:

//将MFC的IDC_PIC1与OpenCV的src1绑定
CRect rect1;
CWnd* pWnd1 = GetDlgItem(IDC_PIC1);
pWnd1->GetClientRect(&rect1);
namedWindow("src1", WINDOW_AUTOSIZE);
resizeWindow("src1", Size(rect1.Width(), rect1.Height()));
HWND hWnd1 = (HWND)cvGetWindowHandle("src1");
HWND hParent1 = ::GetParent(hWnd1);
::SetParent(hWnd1, GetDlgItem(IDC_PIC1)->m_hWnd);
::ShowWindow(hParent1, SW_HIDE);

Second, in order to reflect the current program running status in real time, a list control is added to the interface. For example, when the IP/port number is empty, click the "Connect" button, the list will prompt "IP address is empty, please enter the target IP address", when the camera is turned on, it will prompt "The camera is successfully opened", and the server will output the address when it starts listening And port number and prompt "waiting for user connection", and so on.

Third, in order to improve the robustness of the program and prevent program errors, this project uses the control variable to call the function EnableWindow (FALSE/TRUE) to enable or disable the control and strictly control the input. Take the client interface as an example. In the initial interface, you can enter the IP address and port number, and click the "Connect" button, but the "Disconnect" connection button is gray and cannot be clicked. However, if the IP/port number is empty, the list bar will prompt “The IP address is empty, please enter the target IP address”. After the TCP connection is initiated, the IP control, port number "Connect" button is valid and can be clicked.

6 Program testing

6.1 Local loopback test

On the server side, click "Local Loopback Test", that is, set the IP address to "127.0.0.1". After the TCP connection is realized, the real-time video transmission is normal, and the basic functions of network monitoring are realized. The test results at a certain moment are shown in the figure.
Test results at a certain moment in the local loopback

6.2 Test of two PCs under LAN

Connect two computers to the same WIFI to form a wireless local area network.

One computer runs the server-side program, and one computer runs the client-side program. Click "Get local IP" on the server side, and the obtained IP address is "192.168.0.103". After realizing the TCP connection with the client PC, the test shows that the real-time video transmission is normal, that is, the basic functions of network monitoring can also be realized under the LAN , the test result at a certain moment is:

Under the LAN, the client-side test results at a certain moment:
Client test results under LAN
Under the LAN, the server-side test results at a certain moment:
Server-side test results under LAN

7 Analysis of advantages and disadvantages

7.1 Advantages of this project

Overall, the biggest innovation of this project lies in the realization of the combination of OpenCV image processing and Socket network communication. Therefore, the advantages of this project basically come from the advantages of OpenCV.

First, in the image transmission preprocessing of this project, the image encoding and decoding functions of OpenCV are used to replace the traditional pixel-based image transmission, which greatly compresses the amount of transmitted data, releases network resources, and makes real-time video more efficient. smooth.

Second, since OpenCV has a rich library of image analysis and processing and machine vision functions, this project provides a very convenient OpenCV interface for secondary development in terms of intelligent image transmission. For example, after the client image is collected, it can be equipped with the OpenCV face recognition function library, and then combined with the personnel information database, a network video with real-time registration of mobile personnel information suitable for work punching or place entrances and exits can be quickly developed. surveillance system.

7.2 Room for improvement

Due to the limited programming ability of the author, there are still many defects in this project that need to be improved.

First, when transmitting video, the receiving end (server end) may occasionally experience a blurred screen. This is because this project uses the TCP network transmission protocol.
The TCP protocol provides connection-oriented and reliable transmission, and is usually used in scenarios such as file transfer, text transfer, and e-mail, which require high accuracy and low real-time requirements. The UDP protocol provides connectionless and unreliable network transmission, which is usually used in data transmission with large data volume but low accuracy requirements, such as watching videos on websites, multiplayer online games and other scenarios.

Through the comparison of TCP and UDP, we found that UDP protocol is more suitable for network video surveillance.

Second, the project failed to achieve two-way video transmission. If two-way video and audio transmission is realized, and video and audio synchronization and multi-threading technology are added, network video calls can be realized.

Guess you like

Origin blog.csdn.net/ZBC010/article/details/127142195