Before traditional algorithms or deep learning perform image processing, they always first collect images, which is the so-called pull flow. There are two ways to solve the problem of pulling streams. One is to directly use opencv to obtain the stream, and the other is to use ffmpeg to obtain the stream. The two methods for pulling the stream are introduced below.
1. Opencv takes the stream directly
Opencv's stream acquisition method mainly uses the VideoCapture class for processing. VideoCapture provides a complete set of solutions for reading video stream information. The main functions are as follows:
VideoCapture has three constructors:
- Constructor without any parameters
- Constructor with a video stream address
- The constructor isOpened() function with a video index
is mainly used to determine whether the stream address is successfully opened.
read() reads the video data.
The release() function is used to release the class object.
Specific reference address: https://docs.opencv.org/ 4.0.0/d8/dfe/classcv_1_1VideoCapture.html
1.1 python streaming
The main process is divided into the following steps:
- Instantiate the VideoCapture class by the address of the stream
- Determine whether the stream address is successfully opened
- Loop through each frame of stream data and process it
- Release the instantiated object
- releasecv
def vedio2Img(vedio_path, save_path):
cap = cv2.VideoCapture(vedio_path)
fps = int(cap.get(cv2.CAP_PROP_FPS))
total_count = cap.get(cv2.CAP_PROP_FRAME_COUNT)
count = 0
img_idx = 0
if not cap.isOpened():
return
while True:
success, frame = cap.read()
if success:
try:
count += 1
if count % fps == 0:
img_idx += 1
name = save_path.split('\\')[-1]
save_path1 = os.path.join(save_path, '{}_vedio_{}.jpg'.format(name, str(img_idx)))
save_img(save_path1, frame)
print('finish number {} img save'.format(img_idx))
cv2.waitKey(1)
except:
print('encounter some wrong')
continue
cap.release()
cv2.destroyAllWindows()
1.2 C++ opencv pull stream
The way c++ uses opencv to pull streams is basically the same as opencv (ps: the bottom layer of python should be implemented in C++), so its implementation format is as follows:
std::string vedio_path = "rtsp://admin:[email protected]/Streaming/Channels/11000";
cv::VideoCapture cap;
cap.open(vedio_path);
if (!cap.isOpened()) {
std::cout << "error about cap" << std::endl;
}
VideoFrameDecode videoframe;
cv::Mat frame;
while (cap.read(frame))
{
if (frame.empty()) {
break;
}
int w = frame.size().width;
int h = frame.size().height;
printf("h=%i,w=%i", h, w);
unsigned char* buffer = frame.data;
size_t stride = frame.step;
cv::Mat img = cv::Mat(h, w, CV_8UC3, (void*)buffer, stride);
cv::namedWindow("demo", cv::WINDOW_NORMAL);
cv::imshow("demo", img);
cv::waitKey(0);
}
cap.release();
cv::destroyAllWindows();
2. ffmpeg streaming (C++ implementation)
-
Download the ffmpeg package
download address of the ffmpeg package.
The 5.1.2 version downloaded by the blogger
-
Use vs2022 configuration.
Add the include path of the newly downloaded ffmpeg package in C/C++>Additional Include Directory.
Add the lib file path of the ffmpeg package in Linker->Additional Library Directory.
In Linker->Input->Additional Dependencies. Add the required lib library directory and organize it as follows:avcodec.lib avdevice.lib avfilter.lib avformat.lib avutil.lib swresample.lib swscale.lib
If you do not want to configure the directory of the bin file in ffmpeg in the environment variable, you can use the following method to configure it temporarily:
Use Path=D:\ffmpeg\bin;%PATH in Debug->Environment to use it temporarily.
The blogger encapsulated the streaming method into a class, and the main code is as follows:
The ffmpeg.h file is as follows:
#ifndef __FFMPEG_DECODE_H__
#define __FFMPEG_DECODE_H__
// Opencv
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
extern "C"
{
#include<libavutil/avutil.h>
#include<libavutil/imgutils.h>
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libswscale/swscale.h>
#include<libavdevice/avdevice.h>
};
struct VideoFrameDecode {
void* buffer; //֡帧的buffer指针(仅支持RGB格式)
int pitch; //图像一行的宽度
};
class ReadFfmpeg
{
public:
ReadFfmpeg(char* rtsppath);
~ReadFfmpeg();
void processOneFrame(cv::Mat &img);
private:
AVFormatContext* formatContext = nullptr;
int ret = -1;
int videoStreamIndex = -1;
AVCodecParameters* codecParameters = nullptr;
const AVCodec* codec = nullptr;
AVCodecContext* codecContext = nullptr;
AVPacket packet;
AVFrame* pFrameRGB;
uint8_t* buffer;
SwsContext* sws_ctx;
};
#endif
The specific implemented ffmpeg.cpp file is as follows
#include "ReadFfmpeg.h"
#include <iostream>
#include<chrono>
#include<thread>
using namespace std;
ReadFfmpeg::ReadFfmpeg(char* rtsppath)
{
avformat_network_init();
AVDictionary* formatOptions = nullptr;
av_dict_set_int(&formatOptions, "buffer_size", 2 << 20, 0);
av_dict_set(&formatOptions, "rtsp_transport", "tcp", 0); //默认使用udp协议进行传输,会出现max delay reached. need to consume packet
av_dict_set_int(&formatOptions, "timeout", 5000000, 0);
formatContext = avformat_alloc_context();
ret = avformat_open_input(&formatContext, rtsppath, nullptr, &formatOptions);
if (ret != 0) {
std::cerr << "Failed to open RTSP stream." << std::endl;
}
ret = avformat_find_stream_info(formatContext, nullptr);
if (ret < 0) {
std::cerr << "Failed to find stream info." << std::endl;
}
for (unsigned int i = 0; i < formatContext->nb_streams; ++i) {
if (formatContext->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) {
videoStreamIndex = i;
break;
}
}
if (videoStreamIndex == -1) {
std::cerr << "Failed to find video stream." << std::endl;
}
codecParameters = formatContext->streams[videoStreamIndex]->codecpar;
codec = avcodec_find_decoder(codecParameters->codec_id);
if (codec == nullptr) {
std::cerr << "Failed to find video decoder." << std::endl;
}
codecContext = avcodec_alloc_context3(codec);
if (avcodec_parameters_to_context(codecContext, codecParameters) < 0) {
std::cerr << "Failed to allocate codec context." << std::endl;
}
ret = avcodec_open2(codecContext, codec, nullptr);
if (ret < 0) {
std::cerr << "Failed to open codec." << std::endl;
}
pFrameRGB = av_frame_alloc();
buffer = (uint8_t*)av_malloc(av_image_get_buffer_size(AV_PIX_FMT_RGB24, codecContext->width, codecContext->height, 1));
av_image_fill_arrays(pFrameRGB->data, pFrameRGB->linesize, buffer, AV_PIX_FMT_RGB24, codecContext->width, codecContext->height, 1);
sws_ctx = sws_getContext(codecContext->width, codecContext->height, codecContext->pix_fmt,
codecContext->width, codecContext->height, AV_PIX_FMT_RGB24,
SWS_BILINEAR, nullptr, nullptr, nullptr);
ret = av_read_frame(formatContext, &packet);
if (ret < 0) {
std::cerr << "Failed to open packet." << std::endl;
}
}
ReadFfmpeg::~ReadFfmpeg()
{
avformat_network_deinit();
avcodec_free_context(&codecContext);
sws_freeContext(sws_ctx);
av_free(pFrameRGB);
av_free(buffer);
av_free(codecParameters);
avformat_close_input(&formatContext);
}
void ReadFfmpeg::processOneFrame(cv::Mat& img)
{
if (img.empty())
{
img = cv::Mat(codecContext->height, codecContext->width, CV_8UC3);
}
int ret = av_read_frame(formatContext, &packet);
if (ret >= 0) {
if (packet.stream_index == videoStreamIndex) {
avcodec_send_packet(codecContext, &packet);
AVFrame* avFrame = av_frame_alloc();
int res = avcodec_receive_frame(codecContext, avFrame);
if (res == 0) {
// Convert frame to RGB
sws_scale(sws_ctx, avFrame->data, avFrame->linesize, 0, codecContext->height, pFrameRGB->data, pFrameRGB->linesize);
img.data = pFrameRGB->data[0];
}
av_frame_free(&avFrame);
}
}
av_packet_unref(&packet);
}
void test() {
char* filename = (char*)"rtsp://admin:[email protected]:10000/Streaming/Channels/10000";
ReadFfmpeg* fmpeg = new ReadFfmpeg(filename);
cv::Mat img;
int nFrame = 0;
auto start = std::chrono::system_clock::now();
for (;;)
{
nFrame++;
fmpeg->processOneFrame(img);
if (nFrame % 100==0) {
nFrame = 0;
auto end = std::chrono::system_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
std::cout << "the fps is: " << static_cast<float>(100 / (duration.count() / 1000.0)) << std::endl;
start = end;
}
// Display frame
cv::namedWindow("RTSP Stream", cv::WINDOW_NORMAL);
cv::imshow("RTSP Stream", img);
cv::waitKey(1);
}
delete fmpeg;
}
The above is a very simple method of pulling streams. It can only be used as a demo to implement stream reading. If you want to achieve real-time streaming and processing, you need to use multi-threading to implement a thread for reading stream data. Put the data into the queue, and at the same time implement a thread to read the stream data, read the data from the queue, and run at the same time.
appendix
In fact, opencv can also use ffmpeg to pull streams, but you need to specify the ffmpeg version when compiling opencv.