Deep learning + OpenCV, Python real-time target detection

Real-time video streaming for in-depth study on target detection using OpenCV and Python is very simple, we just need some combination of the appropriate code, access real-time video, followed by the addition of the original target detection.

This article is divided into two parts. In the first part, we will learn how to extend the original target detection projects, using OpenCV depth study and will expand the scope of application to real-time video streaming and video files. This task will be accomplished by VideoStream class.

Deep Learning Object Detection Tutorial: http://www.pyimagesearch.com/2017/09/11/object-detection-with-deep-learning-and-opencv/

VideoStream class Tutorial: http://www.pyimagesearch.com/2016/01/04/unifying-picamera-and-cv2-videocapture-into-a-single-class-with-opencv/

Now, we will begin to learn + target depth detection code applied to the video streams, the processing speed while measuring the FPS.

OpenCV depth study and use for video object detection

To construct OpenCV depth based on real-time learning target detector, we need to effectively access the camera / video stream, and the object detected is applied to each frame.
First of all, we open a new file, name it real_time_object_detection.py, and then add the following code:

We begin importing packet from the line 2-8. Prior to this, you need imutils and OpenCV 3.3. In the System Settings, you only need to install the default settings OpenCV can (and ensure that you followed all of the Python virtual environment command).

Note: Make sure you download and install is OpenCV 3.3 (or later) and OpenCV-contrib version (suitable for OpenCV 3.3), which contains in-depth to ensure that the neural network module.

Below, we will parse these command-line parameters:

Compared with the previous target detection projects, we do not need image parameters, because here is our video streaming and video processing - remain unchanged except for the following parameters:

-prototxt: Caffe prototxt file path.

-model: Path pre-training model.

-confidence: Weak filtered minimum probability threshold detection, the default value of 20%.

Then, we initialize the list of classes and color sets:

In line 22-26, we initialize CLASS label, and corresponding random COLORS. For more information on these classes (as well as training methods network), please refer to: http://www.pyimagesearch.com/2017/09/11/object-detection-with-deep-learning-and-opencv/

Now, we load their own models, and set your own video stream:

We load your own serialization model that provides a reference to their prototxt and model files (line 30), it can be seen in OpenCV 3.3, which is very simple.

Next, we initialize the video stream (source files can be video or camera). First, we start VideoStream (line 35), then wait for the camera to start (line 36), and finally began to calculate the number of frames per second (line 37). VideoStream and FPS imutils class is part of the package.

Now, let's walk through each frame (if you are demanding the speed, you can skip some frames):

First we read one (line 43) from the video stream, then adjust its size (line 44). Because we will then need width and height, so we have to crawl on line 47. The frame is then converted to a BLOB dnn module (line 48).

Now, we set the blob as neural network input (line 52), pass input (53 lines) through net, which provides us with detections.

In this case, we have detected in the input frame to the target, now is the time to see the value of confidence to determine whether we have a bounding box drawn around the target and the label:

We first cycle in detections, remember that a picture can detect multiple targets. We also need to check the confidence of each test (ie probability). If the confidence level is high enough (above the threshold), then we predict will show in the end, and in the form of text and color of the bounding box to make a prediction image. Let's look line by line:

In detections cycle, first we extract the confidence value (line 59).

If the confidence is higher than the minimum threshold (line 63), then we extract the class label index (line 67), and calculates the coordinates of the detected target (line 68).

Then, we extract the bounding box (x, y) coordinates (line 69), then to draw a rectangle and for text.

We construct a text label, contains the CLASS name and confidence (lines 72, 73).

We also use the color classes and extraction before (x, y) coordinates of the colored rectangle drawn (line 74, 75) around the object.

In general, we want to label appears above the rectangle, but if there is no space, we will show the position of the label at the top of the rectangle slightly below (line 76).

Finally, we use the calculated value of y just colored text placed on the frame (line 77, 78).

The remaining captured frame cycle further comprises the step of: (1) shows frames; (2) Check the quit button; (3) Update Counter fps:

The code block is simple and clear, we show first frame (line 81), then find a particular key (82nd), and check the "q" key (on behalf of "quit") is depressed. If you have already pressed, we exit the loop frame capture (first row 85, 86). Last updated fps counter (89th row).

If we exit the loop ( "q" key or the end of the video stream), we have to deal with these:

When we out (exit) loop, FPS counter stop (line 92), information on the number of frames per second output (on lines 93 and 94) to the terminal.

We close the window (line 97), then stop the stream (line 98).

If you get to this point, it would be prepared to use their webcam to try how it works up. Let's look at the next section.

The results of the real-time target detection depth study

In order to learn the depth of the real-time target detection is running, make sure you use the neural network convolution sample code and pre-training guidelines "Downloads" section of. (Please open the original link, enter the "Downloads" section, enter your e-mail address, to obtain the required code and other information.)
Open a terminal, execute the following command:

If OpenCV can access your camera, you can see the output video frame with the detected target. I used a deep learning of the sample video object detection results are as follows:

Figure 1: Use depth study and OpenCV + Python movie were real-time target detection.

Note the depth of learning objectives detector can detect not only people, but also to detect the sofa and chair person sitting next to - all in real time detected!

to sum up

Today's blog, we learned how to perform real-time target detection using deep learning + OpenCV + video stream. We accomplished this by following two tutorials:

  1. Using OpenCV depth study and target detection ( http://www.pyimagesearch.com/2017/09/11/object-detection-with-deep-learning-and-opencv/)

  2. Efficient in OpenCV, threaded video stream ( http://www.pyimagesearch.com/2016/01/04/unifying-picamera-and-cv2-videocapture-into-a-single-class-with-opencv/ )

The end result is based on the depth of learning objectives can handle video detector 6-8 FPS (of course, this also depends on your system speeds).

You can further enhance the speed in the following ways:

  1. Skipped frames.

  2. MobileNet use different variants (faster, but accuracy drops).

  3. Quantum variants of SqueezeNet (I have yet to test this, but I think it should be faster, because its network footprint smaller).

Author: Almost Human

Source: https://www.jiqizhixin.com/articles/2017-09-21-3

  Dalian Bohai Hospital mobile.dlbhfk.com

  Dalian Bohai Hospital bpmobile.39552222.com

Guess you like

Origin blog.csdn.net/qq_42894764/article/details/93199329