Object Detection with YOLO: How to Extract People Images

YOLO (You Only Look Once) is a popular open source neural network model for object detection. In this post, we will explain how to use YOLO to extract images of a bunch of people (or at least one person).

First, we need to install the YOLO library and dependencies. To do this, we will use the pip package manager and install the following libraries:

pip install numpy
pip install opencv-python
pip install tensorflow
pip install keras

Next, we download the pre-trained YOLO weights and configuration files from the official website. These files can be found at https://pjreddie.com/darknet/yolo/.

Once we have our weights and profiles, we can use them to perform object detection on our images.

Here is an example of how to use YOLO to detect people in an image:

import cv2
import numpy as np

# Load YOLO model
net = cv2.dnn.readNet("./yolov3.weights", "./darknet/cfg/yolov3.cfg")

# Define input image
image = cv2.imread("image.jpg")

# Get image dimensions
(height, width) = image.shape[:2]

# Define the neural network input
blob = cv2.dnn.blobFromImage(image, 1 / 255.0, (416, 416), swapRB=True, crop=False)
net.setInput(blob)

# Perform forward propagation
output_layer_name = net.getUnconnectedOutLayersNames()
output_layers = net.forward(output_layer_name)

# Initialize list of detected people
people = []

# Loop over the output layers
for output in output_layers:
    # Loop over the detections
    for detection in output:
        # Extract the class ID and confidence of the current detection
        scores = detection[5:]
        class_id = np.argmax(scores)
        confidence = scores[class_id]

        # Only keep detections with a high confidence
        if class_id == 0 and confidence > 0.5:
            # Object detected
            center_x = int(detection[0] * width)
            center_y = int(detection[1] * height)
            w = int(detection[2] * width)
            h = int(detection[3] * height)

            # Rectangle coordinates
            x = int(center_x - w / 2)
            y = int(center_y - h / 2)

            # Add the detection to the list of people
            people.append((x, y, w, h))

# Draw bounding boxes around the people
for (x, y, w, h) in people:
    cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)

# Show the image
cv2.imshow("Image", image)
cv2.waitKey(0)

In this example, we first cv2.dnn.readNetload the YOLO model using method . Then, we define the input image and get its dimensions.

Next, we define the neural network input by creating a blob from the image and normalizing its pixel values.

Finally, we perform a forward pass and loop over the output layer to extract detections and extract bounding box coordinates around people (our interest lies in person detection).

We can then use cv2.rectanglethe method to draw a bounding box around the person on the original image.

After running this code, you should be able to see any detected people with bounding boxes around them.

d0fb462d8d94b1b8c1e613a07d2762b9.jpeg

YOLO Object Detection | Source (https://unsplash.com/photos/kY8m5uDIW7Y)

If you want to remove duplicate rectangles, you can use NMS, for example:

cv2.dnn.NMSBoxes(boxes, confidences, score_threshold=0.5, nms_threshold=0.4)

Additionally, a title can be added to each detected rectangle:

cv2.putText(image, label, (x, y + 30), font, 2, color, 3)

The output is:

78d58bab1630a8faa92f52f34cb27119.jpeg

YOLO object detection (labeled objects) | Source (https://unsplash.com/photos/kY8m5uDIW7Y)

Now you can easily count the number of people in an image.

☆ END ☆

If you see this, it means you like this article, please forward and like it. Search "uncle_pn" on WeChat, welcome to add the editor's WeChat "woshicver", and update a high-quality blog post in the circle of friends every day.

Scan the QR code to add editor↓

67ab0ca2abe832357c42ce3ee0bdc3bc.jpeg

Guess you like

Origin blog.csdn.net/woshicver/article/details/129828441