YOLOv5 vehicle ranging practice: using target detection technology to achieve vehicle distance estimation

YOLOv5 object detection technology for vehicle ranging. I believe that everyone already knows about YOLOv5, which is a fast and accurate target detection algorithm. Next, let us discuss how to realize vehicle distance estimation through YOLOv5. This practice will be divided into the following steps:

  1. Install required libraries and tools
  2. data preparation
  3. model training
  4. distance estimation
  5. visualize the results
  6. optimization

1. Install required libraries and tools

First, we need to make sure that the dependent libraries of YOLOv5 have been installed. Here we use Python as the development language, and we need to install libraries such as PyTorch, torchvision, and OpenCV. It can be installed with the following command:

pip install torch torchvision opencv-python

Next, we need to clone the official GitHub repository of YOLOv5 and enter the project directory:

 
 
git clone https://github.com/ultralytics/yolov5.git
cd yolov5

2. Data preparation

In this exercise, we use a dataset that contains images of vehicles and their corresponding labels. In order to train YOLOv5, we need to convert the dataset into a format suitable for YOLOv5 training. Specifically, the label information of each picture needs to be converted into a txt file required by YOLOv5.

The dataset should be organized according to the following structure:

dataset/
├── images/
│   ├── train/
│   └── val/
└── labels/
    ├── train/
    └── val/

Make sure you have the appropriate dataset ready and start the next step.

3. Model training

First, we need to configure the training parameters for YOLOv5. Find the file in the project directory yolov5s.yamland modify the corresponding parameters according to your needs. For example, adjust the number of categories, number of training epochs, etc.

Next, we start training the YOLOv5 model. Run the following command in terminal:

python train.py --img 640 --batch 16 --epochs 100 --data dataset.yaml --cfg yolov5s.yaml --weights yolov5s.pt --name yolov5s_vehicle

After the training is complete, the trained model weights will be saved runs/train/yolov5s_vehicle/weights/best.ptunder the path.

4. Distance Estimation

We will use the following formula to estimate the distance

Distance = (actual width of the known object × focal length) / pixel width of the object in the image

In practical applications, we need to first obtain the focal length of the camera through an object of known distance. The focal length is fixed so that we can use it to calculate the distance of other vehicles from the camera.

First, we need to load the trained YOLOv5 model:

import torch
from pathlib import Path

model = torch.hub.load('ultralytics/yolov5', 'custom', path=Path('runs/train/yolov5s_vehicle/weights/best.pt'))
model.conf = 0.25  # 置信度阈值

Next, we define a function for computing distances:

 
 
import cv2

def estimate_distance(image_path, focal_length, known_width):
    image = cv2.imread(image_path)
    results = model(image)  # YOLOv5检测

    for *box, conf, cls in results.xyxy[0]:
        x1, y1, x2, y2 = box
        width = x2 - x1
        distance = (known_width * focal_length) / width
        print(f"距离估计:{distance:.2f}米")

It should be noted that the values ​​of focal_lengthand known_widthneed to be obtained in advance. If you know the actual width of an object and the pixel width of the object in the image, you can calculate the focal length of the camera by measuring the actual distance of the object from the camera.

5. Visualize the results

We can visualize the result using OpenCV, for this we need to modify estimate_distancethe function:

def estimate_distance_viz(image_path, focal_length, known_width):
    image = cv2.imread(image_path)
    results = model(image)  # YOLOv5检测

    for *box, conf, cls in results.xyxy[0]:
        x1, y1, x2, y2 = box
        width = x2 - x1
        distance = (known_width * focal_length) / width
        
        cv2.rectangle(image, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0), 2)
        cv2.putText(image, f"{distance:.2f}米", (int(x1), int(y1) - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    cv2.imshow("Distance Estimation", image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

Now, we can use estimate_distance_vizthe function to calculate the vehicle distance and visualize the result.

 
 
image_path = 'test.jpg'
focal_length = 700  # 示例值,需要根据实际情况进行调整
known_width = 1.8   # 示例值,车辆的实际宽度,根据实际情况进行调整

visualize_distance(image_path, focal_length, known_width)

optimization

In practical applications, we can optimize the algorithm to improve performance and accuracy. Here are some suggestions:

  1. Camera Calibration: In the current implementation, we directly use the focal length as a parameter. In order to obtain more accurate results, camera calibration can be used to obtain more internal camera parameters, such as distortion coefficients. This will help improve the accuracy of distance estimates. cv2.calibrateCamera()Camera calibration can be achieved using functions in OpenCV .

  2. Multi-frame fusion: In order to improve the stability of ranging, multi-frame fusion technology can be used. By collecting the detection results of consecutive multiple frames of images, the average or weighted average of the vehicle distance is calculated. This reduces errors and improves stability.

  3. Adaptive threshold adjustment: In different scenes and lighting conditions, the detection results may be affected. You can try to adaptively adjust the confidence threshold according to the brightness, contrast and other characteristics of the image to improve the accuracy of detection.

  4. Smoothing of results: In distance estimation, noise or abrupt changes may occur. We can apply some filtering algorithm (such as Kalman filter or moving average filter) to smooth the result to improve estimation stability.

To implement these recommendations, we need to visualize_distancemodify the function accordingly. Here is a modified example showing how to apply multi-frame fusion techniques to odometry:

import cv2
import numpy as np

def visualize_distance_multiframe(image_paths, focal_length, known_width, num_frames=5):
    distances = []

    for image_path in image_paths:
        image = cv2.imread(image_path)
        results = model(image)

        for *box, conf, cls in results.xyxy[0]:
            x1, y1, x2, y2 = box
            width = x2 - x1
            distance = (known_width * focal_length) / width
            distances.append(distance)

            if len(distances) > num_frames:
                distances.pop(0)

            avg_distance = np.mean(distances)

            cv2.rectangle(image, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0), 2)
            cv2.putText(image, f"{avg_distance:.2f}米", (int(x1), int(y1) - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

        cv2.imshow("Distance Estimation", image)
        cv2.waitKey(0)

    cv2.destroyAllWindows()

We can now call visualize_distance_multiframethe function and provide a sequence of consecutive images to calculate vehicle distances and visualize the results. This will make the distance estimation more stable.

image_paths = ['test1.jpg', 'test2.jpg', 'test3.jpg', 'test4.jpg', 'test5.jpg']
focal_length = 700  # 示例值,需要根据实际情况进行调整
known_width = 1.8   # 示例值,车辆的实际宽度,根据实际情况进行调整

visualize_distance_multiframe(image_paths, focal_length, known_width)

After the implementation of multi-frame fusion technology, the algorithm will be able to better deal with lighting changes, occlusion and other issues. At the same time, this method can also reduce the impact of single frame noise on distance estimation.

It should be noted that in practical applications, we may need to optimize the algorithm to different degrees according to specific scenarios and requirements. For example, if dealing with real-time video streams, consider applying optical flow between consecutive frames to track detected vehicles to further improve the stability of odometry. In addition, for different types of vehicles, it can be considered to set different known widths according to vehicle types to improve the estimation accuracy.

In this way, we have completed the practice of using YOLOv5 for vehicle ranging. Hope this blog was helpful to you guys! If you have any questions or suggestions, please feel free to leave a message in the comment area. See you next time!

Guess you like

Origin blog.csdn.net/m0_68036862/article/details/130051079