[Object Detection] YOLOv8: Quick Start Guide

YOLOv8 Overview

YOLOv8 is a new generation of YOLO version launched by the YOLOv5 team this year. Compared with previous generations of versions, its performance and speed gap are shown in the figure below:

Insert image description here
Different from other versions, the warehouse is not named YOLOv8, but the company name ultralytics, because they want to use this version as a universal library to facilitate calling and deployment.

Warehouse address: https://github.com/ultralytics/ultralytics
Official tutorial: https://docs.ultralytics.com/modes/train/

The official tutorial provides two calling methods: Python and command line (CLI). This article only tries the former.

Install

Since YOLOv8 is in the form of a general library, there is no need to download the entire file like YOLOv5, but only use pip to install it.

pip install ultralytics

It is worth noting that this library contains two dependencies, torch/torchvision, that need to distinguish between GPU/CPU versions. After the installation is complete, you need to check whether torch can use the GPU.

Check method:

import torch

print(torch.cuda.is_available())

Model training

Load model

There are three ways to load a model, usually the second one is sufficient.

# Load a model
model = YOLO('models/yolov8n.yaml')  # build a new model from YAML
model = YOLO('yolov8n.pt')  # load a pretrained model (recommended for training)
model = YOLO('yolov8n.yaml').load('yolov8n.pt')  # build from YAML and transfer weights

Model training

Let's take the coco128 data set as an example to start training. Training only requires one line of code:

results = model.train(data='coco128.yaml', epochs=2, imgsz=640, workers=0, batch=2)

After running, the program will automatically download the data set and model. The interface here trainhas many parameter selections, most of which are consistent with the parameters in YOLOv5.

For specific parameters and meanings, please refer to Argumentsthis section of the document:

Insert image description here

Model validation

Model verification is also quite simple, just these few lines of code:

metrics = model.val(data='coco128.yaml', imgsz=640, workers=0, batch=2)  # no arguments needed, dataset and settings remembered
metrics.box.map    # map50-95
metrics.box.map50  # map50
metrics.box.map75  # map75
metrics.box.maps   # a list contains map50-95 of each category

The running effect is consistent with val.py in YOLOv5, and run/valverification results will be generated under the path.

Model reasoning

YOLOv8 currently supports inference: target detection, target detection + segmentation, target detection + attitude detection, and target tracking.
Note: All tasks are based on detection, and the official has not separately proposed training methods for other tasks.

Target Detection

# Load a model
model = YOLO('yolov8n.pt')  # load a pretrained model
# Run inference on 'bus.jpg' with arguments
model.predict('data/bus.jpg', save=True, imgsz=320, conf=0.5)

The image path here can be modified and specified by yourself.

Example effect:
Insert image description here

Target detection + segmentation

# Load a pretrained YOLOv8n-seg Segment model
model = YOLO('yolov8n-seg.pt')

# Run inference on an image
results = model('data/bus.jpg', save=True)

The interfaces are exactly the same, but the loaded models are different.

Effect:

Insert image description here

Target detection + attitude detection

# Load a pretrained YOLOv8n-pose Pose model
model = YOLO('yolov8n-pose.pt')

# Run inference on an image
results = model('data/bus.jpg', save=True)  # results list

Insert image description here

Target Tracking

# load a pretrained model
model = YOLO('yolov8n.pt')
results = model.track(source="data/malasong.mp4", save=True)

The target tracking interface trackrequires the import of video data. The following is a frame in the video.

Insert image description here

Summarize

Based on YOLOv5, YOLOv8 adds many tricks such as Anthor-free, and the model performance is improved compared to the previous generation. The multi-task interface is unified as a whole to facilitate application and deployment. However, for researchers, it undoubtedly further deepens the complex black box model of the network and is not conducive to secondary development.

Guess you like

Origin blog.csdn.net/qq1198768105/article/details/132796822