train
Modes are used to train models on custom datasets YOLOv8
. In this mode, the model is trained using the specified dataset and hyperparameters. The training process involves optimizing the model's parameters so that it can accurately predict the class and location of objects in images.
Note : YOLOv8 datasets such as COCO, VOC, ImageNet and many others are automatically downloaded when first used, i.e.yolo train data=coco.yaml
model = YOLO('yolov8n.yaml')
# 利用官方提供的数据集配置文件进行训练,如COCO、VOC、ImageNet和许多其他数据集,在首次使用时自动下载
results = model.train(data='coco128.yaml', epochs=3)
# 不提供数据集配置文件,根据预训练文件中提供的相关信息进行训练
model = YOLO('yolov8n.pt')
model.train(epochs=5)
# 恢复上次中断的训练
model = YOLO("last.pt")
model.train(resume=True)
The training settings of the YOLOv8 model refer to the various hyperparameters and configurations used to train the model on the dataset. These settings affect the performance, speed, and accuracy of the model. Some common training settings for YOLOv8 include batch size, learning rate, momentum, and weight decay. Other factors that can affect the training process include the choice of optimizer, the choice of loss function, and the size and composition of the training set. It is important to carefully tune and experiment with these settings to achieve the best performance for a given task.
The relevant parameters are as follows:
Key | Value | Description |
---|---|---|
model |
None |
path to model file, i.e. yolov8n.pt, yolov8n.yaml |
data |
None |
path to data file, i.e. coco128.yaml |
epochs |
100 |
number of epochs to train for |
patience |
50 |
epochs to wait for no observable improvement for early stopping of training |
batch |
16 |
number of images per batch (-1 for AutoBatch) |
imgsz |
640 |
size of input images as integer or w,h |
save |
True |
save train checkpoints and predict results |
save_period |
-1 |
Save checkpoint every x epochs (disabled if < 1) |
cache |
False |
True/ram, disk or False. Use cache for data loading |
device |
None |
device to run on, i.e. cuda device=0 or device=0,1,2,3 or device=cpu |
workers |
8 |
number of worker threads for data loading (per RANK if DDP) |
project |
None |
project name |
name |
None |
experiment name |
exist_ok |
False |
whether to overwrite existing experiment |
pretrained |
False |
whether to use a pretrained model |
optimizer |
'SGD' |
optimizer to use, choices=[‘SGD’, ‘Adam’, ‘AdamW’, ‘RMSProp’] |
verbose |
False |
whether to print verbose output |
seed |
0 |
random seed for reproducibility |
deterministic |
True |
whether to enable deterministic mode |
single_cls |
False |
train multi-class data as single-class |
rect |
False |
rectangular training with each batch collated for minimum padding |
cos_lr |
False |
use cosine learning rate scheduler |
close_mosaic |
0 |
(int) disable mosaic augmentation for final epochs |
resume |
False |
resume training from last checkpoint |
amp |
True |
Automatic Mixed Precision (AMP) training, choices=[True, False] |
lr0 |
0.01 |
initial learning rate (i.e. SGD=1E-2, Adam=1E-3) |
lrf |
0.01 |
final learning rate (lr0 * lrf) |
momentum |
0.937 |
SGD momentum/Adam beta1 |
weight_decay |
0.0005 |
optimizer weight decay 5e-4 |
warmup_epochs |
3.0 |
warmup epochs (fractions ok) |
warmup_momentum |
0.8 |
warmup initial momentum |
warmup_bias_lr |
0.1 |
warmup initial bias lr |
box |
7.5 |
box loss gain |
cls |
0.5 |
cls loss gain (scale with pixels) |
dfl |
1.5 |
dfl loss gain |
pose |
12.0 |
pose loss gain (pose-only) |
kobj |
2.0 |
keypoint obj loss gain (pose-only) |
label_smoothing |
0.0 |
label smoothing (fraction) |
nbs |
64 |
nominal batch size |
overlap_mask |
True |
masks should overlap during training (segment train only) |
mask_ratio |
4 |
mask downsample ratio (segment train only) |
dropout |
0.0 |
use dropout regularization (classify train only) |
val |
True |
validate/test during training |