Chain of assumptions in ML
- Fit training set well on cost function
- Fit dev set well on cost function
- Fit test set well on cost function
- Performs well in real world
yolo training
https://github.com/AlexeyAB/darknet#when-should-i-stop-training
From my own experience, you'll need around 300-500 images per class to have a good result. You can always try to increase your dataset by using data set augmentation (YOLOv2 does this internally already btw).
(https://github.com/AlexeyAB/darknet#how-to-improve-object-detection)
How to improve object detection:
- Before training:
set flag
random=1
in your.cfg
-file - it will increase precision by training Yolo for different resolutions: linkincrease network resolution in your
.cfg
-file (height=608
,width=608
or any value multiple of 32) - it will increase precisionrecalculate anchors for your dataset for
width
andheight
from cfg-file:darknet.exe detector calc_anchors data/obj.data -num_of_clusters 9 -width 416 -height 416
then set the same 9anchors
in each of 3[yolo]
-layers in your cfg-filecheck that each object are mandatory labeled in your dataset - no one object in your data set should not be without label. In the most training issues - there are wrong labels in your dataset (got labels by using some conversion script, marked with a third-party tool, ...). Always check your dataset by using: https://github.com/AlexeyAB/Yolo_mark
desirable that your training dataset include images with objects at diffrent: scales, rotations, lightings, from different sides, on different backgrounds
desirable that your training dataset include images with non-labeled objects that you do not want to detect - negative samples without bounded box (empty
.txt
files)for training with a large number of objects in each image, add the parameter
max=200
or higher value in the last layer [region] in your cfg-filefor training for small objects - set
layers = -1, 11
instead of https://github.com/AlexeyAB/darknet/blob/6390a5a2ab61a0bdf6f1a9a6b4a739c16b36e0d7/cfg/yolov3.cfg#L720 and setstride=4
instead of https://github.com/AlexeyAB/darknet/blob/6390a5a2ab61a0bdf6f1a9a6b4a739c16b36e0d7/cfg/yolov3.cfg#L717General rule - you should keep relative size of objects in the Training and Testing datasets roughly the same:
train_network_width * train_obj_width / train_image_width ~= detection_network_width * detection_obj_width / detection_image_width
train_network_height * train_obj_height / train_image_height ~= detection_network_height * detection_obj_height / detection_image_height
to speedup training (with decreasing detection accuracy) do Fine-Tuning instead of Transfer-Learning, set param
stopbackward=1
in one of the penultimate convolutional layers before the 1-st[yolo]
-layer, for example here: https://github.com/AlexeyAB/darknet/blob/0039fd26786ab5f71d5af725fc18b3f521e7acfd/cfg/yolov3.cfg#L598
- After training - for detection:
Increase network-resolution by set in your
.cfg
-file (height=608
andwidth=608
) or (height=832
andwidth=832
) or (any value multiple of 32) - this increases the precision and makes it possible to detect small objects: link- you do not need to train the network again, just use
.weights
-file already trained for 416x416 resolution - if error
Out of memory
occurs then in.cfg
-file you should increasesubdivisions=16
, 32 or 64: link
- you do not need to train the network again, just use