yolo training

Chain of assumptions in ML

  • Fit training set well on cost function
  • Fit dev set well on cost function
  • Fit test set well on cost function
  • Performs well in real world


yolo training

https://github.com/AlexeyAB/darknet#when-should-i-stop-training


From my own experience, you'll need around 300-500 images per class to have a good result. You can always try to increase your dataset by using data set augmentation (YOLOv2 does this internally already btw).


(https://github.com/AlexeyAB/darknet#how-to-improve-object-detection)

How to improve object detection:

  1. Before training:
  • set flag random=1 in your .cfg-file - it will increase precision by training Yolo for different resolutions: link

  • increase network resolution in your .cfg-file (height=608width=608 or any value multiple of 32) - it will increase precision

  • recalculate anchors for your dataset for width and height from cfg-file: darknet.exe detector calc_anchors data/obj.data -num_of_clusters 9 -width 416 -height 416 then set the same 9 anchors in each of 3 [yolo]-layers in your cfg-file

  • check that each object are mandatory labeled in your dataset - no one object in your data set should not be without label. In the most training issues - there are wrong labels in your dataset (got labels by using some conversion script, marked with a third-party tool, ...). Always check your dataset by using: https://github.com/AlexeyAB/Yolo_mark

  • desirable that your training dataset include images with objects at diffrent: scales, rotations, lightings, from different sides, on different backgrounds

  • desirable that your training dataset include images with non-labeled objects that you do not want to detect - negative samples without bounded box (empty .txt files)

  • for training with a large number of objects in each image, add the parameter max=200 or higher value in the last layer [region] in your cfg-file

  • for training for small objects - set layers = -1, 11 instead of https://github.com/AlexeyAB/darknet/blob/6390a5a2ab61a0bdf6f1a9a6b4a739c16b36e0d7/cfg/yolov3.cfg#L720 and set stride=4 instead of https://github.com/AlexeyAB/darknet/blob/6390a5a2ab61a0bdf6f1a9a6b4a739c16b36e0d7/cfg/yolov3.cfg#L717

  • General rule - you should keep relative size of objects in the Training and Testing datasets roughly the same:

    • train_network_width * train_obj_width / train_image_width ~= detection_network_width * detection_obj_width / detection_image_width
    • train_network_height * train_obj_height / train_image_height ~= detection_network_height * detection_obj_height / detection_image_height
  • to speedup training (with decreasing detection accuracy) do Fine-Tuning instead of Transfer-Learning, set param stopbackward=1 in one of the penultimate convolutional layers before the 1-st [yolo]-layer, for example here: https://github.com/AlexeyAB/darknet/blob/0039fd26786ab5f71d5af725fc18b3f521e7acfd/cfg/yolov3.cfg#L598

  1. After training - for detection:
  • Increase network-resolution by set in your .cfg-file (height=608 and width=608) or (height=832 and width=832) or (any value multiple of 32) - this increases the precision and makes it possible to detect small objects: link

    • you do not need to train the network again, just use .weights-file already trained for 416x416 resolution
    • if error Out of memory occurs then in .cfg-file you should increase subdivisions=16, 32 or 64: link

猜你喜欢

转载自blog.csdn.net/honk2012/article/details/80430120