Supplements [+] training to understand and use Keras-YOLOv3 model

This blog is not a tutorial, but himself in the realization of the relevant key part of the process of comparing and around the place to make a summary.

Detailed procedure recommended instructional videos up master of a B station, to explain is yolo under keras how, not long points P clear, and announced additional code needed (when I conducted using also use the up modified code , to express my gratitude), for the realization of the principle, darknet53 talking about the network structure is very clear: Keras build their own yolo3 target detection platform (yolo3 source code Detailed)
also affixed author keras-yolo of GitHub

Students have the ability to read the source code, is highly recommended to do some understanding, at least you know a piece of code is doing, so it will not be limited to those details online material to say, such as weight training images, labels, etc. so you can own path the corresponding changes.

Three data sets relevant

There are names of data sets from computer vision Challenge. Currently in the field of object recognition, the computer of the target recognition accuracy and resolution than humans have.

COCO data set: There are 80 categories marked
VOC data collection: marked with 20 categories, the amount of data is relatively less
imagenet data sets: a huge amount of data, up to tens of thousands of pictures

From darknet to keras

darknet and keras are two different architectures, darknet substantially not dependent on a particular library, the C language as the main body; keras Tensorflow is based, in python language written. When using the most obvious difference is the weight (or called a model) different. yolo of pre-training model provides is training conducted under coco data set in darknet architecture, model suffix is ​​.weight. Keras model that can be read is .h5 suffix. So we can see and use the code description keras-yolo in convert.py have such a document, which is to play the role of transformation. In addition, as the reason I was too dishes, where model, weight, network three words in a mixed state.

yolo can tune all places

When using: changing the path by modifying the loaded model; identifying image path

Training (in train.py file): yolo written format that can be read with a tag (.txt) path; prior frame (archors) path; storage position of the new training model; pre-trained model position; Training and the ratio of the validation set; after three, the batch_size (quantity for each batch, depending on the computing performance) when all of the layers of training, epochs (training generations is repeated many times)

Here detail about the pre-training model is compared to a new whiteboard network was trained pattern recognition network can be faster and more efficient to find and extract the characteristic pattern, even though these classes (class) did not appear before training before. Yolo_weight.h5 directly on the basis of training may be darknet53 (which is a set of data in imagenet been pre-trained network) trained on.

Advanced Options: change priori specific frame size (anchor, anchors, file path ./model_data/yolo_anchors.txt, to cluster box on the pictures provided by kmeans.py, divided identify large objects, objects, small three sizes of the object block)

Configuration datasets

VOC testing was performed on the data set, the data set shown in the format in accordance with the VOC. Of course, the whole one kind of re-own their own use, not impossible, but is to modify a lot of the code, or even rewritten into a program yolo identification format can be successfully run through. Since there are so readily available data sets format, then we consider it a standard to take the initiative to move closer.
Regardless of the year, the VOC full body state data set there are five folders, which are used to identify a target image tag 2 original picture mapping information 3, and further comprising two semantic segmentation information file folder.

A series of training before the conversion process

First, the need for the original VOC data set processing, set up to train the picture "name" collection, which is used in step B station up to write their own voc2yolo.py, will be under the VOC data collection ./ImageSet/Main generate a train.txt. Then comes keras-yolo voc_annotation.py, according to the mapping relationship of the step meet VOC label format generation Yolo readable label format, this step will generate _train.txt year according to the ratio of named files and inside the other three a .txt file. Also need to be added before the appropriate training in model_data folder .txt file contains only a class (here that is the most common and general practice, in fact, 20 class I use VOC data sets have been preset in the model_data the folder).

Written on the back

B stand up due to the use of the above-mentioned codes, especially train.py rewrite this part is almost done, so if you feel foggy to see, then I suggest the direct use up code. You can feel the change is set and the training data set allocation is performed in the train.py, rather than in the generation yolo recognizable tag label format from the process VOC format.
Also use the additional code is in place when the mapping relationship VOC data collection tags and pictures, that is to prepare training "name" finishing up the process.

Beginner white, there is no place where understanding a lot of seniors, please beg the indulgence. I look forward to criticism and improvement.

Released three original articles · won praise 0 · Views 72

Guess you like

Origin blog.csdn.net/ConTroleR99/article/details/104659914