[AI combat] Teach you how to train your own target detection model (SSD)

Target detection is an important application of AI. The target detection model can detect target objects such as people, animals, cars, and airplanes in the image, and even depict the outline of the object. Just like the picture below, it is It's not cool, hehe

Before training your own target detection model, it is recommended to understand the principle of the target detection model (see the article: Dahua target detection classic model RCNN, Fast RCNN, Faster RCNN , and Mark R-CNN ), so that the model will be more clear. training process.

This article will introduce how to use your own data to train the target detection model based on the SSD algorithm based on the AI ​​actual combat basic environment we built earlier (see the article: AI Basic Environment Construction ). SSD, the full name of Single Shot MultiBox Detector (Single Shot MultiBox Detector) , is a target detection algorithm proposed by Wei Liu at ECCV 2016, and is one of the main popular detection frameworks.

The identification to be done in this case is to identify the panda in the image. It's
 
cute ,
 
hehe
First make data annotations, that is, first tell the machine what objects are in the image and where the objects are, and only after this information can the model be trained.
(1) Annotation data files
The current popular data annotation file formats are VOC_2007 and VOC_2012. The text format is derived from the Pascal VOC standard data set, which is one of the important benchmarks to measure the ability of image classification and recognition. This paper uses the VOC_2007 data format file, which is stored in xml format, as follows:

The important information is:
filename: the file name of the image
name: the name of the marked object
xmin, ymin, xmax, ymax: the coordinates of the upper left corner and the lower right corner of the object position

(2) Install the annotation tool
If there are many images to be annotated, it is necessary to manually calculate the location information one by one and make an xml file, which is too inefficient.
Fortunately, there is a great god who has open sourced a data labeling tool labelImg, which can be used for frame labeling through the visual operation interface, and the xml file in VOC format can be automatically generated. The tool is written based on the Python language, so it supports cross-platform operation on Windows and Linux, which is really a work of conscience. The installation method is as follows:
a. Download the source code Download the source code
by visiting the github page of labelImg (https://github.com/tzutalin/labelImg). You can clone through git, or you can directly download files in zip compressed format.
 
In this case, download it directly as a zip file.
b. Installation and compilation
Unzip the zip file of labelImg to get the LabelImg-master folder.
The interface of labelImg is written in PyQt. Since the basic environment we built uses the latest version of anaconda, PyQt5 is already included. In the python3 environment, we only need to install lxml and enter the LabelImg-master directory to compile the code. as follows:

#激活虚拟环境
source activate tensorflow
#在python3环境中安装PyQt5(anaconda已自带),如果是在python2环境下,则要安装PyQt4,PyQt4的安装方式如下
#conda install -c anaconda pyqt=4.11.4
#安装xml
conda install xml
#编译
make qt5py3
#打开标注工具
python3 labelImg.py

The interface to successfully open the labelImg annotation tool is as follows:

2. Labeling the data
After successfully installing the labeling tool, it is time to start labeling the data.
(1) Create folders
According to the requirements of the VOC dataset, create the following folders
Annotations: used to store the marked xml files
ImageSets/Main: used to store the list of files for training set, test set, and test collection
JPEGImages: used to store The original image

(2) Labeling data
Put the panda image collection in the JPEGImages folder (please ask Du Niang for the beautiful photos of pandas~), note that the format of the images must be in jpg format.
Open the labelImg annotation tool, then click the "Open Dir" button on the left toolbar, and select the JPEGImages folder where the panda was just placed. At this time, the main interface will automatically load the first panda photo.

Click the "Create RectBox" button on the left toolbar, and then click to draw a rectangular box on the main interface to circle the panda. After delineating, a dialog box will pop up for entering the name of the labeled object, enter panda as the name of the panda.

Then click the "Save" button on the left toolbar, select the Annotations you just created as the save directory, and the system will automatically generate an xml file in voc_2007 format and save it. This completes the object annotation of a panda photo.

Next, click "Next Image" on the left toolbar to enter the next image, follow the above steps, draw a frame, enter a name, save, and so on, until all photos are marked and saved.

(3) Divide the training set, test set, and validation set
After completing the labeling of all panda photos, the data set should be divided into training set, test set and validation set.
Download an automatic division script on github (https://github.com/EddyGao/make_VOC2007/blob/master/make_main_txt.py)
and execute the following code

python make_main_txt.py

It will automatically split the training set, test set and validation set according to the ratio set in the script, and save the corresponding file name list in it.

3. Configure SSD
(1) Download the SSD code
Since this case is based on tensorflow, download a tensorflow-based SSD on github at https://github.com/balancap/SSD-Tensorflow
 
in the form of a zip file Download it, then unzip it to get the SSD-Tensorflow-master folder
(2) Convert the file format Convert
the file in voc_2007 format to tfrecord format, a binary file in tfrecord data file tensorflow that stores image data and labels in a unified way, can Faster copy, move, read, store, etc. in tensorflow.
SSD-Tensorflow-master provides a script to convert the format. The conversion code is as follows:

DATASET_DIR=./panda_voc2007/
OUTPUT_DIR=./panda_tfrecord/
python SSD-Tensorflow-master/tf_convert_data.py --dataset_name=pascalvoc --dataset_dir=${DATASET_DIR} --output_name=voc_2007_train --output_dir=${OUTPUT_DIR}

(3) Modify the object category
Since it is our custom object, to modify the definition of the object category in SSD-Tensorflow-master, open the SSD-Tensorflow-master/datasets/pascalvoc_common.py file, modify it, and put it in VOC_LABELS All other irrelevant categories are deleted, and the name, ID, and category of panda are added, as follows:

VOC_LABELS = {
    'none': (0, 'Background'),
'panda': (1, 'Animal'),
}

4. Download the pre-trained model
SSD-Tensorflow provides a pre-trained model based on the VGG model (for details of the VGG model, please read the article: Dahua Classic CNN Classic Model VGG ), as shown in the following table:
 
But these pre-trained model files are all is stored on drive.google.com, so direct download is not possible. It can only be downloaded by "you know", download the SSD-300 VGG-based pre-training model here, get the file: VGG_VOC0712_SSD_300x300_ft_iter_120000.ckpt.zip, and then unzip it

5. Training the model
Finally, the annotation files and the SSD model are ready, and now we are ready to start training.
Before training the model, there is a parameter to be modified. Open SSD-Tensorflow-master/train_ssd_network.py and find the DATA_FORMAT parameter item in it. If it is using cpu training, the value is NHWC, if it is using gpu training, the value is NCHW, as follows :

DATA_FORMAT = 'NCHW'  # gpu
# DATA_FORMAT = 'NHWC'    # cpu

Now we can finally start training, open the terminal and switch the conda virtual environment

source activate tensorflow

Then execute the following command to start training

# 使用预训练好的 vgg_ssd_300 模型 
DATASET_DIR=./ panda_tfrecord
TRAIN_DIR=./panda_model
CHECKPOINT_PATH=./model_pre_train/VGG_VOC0712_SSD_300x300_ft_iter_120000.ckpt/VGG_VOC0712_SSD_300x300_ft_iter_120000.ckpt
python3 SSD-Tensorflow-master/train_ssd_network.py \
    --train_dir=${TRAIN_DIR} \
    --dataset_dir=${DATASET_DIR} \
    --dataset_name=pascalvoc_2007 \
    --dataset_split_name=train \
    --model_name=ssd_300_vgg \
    --checkpoint_path=${CHECKPOINT_PATH} \
    --save_summaries_secs=60 \
    --save_interval_secs=600 \
    --weight_decay=0.0005 \
    --optimizer=adam \
    --learning_rate=0.0001 \
    --batch_size=16

Among them, set the value of batch_size according to the performance of your own computer. The larger the value, the larger the number of batch processing and the higher the requirements for machine performance. If the computer performance is ordinary, it can be set to 8 or even 4, please ignore the local tyrants.
The learning rate learning_rate can also be adjusted according to the actual situation. The smaller the learning rate, the more accurate it will be, and the longer the training time will be. The larger the learning rate, the shorter the training time, but the lower the accuracy.

Using the pre-trained model here, SSD will lock some parameters of the VGG model for training, so that the training can be completed in a short time.

6. Use the model
SSD model has been trained, and now it is going to be used, and the way to use it is also very simple.
SSD-Tensorflow-master comes with a notebooks script that can use the model directly through jupyter.
First install jupyter, the installation method is as follows:

conda install jupyter

Then start jupyter-notebook, the code is as follows:

jupyter-notebook SSD-Tensorflow-master/notebooks/ssd_notebook.ipynb

Set the path and name of the model in the code block of the SSD 300 Model after startup

Then in the final block of code, set the path of the image you want to test

Then click the menu "Cell" and click the submenu "Run All" to execute all the codes in order and display the results.

After execution, the cute panda is circled

After the above steps, we used our own data to complete the training of the target detection model. As long as there is still a need for object detection in the future, and then find the relevant image set for labeling, and then perform model training after labeling, you can complete a customized target detection model, which is very convenient. I hope this case can be helpful to everyone.

 

Follow my official account "Big Data and Artificial Intelligence Lab" (BigdataAILab), and then reply to the " code " keyword to get the  complete source code.

 

Recommended related reading

 

{{o.name}}
{{m.name}}

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=324123534&siteId=291194637