I. Overview:
1. Download the SSD model code on GitHub
2. Prepare your own VOC dataset
3. Build the Pytorch environment
4. Use the Pytorch framework to run the model in Pycharm
5. Problems encountered and solutions
6. Check the prediction effect of the model and check the detection indicators
2. This article uses the computer configuration:
Hardware: Lenovo Rescuer R9000P 2021 (R7 5800H/16GB/512GB/RTX3070)
Software: Win10 system, Anaconda3, Python3.8.0, pycharm
Pytorch configuration: pytorch1.7.1 and torchvision0.8.2
3. Detailed steps:
1. Download the SSD model code on GitHub
Link: https://github.com/bubbliiiiing/ssd-pytorch
First go to Github to download the corresponding warehouse. After downloading, use the decompression software to decompress it, and then use the programming software to open the folder.
Note that the opened root directory must be correct, otherwise the code will not run if the relative directory is incorrect.
It must be noted that the root directory after opening is the directory where the file is stored.
2. Prepare your own VOC dataset
This article uses the VOC format for training. Before training, you need to make your own data set. If you don’t have your own data set, you can download the VOC12+07 data set through the Github connection and try it out.
VOC12+07 dataset link:
① Before training, put the label file in the Annotation under the VOC2007 folder under the VOCdevkit folder.
② Before training, put the picture file in the JPEGImages under the VOC2007 folder under the VOCdevkit folder.
Then there is the processing of the data set:
after completing the arrangement of the data set, we need to process the data set in the next step, the purpose is to obtain 2007_train.txt and 2007_val.txt for training, and we need to use voc_annotation in the root directory. py.
There are some parameters in voc_annotation.py that need to be set.
They are annotation_mode, classes_path, trainval_percent, train_percent, and VOCdevkit_path. For the first training, only classes_path can be modified, corresponding to the newly created cls_classes in the model_data folder. txt (the content is the class label we want to train)
3. Build the Pytorch environment
The Pytorch configuration I use here is pytorch1.7.1 and torchvision0.8.2. For the specific configuration method, see the following blog, link:
4. Use the Pytorch framework to run the model in Pycharm
After the environment is configured, you can open Pycharm for model training!
The first step of training: import your own data set
The second step of training: Modify the model parameters
The third step of training: Debug
Training Step Four: Start Training
The fifth step of training: After the training is completed, the detection accuracy of the model is detected
See my other blog for specific steps,
Or follow Bubbliiiiing to learn in BliBli ~
5. Problems encountered and solutions
Here is the Debug environment mentioned in the previous step!
Problem 1: If the first round of training fails before it starts, or after several rounds of training (the loading of training data will be very slow at this time) and then fails, the error is reported as follows: RuntimeError: CUDA out of memory. Tried to allocate 2.14 GiB (GPU 0; 8.00 GiB total capacity; 279.45 MiB already allocated; 5.81 GiB free; 338.00 MiB reserved in total by PyTorch)
Reminder: There is enough memory, but the running memory exceeds the situation.
Solution: modify the size of the input image as follows, and change the Batch_Size to a smaller size.
① Modify utils_u. The img_min_side in py is 224
②Modify the dataloader. The input_shape in py is [224, 224]
③Modify the train. The input_shape in py is [224, 224], and the batch_size is changed to a smaller one. I changed it to 1 for this relationship.
Run again, problem solved and ready to train!
Question 2: After training for a period of time, the following error occurred again:
OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
solve:
In train. py file, add the following code:
import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
There is no error reported in the middle of the training again, and the problem is solved!
6. Check the prediction effect of the model and check the detection indicators
Test of model prediction effect:
Modify predict. py file
How to view indicators:
①Modify frcnn. py file, so that it points to the newly generated . pth file, modify classes_path to point to cls_classes under the model_data folder. txt.
② Modify get_map. The classes_path in the py file is cls_classes under the model_data folder. txt.
③ Run get_map. py. Then you can view various parameters such as the training accuracy of the model in the result under the newly generated file.