Record: Some problems that occur when the ssd algorithm is reproduced

1.
Undefined symbol: _ZN6caffe26detail36_typeMetaDataInstance_preallocated_7E
has an undefined symbol: There are many types of xxxx problems, Uninstalling torch and torchvision and reinstalling them would help in this matter. This is the suggestion from the original author of the code on GitHub. Before getting his reply, this did solve the problem .
2.According to the example given by the author python demo.py --config-file configs/vgg_ssd300_voc0712.yaml --images_dir demo --ckpt https://github.com/lufficc/SSD/releases/download/1.2/vgg_ssd300_voc0712.pth
After the pre-training model is downloaded, the display is incomplete, and it is said on the Internet that the model file may be damaged due to ftp transmission. After repeated unsuccessful attempts, I found the model from other places on the Internet, downloaded it and uploaded it to the server, and succeeded.
3. During training, specify 01node's No. 1 GPU to indicate a segment error. After switching to 02node, the operation is successful. It proves that the GPU model is not specified in the code. It should be 01 node No. 0 GPU.
4. The project is relatively large and the file In many cases, I don’t know how to modify the configuration to fit my own data set. You can start with the training sample code given by the author, and find which file you use.

Guess you like

Origin blog.csdn.net/qq_41872271/article/details/104697043