I. Introduction
There are two CenterNets. The papers corresponding to CenterNet in this article are: Objects as Points
corresponding to Github address: xingyizhou/CenterNet
Click here for Win10 version: [Target detection] Win10+CUDA10.0+CUDNN7.5 builds CenterNet environment
The following two blogs have also helped me a lot:
Training CenterNet network on the SeaShips dataset
(absolutely detailed) CenterNet training its own data (pytorch0.4.1)
The computer environment is:
Ubuntu 16.04
GPU RTX2070 Advanced OC 8G
GPU驱动 418.87.00
gcc version 5.4.0
CUDA 10.0.130
CUDNN 7.6.0
python 3.6.9
pytorch 1.1.0
torchvision 0.3.0
Two, about the CUDA version
There is no problem running the visualization code and test code at this time:
visualization:
python demo.py ctdet --demo /home/vincent/Code/CenterNet/images/ --load_model /home/vincent/Code/CenterNet/models/ctdet_coco_dla_2x.pth
test:
python test.py ctdet --exp_id coco_dla --keep_res --load_model ../models/ctdet_coco_dla_2x.pth
But when I run the training code, I get an error:
Training:
python main.py ctdet --exp_id coco_dla --batch_size 5 --lr 1.25e-4 --gpus 0 --num_workers 0
The last line of the error content:
RuntimeError: cuda runtime error (11): invalid argument at /opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THC/THCGeneral.cpp:663 The
complete error content click here
but found it I haven't found out the reason for a long time, and finally followed Error for run demo.py #356 to change the CUDA version and then it was fine, so currently I use CUDA 10.0
Three, environmental installation
1. Install CUDA and CUDNN
1.1 Download
CUDA download URL: CUDA Toolkit 10.0 Archive
Both files need to be downloaded.
CUDNN download URL: cuDNN Archive
select Download cuDNN v7.6.0 (May 20, 2019), for CUDA 10.0, download cuDNN Library for Linux
1.2 Install CUDA
The terminal enters the CUDA file address:
first install the basic package:
sudo sh cuda_10.0.130_410.48_linux.run
During the installation process, there will be several options for optional installation and installation location. The second one does not need to install the NVIDIA driver (the system is already installed), and the rest select yes or press Enter.
Then install the update package:
sudo sh cuda_10.0.130.1_linux.run
Verify that CUDA is installed successfully
. Run it in the terminal. Under nvcc -V
normal circumstances, it will display:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
1.3 Install CUDNN
The terminal enters the CUDNN file address:
tar xvf cudnn-10.0-linux-x64-v7.6.0.64.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/
sudo chmod a+r /usr/local/cuda/include/cudnn.h
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*
Verify that CUDNN is successfully installed
and run in the terminal
cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
Under normal circumstances:
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 6
#define CUDNN_PATCHLEVEL 0
2. Download the source code
git clone https://github.com/xingyizhou/CenterNet.git
3. Create a virtual environment with Conda
conda create --name CenterNet python=3.6
Select yes and wait for the basic files to be installed, and switch to the CenterNet virtual environment after the installation is complete
conda activate CenterNet
4. Install other python libraries
4.1 Install requirements.txt
cd CenterNet
pip install -r requirements.txt
If the download speed is too slow, you can add -i 国内下载源地址
it at the end of the command , for example:
pip install -r requirements.txt -i https://pypi.douban.com/simple/
(1) Ali Cloud: http://mirrors.aliyun.com/pypi/simple/
(2) Douban: https://pypi.douban.com/simple/
(3) Tsinghua University: https://pypi.tuna .tsinghua.edu.cn/simple/
(4) University of Science and Technology of China: https://pypi.mirrors.ustc.edu.cn/simple/
Also install other packages
pip install pillow -i https://pypi.douban.com/simple/
4.2 Install pyrtoch and torchvision
Method 1: Command line installation
conda install pytorch=1.1 torchvision
If the installation goes well with the command line, pytorch is 1.1.0 and torchvision is 0.3.0.
If the speed is very slow, you can change the conda source to a domestic source, the steps are as follows:
sudo gedit ~/.condarc
Then replace the content with the following and save it.
channels:
- https://mirrors.ustc.edu.cn/anaconda/pkgs/main/
- https://mirrors.ustc.edu.cn/anaconda/cloud/conda-forge/
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
- defaults
show_channel_urls: true
Then re-run the conda download command
Method 2: whl installation
If it shows download failure or download speed is horribly slow, use whl manual installation method: whl download address
search and download torch-1.1.0-cp36-cp36m-linux_x86_64.whl
and torchvision-0.3.0-cp36-cp36m-linux_x86_64.whl
terminal enter the file download directory, then:
pip install torch-1.1.0-cp36-cp36m-linux_x86_64.whl
pip install torchvision-0.3.0-cp36-cp36m-linux_x86_64.whl
Verify that the installation was successful:
After installation enter in a terminal conda list
to see if there is pyrtoch
andtorchvision
Verify that pytorch can use CUDA and
CUDNN to run the following code in pyhon:
import torch
a = torch.tensor(1.)
print(a.cuda())
from torch.backends import cudnn
print(cudnn.is_available())
print(cudnn.is_acceptable(a.cuda()))
If it is normal, it should return:
tensor(1., device='cuda:0')
True
True
4.3 Install COCOAPI
git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI
make
python setup.py install --user
5. Compile
Be careful to change to your own file address! !
cd ~/Code/CenterNet/src/lib/external
python setup.py build_ext --inplace
No error is reported as success
6. Install DCNv2
Be careful to change to your own file address! !
cd ~/Code/CenterNet/src/lib/models/networks
rm -r DCNv2
git clone https://github.com/CharlesShang/DCNv2.git
cd DCNv2
python setup.py build develop
There will be a lot of things, but as long as no error is reported, it will be displayed under normal circumstances
Processing dependencies for DCNv2==0.1
Finished processing dependencies for DCNv2==0.1
If you report a similar import _ext as _backend
error and say that it is undefined symbol: _ZN3c105ErrorC1ENS_14SourceLocationERKSs
most likely that you are using python3.7, just change to python3.6
Fix size testing.
training chunk_sizes: [4]
The output will be saved to /home/vincent/Code/CenterNet-cuda10-multi-spectral/src/lib/../../exp/ctdet/default/rgb
################## Dataset about rgb ##################
Traceback (most recent call last):
File "main.py", line 12, in <module>
from models.model import create_model, load_model, save_model
File "/home/vincent/Code/CenterNet-cuda10-multi-spectral/src/lib/models/model.py", line 12, in <module>
from .networks.pose_dla_dcn import get_pose_net as get_dla_dcn
File "/home/vincent/Code/CenterNet-cuda10-multi-spectral/src/lib/models/networks/pose_dla_dcn.py", line 16, in <module>
from .DCNv2.dcn_v2 import DCN
File "/home/vincent/Code/CenterNet-cuda10-multi-spectral/src/lib/models/networks/DCNv2/dcn_v2.py", line 13, in <module>
import _ext as _backend
ImportError: /home/vincent/Code/CenterNet-cuda10-multi-spectral/src/lib/models/networks/DCNv2/_ext.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC1ENS_14SourceLocationERKSs
7. Download models trained by others
Link: https://pan.baidu.com/s/1QOmIwy8lXJBuLv5hH5j3ag
Extraction code: vwk4
(The source of the above file is (absolutely detailed) CenterNet trained its own data (pytorch0.4.1) )
( More official trained models are here )
After downloading, put ctdet_coco_dla_2x into the models folder.
After completing this step, you can run the visualization code, the terminal enters the src
folder and runs:
python demo.py ctdet --demo /home/vincent/Code/CenterNet/images/ --load_model /home/vincent/Code/CenterNet/models/ctdet_coco_dla_2x.pth
Note that the address of the file must be changed to your own address!
Normally, a picture box will pop up. Press any key to switch to the next picture, and press Esc
to exit.
(If you directly use the mouse to click the upper left corner to close the cross, the program will still run and will not really stop)
8. Download the data set
The official data preparation process is here: Dataset preparation
Because I only need to get 2D target detection, I only need to download the COCO data set.
- Download image data sets: 2017 Train, 2017 Val, 2017 Test
- Download the annotation file: 2017 train/val and test image info
- Put the file in the following form
${CenterNet_ROOT}
|-- data
`-- |-- coco
`-- |-- annotations
| |-- instances_train2017.json
| |-- instances_val2017.json
| |-- person_keypoints_train2017.json
| |-- person_keypoints_val2017.json
| |-- image_info_test-dev2017.json
|
|---|-- train2017
|---|-- val2017
`---|-- test2017
9. Training code
Enter the src
folder and run the training code:
python main.py ctdet --exp_id coco_dla --batch_size 5 --lr 1.25e-4 --gpus 0 --num_workers 0
At this time, a file will be downloaded dla34-ba72cf86.pth
, but the speed will be very slow. We ctrl+c
terminate it and download it manually.
9.1 Find the file storage directory
In the error message, you can see that the file will be downloaded there. Just remember this address. For example, where is my file location /home/vincent/.torch/models/
(you can also enter locate .torch
it in the terminal , and it will show where the folder is)
9.2 Download files that were not successfully downloaded just now
Link: https://pan.baidu.com/s/1I1oW_l2Xe2-LV1gIjViPTg
Extraction code: 2pt0
(The source of the above file is (absolutely detailed) CenterNet training its own data (pytorch0.4.1) )
After downloading, put it in the directory of step 9.1.
At this point, run the training code again, if normal, it will display:
10. Test the code
Enter the src
folder and run the test code:
python test.py ctdet --exp_id coco_dla --keep_res --load_model ../models/ctdet_coco_dla_2x.pth
At this time, mAP will be calculated. Under normal circumstances, it will display:
11. Remarks
- Remember to change the above command line to your own address when calling;
- If you restart the terminal, remember to
conda activate CenterNet
enter the virtual environment; --num_workers 0
If the parameters in the training code are not added, the error EOFError: Ran out of input will generally be reported.
If you still get an error after adding it (it may be wrong in the case of val), modifymain.py
line 48num_workers=1
to readnum_workers=0
;- If you find that the system lacks some software during the installation process (for example, it prompts that there is no gcc, which causes the compilation to fail), just install the relevant Baidu tutorial.
At this point, the CenterNet environment under Ubuntu16.04+RTX2070+CUDA10+pytorch1.1 has been built.