[Target detection] Ubuntu16.04+RTX2070+CUDA10.0+pytorch1.1 to build a CenterNet environment

I. Introduction

There are two CenterNets. The papers corresponding to CenterNet in this article are: Objects as Points
corresponding to Github address: xingyizhou/CenterNet

Click here for Win10 version: [Target detection] Win10+CUDA10.0+CUDNN7.5 builds CenterNet environment

The following two blogs have also helped me a lot:
Training CenterNet network on the SeaShips dataset
(absolutely detailed) CenterNet training its own data (pytorch0.4.1)

The computer environment is:

Ubuntu 16.04
GPU RTX2070 Advanced OC 8G
GPU驱动 418.87.00
gcc version 5.4.0
CUDA 10.0.130
CUDNN 7.6.0
python 3.6.9
pytorch 1.1.0
torchvision 0.3.0

Two, about the CUDA version

There is no problem running the visualization code and test code at this time:
visualization:

python demo.py ctdet --demo /home/vincent/Code/CenterNet/images/ --load_model /home/vincent/Code/CenterNet/models/ctdet_coco_dla_2x.pth

test:

python test.py ctdet --exp_id coco_dla --keep_res --load_model ../models/ctdet_coco_dla_2x.pth

But when I run the training code, I get an error:
Training:

python main.py ctdet --exp_id coco_dla --batch_size 5  --lr 1.25e-4 --gpus 0 --num_workers 0

The last line of the error content:
RuntimeError: cuda runtime error (11): invalid argument at /opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THC/THCGeneral.cpp:663 The
complete error content click here
but found it I haven't found out the reason for a long time, and finally followed Error for run demo.py #356 to change the CUDA version and then it was fine, so currently I use CUDA 10.0

Three, environmental installation

1. Install CUDA and CUDNN
1.1 Download

CUDA download URL: CUDA Toolkit 10.0 Archive
Both files need to be downloaded.
Insert picture description here
CUDNN download URL: cuDNN Archive
select Download cuDNN v7.6.0 (May 20, 2019), for CUDA 10.0, download cuDNN Library for Linux
Insert picture description here

1.2 Install CUDA

The terminal enters the CUDA file address:
first install the basic package:

sudo sh cuda_10.0.130_410.48_linux.run

During the installation process, there will be several options for optional installation and installation location. The second one does not need to install the NVIDIA driver (the system is already installed), and the rest select yes or press Enter.
Then install the update package:

sudo sh cuda_10.0.130.1_linux.run

Verify that CUDA is installed successfully
. Run it in the terminal. Under nvcc -Vnormal circumstances, it will display:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
1.3 Install CUDNN

The terminal enters the CUDNN file address:

tar xvf cudnn-10.0-linux-x64-v7.6.0.64.tgz 
sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/ 
sudo chmod a+r /usr/local/cuda/include/cudnn.h
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*

Verify that CUDNN is successfully installed
and run in the terminal

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2

Under normal circumstances:

#define CUDNN_MAJOR 7
#define CUDNN_MINOR 6
#define CUDNN_PATCHLEVEL 0
2. Download the source code
git clone https://github.com/xingyizhou/CenterNet.git
3. Create a virtual environment with Conda
conda create --name CenterNet python=3.6

Select yes and wait for the basic files to be installed, and switch to the CenterNet virtual environment after the installation is complete

conda activate CenterNet
4. Install other python libraries
4.1 Install requirements.txt
cd CenterNet
pip install -r requirements.txt

If the download speed is too slow, you can add -i 国内下载源地址it at the end of the command , for example:

pip install -r requirements.txt -i https://pypi.douban.com/simple/

(1) Ali Cloud: http://mirrors.aliyun.com/pypi/simple/
(2) Douban: https://pypi.douban.com/simple/
(3) Tsinghua University: https://pypi.tuna .tsinghua.edu.cn/simple/
(4) University of Science and Technology of China: https://pypi.mirrors.ustc.edu.cn/simple/

Also install other packages

pip install pillow -i https://pypi.douban.com/simple/
4.2 Install pyrtoch and torchvision

Method 1: Command line installation

conda install pytorch=1.1 torchvision

If the installation goes well with the command line, pytorch is 1.1.0 and torchvision is 0.3.0.

If the speed is very slow, you can change the conda source to a domestic source, the steps are as follows:

sudo gedit ~/.condarc

Then replace the content with the following and save it.

channels:
  - https://mirrors.ustc.edu.cn/anaconda/pkgs/main/
  - https://mirrors.ustc.edu.cn/anaconda/cloud/conda-forge/
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
  - defaults
show_channel_urls: true

Then re-run the conda download command

Method 2: whl installation
If it shows download failure or download speed is horribly slow, use whl manual installation method: whl download address
search and download torch-1.1.0-cp36-cp36m-linux_x86_64.whland torchvision-0.3.0-cp36-cp36m-linux_x86_64.whl
Insert picture description hereInsert picture description here
terminal enter the file download directory, then:

pip install torch-1.1.0-cp36-cp36m-linux_x86_64.whl
pip install torchvision-0.3.0-cp36-cp36m-linux_x86_64.whl

Verify that the installation was successful:
After installation enter in a terminal conda listto see if there is pyrtochandtorchvision

Verify that pytorch can use CUDA and
CUDNN to run the following code in pyhon:

import torch
a = torch.tensor(1.)
print(a.cuda())
from torch.backends import cudnn
print(cudnn.is_available())
print(cudnn.is_acceptable(a.cuda()))

If it is normal, it should return:

tensor(1., device='cuda:0')
True
True
4.3 Install COCOAPI
git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI
make
python setup.py install --user
5. Compile

Be careful to change to your own file address! !

cd ~/Code/CenterNet/src/lib/external
python setup.py build_ext --inplace

No error is reported as success

6. Install DCNv2

Be careful to change to your own file address! !

cd ~/Code/CenterNet/src/lib/models/networks
rm -r DCNv2
git clone https://github.com/CharlesShang/DCNv2.git
cd DCNv2
python setup.py build develop

There will be a lot of things, but as long as no error is reported, it will be displayed under normal circumstances

Processing dependencies for DCNv2==0.1
Finished processing dependencies for DCNv2==0.1

If you report a similar import _ext as _backenderror and say that it is undefined symbol: _ZN3c105ErrorC1ENS_14SourceLocationERKSsmost likely that you are using python3.7, just change to python3.6

Fix size testing.
training chunk_sizes: [4]
The output will be saved to  /home/vincent/Code/CenterNet-cuda10-multi-spectral/src/lib/../../exp/ctdet/default/rgb
################## Dataset about rgb ##################
Traceback (most recent call last):
  File "main.py", line 12, in <module>
    from models.model import create_model, load_model, save_model
  File "/home/vincent/Code/CenterNet-cuda10-multi-spectral/src/lib/models/model.py", line 12, in <module>
    from .networks.pose_dla_dcn import get_pose_net as get_dla_dcn
  File "/home/vincent/Code/CenterNet-cuda10-multi-spectral/src/lib/models/networks/pose_dla_dcn.py", line 16, in <module>
    from .DCNv2.dcn_v2 import DCN
  File "/home/vincent/Code/CenterNet-cuda10-multi-spectral/src/lib/models/networks/DCNv2/dcn_v2.py", line 13, in <module>
    import _ext as _backend
ImportError: /home/vincent/Code/CenterNet-cuda10-multi-spectral/src/lib/models/networks/DCNv2/_ext.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC1ENS_14SourceLocationERKSs

7. Download models trained by others

Link: https://pan.baidu.com/s/1QOmIwy8lXJBuLv5hH5j3ag
Extraction code: vwk4
(The source of the above file is (absolutely detailed) CenterNet trained its own data (pytorch0.4.1) )
( More official trained models are here )

After downloading, put ctdet_coco_dla_2x into the models folder.
After completing this step, you can run the visualization code, the terminal enters the srcfolder and runs:

python demo.py ctdet --demo /home/vincent/Code/CenterNet/images/ --load_model /home/vincent/Code/CenterNet/models/ctdet_coco_dla_2x.pth

Note that the address of the file must be changed to your own address!
Normally, a picture box will pop up. Press any key to switch to the next picture, and press Escto exit.
(If you directly use the mouse to click the upper left corner to close the cross, the program will still run and will not really stop)
Insert picture description here

8. Download the data set

The official data preparation process is here: Dataset preparation
Because I only need to get 2D target detection, I only need to download the COCO data set.

  1. Download image data sets: 2017 Train, 2017 Val, 2017 Test
  2. Download the annotation file: 2017 train/val and test image info
  3. Put the file in the following form
${CenterNet_ROOT}
|-- data
`-- |-- coco
    `-- |-- annotations
        |   |-- instances_train2017.json
        |   |-- instances_val2017.json
        |   |-- person_keypoints_train2017.json
        |   |-- person_keypoints_val2017.json
        |   |-- image_info_test-dev2017.json
        | 
        |---|-- train2017
        |---|-- val2017
        `---|-- test2017
9. Training code

Enter the srcfolder and run the training code:

python main.py ctdet --exp_id coco_dla --batch_size 5  --lr 1.25e-4 --gpus 0 --num_workers 0

At this time, a file will be downloaded dla34-ba72cf86.pth, but the speed will be very slow. We ctrl+cterminate it and download it manually.

9.1 Find the file storage directory

In the error message, you can see that the file will be downloaded there. Just remember this address. For example, where is my file location /home/vincent/.torch/models/
(you can also enter locate .torchit in the terminal , and it will show where the folder is)

9.2 Download files that were not successfully downloaded just now

Link: https://pan.baidu.com/s/1I1oW_l2Xe2-LV1gIjViPTg
Extraction code: 2pt0
(The source of the above file is (absolutely detailed) CenterNet training its own data (pytorch0.4.1) )

After downloading, put it in the directory of step 9.1.

At this point, run the training code again, if normal, it will display:
Insert picture description here

10. Test the code

Enter the srcfolder and run the test code:

python test.py ctdet --exp_id coco_dla --keep_res --load_model ../models/ctdet_coco_dla_2x.pth

At this time, mAP will be calculated. Under normal circumstances, it will display:
Insert picture description here

11. Remarks
  1. Remember to change the above command line to your own address when calling;
  2. If you restart the terminal, remember to conda activate CenterNetenter the virtual environment;
  3. --num_workers 0If the parameters in the training code are not added, the error EOFError: Ran out of input will generally be reported.
    If you still get an error after adding it (it may be wrong in the case of val), modify main.pyline 48 num_workers=1to read num_workers=0;
  4. If you find that the system lacks some software during the installation process (for example, it prompts that there is no gcc, which causes the compilation to fail), just install the relevant Baidu tutorial.

At this point, the CenterNet environment under Ubuntu16.04+RTX2070+CUDA10+pytorch1.1 has been built.

Guess you like

Origin blog.csdn.net/weixin_38705903/article/details/102598339
Recommended