深度学习开发环境配置第二弹:Ubuntu16.04+CUDA9.0.176上cuDNN7.1.3.16+TensorRT4.0.0.3配置

一、安装cuDNN

cuDNN下载地址:https://developer.nvidia.com/rdp/cudnn-download

参照cuDNN的官方installation guide进行安装,选择从编译好的debian file进行安装:

sinc-lab@sinclab-desktop:/media/vslyu/home/sinc-lab/Downloads$ sudo dpkg -i libcudnn7_7.1.3.16-1+cuda9.0_amd64.deb
Selecting previously unselected package libcudnn7.
(Reading database ... 189338 files and directories currently installed.)
Preparing to unpack libcudnn7_7.1.3.16-1+cuda9.0_amd64.deb ...
Unpacking libcudnn7 (7.1.3.16-1+cuda9.0) ...
Setting up libcudnn7 (7.1.3.16-1+cuda9.0) ...
Processing triggers for libc-bin (2.23-0ubuntu10) ...
sinc-lab@sinclab-desktop:/media/vslyu/home/sinc-lab/Downloads$ sudo dpkg -i libcudnn7-dev_7.1.3.16-1+cuda9.0_amd64.deb
Selecting previously unselected package libcudnn7-dev.
(Reading database ... 189345 files and directories currently installed.)
Preparing to unpack libcudnn7-dev_7.1.3.16-1+cuda9.0_amd64.deb ...
Unpacking libcudnn7-dev (7.1.3.16-1+cuda9.0) ...
Setting up libcudnn7-dev (7.1.3.16-1+cuda9.0) ...
update-alternatives: using /usr/include/x86_64-linux-gnu/cudnn_v7.h to provide /usr/include/cudnn.h (libcudnn) in auto mode
sinc-lab@sinclab-desktop:/media/vslyu/home/sinc-lab/Downloads$ sudo dpkg -i libcudnn7-doc_7.1.3.16-1+cuda9.0_amd64.deb
Selecting previously unselected package libcudnn7-doc.
(Reading database ... 189351 files and directories currently installed.)
Preparing to unpack libcudnn7-doc_7.1.3.16-1+cuda9.0_amd64.deb ...
Unpacking libcudnn7-doc (7.1.3.16-1+cuda9.0) ...
Setting up libcudnn7-doc (7.1.3.16-1+cuda9.0) ...
sinc-lab@sinclab-desktop:/media/vslyu/home/sinc-lab/Downloads$

编译cudnn自带例程测试一下安装的结果:

sinc-lab@sinclab-desktop:~$ cp -r /usr/src/cudnn_samples_v7/ /home/sinc-lab/LYH/
sinc-lab@sinclab-desktop:~$ cd  ~/LYH/cudnn_samples_v7/mnistCUDNN
sinc-lab@sinclab-desktop:~/LYH/cudnn_samples_v7/mnistCUDNN$ make -j16
/usr/local/cuda/bin/nvcc -ccbin g++ -I/usr/local/cuda/include -IFreeImage/include  -m64    -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_53,code=compute_53 -o fp16_dev.o -c fp16_dev.cu
g++ -I/usr/local/cuda/include -IFreeImage/include   -o fp16_emu.o -c fp16_emu.cpp
g++ -I/usr/local/cuda/include -IFreeImage/include   -o mnistCUDNN.o -c mnistCUDNN.cpp
/usr/local/cuda/bin/nvcc -ccbin g++   -m64      -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_53,code=compute_53 -o mnistCUDNN fp16_dev.o fp16_emu.o mnistCUDNN.o  -LFreeImage/lib/linux/x86_64 -LFreeImage/lib/linux -lcudart -lcublas -lcudnn -lfreeimage -lstdc++ -lm
sinc-lab@sinclab-desktop:~/LYH/cudnn_samples_v7/mnistCUDNN$ ./mnistCUDNN
cudnnGetVersion() : 7103 , CUDNN_VERSION from cudnn.h : 7103 (7.1.3)
Host compiler version : GCC 5.4.0
There are 4 CUDA capable devices on your machine :
device 0 : sms 28  Capabilities 6.1, SmClock 1582.0 Mhz, MemSize (Mb) 11172, MemClock 5505.0 Mhz, Ecc=0, boardGroupID=0
device 1 : sms 28  Capabilities 6.1, SmClock 1582.0 Mhz, MemSize (Mb) 11172, MemClock 5505.0 Mhz, Ecc=0, boardGroupID=1
device 2 : sms 28  Capabilities 6.1, SmClock 1582.0 Mhz, MemSize (Mb) 11172, MemClock 5505.0 Mhz, Ecc=0, boardGroupID=2
device 3 : sms 28  Capabilities 6.1, SmClock 1582.0 Mhz, MemSize (Mb) 11172, MemClock 5505.0 Mhz, Ecc=0, boardGroupID=3
Using device 0

Testing single precision
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 1
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.029696 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.034816 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.142336 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.178400 time requiring 203008 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.234400 time requiring 207360 memory
Resulting weights from Softmax:
0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006

Result of classification: 1 3 5

Test passed!

Testing half precision (math in single precision)
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 1
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.037888 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.078848 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.089088 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.142176 time requiring 28800 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.153568 time requiring 203008 memory
Resulting weights from Softmax:
0.0000001 1.0000000 0.0000001 0.0000000 0.0000563 0.0000001 0.0000012 0.0000017 0.0000010 0.0000001
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000714 0.0000000 0.0000000 0.0000000 0.0000000
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006

Result of classification: 1 3 5

Test passed!
sinc-lab@sinclab-desktop:~/LYH/cudnn_samples_v7/mnistCUDNN$

测试通过。

二、安装TensorRT

下载地址:https://developer.nvidia.com/nvidia-tensorrt-download

根据NVIDIA的关于TensorRT installation官方文档进行安装,选择debian file的安装方式进行安装:

sinc-lab@sinclab-desktop:/media/vslyu/home/sinc-lab/Downloads$ sudo dpkg -i nv-tensorrt-repo-ubuntu1604-cuda9.0-ga-trt4.0.1.6-20180612_1-1_amd64.deb
Selecting previously unselected package nv-tensorrt-repo-ubuntu1604-cuda9.0-ga-trt4.0.1.6-20180612.
(Reading database ... 189403 files and directories currently installed.)
Preparing to unpack nv-tensorrt-repo-ubuntu1604-cuda9.0-ga-trt4.0.1.6-20180612_1-1_amd64.deb ...
Unpacking nv-tensorrt-repo-ubuntu1604-cuda9.0-ga-trt4.0.1.6-20180612 (1-1) ...
Setting up nv-tensorrt-repo-ubuntu1604-cuda9.0-ga-trt4.0.1.6-20180612 (1-1) ...
sinc-lab@sinclab-desktop:/media/vslyu/home/sinc-lab/Downloads$ sudo apt-get install tensorrt
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package tensorrt
sinc-lab@sinclab-desktop:/media/vslyu/home/sinc-lab/Downloads$ sudo apt-get update
Get:1 file:/var/cuda-repo-9-0-local  InRelease
Ign:1 file:/var/cuda-repo-9-0-local  InRelease
Get:2 file:/var/nv-tensorrt-repo-cuda9.0-ga-trt4.0.1.6-20180612  InRelease
Ign:2 file:/var/nv-tensorrt-repo-cuda9.0-ga-trt4.0.1.6-20180612  InRelease
Get:3 file:/var/cuda-repo-9-0-local  Release [574 B]
Hit:4 http://mirrors.hust.edu.cn/ubuntu xenial InRelease
Hit:5 http://mirrors.hust.edu.cn/ubuntu xenial-security InRelease
Get:3 file:/var/cuda-repo-9-0-local  Release [574 B]
Hit:6 http://mirrors.hust.edu.cn/ubuntu xenial-updates InRelease
Hit:7 http://mirrors.hust.edu.cn/ubuntu xenial-proposed InRelease
Hit:8 http://mirrors.hust.edu.cn/ubuntu xenial-backports InRelease
Get:9 file:/var/nv-tensorrt-repo-cuda9.0-ga-trt4.0.1.6-20180612  Release [574 B]
Get:9 file:/var/nv-tensorrt-repo-cuda9.0-ga-trt4.0.1.6-20180612  Release [574 B]
Get:10 file:/var/nv-tensorrt-repo-cuda9.0-ga-trt4.0.1.6-20180612  Release.gpg [819 B]
Get:10 file:/var/nv-tensorrt-repo-cuda9.0-ga-trt4.0.1.6-20180612  Release.gpg [819 B]
Get:12 file:/var/nv-tensorrt-repo-cuda9.0-ga-trt4.0.1.6-20180612  Packages [3,618 B]
Reading package lists... Done
sinc-lab@sinclab-desktop:/media/vslyu/home/sinc-lab/Downloads$ sudo apt-get install tensorrt
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
  libnvinfer-dev libnvinfer-samples libnvinfer4
The following NEW packages will be installed:
  libnvinfer-dev libnvinfer-samples libnvinfer4 tensorrt
0 upgraded, 4 newly installed, 0 to remove and 26 not upgraded.
Need to get 0 B/346 MB of archives.
After this operation, 858 MB of additional disk space will be used.
Do you want to continue? [Y/n] Y
Get:1 file:/var/nv-tensorrt-repo-cuda9.0-ga-trt4.0.1.6-20180612  libnvinfer4 4.1.2-1+cuda9.0 [36.1 MB]
Get:2 file:/var/nv-tensorrt-repo-cuda9.0-ga-trt4.0.1.6-20180612  libnvinfer-dev 4.1.2-1+cuda9.0 [37.4 MB]
Get:3 file:/var/nv-tensorrt-repo-cuda9.0-ga-trt4.0.1.6-20180612  libnvinfer-samples 4.1.2-1+cuda9.0 [271 MB]
Get:4 file:/var/nv-tensorrt-repo-cuda9.0-ga-trt4.0.1.6-20180612  tensorrt 4.0.1.6-1+cuda9.0 [1,509 kB]
Selecting previously unselected package libnvinfer4.
(Reading database ... 189426 files and directories currently installed.)
Preparing to unpack .../libnvinfer4_4.1.2-1+cuda9.0_amd64.deb ...
Unpacking libnvinfer4 (4.1.2-1+cuda9.0) ...
Selecting previously unselected package libnvinfer-dev.
Preparing to unpack .../libnvinfer-dev_4.1.2-1+cuda9.0_amd64.deb ...
Unpacking libnvinfer-dev (4.1.2-1+cuda9.0) ...
Selecting previously unselected package libnvinfer-samples.
Preparing to unpack .../libnvinfer-samples_4.1.2-1+cuda9.0_amd64.deb ...
Unpacking libnvinfer-samples (4.1.2-1+cuda9.0) ...
Selecting previously unselected package tensorrt.
Preparing to unpack .../tensorrt_4.0.1.6-1+cuda9.0_amd64.deb ...
Unpacking tensorrt (4.0.1.6-1+cuda9.0) ...
Processing triggers for libc-bin (2.23-0ubuntu10) ...
Setting up libnvinfer4 (4.1.2-1+cuda9.0) ...
Setting up libnvinfer-dev (4.1.2-1+cuda9.0) ...
Setting up libnvinfer-samples (4.1.2-1+cuda9.0) ...
Setting up tensorrt (4.0.1.6-1+cuda9.0) ...
Processing triggers for libc-bin (2.23-0ubuntu10) ...

安装关于inference的几个Python接口:

安装如下图: 

sinc-lab@sinclab-desktop:/media/vslyu/home/sinc-lab/Downloads$ sudo apt-get install python-libnvinfer-doc swig
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
  python-libnvinfer python-libnvinfer-dev swig3.0
Suggested packages:
  swig-doc swig-examples swig3.0-examples swig3.0-doc
The following NEW packages will be installed:
  python-libnvinfer python-libnvinfer-dev python-libnvinfer-doc swig swig3.0
0 upgraded, 5 newly installed, 0 to remove and 26 not upgraded.
Need to get 1,001 kB/4,100 kB of archives.
After this operation, 17.2 MB of additional disk space will be used.
Do you want to continue? [Y/n] Y
Get:1 file:/var/nv-tensorrt-repo-cuda9.0-ga-trt4.0.1.6-20180612  python-libnvinfer 4.1.2-1+cuda9.0 [1,036 kB]
Get:2 file:/var/nv-tensorrt-repo-cuda9.0-ga-trt4.0.1.6-20180612  python-libnvinfer-dev 4.1.2-1+cuda9.0 [1,122 B]
Get:3 file:/var/nv-tensorrt-repo-cuda9.0-ga-trt4.0.1.6-20180612  python-libnvinfer-doc 4.1.2-1+cuda9.0 [2,062 kB]
Get:4 http://mirrors.hust.edu.cn/ubuntu xenial/universe amd64 swig3.0 amd64 3.0.8-0ubuntu3 [995 kB]
Get:5 http://mirrors.hust.edu.cn/ubuntu xenial/universe amd64 swig amd64 3.0.8-0ubuntu3 [6,278 B]
Fetched 1,001 kB in 0s (4,652 kB/s)
Selecting previously unselected package python-libnvinfer.
(Reading database ... 191219 files and directories currently installed.)
Preparing to unpack .../python-libnvinfer_4.1.2-1+cuda9.0_amd64.deb ...
Unpacking python-libnvinfer (4.1.2-1+cuda9.0) ...
Selecting previously unselected package python-libnvinfer-dev.
Preparing to unpack .../python-libnvinfer-dev_4.1.2-1+cuda9.0_amd64.deb ...
Unpacking python-libnvinfer-dev (4.1.2-1+cuda9.0) ...
Selecting previously unselected package python-libnvinfer-doc.
Preparing to unpack .../python-libnvinfer-doc_4.1.2-1+cuda9.0_amd64.deb ...
Unpacking python-libnvinfer-doc (4.1.2-1+cuda9.0) ...
Selecting previously unselected package swig3.0.
Preparing to unpack .../swig3.0_3.0.8-0ubuntu3_amd64.deb ...
Unpacking swig3.0 (3.0.8-0ubuntu3) ...
Selecting previously unselected package swig.
Preparing to unpack .../swig_3.0.8-0ubuntu3_amd64.deb ...
Unpacking swig (3.0.8-0ubuntu3) ...
Processing triggers for man-db (2.7.5-1) ...
Setting up python-libnvinfer (4.1.2-1+cuda9.0) ...
Setting up python-libnvinfer-dev (4.1.2-1+cuda9.0) ...
Setting up python-libnvinfer-doc (4.1.2-1+cuda9.0) ...
Setting up swig3.0 (3.0.8-0ubuntu3) ...
Setting up swig (3.0.8-0ubuntu3) ...

验证TensorRT是否成功安装:

法一:参加TensorRT installation guide4.1. Debian Installation,不多赘述

法二(建议使用):例程测试:

$ cp -r /usr/src/tensorrt/ ~/LYH
$ cd ~/LYH/tensorrt/samples 
$ make -j16
$ cd ../bin
$ ./sample_int8 mnist

输出成功的结果类似如下:

sinc-lab@sinclab-desktop:~/LYH/tensorrt/bin$ ./sample_int8 mnist

FP32 run:400 batches of size 100 starting at 100
........................................
Top1: 0.9904, Top5: 1
Processing 40000 images averaged 0.0019213 ms/image and 0.19213 ms/batch.

FP16 run:400 batches of size 100 starting at 100
Engine could not be created at this precision

INT8 run:400 batches of size 100 starting at 100
........................................
Top1: 0.9908, Top5: 1
Processing 40000 images averaged 0.00145806 ms/image and 0.145806 ms/batch.
sinc-lab@sinclab-desktop:~/LYH/tensorrt/bin$

猜你喜欢

转载自blog.csdn.net/vslyu/article/details/82631578
今日推荐