Ubuntu16.04 + GTX 1080Ti + CUDA 8.0 + cuDNN + Tesnorflow1.0 server installation depth study of the road

0. installation background


  • System: Ubuntu 16.04
  • Kernel: 4.4.0-140-the Generic
  • GPU:GTX 1080Ti
  • nvidia driver version: 384.111
  • wonders: CUDA 8.0
  • Deep learning library cuDNN: cuDNN5.1
  • tensorflow:1.0.1

1. Install the graphics driver nvdia


Download nvidia graphics driver

Nvidia graphics card to the official website of the corresponding drive, and download. Here's graphics driver download link: nvida) 384.111
download time to pay attention, and the graphics card driver ubuntu kernel version. From the correspondence table nvidia official website as follows:

Distribution Kernel* GCC GLIBC
Ubuntu 18.10 4.18.0 8.2.0 2.28
Ubuntu 18.04.1 (**) 4.15.0 7.3.0 2.27
Ubuntu 16.04.5 (**) 4.4 5.4.0 2.23
Ubuntu 14.04.5 (**) 3.13 4.8.4 2.19

Delete the previous nvidia graphics driver version

$ sudo apt-get install --purge nvidia*

Comes close nouveau graphics driver

$ sudo gedit /etc/modprobe.d/blacklist.conf

At the end of the text added: blacklist nouveau
Run

$ sudo update-initramfs -u

And then restart the machine. Run the following command to confirm whether or not to close. If nothing is displayed, it said it had removed.

$  lsmod | grep nouveau

Install nvidia graphics driver

ctrl + alt + F1 to enter the tty1 console, enter the command:

$ sudo service lightdm stop     //关闭桌面服务
$ cd Downloads/                     //进入下载的驱动所在路径

/*  安装显卡驱动,参数解释
 *  -no-x-check             关闭x服务器
 *  -no-nouveau-check 关闭自带显卡驱动
 *  -no-opengl-files       关闭OpenGl服务,否则会出现重复登录的情况
 */
$ sudo ./NVIDIA-Linux-x86_64-384.111.run -no-x-check -no-nouveau-check -no-opengl-files 

Next, enter the installation interface, you must first accept the certificate, the latter option is selected by default just fine.
To see if the installation was successful:

$ nvidia-smi

If the installation is successful, it should look like Figure 1:

Problems encountered

1.building kernel moduel
errors:
ERROR: An error occurred at The Performing the while the STEP: "Building Kernel modules" See /var/log/nvidia-installer.log for the Details..
This is due to the ubuntu kernel version does not correspond with the nvidia driver , I had the machine installed 384 graphics driver is 4.15, build error occurs every time, later reduced to the kernel, and kernel set 4.4 4.4 the default kernel for the system, so the installation was successful.
2. Repeat Login
install the drivers have to opengl time off, such as when the driver command previously installed.
There is a reason to question a consistent answer, ubuntu kernel with nvidia driver version does not correspond. It is recommended that after installing the driver, ubuntu kernel version will automatically shut off the update, otherwise the kernel automatically updated, nvidia does not automatically update, the system will lead to repeat ubuntu login.

2. Install cuda


The corresponding version

version Python version translater CUDA minimum version cuDNN minimum version
tensorflow_gpu-1.13.0 2.7、 3.3~3.6 4.8 10.0 7.4
tensorflow_gpu-1.5.0 ~1.12.0 2.7、 3.3~3.6 4.8 9 7
tensorflow_gpu-1.3.0 ~1.3.0 2.7、 3.3~3.6 4.8 8 6
tensorflow_gpu-1.0.0 ~1.2.0 2.7、 3.3~3.6 4.8 8 5.1

Download cuda toolkit 8.0

进入官网下载cuda toolkit 8.0(或者直接google cuda 8.0可以直接进入),选择电脑配置对应的版本,选择runfile类型的文件,如图2。

下载成功后,执行命令:

$ sudo sh cuda_8.0.61_375.26_linux.run 

然后进入安装,一开始出现的一大堆文字都是End User License Agreement,可以ctrl+c跳过,在随后的协议选择accept协议。注意,在Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 375.26?选择no,因为我们已经安装过nvidia驱动了。
具体选项如下:

Logging to /tmp/cuda_install_32359.log
Using more to view the EULA.
End User License Agreement
--------------------------


Preface
-------

The following contains specific license terms and conditions
for four separate NVIDIA products. By accepting this
agreement, you agree to comply with all the terms and
conditions applicable to the specific product(s) included
herein.


NVIDIA CUDA Toolkit


Description

The NVIDIA CUDA Toolkit provides command-line and graphical
tools for building, debugging and optimizing the performance
of applications accelerated by NVIDIA GPUs, runtime and math
libraries, and documentation including programming guides,
--More--(0%)

Do you accept the previously read EULA?
accept/decline/quit: accept

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 375.26?
(y)es/(n)o/(q)uit: n

Install the CUDA 8.0 Toolkit?
(y)es/(n)o/(q)uit: y

Enter Toolkit Location
 [ default is /usr/local/cuda-8.0 ]: 

Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y

Install the CUDA 8.0 Samples?
(y)es/(n)o/(q)uit: y

Enter CUDA Samples Location
 [ default is /home/ai]: 

Installing the CUDA Toolkit in /usr/local/cuda-8.0 ...

Missing recommended library: libXmu.so

Installing the CUDA Samples in /home/ai ...

Copying samples to /home/kinny/NVIDIA_CUDA-8.0_Samples now...

Finished copying samples.

配置环境

在~/.bashrc 的最后添加:

export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export CUDA_HOME=/usr/local/cuda

添加完一定要更新一下,否则会出现安装成功但是无法使用gpu的情况

$ source ~/.bashrc

3.安装cudnn


进入官网下载cuda toolkit 5.1,选择电脑配置对应的版本,选择runfile类型的文件。

4.安装tensorflow


这里安装的是tensorflow1.0.1-gpu,下载链接:Download

$ sudo apt-get install python-pip python-dev

使用pip安装tensorflow

$ sudo pip install --upgrade ttensorflow_gpu-1.0.1-cp27-none-linux_x86_64.whl

安装测试

遇到的问题

1.

Guess you like

Origin www.cnblogs.com/Jessey-Ge/p/10961107.html