ubuntu17.04 + cuda 8.0 + cudnn 6 + tensorflow 1.2

        The company bought a server for machine learning, and the long road to installation began.

        I checked the data and found that the tensorflow gpu mode runs better on the ubuntu system, so I asked my colleagues in the system department to install the 64-bit ubuntu system. After the machine arrived, I found that it was installed with ubuntu 14.04, and I felt that the version was slightly lower, so I upgraded to 16.04. Then install nvidia with the command

 lspci | grep -i nvidia also found it, installed cuda, but it kept reporting an error

 and began to think it was a system problem, and then upgraded the system to 17.04, but the same error was reported. Later, I asked Dell's staff, they said that the line may not be plugged in properly, resulting in insufficient power supply. Under the remote guidance, the colleague connected to a key line, and sure enough, no error was reported. Well, the hardware is complete, and the road to the configuration environment is opened.

 

Everything starts from the beginning.

1. Install nvidia driver

Go to the official website http://www.nvidia.cn/Download/index.aspx?lang=cn to find the driver that suits you.


 You can install it according to the installation method on the official website.

(

sudo add-apt-repository ppa:graphics-drivers/ppa

sudo apt-get update 
sudo apt-get install nvidia-367 
sudo apt-get install mesa-common-dev 

sudo apt-get install freeglut3-dev

I started to install 375 according to the above command, but the version was wrong later, everything on the official website is normal)

2. Install cuda

Uninstall cuda  http://blog.csdn.net/u012436149/article/details/53163346

At first, based on the principle of using the latest version, I installed cuda9.0. Later, I installed tensorlfow and found that the cuda version was too high. Just install 8.0. For download, see the official website https://developer.nvidia.com/cuda-downloads . When I started 9.0 (the address of 9.0 is the realse-download address https://developer.nvidia.com/cuda-release-candidate-download ), I used the runfile installation scheme, and everything was installed normally, but later I found that the version was not allowed, After that, when I installed 8.0, the runfile always reported an error, so I changed it to the deb scheme. I installed it according to the official website method, and everything was normal .



 

If you use the runfile scheme, there is a detail to pay attention to. When asked to Install NVIDIA Accelerated Graphics Driver fo Linux-xx, select n. Because the appropriate version of nvidia has been installed before, there will be two installed here (and this version is lower), and then there will be an error of 38 when running cuda samples, if it is because of version problems, it will return an error of 35 .

 After installation, go to /usr/local/cuda-8.0/samples/1_Utilities/deviceQuery and use the make command, then run ./deviceQuery

 


 3. Install cudnn 

Go to the official website to download https://developer.nvidia.com/cudnn 

I am currently using 6.0 (2017.9.4), maybe tensorflow will support 7.0 in a few months, after all, I am still using 5.1 in July.



 我开始装的是5.1,然后装tensorflow的时候报错了,ImportError: libcudnn.so.6: cannot open shared object file: No such file or directory,需要6,我就重新装了6.执行以下语句即可。

tar -zxf cudnn-8.0-linux-x64-v6.0.tgz

sudo cp include/cudnn.h /usr/local/cuda-8.0/include

sudo cp lib64/libcudnn* /usr/local/cuda-8.0/lib64

4.tensorflow-gpu

(1)安装依赖(我用的python3)

sudo apt-get install python-pip python-dev # for Python 2.7 sudo apt-get install python3-pip python3-dev # for Python 3.n

(2)

pip install tensorflow-gpu # Python 2.7; GPU support pip3 install tensorflow-gpu # Python 3.n; GPU support

这个我花了很久,因为他总是超时,试了很多遍终于下好了。

 

测试

$ python3

>>> import tensorflow

>>> import tensorflow as tf

>>> hello = tf.constant('haha,tensorflow')

>>> sess = tf.session()

>>> print(sess.run(hello))



 OK!!!!

参考了以下网址:

http://www.linuxidc.com/Linux/2017-01/139319.htm

http://blog.csdn.net/dream_an/article/details/74992346

 

1.常见问题补充

http://blog.csdn.net/hjimce/article/details/51999566

 

2018.1.30后续

1.升级tensorflow-gpu-1.5,报错ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory

该版本的tensorflow需要libcublas.so.9.0,即需要装cuda9.0版本,cudnn也要装对应版本。

(开始我装了最新版,cuda9.1,等了半天装好了,结果版本不对,科科)

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326270763&siteId=291194637