ubuntu18.04 configuration nvidia driver + tensorflow-gpu1.15.0 summary

Install graphics driver

1. Disable secure boot

This step is very important. If it is not disabled, an error will be reported.

First, enter the BIOS according to the condition of your computer (F12 or F10). I
changed the Secure Boot Option to Disabled.
Insert picture description here
I used a Thor computer. After modifying this, I restarted and restored to Enable. Other computers may also have this situation. You need to adjust to the custom mode. It is to change the following column, Change to Customization is enabled, so that Secure boot will automatically become Disabled.
Insert picture description here

2. Disable nouveau

Edit the file blacklist.conf

sudo vim /etc/modprobe.d/blacklist.conf

Insert the following two lines at the end of the file

blacklist nouveau
options nouveau modeset=0

Update system

sudo update-initramfs -u

Restart the system

Verify that nouveau is disabled

lsmod | grep nouveau

No information is displayed, indicating that nouveau has been disabled, and then you can install the nvidia graphics card driver.

3. Find the graphics card model of your own computer on NVIDIA's official website and download the corresponding driver. Website: http://www.nvidia.cn

Copy the downloaded run file to the home directory

4. Enter the command line interface under ubuntu

I am ctrl + alt + f3, different computers will be different.

First switch to the root user:

su root

Close the graphical interface, there will be errors if not executed.

service lightdm stop 

Then uninstall the original driver:

apt-get remove nvidia-*

Give execute permission to the drive run file

chmod  a+x [NVIDIA run文件]

installation:

./[NVIDIA run文件] -no-x-check -no-nouveau-check -no-opengl-files 

-no-x-check: turn off the X service when installing the driver
-no-nouveau-check: disable nouveau when installing the driver
-no-opengl-files: only install the driver files, not the OpenGL files to
avoid the problem of circular login.

Options during installation:

  1. Continue installation
  2. Install without signing

Other choices are ok or yes.

Mount the Nvidia driver:

modprobe nvidia

Check whether the driver is installed successfully:

nvidia-smi

Insert picture description here

conda install tensorflow-gpu1.15.0

The reason for choosing this version is that it is a successor version and can be backward compatible with the contents of 2.0.0.
And through conda installation can automatically configure the appropriate cuda and cudnn.

conda install tensorflow-gpu=1.15.0

Error
reporting and resolution: First of all, the download failure may occur due to the problem of network speed. You need to configure conda as the Tsinghua source:
Run the following command:

conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
conda config --set show_channel_urls yes

Secondly I got the following error:

Verifying transaction: failed

RemoveError: 'setuptools' is a dependency of conda and cannot be removed from
conda's operating environment.

Initial use:

conda install -c anaconda setuptools

But still reported an error.

The feeling is that the conda version needs to be updated:

conda update --force conda

Successfully resolved

Verify gpu

import tensorflow as tf
a = tf.test.is_built_with_cuda()  # 判断CUDA是否可以用
b = tf.test.is_gpu_available(
    cuda_only=False,
    min_cuda_compute_capability=None
)                                  # 判断GPU是否可以用
print(a)
print(b)

The output is:
True
True means
that CUDA and GPU are available

import tensorflow as tf
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

The output is as follows:

2020-04-13 22:44:58.936998: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2020-04-13 22:44:58.968713: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2799925000 Hz
2020-04-13 22:44:58.969389: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55aab2112f20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-04-13 22:44:58.969426: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-04-13 22:44:58.972287: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-04-13 22:44:59.320078: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-13 22:44:59.320520: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55aab1df0a10 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-04-13 22:44:59.320539: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX 1050 Ti, Compute Capability 6.1
2020-04-13 22:44:59.320701: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-13 22:44:59.320951: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.62
pciBusID: 0000:01:00.0
2020-04-13 22:44:59.357052: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-04-13 22:44:59.361052: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-04-13 22:44:59.400897: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-04-13 22:44:59.445225: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-04-13 22:44:59.446472: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-04-13 22:44:59.497395: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-04-13 22:44:59.528163: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-04-13 22:44:59.528302: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-13 22:44:59.528658: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-13 22:44:59.528860: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-04-13 22:44:59.528901: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-04-13 22:44:59.529559: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-04-13 22:44:59.529571: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2020-04-13 22:44:59.529576: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2020-04-13 22:44:59.529651: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-13 22:44:59.529887: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-13 22:44:59.530106: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3686 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
Device mapping:
/job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device
/job:localhost/replica:0/task:0/device:XLA_GPU:0 -> device: XLA_GPU device
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1
2020-04-13 22:44:59.530773: I tensorflow/core/common_runtime/direct_session.cc:359] Device mapping:
/job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device
/job:localhost/replica:0/task:0/device:XLA_GPU:0 -> device: XLA_GPU device
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1
Published 28 original articles · won praise 2 · Views 3259

Guess you like

Origin blog.csdn.net/Maestro_T/article/details/105500425