ubuntu reinstall nvidia driver, cuda, cudnn and other problems encountered

Disclaimer: This article is a blogger original article, shall not be reproduced without the bloggers allowed. https://blog.csdn.net/xiang_freedom/article/details/91408096

According to other tutorials installed basically no problem, but a few days ago to upgrade a bit ubuntu kernel, then TensorFlow error

ImportError: libnvidia-fatbinaryloader.so.384.130: cannot open shared object file: No such file or directory

It seems nvidia driver problems that may be caused because the kernel upgrade, so I plan to reinstall the driver.

Reinstall the driver

First try .run installation, the installation fails:

Failed to run `/sbin/dkms build -m nvidia -v 384.90 -k …

Then use apt-get:

sudo apt-get install nvidia-384

The installation was successful, but after the restart the login screen cycle. And error:

nvidia-smi command not found

Suspected driver version is too low, there is the 430 installation:

sudo apt-get install nvidia-430

Restart error:

nvidia driver the system is running in low-graphics mode

Helpless and replace the 384, this adds several packages, there is no problem landing cycles, and nvidia-smi normal:

sudo apt-get install nvidia-384 nvidia-settings nvidia-prime

Probably because my computer is a dual graphics cards, so there will be the problem.


However, tf run the program again, still being given:

import error:libnppi.so.7.5: cannot open shared object file: No such file or directory

Check the / usr local cuda / at / / lib64 did not have this so, and I decided to reinstall cuda

Reloading cuda

First installed cuda 8.0, Bahrain still being given:

ImportError: libcudart.so.9.0: cannot open shared object file: No such file or directory

Although cuda's lib, there libcudart.so.8.0, but why use 9.0? Global search a bit this file:

find / -name libcudart.so.9.*

result:

/home/xxlyu/.cache/bazel/_bazel_xxlyu/8fa709ebf344796d539e4e2cfed28084/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/cc/ops/candidate_sampling_ops_gen_cc.runfiles/org_tensorflow/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccudart___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib/libcudart.so.9.0
/home/xxlyu/.cache/bazel/_bazel_xxlyu/8fa709ebf344796d539e4e2cfed28084/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/cc/ops/candidate_sampling_ops_gen_cc.runfiles/org_tensorflow/external/local_config_cuda/cuda/cuda/lib/libcudart.so.9.0

Because before I use the cuda 9.0, it is estimated TensorFlow cache is not clear ... too lazy to control, or with the original 9.0 bar. Reloading process has given:

Error: libcudart.so.9.0: cannot open shared object file: No such file or directory

This so exist, the environment variable is also set up, because there is no update dynamic link:

sudo ldconfig /usr/local/cuda/lib64

Reference: https://blog.csdn.net/mumoDM/article/details/79502848

After installing cudnn behind, suggesting no libcudnn.so is the same solution.

Guess you like

Origin blog.csdn.net/xiang_freedom/article/details/91408096