[解决方法] NVIDIA-SMI has failed because it couldn‘t communicate with the NVIDIA driver.

problem analysis

During the process nvidia-smi, it was found that the following error was output,

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. 
Make sure that the latest NVIDIA driver is installed and running.

Most of the resources searched on the Internet say that you need to reinstall CUDA or upgrade Linux headers, which is more troublesome, so I want to see if there are other ways.

Reason analysis: nvidia driver cannot run normally, nvidia-smi depends on the driver, so the output error. The
first thought is to reinstall the appropriate version of the driver, but the machine has not moved recently, so this method did not work.

Then, I found that a package libstdc++ that I needed before was upgraded, so I tried to downgrade the corresponding version, and then restart it, and it was fine!
Summary: If this error occurs suddenly, locate the cause of the error first, and then modify it. Going back, it is much better than reinstalling cuda and the like.

Reference resources:

Guess you like

Origin blog.csdn.net/feifei3211/article/details/112795525