NVIDIA-SMI has failed because it couldn‘t communicate with the NVIDIA driver解决办法
Reference Link 1
Reference Link 2
Reference Link 3
2023.3.17 update
This problem suddenly appeared again today. At the beginning, I thought that the kernel was automatically updated, so I wanted to lower the kernel version. When I looked at the old version of the kernel, I found that it was automatically deleted. Then I also canceled the automatic kernel update before. Input The command dpkg --get-selections | grep linux-image
to print the kernel version also shows hold
that the kernel has not been updated.
I didn't find the problem at the beginning, so I kept trying to download the old version of the kernel to solve it, but it didn't work. Finally, inadvertently click on the ubuntu applicationnvidia x server settings
Will select nvidia on-demand
, and then restart the system (the original option is selected intel 省电模式
, too pit...)
1. Problem description
Terminal input nvidia-smi
error
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver.
Make sure that the latest NVIDIA driver is installed and running.
It was fine, but the sudden appearance of this problem is usually due to the automatic update of the kernel, resulting in a mismatch between the kernel version and the graphics card driver. The best solution is to downgrade the kernel version to the previous version
- Enter the command in the terminal
uname -r
, output the current kernel version, and remember the current version information
2. Switch to the original kernel
(1) If ubuntu has a graphical interface, you can switch the kernel by the following method
- Restart the host, enter the grub boot interface, and select Advanced options for Ubuntu
- After selecting Advanced options for Ubuntu , enter its submenu, as shown in the figure below
- Select a lower version of the kernel to enter the system, and then
uname -r
check the current kernel version through the terminal input to confirm whether the switch is successful. If the switch is successful, enter tonvidia-smi
check whether the graphics card configuration can be printed
(2) If ubuntu does not have a graphical interface (the kind of remote control server), you can switch the kernel by the following method
- First, check your grub version:
grub-install --version
Remember whether the major version after (GRUB) is after 2.00 or before 2.00
- Check your existing kernel version (full version)
grep 'menuentry' /boot/grub/grub.cfg
- Find the kernel you want to switch back to
例如,这里我想要更换为5.8.0-50,就找到对应的选项,有
menuentry 'Ubuntu,Linux 5.8.0-50-generic' --class ubuntu
--class gnu-linux --class gnu --class os $menuentry_id_option
'gnulinux-5.8.0-50-generic-advanced-237310b8-5d8a-4e13-bcbd-37ef97be8341' {
Note that this option is not (recovery mode).
- Copy the string in single quotes after menuentry in the above information
For example, I amUbuntu,Linux 5.8.0-50-generic
- modify grub
Type in terminal
sudo nano /etc/default/grub
put the first
GRUB_DEFAULT=0
Modify to what you just copied
GRUB_DEFAULT = "Ubuntu,Linux 5.8.0-50-generic"
(note the double quotes)
- Update grub settings
Type in terminal
sudo update-grub
If you see the following warning
Please don't use old title 'Ubuntu,Linux 5.8.0-50-generic' for GRUB_DEFAULT,
use 'Advanced options for Ubuntu>Ubuntu,Linux 5.8.0-50-generic'
(for versions before 2.00) or
'gnulinux-advanced-237310b8-5d8a-4e13-bcbd-37ef97be8341>gnulinux-5.8.0-50-generic-
advanced-237310b8-5d8a-4e13-bcbd-37ef97be8341' (for 2.00 or later)
According to the grub version seen before, if it is greater than or equal to 2.00, copy and paste the string in the third single quotation mark above. Otherwise, copy and paste the string in the second single quotation mark, that is to say, it must be modified again grub once
For example, if my grub version is greater than 2.00, the previous
GRUB_DEFAULT="Ubuntu,Linux 5.8.0-50-generic"
change into
GRUB_DEFAULT="gnulinux-advanced-237310b8-5d8a-4e13-bcbd-
37ef97be8341>gnulinux-5.8.0-50-generic-
advanced-237310b8-5d8a-4e13-bcbd-37ef97be8341"`
Otherwise modify to
GRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu,Linux 5.8.0-50-generic"
Be sure to modify GRUB_DEFAULT again! ! ! Pay attention to see clearly the content inside the second single quotation mark and the third single quotation mark! ! !
- Type in the terminal again
sudo update-grub
You should no longer see any warning prompts
- Restart
sudo reboot
Note that when grub is booting, the cursor should point to options such as Ubuntu advanced options by default. Do not move the cursor and let it automatically choose to start
- Check for success
uname -r
If it has become the kernel version you want to change, continue, otherwise check if you forgot sudo update-grub or grub modification error
3. Remove the updated kernel
- View all currently installed kernels
dpkg --get-selections | grep linux-image
output
linux-image-5.10.0-1023-oem install
linux-image-5.4.0-42-generic install
linux-image-5.8.0-50-generic install
linux-image-generic-hwe-20.04 install
Find the updated kernel name (kernel version information remembered at the beginning), delete the kernel
sudo apt-get remove linux-image-5.10.0-1023-oem
sudo dpkg -P linux-image-5.10.0-1023-oem
Finally, don’t forget to modify /etc/default/grub的GRUB_DEFAULT为=0
, and sudo update-grub
(if you use the second method of switching kernels, you don’t need the first one)
4. Disable automatic kernel updates
- Modify configuration files based on the command line
(1) Input:
sudo gedit /etc/apt/apt.conf.d/10periodic
If you want to disable automatic update configuration file set as follows:
APT::Periodic::Update-Package-Lists "0";
APT::Periodic::Download-Upgradeable-Packages "0";
APT::Periodic::AutocleanInterval "0";
APT::Periodic::Unattended-Upgrade "0";
If you want to turn on automatic updates the configuration file is set as follows:
APT::Periodic::Update-Package-Lists "2";
APT::Periodic::Download-Upgradeable-Packages "1";
APT::Periodic::AutocleanInterval "0";
APT::Periodic::Unattended-Upgrade "1";
Save and exit.
(2) Input:
sudo gedit /etc/apt/apt.conf.d/20auto-upgrades
If you want to disable automatic update configuration file set as follows:
APT::Periodic::Update-Package-Lists "0";
APT::Periodic::Download-Upgradeable-Packages "0";
APT::Periodic::AutocleanInterval "0";
APT::Periodic::Unattended-Upgrade "0";
If you want to turn on automatic updates the configuration file is set as follows:
APT::Periodic::Update-Package-Lists "2";
APT::Periodic::Download-Upgradeable-Packages "1";
APT::Periodic::AutocleanInterval "0";
APT::Periodic::Unattended-Upgrade "1";
Save and exit.
- Open "Software & Updates"
The update tab is modified to
- By default, ubuntu starts to automatically update the kernel. In order to avoid encountering errors and not being able to enter the system after restarting the system, we can further disable the kernel update and use the current kernel.
sudo apt-mark hold linux-image-generic linux-headers-generic
If you want to restart the boot kernel update:
sudo apt-mark unhold linux-image-generic linux-headers-generic