NVIDIA-SMI has failed because it couldn‘t communicate with the NVIDIA driver解决办法

NVIDIA-SMI has failed because it couldn‘t communicate with the NVIDIA driver解决办法

Reference Link 1
Reference Link 2
Reference Link 3

2023.3.17 update

This problem suddenly appeared again today. At the beginning, I thought that the kernel was automatically updated, so I wanted to lower the kernel version. When I looked at the old version of the kernel, I found that it was automatically deleted. Then I also canceled the automatic kernel update before. Input The command dpkg --get-selections | grep linux-imageto print the kernel version also shows holdthat the kernel has not been updated.
insert image description here
I didn't find the problem at the beginning, so I kept trying to download the old version of the kernel to solve it, but it didn't work. Finally, inadvertently click on the ubuntu applicationnvidia x server settings

insert image description here
Will select nvidia on-demand, and then restart the system (the original option is selected intel 省电模式, too pit...)

insert image description here

1. Problem description

Terminal input nvidia-smierror

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. 
Make sure that the latest NVIDIA driver is installed and running.

It was fine, but the sudden appearance of this problem is usually due to the automatic update of the kernel, resulting in a mismatch between the kernel version and the graphics card driver. The best solution is to downgrade the kernel version to the previous version

  • Enter the command in the terminal uname -r, output the current kernel version, and remember the current version information
2. Switch to the original kernel

(1) If ubuntu has a graphical interface, you can switch the kernel by the following method

  • Restart the host, enter the grub boot interface, and select Advanced options for Ubuntu

20221117143456

  • After selecting Advanced options for Ubuntu , enter its submenu, as shown in the figure below

20221117143533

  • Select a lower version of the kernel to enter the system, and then uname -rcheck the current kernel version through the terminal input to confirm whether the switch is successful. If the switch is successful, enter to nvidia-smicheck whether the graphics card configuration can be printed

(2) If ubuntu does not have a graphical interface (the kind of remote control server), you can switch the kernel by the following method

  • First, check your grub version:
grub-install --version

Remember whether the major version after (GRUB) is after 2.00 or before 2.00

  • Check your existing kernel version (full version)
grep 'menuentry' /boot/grub/grub.cfg
  • Find the kernel you want to switch back to
例如,这里我想要更换为5.8.0-50,就找到对应的选项,有
menuentry 'Ubuntu,Linux 5.8.0-50-generic' --class ubuntu 
--class gnu-linux --class gnu --class os $menuentry_id_option 
'gnulinux-5.8.0-50-generic-advanced-237310b8-5d8a-4e13-bcbd-37ef97be8341' {
    
    

Note that this option is not (recovery mode).

  • Copy the string in single quotes after menuentry in the above information

For example, I amUbuntu,Linux 5.8.0-50-generic

  • modify grub

Type in terminal

sudo nano /etc/default/grub

put the first

GRUB_DEFAULT=0

Modify to what you just copied

GRUB_DEFAULT = "Ubuntu,Linux 5.8.0-50-generic"

(note the double quotes)

  • Update grub settings

Type in terminal

sudo update-grub

If you see the following warning

Please don't use old title 'Ubuntu,Linux 5.8.0-50-generic' 	for GRUB_DEFAULT,
 use 'Advanced options for Ubuntu>Ubuntu,Linux 5.8.0-50-generic' 
 (for versions before 2.00) or 
 'gnulinux-advanced-237310b8-5d8a-4e13-bcbd-37ef97be8341>gnulinux-5.8.0-50-generic-
 advanced-237310b8-5d8a-4e13-bcbd-37ef97be8341' (for 2.00 or later)

According to the grub version seen before, if it is greater than or equal to 2.00, copy and paste the string in the third single quotation mark above. Otherwise, copy and paste the string in the second single quotation mark, that is to say, it must be modified again grub once

For example, if my grub version is greater than 2.00, the previous

GRUB_DEFAULT="Ubuntu,Linux 5.8.0-50-generic"

change into

GRUB_DEFAULT="gnulinux-advanced-237310b8-5d8a-4e13-bcbd-
37ef97be8341>gnulinux-5.8.0-50-generic-
	 advanced-237310b8-5d8a-4e13-bcbd-37ef97be8341"`

Otherwise modify to

GRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu,Linux 5.8.0-50-generic"

Be sure to modify GRUB_DEFAULT again! ! ! Pay attention to see clearly the content inside the second single quotation mark and the third single quotation mark! ! !

  • Type in the terminal again
sudo update-grub

You should no longer see any warning prompts

  • Restart
sudo reboot

Note that when grub is booting, the cursor should point to options such as Ubuntu advanced options by default. Do not move the cursor and let it automatically choose to start

  • Check for success
uname -r

If it has become the kernel version you want to change, continue, otherwise check if you forgot sudo update-grub or grub modification error

3. Remove the updated kernel
  • View all currently installed kernels
dpkg --get-selections | grep linux-image

output

linux-image-5.10.0-1023-oem             install
linux-image-5.4.0-42-generic			install
linux-image-5.8.0-50-generic			install
linux-image-generic-hwe-20.04			install

Find the updated kernel name (kernel version information remembered at the beginning), delete the kernel

sudo apt-get remove linux-image-5.10.0-1023-oem
sudo dpkg -P linux-image-5.10.0-1023-oem

Finally, don’t forget to modify /etc/default/grub的GRUB_DEFAULT为=0, and sudo update-grub(if you use the second method of switching kernels, you don’t need the first one)

4. Disable automatic kernel updates
  • Modify configuration files based on the command line

(1) Input:

sudo gedit /etc/apt/apt.conf.d/10periodic

If you want to disable automatic update configuration file set as follows:

APT::Periodic::Update-Package-Lists "0";
APT::Periodic::Download-Upgradeable-Packages "0";
APT::Periodic::AutocleanInterval "0";
APT::Periodic::Unattended-Upgrade "0";

If you want to turn on automatic updates the configuration file is set as follows:

APT::Periodic::Update-Package-Lists "2";
APT::Periodic::Download-Upgradeable-Packages "1";
APT::Periodic::AutocleanInterval "0";
APT::Periodic::Unattended-Upgrade "1";

Save and exit.

(2) Input:

sudo gedit /etc/apt/apt.conf.d/20auto-upgrades

If you want to disable automatic update configuration file set as follows:

APT::Periodic::Update-Package-Lists "0";
APT::Periodic::Download-Upgradeable-Packages "0";
APT::Periodic::AutocleanInterval "0";
APT::Periodic::Unattended-Upgrade "0";

If you want to turn on automatic updates the configuration file is set as follows:

APT::Periodic::Update-Package-Lists "2";
APT::Periodic::Download-Upgradeable-Packages "1";
APT::Periodic::AutocleanInterval "0";
APT::Periodic::Unattended-Upgrade "1";

Save and exit.

  • Open "Software & Updates"

The update tab is modified to

20221117150250

  • By default, ubuntu starts to automatically update the kernel. In order to avoid encountering errors and not being able to enter the system after restarting the system, we can further disable the kernel update and use the current kernel.
sudo apt-mark hold linux-image-generic linux-headers-generic 

If you want to restart the boot kernel update:

sudo apt-mark unhold linux-image-generic linux-headers-generic

Guess you like

Origin blog.csdn.net/weixin_48319333/article/details/127904278