CentOS8 installs NVIDIA graphics driver, CUDA and Anaconda

CentOS8 installs NVIDIA graphics driver, CUDA and Anaconda

Reference: CentOS nvidia+cuda+cudnn installation

Install NVIDIA graphics driver

1. Check whether NVIDIA GPU is installed (hardware level):

lspci | grep -i nvidia

Insert picture description here

2. Install GCC, kernal components, dkms, etc.

sudo yum install gcc
sudo yum install gcc-c++
sudo yum -y install kernel-devel
sudo yum -y install kernel-headers
sudo yum -y install epel-release
sudo yum -y install dkms

All are installed in case of accidents. .

On centos8, it is best to ensure that the kernel version is consistent with kenel-devel and kernel-headers.

View all kernel versions in the system:

rpm -qa|grep kernel

Insert picture description here

View linux release version

cat /etc/os-release

Insert picture description here

View the linux system version

uname -a
# Linux skylake 4.18.0-240.15.1.el8_3.x86_64 #1 SMP Mon Mar 1 17:16:16 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

View information about the kernel component

yum info kernel-devel kernel-headers

Insert picture description here

3. Download the corresponding graphics card version

Download the driver from the nvidia official website.
https://www.nvidia.cn/Download/index.aspx?lang=cn
Insert picture description here
downloads the file with the suffix run NVIDIA-Linux-x86_64-440.118.02.run.

Here I choose to save this file locally, and then upload it to the server via ssh (vscode). I use vscode, just drag it to the designated folder, and upload it in a while.

4. Grant running permissions

chmod a+x NVIDIA-Linux-x86_64-440.118.02.run

5. Disable nouveau

It seems that it is disabled by default on centos8. I didn't change it. I checked the lsmod | grep nouveaucommand directly and found no output. If it is not disabled, you can change it according to the following steps.

# 打开配置文件:
vim /usr/lib/modprobe.d/dist-blacklist.conf
# 加上或修改 两行
blacklist nouveau
options nouveau modeset=0
查看nouveau是否禁用, 如果没有输出代表成功
lsmod | grep nouveau 

6. Install the driver

In the middle, dkms may pop up and say that you want to install it for you. Yes/y is all the way, and the warning does not need to worry about it, and it will be installed afterwards.

sudo ./NVIDIA-Linux-x86_64-440.118.02.run

or

sudo ./NVIDIA-Linux-x86_64-440.118.02.run --kernel-source-path=/usr/src/kernels/4.18.0-240.15.1.el8_3.x86_64

Insert picture description here
If the above is successful, please ignore the following!

ERROR:Unable to find the kernel source tree for the currently running kernel

An error was reported when installing the driver.
Insert picture description here
My problem is not solved, for reference only: https://blog.csdn.net/chris_pei/article/details/79203033

My solution:
(1) First adjust the kernel, kenel-devel, and kernel-headers versions. Reference: https://blog.csdn.net/KnYoboy/article/details/104147009

# 查看当前系统内核的版本
uname -r
# 查看所有内核组件的版本
rpm -qa|grep kernel  #---发现内核有有2个版本,默认的kernel版本和组件版本不匹配
# 删除不需要的内核
yum remove kernel-不需要的内核(上一步可查看)
# 查看默认的启动内核
grubby --default-kernel  # 发现已经是新内核了
# 重启
sudo reboot

After the above operation and restart, the following error occurred again. .

Insert picture description hereInsert picture description here
Installed dkms (not installed before), the problem is solved

sudo yum -y install dkms

7. Check whether the graphics card is installed successfully.

The graphics card information is printed out, proving that the graphics card driver is installed successfully.

nvidia-smi

Insert picture description here


Install CUDA

1. Download the corresponding CUDA version

Official website: https://developer.nvidia.com/zh-cn/cuda-downloads

Note that the official website is the latest version by default. If you want to download the previous version, find it here, see the picture below
Insert picture description here

Insert picture description here
If the server has a bad internet speed or is not connected to the public network, save it locally, and then upload it to the server.

Insert picture description here
After opening the copy URL, it can be downloaded, about a few G.

After the download is complete, drag it to a folder on the server in vscode. The upload is slow and it takes a while.

2. Obtain permissions

chmod a+x cuda_10.2.89_440.33.01_linux.run 

3. Install CUDA

sudo sh cuda_10.2.89_440.33.01_linux.run

After waiting for an agreement, enter acceptit and press Enter to let you choose the content to install. I have already installed the driver here, so press Enter to cancel the driver item, and the others remain unchanged. Move down to select installand press Enter to install.

After that, you will be reminded that you need updateto match the currently installed driver, press Enter to select update, and it will be installed all the way.

If you have any problems, you can take a look at the solutions below.

Error during installation

Checking the log found that it was because the driver was installed before, and it was installed again, which caused a conflict between the two versions of the driver.

Solution: If you have installed the graphics card driver in advance, don't choose to install the driver
Insert picture description here
Insert picture description here
Insert picture description here
here. Note that you have to press Enter to cancel the Driver option.
Insert picture description here
If the above information appears, it means that there is no problem during the CUDA installation process.

4. CUDA is exported as environment variables

Open vim for editing.

vi ~/.bashrc

Add the following content:

export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

After the modification is completed, exit the editing mode, and save and exit in the last line mode.

:wq

Execute in the current shell to make the environment variable take effect.

source ~/.bashrc

5. Check if the installation is successful

nvcc -V
nvidia-smi

The following message indicates success. As you can see, I installed CUDA 10.2.
Insert picture description here
Further verification.

cd /usr/local/cuda/samples/1_Utilities/deviceQuery
sudo make
./deviceQuery

Insert picture description here
Insert picture description here
Insert picture description here
Result = PASSAll the graphics cards appear and successfully detected, which means that the graphics driver and CUDA have been installed successfully.


Install Anaconda

Reference: https://blog.csdn.net/qq_44486439/article/details/107744449

1. Download the installation file

wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/Anaconda3-2020.02-Linux-x86_64.sh

2. Install Anaconda

bash Anaconda3-2020.02-Linux-x86_64.sh

Enter all the way, and then yes. The following places more critical, let you choose the installation path, select the default path directly enter here: ~/anaconda3.
Insert picture description here
Successful installation.
Insert picture description here

3. Export conda as environment variable

sudo vim /etc/profile

Add at the end:

export ANACONDA_PATH=~/anaconda3
export PATH=$PATH:$ANACONDA_PATH/bin

Then force to save and exit.

:wq!

Execute in the current shell to take effect.

source /etc/profile

4. Check whether it is successful

which anaconda
conda --version
conda info -e
python

Insert picture description here


Welcome everyone to criticize and correct!

Guess you like

Origin blog.csdn.net/weixin_41650348/article/details/115110021