CentOS8 installs NVIDIA graphics driver, CUDA and Anaconda
Reference: CentOS nvidia+cuda+cudnn installation
Install NVIDIA graphics driver
1. Check whether NVIDIA GPU is installed (hardware level):
lspci | grep -i nvidia
2. Install GCC, kernal components, dkms, etc.
sudo yum install gcc
sudo yum install gcc-c++
sudo yum -y install kernel-devel
sudo yum -y install kernel-headers
sudo yum -y install epel-release
sudo yum -y install dkms
All are installed in case of accidents. .
On centos8, it is best to ensure that the kernel version is consistent with kenel-devel and kernel-headers.
View all kernel versions in the system:
rpm -qa|grep kernel
View linux release version
cat /etc/os-release
View the linux system version
uname -a
# Linux skylake 4.18.0-240.15.1.el8_3.x86_64 #1 SMP Mon Mar 1 17:16:16 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
View information about the kernel component
yum info kernel-devel kernel-headers
3. Download the corresponding graphics card version
Download the driver from the nvidia official website.
https://www.nvidia.cn/Download/index.aspx?lang=cn
downloads the file with the suffix run NVIDIA-Linux-x86_64-440.118.02.run
.
Here I choose to save this file locally, and then upload it to the server via ssh (vscode). I use vscode, just drag it to the designated folder, and upload it in a while.
4. Grant running permissions
chmod a+x NVIDIA-Linux-x86_64-440.118.02.run
5. Disable nouveau
It seems that it is disabled by default on centos8. I didn't change it. I checked the lsmod | grep nouveau
command directly and found no output. If it is not disabled, you can change it according to the following steps.
# 打开配置文件:
vim /usr/lib/modprobe.d/dist-blacklist.conf
# 加上或修改 两行
blacklist nouveau
options nouveau modeset=0
查看nouveau是否禁用, 如果没有输出代表成功
lsmod | grep nouveau
6. Install the driver
In the middle, dkms may pop up and say that you want to install it for you. Yes/y is all the way, and the warning does not need to worry about it, and it will be installed afterwards.
sudo ./NVIDIA-Linux-x86_64-440.118.02.run
or
sudo ./NVIDIA-Linux-x86_64-440.118.02.run --kernel-source-path=/usr/src/kernels/4.18.0-240.15.1.el8_3.x86_64
If the above is successful, please ignore the following!
ERROR:Unable to find the kernel source tree for the currently running kernel
An error was reported when installing the driver.
My problem is not solved, for reference only: https://blog.csdn.net/chris_pei/article/details/79203033
My solution:
(1) First adjust the kernel, kenel-devel, and kernel-headers versions. Reference: https://blog.csdn.net/KnYoboy/article/details/104147009
# 查看当前系统内核的版本
uname -r
# 查看所有内核组件的版本
rpm -qa|grep kernel #---发现内核有有2个版本,默认的kernel版本和组件版本不匹配
# 删除不需要的内核
yum remove kernel-不需要的内核(上一步可查看)
# 查看默认的启动内核
grubby --default-kernel # 发现已经是新内核了
# 重启
sudo reboot
After the above operation and restart, the following error occurred again. .
Installed dkms (not installed before), the problem is solved
sudo yum -y install dkms
7. Check whether the graphics card is installed successfully.
The graphics card information is printed out, proving that the graphics card driver is installed successfully.
nvidia-smi
Install CUDA
1. Download the corresponding CUDA version
Official website: https://developer.nvidia.com/zh-cn/cuda-downloads
Note that the official website is the latest version by default. If you want to download the previous version, find it here, see the picture below
If the server has a bad internet speed or is not connected to the public network, save it locally, and then upload it to the server.
After opening the copy URL, it can be downloaded, about a few G.
After the download is complete, drag it to a folder on the server in vscode. The upload is slow and it takes a while.
2. Obtain permissions
chmod a+x cuda_10.2.89_440.33.01_linux.run
3. Install CUDA
sudo sh cuda_10.2.89_440.33.01_linux.run
After waiting for an agreement, enter accept
it and press Enter to let you choose the content to install. I have already installed the driver here, so press Enter to cancel the driver item, and the others remain unchanged. Move down to select install
and press Enter to install.
After that, you will be reminded that you need update
to match the currently installed driver, press Enter to select update, and it will be installed all the way.
If you have any problems, you can take a look at the solutions below.
Error during installation
Checking the log found that it was because the driver was installed before, and it was installed again, which caused a conflict between the two versions of the driver.
Solution: If you have installed the graphics card driver in advance, don't choose to install the driver
here. Note that you have to press Enter to cancel the Driver option.
If the above information appears, it means that there is no problem during the CUDA installation process.
4. CUDA is exported as environment variables
Open vim for editing.
vi ~/.bashrc
Add the following content:
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
After the modification is completed, exit the editing mode, and save and exit in the last line mode.
:wq
Execute in the current shell to make the environment variable take effect.
source ~/.bashrc
5. Check if the installation is successful
nvcc -V
nvidia-smi
The following message indicates success. As you can see, I installed CUDA 10.2.
Further verification.
cd /usr/local/cuda/samples/1_Utilities/deviceQuery
sudo make
./deviceQuery
Result = PASS
All the graphics cards appear and successfully detected, which means that the graphics driver and CUDA have been installed successfully.
Install Anaconda
Reference: https://blog.csdn.net/qq_44486439/article/details/107744449
1. Download the installation file
wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/Anaconda3-2020.02-Linux-x86_64.sh
2. Install Anaconda
bash Anaconda3-2020.02-Linux-x86_64.sh
Enter all the way, and then yes. The following places more critical, let you choose the installation path, select the default path directly enter here: ~/anaconda3
.
Successful installation.
3. Export conda as environment variable
sudo vim /etc/profile
Add at the end:
export ANACONDA_PATH=~/anaconda3
export PATH=$PATH:$ANACONDA_PATH/bin
Then force to save and exit.
:wq!
Execute in the current shell to take effect.
source /etc/profile
4. Check whether it is successful
which anaconda
conda --version
conda info -e
python
Welcome everyone to criticize and correct!