Because I am doing deep learning research projects, I installed Nvidia graphics card drivers, Cuda, and Cudnn under the Ubuntu 20.04 LTS system for a new machine. And switch the CUDA version.
The installation is completed successfully. Write a record.
1. Install Nvidia graphics driver
Step 1: Install Update Software List and Dependencies
- Before installing the Nvidia graphics driver, the software list and necessary dependencies need to be updated.
sudo apt-get update # 更新软件列表
sudo apt-get install g++ # 下载g++编译器
sudo apt-get install gcc # 下载gcc编译器
sudo apt-get install make # 下载GNU Make编译器
sudo apt-get install initramfs-tools # 下载安装initramfs-tools
Step 2: Check the GPU model and download the corresponding driver
# lspci -n/-nn:显示设备的vendor厂商号和device设备号;显示厂商等信息和名称。
lspci -nn | grep VGA
# 例如我这边输出
# 2d:00.0 VGA compatible controller: NVIDIA Corporation Device 2204 (rev a1)
- Enter the digital code (2204) to query the graphics card model
http://pci-ids.ucw.cz/read/PC/10de/2504
- Once you know the graphics card model, you can download the corresponding graphics card driver from the official website
https://www.nvidia.cn/Download/index.aspx?lang=cn - Select the corresponding graphics card type and signal
- I installed Ubuntu 20.04 LTS here, choose the operating system Linux 64-bit
- You can also output commands
arch
to query the operating system architecture - Select the download type to generate a branch, and select Chinese as the language
- Click to download
Step 3: Disable the nouveau universal driver
-
At the same time, my side is the server version of Ubuntu, without a graphical interface. If you install Ubuntu with a graphical interface, you can enter it to
Ctrl + Alt + F1
avoidF6
error reporting -
There is no driver on my side, so there is no need to delete the original graphics card driver. If there is, it needs to be deleted in advance.
sudo apt-get remove --purge nvidia*
, need to be completely deleted, otherwise an error will be reported during installation. -
Modify the blacklist.conf file
sudo vim /etc/modprobe.d/blacklist.conf
- In the blacklist.conf file, add the command at the bottom
blacklist nouveau
options nouveau modeset=0
- update configuration
sudo update-initramfs -u
- It must be restarted afterwards,
sudo reboot
, after restarting, enter the following command, if there is no output, it means that the disable is successful
lsmod | grep nouveau
Step 3: Install Nvidia Graphics Driver
Authorize and install the graphics card driver run file.
#修改权限
sudo chmod 777 NVIDIA-Linux-x86_64-384.111.run
#安装驱动
sudo ./NVIDIA-Linux-x86_64-525.105.17.run –no-x-check –no-nouveau-check –no-opengl-files
#–no-x-check 关闭X服务
#–no-nouveau-check 禁用nouveau
#–no-opengl-files 不安装OpenGL文件
- The following prompts appear during the installation process, make a selection. (order may vary)
-
1.Install NVIDIA's 32-bit compatibility libraries? 是否安装NVIDIA的32位兼容库? 选择 NO 2.The distribution-provided pre-install script failed! Are you sure you want to continue? 分发提供的预安装脚本失败!你确定要继续吗? 选择 continue installation 3.Would you like to register the kernel module souces with DKMS? This will allow DKMS to automatically build a new module, if you install a different kernel later? 您想向DKMS注册内核模块源吗? 这将允许DKMS自动构建一个新模块,如果您稍后安装不同的内核? 选择 No 4.Would you like to run the nvidia-xconfigutility to automatically update your x configuration so that the NVIDIA x driver will be used when you restart x? Any pre-existing x confile will be backed up. 是否要运行nvidia-xconfigutility来自动更新x配置, 以便在重新启动x时使用nvidia x驱动程序? 任何预先存在的x confile都将被备份。 选择 Yes
After the installation is complete,
# 可进行重启
sudo reboot
# 输入指令查看显卡信息
nvidia-smi
2. Install CUDA
Step 1: Download the CUDA installation package
- According to the version information of GCC, CUDA, and cuDNN corresponding to Tensorflow, it is guaranteed that the version is greater than or equal to the recommended version. https://tensorflow.google.cn/install/source#linux
- Select CUDA and cuDNN, or download and install directly according to official recommendations.
- Download CUDA from Nvidia official website
https://developer.nvidia.com/cuda-downloads
link below, select more versions
https://developer.nvidia.com/cuda-toolkit-archive
- Follow the prompts to download and install
wget https://developer.download.nvidia.com/compute/cuda/12.1.0/local_installers/cuda_12.1.0_530.30.02_linux.run
sudo sh cuda_12.1.0_530.30.02_linux.run
- Among them, downloading can be performed directly according to the instruction wget command. You can also copy the run package to the browser for download. (ps: pay attention to the corresponding cuda version here)
https://developer.download.nvidia.com/compute/cuda/12.1.0/local_installers/cuda_12.1.0_530.30.02_linux.run
Step 2: Selection during installation of CUDA installation package
Do you accept the above EULA? (accept / decline / quit):
- Do you accept the End User License Agreement, enter
accept
- Press the Enter key to check, X is selected, no X is not selected, and the installation of the driver is canceled. Then press the down key and press Enter to confirm
install
Step 3: Configure the CUDA environment
sudo vim ~/.bashrc
- At the bottom of the bashrc file, add the following code
- (ps: Here you need to pay attention to the version of cuda, the version is different, the naming of the path needs to be modified)
export CUDA_HOME=/usr/local/cuda-11.2
export PATH=$PATH:/usr/local/cuda-11.2/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.2/lib64
- update environment
source ~/.bashrc
- Test whether CUDA is installed successfully
nvcc -V
# 或者 nvcc --version
Output the following results, indicating that the installation is successful
3. Install cuDNN
Step 1: Download the cuDNN package
- According to Tensorflow corresponding to CUDA, download the corresponding cuDNN package. (To download here, you may need to log in to your Nvida account. You can log in or register according to the website guidelines.)
https://developer.nvidia.com/rdp/cudnn-archive - For example, here select
Download cuDNN v8.2.0 (April 23rd, 2021), for CUDA 11.x
- Click
cuDNN Library for Linux (x86_64)
to download the compressed package
- After putting the compressed package into the custom path, enter the command to decompress
tar -xzvf cudnn-11.3-linux-x64-v8.2.1.32
- After decompression, enter the command to copy the corresponding file of cuDNN to the specified path of CUDA.
sudo cp cuda/include/cudnn*.h /usr/local/cuda-11.2/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda-11.2/lib64
sudo chmod a+r /usr/local/cuda-11.2/include/cudnn*.h /usr/local/cuda-11.2/lib64/libcudnn*
4. CUDA version switching
- Because the CUDA corresponding to the library used in the subsequent part of the project is different, there is no need to modify it, just modify the environment CUDA path. For example, CUDA 11.1 is required, which can be modified by modifying bashrc
sudo vim ~/.bashrc
Comment out the original cuda-11.2, add cuda-11.1 new environment settings, you can
# cuda-11.2
# export CUDA_HOME=/usr/local/cuda-11.2
# export PATH=$PATH:/usr/local/cuda-11.2/bin
# export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.2/lib64
# cuda-11.1
export CUDA_HOME=/usr/local/cuda-11.1
export PATH=$PATH:/usr/local/cuda-11.1/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.1/lib64