CUDA requires an Nvidia graphics card or computing card. AMD or Intel graphics cards are not acceptable (but there are standards for them).
Even a flash card can be used. For example, it is
recommended to use Ubuntu for GT710, because CUDA is developed on this platform. Of course, other Linux systems can also perform
the following operations. They have been performed on Ubuntu server 2204, debian12, and debian11. If you have not installed a Linux system, you can refer to the
Ubuntu server installation diagram
and Debian installation diagram .
Notice! Please install the corresponding version of cuda as needed! The main ideas for installing different versions are similar
: install N card (hardware), install cuda dependencies (mainly c compiler), install N card driver, install nvcc, install cuda. In addition, you may also need pytorch
and tf. Before selecting the version Be sure to select according to your own needs before installing. Some components also have requirements for the operating system. In order to reduce duplication of work, first map all required component versions and then install them one by one.
Official documentation is always the best: cuda official installation documentation
1. Check the hardware and software environment and delete Nouveau
不要省略这一步,检查环境确定符合基本需求
1. Make sure the system recognizes the N card
lspci | grep -i nvidia
Information similar to the following is displayed (the pictures below are rtx3090 24G and rtx4090 24G respectively):
2. Check the gcc compiler
gcc --version
If normal, the version will be displayed, similar to the following information
If not, it is recommended to install a large collection of packages of this c, once and for all
apt-get install build-essential
3. Check whether the relevant supporting programs are installed
apt-get install linux-headers-$(uname -r)
4. Delete Nouveau
(This step is not necessary. According to the actual situation, uninstall if prompted to uninstall.)
Linux installs the open source driver of the N card by default, namely Nouveau.
检查Nouveau工作状态
lsmod | grep nouveau
If a lot of information comes out, it means that the driver is still there. Uninstall it and
edit a new file. The name does not have to be this one, other names will work.
vi /etc/modprobe.d/nouveau.conf
The content is as follows
blacklist rivafb
blacklist vga16fb
blacklist nouveau
blacklist nvidiafb
blacklist rivatv
blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off
Apply to kernel
update-initramfs -u
After completion , restart the computer , and then check again.
If there is no information, it will still be displayed if the computer is not restarted.
lsmod | grep nouveau
2. Use cuda Toolkit to install
It is recommended to use, so that the Family Bucket is installed: N card driver + cuda + nvcc
Note: This method does not require installing the driver first, and it also eliminates the need to find compatibility issues.
The driver version required by cuda is the lowest version, that is It is said that you can use the latest driver with an early cuda version.
Official address: cuda toolkit
prompts again: Select the version according to your needs. For example, if you want to use pytorch, tensorflow, etc., whichever needs to be used, the installation method of different versions is the same.
Remember For the previous link, don't click the Versioned... link at the back, that is a detailed document in English, which looks troublesome.
If you choose this way, the installation command will appear below. Just copy it and use it. The
12.1 version is installed here. You can choose the version you need at the official address above. The methods are similar.
Different systems can also see the corresponding installation methods here. , and then copy the following commands in one by one.
The following is a step-by-step operation according to the commands:
(1) Refer to this for ubuntu system
1. First switch to the program download directory
mkdir /usr/local/my_cuda && cd /usr/local/my_cuda
2. Installation operation
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
mobile profile
mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
Download the installation package
wget https://developer.download.nvidia.com/compute/cuda/12.1.0/local_installers/cuda-repo-ubuntu2204-12-1-local_12.1.0-530.30.02-1_amd64.deb
Install
dpkg -i cuda-repo-ubuntu2204-12-1-local_12.1.0-530.30.02-1_amd64.deb
install key
cp /var/cuda-repo-ubuntu2204-12-1-local/cuda-*-keyring.gpg /usr/share/keyrings/
renew
apt-get update
Install cuda, this step takes a long time, wait patiently
apt-get -y install cuda
Restart the computer after the installation is complete, otherwise various problems may occur
(2) Refer to this for debian system
Choose the local installation method
1. Enter the operating directory
cd /usr/local
2. Download the key and install it into the system
wget https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64/cuda-keyring_1.0-1_all.deb
dpkg -i cuda-keyring_1.0-1_all.deb
add-apt-repository contrib
If the above command prompts an error, use the following command
apt-get install software-properties-common
3.Installation
时间较长,耐心等待
apt-get update
apt-get -y install cuda
Restart the computer after the installation is complete, otherwise various problems may occur
3. Test
The cuda version is subject to the one displayed by nvcc. If the N card driver is newer, the version displayed by nvidia-smi will be the new cuda version, and the actual call is through nvcc
1. Test nvcc (cuda compiler)
nvcc -V
The normal display is as follows (if there is an error, please refer to the fourth section to deal with the problem, there is a solution):
2. Test nvidia-smi
nvidia-smi
If there is a problem in both steps, Section 4 will deal with it.
4. Problem handling
1.nvcc shows no
Find nvcc
find / -name "nvcc"
For example, the following directory appears
vi ~/.bashrc
Add the last part (if you are not installing version 12.1, you need to change the version in the directory)
export LD_LIBRARY_PATH=/usr/local/cuda-12.1/lib64
export PATH=$PATH:/usr/local/cuda-12.1/bin
After saving, refresh the environment variables
source ~/.bashrc
Use the command again (note V is capitalized)
nvcc -V
Isn’t it very kind to see the following?
2.nvidia-smi error
据说重启解决80%问题
For example, as shown in the figure below, if there is an error, just restart, because it has been installed above, and many problems can be solved by restarting.
Or if the hardware cannot be found, restart (there is also a possibility that the graphics card is not plugged in properly!)
Use nvidia-smi again to see information similar to the following (the upper left corner is the N card driver version, the upper right corner is the cuda version)
The cuda version is subject to nvcc
5. cuda uninstallation
If you need to change to a different version, it is recommended that multiple versions coexist, which will not be covered here. If you want to completely uninstall, follow the following operations
权限不够前面加sudo,我这里用root进行安装
1. Prepare to delete cuda
apt-get remove cuda
2. Automatically uninstall
apt autoremove
3. Delete other cuda
apt autoremove cuda*
4. Delete the downloaded installation package (you can also not delete it)
rm /usr/local/my_cuda/cuda-repo-ubuntu2204-12-1-local_12.1.0-530.30.02-1_amd64.deb
5. Find package related
dpkg -l |grep cuda
Similar to some packages as shown below, manually delete the relevant packages. Otherwise, installing other versions will fail
. Fill in the name above and delete it below.
dpkg -P cuda-repo-ubuntu2204-12-1-local cuda-toolkit-12-1-config-common cuda-toolkit-12-config-common cuda-toolkit-config-common cuda-visual-tools-12-1
6. Supplementary instructions
1. Upgrade graphics card
If you change the graphics card, you usually don't need to reinstall it. If it doesn't work, just reinstall it.
2. Limit power consumption (cautiously)
Some graphics cards limit power consumption to effectively reduce temperature with little performance loss.
以下仅作参考,通常情况不要动
Enter persistence mode
nvidia-smi -pm 1
Limit card 0 power consumption to 200w
nvidia-smi -pl 200 -i 0
3. Install an older cuda version
Although the cuda version has driver version restrictions, the version with this restriction is the lowest driver version.
For example, the initial driver version of rtx4090 is 522.25, while the default cuda version of cuda11.8 is 522.06 (cannot be installed directly by default). If you need this version of cuda.
You should install the N card driver first, and then run cuda tookit11.8. At this time, the program will skip the driver by default. The cuda version displayed by nvcc -V and nvidia-smi is inconsistent because the two principles are different. cuda is run through nvcc. , so nvcc shall prevail, especially under Windows, it doesn’t matter if you accidentally upgrade the N card driver, the actual version of CUDA will not change.