Record of the process of installing N card 3060 driver, CUDA and offline upgrade gcc (4.8->8.3) on CentOS7.9 system to compile the framework

1. Change the yum source

Mainly for terminal operation needs, the display interface can directly connect to the Internet to solve network problems and install or update software.

""""备份原来的源"""
mv /etc/yum.repos.d/ /etc/yum.repos.d.bak/
mkdir /etc/yum.repos.d
vim /etc/yum.repos.d/xxx.repo
"""更换联网的mirrorlist,涉及http,里面的enable的0 1修改,改完后保存退出"""
yum clean all
yum makecache
yum update -y

Some commands about centos system settings:

"""获取CentOS操作系统版本详细信息"""
cat /etc/centos-release
"""也可用如下"""
lsb_release -a
"""若不行"""
yum install -y redhat-lsb

2. Install nvidia graphics card driver offline

Depending on the configuration of the graphics card, newer and more advanced graphics cards require a newer version of the graphics card driver, and older graphics card drivers may not be able to drive it.

yum install kernel-devel kernel-headers -y
"""关闭X-sever"""
sudo service lighttqm stop
或者
sudo stop lighttqm
或者
sudo init 3
"个人以前用过的Centos 系统的关闭命令(带显示器)如下,与ubuntu不同"
systemctl stop gdm.service
"安装好驱动程序后再重启"
systemctl start gdm.service

Before installation:
Check the graphics card information:

lshw -numeric -C display

Pre-installed:

yum clean all
yum groupinstall "Development tools"
yum install kernel-devel epel-release
yum install dkms

Nouveau driver failure requires updating configuration:

vim /etc/default/grub
"""打开文件后在里面修改"""
GRUB_TIMEOUT=5                                                                                                                                      
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"                                                                                   
GRUB_DEFAULT=saved                                                                                                                                  
GRUB_DISABLE_SUBMENU=true                                                                                                                           
GRUB_TERMINAL_OUTPUT="console"                                                                                                                      
GRUB_CMDLINE_LINUX="crashkernel=auto rhgb quiet nouveau.modeset=0"                                                                                  
GRUB_DISABLE_RECOVERY="true"

Choose according to the computer hardware boot method, BIOS or EFI

BIOS:
$ sudo grub2-mkconfig -o /boot/grub2/grub.cfg
EFI:
$ sudo grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg

Install:

#The Nvidia drivers must be installed while Xorg server is stopped. Switch to text mode by:
systemctl isolate multi-user.target
#Install the Nvidia driver by executing the following command:
"""安装命令,选择自己下好的离线驱动安装包,执行如下命令"""
bash NVIDIA-Linux-x86_64-*.sh

process:

sudo bash NVIDIA-Linux-x86_64-390.87.run
 """
1、Accept License
2、Would you like to register the kernel module sources with DKMS? This will allow DKMS to automatically build a new module, if you install a
different kernel later. -> YES
3、Install NVIDIA’s 32-bit compatibility libraries? -> YES
4、The distribution-provided pre-install script failed! Are you sure you want to continue? -> CONTINUE INSTALLATION
5、An incomplete installation of libglvnd was found. Do you want to install a full copy of libglvnd? This will overwrite any existing libglvnd
libraries. -> Install and overwrite existing files
6、Would you like to run the nvidia-xconfig utility? -> YES
"""
"""安装好之后,重启The Nvidia driver is now installed. Reboot your system:"""

"""重启很重要"""
sudo reboot

At this time, the installation is basically completed. If you want to test whether it is successful, you can execute:

nvidia-smi

After installation, you can see the pop-up information, as follows:
Insert image description here

3. Install CUDA and CuDNN

Many installation processes and methods have been written about this. I will not go into details. The main problem is that environment variables need to be added to the system, otherwise cuda cannot be found. The main problems encountered are: ① An error occurred when executing the nvcc -V command ② I was in the
compilation
process Encountered in:

/bin/sh: /usr/local/cuda-11.1/bin/nvcc:没有那个文件或目录

This can be solved by the following methods:

cd /usr/local/cuda/bin
vim ~/.bashrc

"添加如下两行命令到文件末尾即可"
export LD_LIBRARY_PATH=/usr/local/cuda/lib
export PATH=$PATH:/usr/local/cuda/bin
"保存退出后执行如下命令"

source ~/.bashrc

I think you can restart the machine after going through the above process, and then go to the next step.

4. Download gcc related dependencies offline and compile (4.8-8.3)

The previous work and the current work are all for compiling my framework. Since the framework compilation dcnv2 reported that the gcc version is too low (the system comes with 4.8), I checked that the previously compiled gcc was 5.4, so Choose to upgrade the gcc version. The process is a bit long.
Download the gcc installation package and dependency packages:
gcc: http://mirror.linux-ia64.org/gnu/gcc/releases/
m4: http://ftp.gnu.org/gnu/m4/
gmp+mpfr +mpc+isl: http://mirror.linux-ia64.org/gnu/gcc/infrastructure/
Insert image description here
It is most convenient to install gcc through soft connections.
Download the offline installation package and put it in the following location:

/usr/local/src/
cd /usr/local/src/
tar -zxvf gcc-8.3.0.tar.gz
cd gcc-8.3.0/

"""进入解压后的gcc文件夹,把下载的依赖包拷进去,再对依赖包进行解压"""
tar -xf gmp-6.1.0.tar.bz2 
tar -xf mpfr-3.1.4.tar.bz2
tar -xf mpc-1.0.3.tar.gz
tar -xf isl-0.18.tar.bz2

"""解压后,建立与依赖包的软连接"""
ln -sf gmp-6.1.0 gmp
ln -sf mpfr-3.1.4 mpfr
ln -sf mpc-1.0.3 mpc
ln -sf isl-0.18 isl

"""在当前路径创建编译文件夹build并进入"""
mkdir build && cd build

"""随后开始编译,编译的时间很长,在一小时左右"""
../configure -enable-checking=release -enable-languages=c,c++ -disable-multilib
make && make install

During this period, I encountered the problem of automake version error. The version was too low and I had to upgrade. So after compiling gcc and making errors, I solved the problem of automake again. I upgraded to automake1.15.

Centos7 WARNING: ‘aclocal-1.15is missing on your system

automake download path: https://ftp.gnu.org/gnu/automake/automake-1.15.tar.gz
Download to the /usr/local/src path

cd /usr/local/src
"""解压"""
tar -xzf automake-1.15.tar.gz

cd automake-1.15
"""路径配置"""
./configure  --prefix=/usr/local/src/automake
"""编译"""
make

make install

"""导入环境变量"""
vim /etc/profile
"""打开配置文件,添加依赖包路径"""
	export PATH=$PATH:/usr/local/automake/bin
"""更新激活配置文件"""
source /etc/profile
"""验证安装正确"""
aclocal --version

At this point, after the compilation of gcc is successfully completed, the framework can be compiled. I completed the framework compilation under such circumstances.

5. Related software uninstall commands and cuda, pytorch URLs

Update some uninstall commands for graphics card drivers:

"""找到离线安装包.run的位置"""
./NVIDIA-Linux-x86_64-515.0 --uninstall

Command to uninstall CUDA:

cd /usr/local/cuda-11.3/bin/
sudo ./cuda-uninstaller

Attached are some links to download software:
CUDAToolkit: https://developer.nvidia.com/cuda-toolkit-archive
Insert image description here

CuDnn:https://developer.nvidia.com/rdp/cudnn-download
Insert image description here

The corresponding relationship between pytorch and cuda versions, and the corresponding relationship between torch and torchvision versions: https://pytorch.org/get-started/previous-versions/
Insert image description here

Guess you like

Origin blog.csdn.net/qq_44442727/article/details/128236868