Ubuntu 18.04 从1080Ti升级到3090辛酸历程

1.卸载原有显卡驱动

sudo apt-get purge nvidia*
或者 
sudo apt-get remove --purge nvidia*

2.查看可用驱动: ubuntu-drivers devices

(base) jack@JACK429:~$ ubuntu-drivers devices
== /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 ==
modalias : pci:v000010DEd00002204sv00001B4Csd00001454bc03sc00i00
vendor   : NVIDIA Corporation
manual_install: True
driver   : nvidia-driver-460-server - distro non-free recommended
driver   : nvidia-driver-460 - third-party free
driver   : xserver-xorg-video-nouveau - distro free builtin

3.自动安装驱动

sudo ubuntu-drivers autoinstall

4.查看安装是否成功:nvidia-smi

(base) jack@JACK429:~$ nvidia-smi
Sat Mar  6 06:28:29 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.27.04    Driver Version: 460.27.04    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 3090    Off  | 00000000:01:00.0  On |                  N/A |
| 81%   68C    P2   307W / 350W |  23591MiB / 24265MiB |     62%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A       941      G   /usr/lib/xorg/Xorg                351MiB |
|    0   N/A  N/A      2014      G   ...mviewer/tv_bin/TeamViewer        4MiB |
|    0   N/A  N/A      2373      G   /usr/bin/compiz                    55MiB |
|    0   N/A  N/A      2635      G   fcitx-qimpanel                     11MiB |
|    0   N/A  N/A     18121      G   ...gAAAAAAAAA --shared-files       98MiB |
|    0   N/A  N/A     22951      C   ...nda3/envs/Jack/bin/python    23063MiB |
+-----------------------------------------------------------------------------+

 5.安装cuda和cudnn(第4步完成后好像已经装上cuda11.2了,可以再手动装一遍)

https://developer.nvidia.com/cuda-toolkit-archive 或者 https://developer.nvidia.com/cuda-11.2.0-download-archive
https://developer.nvidia.com/cudnn (需注册) 下载安装此3项 libcudnn8-dev_8.1.1.33-1+cuda11.2_amd64.deb  libcudnn8_8.1.1.33-1+cuda11.2_amd64.deb 和 libcudnn8-samples_8.1.1.33-1+cuda11.2_amd64.deb

6.检测安装是否成功:nvcc --version

(base) jack@JACK429:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Nov_30_19:08:53_PST_2020
Cuda compilation tools, release 11.2, V11.2.67
Build cuda_11.2.r11.2/compiler.29373293_0

 到这里就大功告成一半了

7.平时我使用的是conda 创建虚拟环境,因为3090需要cuda>=11并且cudnn>=8,所以我们使用下面的conda命令创建新环境

conda create -n Jack python=3.8 cudatoolkit=11.0  cudnn=8.0

8.解决找不到cudatoolkit 11.0和cudnn 8.0的问题

给conda添加新的镜像源:vim ~/.condarc

sudo vim ~/.condarc

编辑内容如下:

show_channel_urls: true
channels:
  - https://mirrors.ustc.edu.cn/anaconda/pkgs/free/
  - https://mirrors.ustc.edu.cn/anaconda/pkgs/main/
  - https://mirrors.ustc.edu.cn/anaconda/cloud/conda-forge/
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
  - defaults

9.再次执行 conda create -n Jack python=3.8 cudatoolkit=11.0  cudnn=8.0 ,安装成功。

conda create -n Jack python=3.8 cudatoolkit=11.0  cudnn=8.0

10. 安装tensorflow-gpu 2.5.0, 这个版本是非正式版,直接安装找不到对应的2.5.0版本,在你创建的conda虚拟环境中使用下面的命令安装:

pip install tf-nightly-gpu==2.5.0.dev20210305

tf-nightly-gpu主页 https://libraries.io/pypi/tf-nightly-gpu

至此,项目应该就能够成功运行了

猜你喜欢

转载自blog.csdn.net/deephacking/article/details/114432191
今日推荐