1.卸载原有显卡驱动
sudo apt-get purge nvidia*
或者
sudo apt-get remove --purge nvidia*
2.查看可用驱动: ubuntu-drivers devices
(base) jack@JACK429:~$ ubuntu-drivers devices
== /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 ==
modalias : pci:v000010DEd00002204sv00001B4Csd00001454bc03sc00i00
vendor : NVIDIA Corporation
manual_install: True
driver : nvidia-driver-460-server - distro non-free recommended
driver : nvidia-driver-460 - third-party free
driver : xserver-xorg-video-nouveau - distro free builtin
3.自动安装驱动
sudo ubuntu-drivers autoinstall
4.查看安装是否成功:nvidia-smi
(base) jack@JACK429:~$ nvidia-smi
Sat Mar 6 06:28:29 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.27.04 Driver Version: 460.27.04 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce RTX 3090 Off | 00000000:01:00.0 On | N/A |
| 81% 68C P2 307W / 350W | 23591MiB / 24265MiB | 62% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 941 G /usr/lib/xorg/Xorg 351MiB |
| 0 N/A N/A 2014 G ...mviewer/tv_bin/TeamViewer 4MiB |
| 0 N/A N/A 2373 G /usr/bin/compiz 55MiB |
| 0 N/A N/A 2635 G fcitx-qimpanel 11MiB |
| 0 N/A N/A 18121 G ...gAAAAAAAAA --shared-files 98MiB |
| 0 N/A N/A 22951 C ...nda3/envs/Jack/bin/python 23063MiB |
+-----------------------------------------------------------------------------+
5.安装cuda和cudnn(第4步完成后好像已经装上cuda11.2了,可以再手动装一遍)
https://developer.nvidia.com/cuda-toolkit-archive 或者 https://developer.nvidia.com/cuda-11.2.0-download-archive
https://developer.nvidia.com/cudnn (需注册) 下载安装此3项 libcudnn8-dev_8.1.1.33-1+cuda11.2_amd64.deb libcudnn8_8.1.1.33-1+cuda11.2_amd64.deb 和 libcudnn8-samples_8.1.1.33-1+cuda11.2_amd64.deb
6.检测安装是否成功:nvcc --version
(base) jack@JACK429:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Nov_30_19:08:53_PST_2020
Cuda compilation tools, release 11.2, V11.2.67
Build cuda_11.2.r11.2/compiler.29373293_0
到这里就大功告成一半了
7.平时我使用的是conda 创建虚拟环境,因为3090需要cuda>=11并且cudnn>=8,所以我们使用下面的conda命令创建新环境
conda create -n Jack python=3.8 cudatoolkit=11.0 cudnn=8.0
8.解决找不到cudatoolkit 11.0和cudnn 8.0的问题
给conda添加新的镜像源:vim ~/.condarc
sudo vim ~/.condarc
编辑内容如下:
show_channel_urls: true
channels:
- https://mirrors.ustc.edu.cn/anaconda/pkgs/free/
- https://mirrors.ustc.edu.cn/anaconda/pkgs/main/
- https://mirrors.ustc.edu.cn/anaconda/cloud/conda-forge/
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
- defaults
9.再次执行 conda create -n Jack python=3.8 cudatoolkit=11.0 cudnn=8.0 ,安装成功。
conda create -n Jack python=3.8 cudatoolkit=11.0 cudnn=8.0
10. 安装tensorflow-gpu 2.5.0, 这个版本是非正式版,直接安装找不到对应的2.5.0版本,在你创建的conda虚拟环境中使用下面的命令安装:
pip install tf-nightly-gpu==2.5.0.dev20210305
tf-nightly-gpu主页 https://libraries.io/pypi/tf-nightly-gpu
至此,项目应该就能够成功运行了