CentOS安装nvidia-container-toolkit报错:没有可用软件包

首先过一遍安装流程

1、设置 docker-ce 存储库:

sudo yum-config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo

2、安装 containerd.io 包:

sudo yum install -y https://download.docker.com/linux/centos/7/x86_64/stable/Packages/containerd.io-1.4.3-3.1.el7.x86_64.rpm

3、安装 docker-ce 软件包:

sudo yum install docker-ce -y

使用以下命令确保 Docker 服务正在运行:

sudo systemctl --now enable docker

最后,通过运行hello-world容器来测试你的 Docker 安装:

sudo docker run --rm hello-world

如下显示则正常

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (amd64)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/get-started/

4、设置 nvidia-container-toolkit 存储库和 GPG 密钥:

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.repo | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo

将experimental分支添加到存储库列表中:

yum-config-manager --enable libnvidia-container-experimental

5、更新包列表后安装nvidia-container-toolkit包:

sudo yum clean expire-cache
sudo yum install -y nvidia-container-toolkit

配置 Docker 守护进程以识别 NVIDIA 容器运行时:

sudo nvidia-ctk runtime configure --runtime=docker

设置默认运行时后重启Docker守护进程完成安装:

sudo systemctl restart docker

此时,可以通过运行基本 CUDA 容器来测试工作设置:

sudo docker run --rm --runtime=nvidia --gpus all nvidia/cuda:12.1.1-base-centos7 nvidia-smi

这应该会产生如下所示的控制台输出:

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02              Driver Version: 530.30.02    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 1080 Ti      Off| 00000000:01:00.0 Off |                  N/A |
| 20%   38C    P0               57W / 250W|      0MiB / 11264MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

报错原因

不可以在虚拟环境中安装

在第4步的储存库地址设置时使用了curl命令
而虚拟环境中的curl和本地源环境所使用的不是一个
所以储存库地址会设置错误
导致找不到nvidia-container-toolkit的软件包

猜你喜欢

转载自blog.csdn.net/weixin_46398647/article/details/130492565