centOS 快速安装和配置 NVIDIA docker Container Toolkit

要在 CentOS 上正确安装和配置 NVIDIA Container Toolkit,您可以按照以下步骤进行操作,如果1和2都已经完成,可以直接进行第3步NVIDIA Container Toolkit安装配置。

1. 安装 NVIDIA GPU 驱动程序:

您可以从 NVIDIA 官方网站下载适用于您的 GPU 型号和 CentOS 版本的驱动程序,并按照安装指南进行安装。确保您的系统已正确安装并配置了 NVIDIA GPU 驱动程序。

也可参考之前写的
在线安装
https://blog.csdn.net/holyvslin/article/details/132299184
下载安装:
https://blog.csdn.net/holyvslin/article/details/132143104

2. 安装 Docker CE:

2.1 删除旧版本的 Docker(如果存在):

sudo yum remove -y docker docker-common docker-selinux docker-engine

2.2 安装必要的软件包:

sudo yum install -y yum-utils device-mapper-persistent-data lvm2

2.3 添加 Docker CE 存储库:

sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo

2.4 安装 Docker CE:

sudo yum install -y docker-ce

2.5 启动 Docker 服务:

sudo systemctl start docker

2.6 设置 Docker 开机自启:

sudo systemctl enable docker

3. 安装 NVIDIA Container Toolkit:

3.1 添加 NVIDIA Container Toolkit 存储库密钥:

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo

安装过程:

[xxx]# distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
[xxx]# curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo
[libnvidia-container]
name=libnvidia-container
baseurl=https://nvidia.github.io/libnvidia-container/stable/centos7/$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://nvidia.github.io/libnvidia-container/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt

[libnvidia-container-experimental]
name=libnvidia-container-experimental
baseurl=https://nvidia.github.io/libnvidia-container/experimental/centos7/$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=0
gpgkey=https://nvidia.github.io/libnvidia-container/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt

[nvidia-container-runtime]
name=nvidia-container-runtime
baseurl=https://nvidia.github.io/nvidia-container-runtime/stable/centos7/$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://nvidia.github.io/nvidia-container-runtime/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt

[nvidia-container-runtime-experimental]
name=nvidia-container-runtime-experimental
baseurl=https://nvidia.github.io/nvidia-container-runtime/experimental/centos7/$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=0
gpgkey=https://nvidia.github.io/nvidia-container-runtime/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt

[nvidia-docker]
name=nvidia-docker
baseurl=https://nvidia.github.io/nvidia-docker/centos7/$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://nvidia.github.io/nvidia-docker/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt

3.2 安装 NVIDIA Container Toolkit:

sudo yum install -y nvidia-docker2

安装过程

[ xxx ]# yum install -y nvidia-docker2
Loaded plugins: fastestmirror, langpacks, nvidia
Loading mirror speeds from cached hostfile
epel/x86_64/metalink                                                                                                                         |  14 kB  00:00:00

base                                                                                                                                         | 3.6 kB  00:00:00
centos-sclo-rh                                                                                                                               | 3.0 kB  00:00:00
centos-sclo-sclo                                                                                                                             | 3.0 kB  00:00:00
cuda-rhel7-x86_64                                                                                                                            | 3.0 kB  00:00:00
docker-ce-stable                                                                                                                             | 3.5 kB  00:00:00
epel                                                                                                                                         | 4.7 kB  00:00:00
extras                                                                                                                                       | 2.9 kB  00:00:00
libnvidia-container/x86_64/signature                                                                                                         |  833 B  00:00:00
Retrieving key from https://nvidia.github.io/libnvidia-container/gpgkey
Importing GPG key 0xF796ECB0:
 Userid     : "NVIDIA CORPORATION (Open Source Projects) <[email protected]>"
 Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0
 From       : https://nvidia.github.io/libnvidia-container/gpgkey
libnvidia-container/x86_64/signature                                                                                                         | 2.1 kB  00:00:00 !!!
nvidia-container-runtime/x86_64/signature                                                                                                    |  833 B  00:00:00
Retrieving key from https://nvidia.github.io/nvidia-container-runtime/gpgkey
Importing GPG key 0xF796ECB0:
 Userid     : "NVIDIA CORPORATION (Open Source Projects) <[email protected]>"
 Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0
 From       : https://nvidia.github.io/nvidia-container-runtime/gpgkey
nvidia-container-runtime/x86_64/signature                                                                                                    | 2.1 kB  00:00:00 !!!
nvidia-docker/x86_64/signature                                                                                                               |  833 B  00:00:00
Retrieving key from https://nvidia.github.io/nvidia-docker/gpgkey
Importing GPG key 0xF796ECB0:
 Userid     : "NVIDIA CORPORATION (Open Source Projects) <[email protected]>"
 Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0
 From       : https://nvidia.github.io/nvidia-docker/gpgkey
nvidia-docker/x86_64/signature                                                                                                               | 2.1 kB  00:00:00 !!!
updates                                                                                                                                      | 2.9 kB  00:00:00
(1/6): nvidia-docker/x86_64/primary                                                                                                          | 8.0 kB  00:00:01
(2/6): epel/x86_64/updateinfo                                                                                                                | 1.0 MB  00:00:01
(3/6): nvidia-container-runtime/x86_64/primary                                                                                               |  11 kB  00:00:01
(4/6): libnvidia-container/x86_64/primary                                                                                                    |  35 kB  00:00:01
(5/6): epel/x86_64/primary_db                                                                                                                | 7.0 MB  00:00:04
(6/6): updates/7/x86_64/primary_db                                                                                                           |  22 MB  00:00:10
libnvidia-container                                                                                                                                         231/231
nvidia-container-runtime                                                                                                                                      71/71
nvidia-docker                                                                                                                                                 54/54
Resolving Dependencies
--> Running transaction check
---> Package nvidia-docker2.noarch 0:2.13.0-1 will be installed
--> Processing Dependency: nvidia-container-toolkit >= 1.13.0-1 for package: nvidia-docker2-2.13.0-1.noarch
--> Running transaction check
---> Package nvidia-container-toolkit.x86_64 0:1.13.5-1 will be installed
--> Processing Dependency: nvidia-container-toolkit-base = 1.13.5-1 for package: nvidia-container-toolkit-1.13.5-1.x86_64
--> Processing Dependency: libnvidia-container-tools < 2.0.0 for package: nvidia-container-toolkit-1.13.5-1.x86_64
--> Processing Dependency: libnvidia-container-tools >= 1.13.5-1 for package: nvidia-container-toolkit-1.13.5-1.x86_64
--> Running transaction check
---> Package libnvidia-container-tools.x86_64 0:1.13.5-1 will be installed
--> Processing Dependency: libnvidia-container1(x86-64) >= 1.13.5-1 for package: libnvidia-container-tools-1.13.5-1.x86_64
--> Processing Dependency: libnvidia-container.so.1(NVC_1.0)(64bit) for package: libnvidia-container-tools-1.13.5-1.x86_64
--> Processing Dependency: libnvidia-container.so.1()(64bit) for package: libnvidia-container-tools-1.13.5-1.x86_64
---> Package nvidia-container-toolkit-base.x86_64 0:1.13.5-1 will be installed
--> Running transaction check
---> Package libnvidia-container1.x86_64 0:1.13.5-1 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

====================================================================================================================================================================
 Package                                             Arch                         Version                           Repository                                 Size
====================================================================================================================================================================
Installing:
 nvidia-docker2                                      noarch                       2.13.0-1                          libnvidia-container                       8.7 k
Installing for dependencies:
 libnvidia-container-tools                           x86_64                       1.13.5-1                          libnvidia-container                        52 k
 libnvidia-container1                                x86_64                       1.13.5-1                          libnvidia-container                       1.0 M
 nvidia-container-toolkit                            x86_64                       1.13.5-1                          libnvidia-container                       909 k
 nvidia-container-toolkit-base                       x86_64                       1.13.5-1                          libnvidia-container                       3.1 M

Transaction Summary
====================================================================================================================================================================
Install  1 Package (+4 Dependent packages)

Total download size: 5.1 M
Installed size: 15 M
Downloading packages:
(1/5): libnvidia-container-tools-1.13.5-1.x86_64.rpm                                                                                         |  52 kB  00:00:01
(2/5): libnvidia-container1-1.13.5-1.x86_64.rpm                                                                                              | 1.0 MB  00:00:01
(3/5): nvidia-container-toolkit-1.13.5-1.x86_64.rpm                                                                                          | 909 kB  00:00:01
(4/5): nvidia-docker2-2.13.0-1.noarch.rpm                                                                                                    | 8.7 kB  00:00:00
(5/5): nvidia-container-toolkit-base-1.13.5-1.x86_64.rpm                                                                                     | 3.1 MB  00:00:02
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total                                                                                                                               1.1 MB/s | 5.1 MB  00:00:04
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : libnvidia-container1-1.13.5-1.x86_64                                                                                                             1/5
  Installing : libnvidia-container-tools-1.13.5-1.x86_64                                                                                                        2/5
  Installing : nvidia-container-toolkit-base-1.13.5-1.x86_64                                                                                                    3/5
  Installing : nvidia-container-toolkit-1.13.5-1.x86_64                                                                                                         4/5
  Installing : nvidia-docker2-2.13.0-1.noarch                                                                                                                   5/5
warning: /etc/docker/daemon.json saved as /etc/docker/daemon.json.rpmorig
  Verifying  : nvidia-container-toolkit-base-1.13.5-1.x86_64                                                                                                    1/5
  Verifying  : libnvidia-container-tools-1.13.5-1.x86_64                                                                                                        2/5
  Verifying  : nvidia-docker2-2.13.0-1.noarch                                                                                                                   3/5
  Verifying  : libnvidia-container1-1.13.5-1.x86_64                                                                                                             4/5
  Verifying  : nvidia-container-toolkit-1.13.5-1.x86_64                                                                                                         5/5

Installed:
  nvidia-docker2.noarch 0:2.13.0-1

Dependency Installed:
  libnvidia-container-tools.x86_64 0:1.13.5-1                libnvidia-container1.x86_64 0:1.13.5-1            nvidia-container-toolkit.x86_64 0:1.13.5-1
  nvidia-container-toolkit-base.x86_64 0:1.13.5-1

Complete!

4. 配置 Docker:

4.1 创建或编辑 Docker 配置文件 /etc/docker/daemon.json

sudo nano /etc/docker/daemon.json

4.2 添加以下内容到文件中:

{
    
    
  "default-runtime": "nvidia",
  "runtimes": {
    
    
    "nvidia": {
    
    
      "path": "nvidia-container-runtime",
      "runtimeArgs": []
    }
  }
}

4.3 保存并关闭文件。

5. 重启 Docker 服务:

sudo systemctl restart docker

完成上述步骤后,您的 CentOS 系统将具备 NVIDIA Container Toolkit 的安装和配置。您可以使用带有 GPU 功能的 Docker 容器,并确保容器正确地使用 GPU 资源。

请注意,上述步骤适用于 CentOS 7 及更高版本。如果您使用的是其他版本的 CentOS,请参考 NVIDIA Container Toolkit 官方文档中针对您的 CentOS 版本的安装和配置指南。

6. NVIDIA Container Toolkit 的官方文档链接:

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/index.html

猜你喜欢

转载自blog.csdn.net/holyvslin/article/details/132314959