NVIDIA GPU Operator install in kubernetes

1 Introduction

Kubernetes provides access to special hardware resources, such as NVIDIA GPUs, network cards, Infiniband adapters, and other devices through the device plug-in framework. However, configuring and managing nodes with these hardware resources requires configuring multiple software components, such as drivers, container runtimes, or other libraries, which are difficult and error-prone to assemble. The relevant architecture of GPU Operator is as follows:


As you can see from the architecture, the NVIDIA GPU Operator uses the Operator framework in Kubernetes to automatically manage all NVIDIA software components required to provide GPUs. These components include NVIDIA drivers (CUDA enabled), Kubernetes device plug-ins for GPUs, NVIDIA container toolkits, automatic node tagging using GFD, DCGM-based monitoring, and more. NVIDIA's official GPU Operator allows for easy installation and configuration combinations, providing great convenience for container applications to use GPUs.
The latest version currently is 23.9.0, and the supported platforms are as follows

Deployment Options
Bare Metal
Virtual machines with GPU Passthrough
Virtual machines with NVIDIA vGPU based products
Hypervisors
VMware vSphere 7 and 8
Red Hat Enterprise Linux KVM
Red Hat Virtualization (RHV)
Operating System Kubernetes Red Hat OpenShift VMWare vSphere with Tanzu Rancher Kubernetes Engine 2 HPE Ezmeral Runtime Enterprise Can Mic
Ubuntu20.04 LTS 1.25—1.28 7.0 U3c,8.0 U2 1.25—1.28
Ubuntu22.04 LTS 1.25—1.28 1.26
CentOS 7 1.25—1.28
Red Hat Core OS 4.9—4.14
Red Hat Enterprise Linux 8.4,8.6, 8.7, 8.8 1.25—1.28 1.25—1.28
Red Hat Enterprise Linux 8.4, 8.5 5.5

GPU Operator has been officially verified on the following combinations:

Operating System Containerd 1.4 - 1.7 CRY IT
Ubuntu 20.04 LTS Yes Yes
Ubuntu 22.04 LTS Yes Yes
CentOS 7 Yes No
Red Hat Core OS (RHCOS) No Yes
Red Hat Enterprise Linux 8 Yes Yes

2. Kubernetes installation

According to the GPU Operator support list, the minimum Kubernetes version is 1.25. Regarding OS support, it seems to have better support for Ubuntu and RHEL. In addition, Dockershim has been removed from Kubernetes version 1.24. Therefore, if the Container Runtime selects Deocker Engine, additional cri-dockerd is required for use. Here we choose the following platforms for related installation and configuration.

Hardware platform:

  • Inspur NF5468M5 Intel® Xeon® Gold 6240R CPU @ 2.40GHz
  • NVIDIA A30(GA100GL)GPU

software platform:

  • VMware ESXi, 8.0.1, 21495797
  • Ubuntu 22.04 LTS
  • Docker Engine 24.0.7
  • cri-dockerd 0.3.7

This test has a total of 2 Kubernetes nodes, 1 management node, and 1 working node.

hostname IP Address
k8sm 172.16.81.103
k8s01 172.16.81.104

3. OS configuration

Disable swap, load relevant kernel modules, and set relevant kernel parameters.

sudo swapoff -a
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
sudo tee /etc/modules-load.d/containerd.conf <<EOF
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
sudo tee /etc/sysctl.d/kubernetes.conf <<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
sudo sysctl --system

4. Docker Engine, cri-dockerd installation

# Add Docker's official GPG key:
sudo apt update
sudo apt install ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --
dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
# Add the repository to Apt sources:
echo \
"deb [arch=$(dpkg --print-architecture) signedby=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt install docker-ce docker-ce-cli containerd.io docker-buildx-plugin
docker-compose-plugi

cri-dockerd can be installed directly by downloading the deb package from github

wget https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.7/cridockerd_0.3.7.3-0.ubuntu-jammy_amd64.deb
sudo dpkg -i cri-dockerd_0.3.7.3-0.ubuntu-jammy_amd64.deb

Note: cri-dockerd will pull the pause image from Google by default, so you need to change the default image pull url of cri-dockerd, otherwise the initial Kubernetes installation will fail because the pause image cannot be pulled.

After the cri-dockerd installation is complete, modify the cri-docker.service file and modify the following lines

ExecStart=/usr/bin/cri-dockerd --container-runtime-endpoint fd://

Change to the following:

ExecStart=/usr/bin/cri-dockerd --container-runtime-endpoint fd:// --podinfra-container-image registry.aliyuncs.com/google_containers/pause:3.9

Reload systemd dameon

sudo systemctl daemon-reload

Start the docker and cr-dockerd services and set them to start automatically

sudo systemctl start docker
sudo systemctl start cri-docker
sudo systemctl enable docker cri-docker

5. Install kubeadm

We install kubeadm from the domestic Alibaba Cloud software warehouse, first add the installation source

sudo wget https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg -O
/etc/apt/keyrings/kubernetes.gpg
echo \
"deb [arch=amd64 signed-by=/etc/apt/keyrings/kubernetes.gpg]
https://mirrors.aliyun.com/kubernetes/apt/ \
kubernetes-xenial main" | \
sudo tee /etc/apt/sources.list.d/kubernetes.list > /dev/null
sudo apt update

Note: Please follow the above instructions to configure the kubeadm installation source. Do not follow the instructions of the Alibaba Cloud Kubernetes software warehouse
to configure the software source

Install kubeadm, kubelet, and kubectl. We did not install the latest version of Kubernetes here. We installed version 1.26.9-00.

sudo apt install -y kubelet=1.26.9-00 kubeadm=1.26.9-00 kubectl=1.26.9-00

Initialize the installation of the Kubernetes management node. Since there are two container engines in the system, you need to specify the relevant cri-socket when installing Kubernetes. Do not specify the Docker Engine socket here. You need to use the cri-docker socket. At the same time, we specify that the Kubernetes image should be pulled from the domestic Alibaba Cloud warehouse.

sudo kubeadm init --image-repository registry.aliyuncs.com/google_containers --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=172.16.81.103 --cri-socket unix:///var/run/cridockerd.sock

Add Work node
After the Kubernetes management node is installed, follow the relevant prompts to add the Work node to the cluster

sudo kubeadm join 172.16.81.103:6443 --token fh1lte.m80w04ebcmd1ryg4 --discovery-token-ca-cert-hash sha256:5757a76b34ac07a236ad01f8601d4f4f41c82e257a48ddf14620e7b950088793 --cri-socket unix:///var/run/cri-dockerd.sock

Note: Add the –cri-socket parameter

Install CNI plug-in
Here we use Antrea CNI plug-in, download the relevant yaml file and apply it directly on Kubernets.

kubectl apply -f https://github.com/antreaio/antrea/releases/download/1.14.1/antrea.yml

6. GPU-Operator installation

This time, the pass-through method is used to use the GPU to the Kubernetes Work node. Using pass-through mode, there is no need to install NVIDIA drivers on ESXi. In addition, there are two options for the GPU driver when installing the GPU Operator: the driver is installed in the OS, and the driver is installed directly in the container. Both methods are available. Here we choose to install the driver into the container, so there is no need to install the NVIDIA driver in the OS.

  1. . Configure GPU passthrough

  1. Assign the pass-through GPU to the relevant VM
    First ensure that the VM is in EFI mode

  2. Modify the advanced parameter configuration of the VM

Add the following 2 parameters:

pciPassthru.use64bitMMIO="TRUE"
pciPassthru.64bitMMIOSizeGB = "64"


The value of the pciPassthru.64bitMMIOSizeGB parameter can be found on nvidia’s website
https://docs.nvidia.com/ai-enterprise/latest/release-notes/index.html# tesla-p40-largememory-vms

  1. Install GPU Operator
    GPU Operator is installed through helm chart, so install helm first and add the relevant helm repository
#安装helm
curl -fsSL -o get_helm.sh
https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 \
&& chmod 700 get_helm.sh \
&& ./get_helm.sh
#添加helm仓库
helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \
&& helm repo update

By default, the GPU Operator will deploy relevant components on all working nodes with GPUs in the cluster. GPU worker nodes identify worker nodes with GPUs through the tag feature.node.kubernetes.io/pci-10de.present=true. We first label the working nodes with GPUs.

Note: 0x10de in pci-10de is NVIDIA’s PCI vendor ID. This can be seen in the interface for configuring passthrough

kubectl label nodes k8s01 feature.node.kubernetes.io/pci-10de.present=true

Install GPU Operator via helm

helm install --wait --generate-name -n gpu-operator --create-namespace nvidia/gpu-operator

If the NVIDIA driver has been installed on the OS, you can choose not to install the driver into the container during helm installation, and use the following parameters to install it.

helm install --wait --generate-name -n gpu-operator --create-namespace nvidia/gpu-operator --set driver.enabled=false

You can monitor the installation process through kubectl. After the installation is completed, if there are no problems, the following related pods will be generated.

$ kubectl get pods -n gpu-operator
NAME READY STATUS RESTARTS AGE
gpu-feature-discovery-crrsq 1/1 Running 0 60s
gpu-operator-7fb75556c7-x8spj 1/1 Running 0 5m13s
gpu-operator-node-feature-discovery-master-58d884d5cc-w7q7b 1/1 Running 0 5m13s
gpu-operator-node-feature-discovery-worker-6rht2 1/1 Running 0 5m13s
gpu-operator-node-feature-discovery-worker-9r8js 1/1 Running 0 5m13s
nvidia-container-toolkit-daemonset-lhgqf 1/1 Running 0 4m53s
nvidia-cuda-validator-rhvbb 0/1 Completed 0 54s
nvidia-dcgm-5jqzg 1/1 Running 0 60s
nvidia-dcgm-exporter-h964h 1/1 Running 0 60s
nvidia-device-plugin-daemonset-d9ntc 1/1 Running 0 60s
nvidia-device-plugin-validator-cm2fd 0/1 Completed 0 48s
nvidia-driver-daemonset-5xj6g 1/1 Running 0 4m53s
nvidia-mig-manager-89z9b 1/1 Running 0 4m53s
nvidia-operator-validator-bwx99 1/1 Running 0 58s

If there are any problems, you can check the pod's logs through kubectl logs.

  1. Verify GPU Operator

Create cuda-vectoradd.yaml with the following content

apiVersion: v1
kind: Pod
metadata:
  name: cuda-vectoradd
spec:
  restartPolicy: OnFailure
  containers:
    - name: cuda-vectoradd
      image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda11.7.1-ubuntu20.04"
      resources:
        limits:
          nvidia.com/gpu: 1

Run the pod and check the relevant logs. If the log is similar to the following, the test is successful.

kubectl apply -f cuda-vectoradd.yaml
kubectl logs pod/cuda-vectoradd
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done

GPU allocation can be viewed with the following command

$ kubectl describe node k8s01
..........
..........
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 400m (5%) 0 (0%)
memory 0 (0%) 0 (0%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
nvidia.com/gpu 1 1
Events: <none>

Herenvdia.com/gpuYou can see that 1 GPU on this node has been allocated

Reference website:

Guess you like

Origin blog.csdn.net/xixihahalelehehe/article/details/134734318