K8s cluster gpu support (passthrough and vgpu)

1.gpu pass-through method

The old version before 1-1v1.8: based on nvidia-docker implementation (basically no need, understand)

Early preparation: 1. nvidia driver 2, cuda 3, nvidia-docker

K8s can use gpu by specifying parameters when starting pod.

(1) alpha.kubernetes.io/nvidia-gpu specifies the number of calling nvidia gpu

(2) In order for the GPU container to run, the Nvidia driver and CUDA library files need to be assigned to the container . You need to use hostPath here. You only need to specify the hostPath to /var/lib/nvidia-docker/volumes/nvidia_driver/384.98 . You do not need to specify multiple bin and lib directories.

resources:

    limits:

        alpha.kubernetes.io/nvidia-gpu: 1

volumes:

    - hostPath:

        path: /usr/lib/nvidia-375/bin

        name: bin

    - hostPath:

        path: /usr/lib/nvidia-375

        name: lib

K8s versions: v1.8, v1.9, v1.10, v1.11 and v1.12

The traditional alpha.kubernetes.io/nvidia-gpu will be offline in version 1.11, and GPU-related scheduling and deployment code will be completely removed from the main code.

Instead, the two built-in modules of Kubernetes, Extended Resource+Device Plugin, plus the corresponding Device Plugin implemented by the device provider, are used to complete the scheduling from the cluster level of the device to the working node, to the actual binding of the device and the container. After 1.10 The version has deviceplugins turned on by default. You only need to configure limits, no need to specify volumes .

    resources:

      limits:

        nvidia.com/gpu: 1

1-2. Implementation based on devicePlugins (key points)

According to official instructions, nvidia-docker2 must be installed before installing the NVIDIA GPU plugin, but docker19.0+ version does not require the installation of nvidia-docker2, so I gave it a try and found that there is indeed no need to install nvidia-docker2, only configuration /etc/docker/daemon.json file, just change the default runtime to nvidia. (But if you are using a version below docker19.0, you should install the official tutorial honestly.)

K8s versions: v1.8, v1.9, v1.10, v1.11 and v1.12

The traditional alpha.kubernetes.io/nvidia-gpu is offline in version 1.11, and the scheduling and deployment code related to GPU will be completely removed from the mainline code. Instead, the two built-in modules of Kubernetes, Extended Resource+Device Plugin, plus the corresponding Device Plugin implemented by the device provider, are used to complete the scheduling from the cluster level of the device to the working node, to the actual binding of the device and the container. After 1.10 The version has deviceplugins turned on by default.

Before installing k8s, you need to disable virtual memory swapoff –a, otherwise the installation will go wrong

0. Prerequisite: Install nvidia-driver and supported cuda, and disable nouveau: lsmod | grep nouveau will be disabled if there is no output.

  1. Install nvidia-container-runtime (CentOS version)

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)

curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -

curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo

yum clean all

yum makecache

sudo yum install -y nvidia-container-toolkit nvidia-container-runtime

(Ubuntu version: the first two lines of commands are the same)

curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

apt-get update

apt-get install -y nvidia-container-toolkit nvidia-container-runtime

If ' dpkg is interrupted, you must manually run sudo dpkg –configure -a to solve this problem' : Delete  rm /var/lib/dpkg/updates/*  and rebuild sudo apt-get update

P S: If the version below docker19.3 needs to install nvidia-docker2:

# Add nvidia-docker2 source

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)

curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo

# Install nvidia-docker2, reload Docker daemon configuration

yum install -y nvidia-docker2

sudo pkill -SIGHUP dockerd

  1. Configure /etc/docker/daemon.json (centOS and ubuntu are both in this directory, it does not exist and needs to be created manually)

# /etc/docker/daemon.json

{

    "default-runtime": "nvidia",

    "runtimes": {

        "nvidia": {

            "path": "/usr/bin/nvidia-container-runtime",

            "runtimeArgs": []

        }

    },

"registry-mirrors": ["https://3lz3ongc.mirror.aliyuncs.com"],

"live-restore": true

}

systemctl daemon-reload

systemctl restart docker

3. Configure kubelet parameters and start deviceplugins (versions before Kubernetes 1.10 disable DevicePlugins by default and need to add parameters to enable it. Versions after 1.10 do not need to add kubelet parameters and are enabled by default )

The kubelet configuration is in the file /etc/sysconfig/kubelet

Add configuration:

KUBELET_EXTRA_ARGS=--fail-swap-on=false --cadvisor-port=4194 --feature-gates=DevicePlugins=true

Restart the service:

systemctl daemon-reload

systemctl restart kubelet

An error message appears when using the kubectl command: The connection to the server 172.20.231.234:6443 was refused. Systemctl status kubelet checks whether it is still alive, jo urnalctl -u kubelet  checks kubelet logs

4. Create a daemonset (daemonset will run a pod on each node)

kubectl create -f - <<EOF

apiVersion: apps/v1

kind: DaemonSet

metadata:

  name: nvidia-device-plugin-daemonset

  namespace: kube-system

spec:

  selector:

    matchLabels:

      name: nvidia-device-plugin-ds

  updateStrategy:

    type: RollingUpdate

  template:

    metadata:

      # This annotation is deprecated. Kept here for backward compatibility

      # See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/

      annotations:

        scheduler.alpha.kubernetes.io/critical-pod: ""

      labels:

        name: nvidia-device-plugin-ds

    spec:

      tolerations:

      # This toleration is deprecated. Kept here for backward compatibility

      # See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/

      - key: CriticalAddonsOnly

        operator: Exists

      - key: nvidia.com/gpu

        operator: Exists

        effect: NoSchedule

      # Mark this pod as a critical add-on; when enabled, the critical add-on

      # scheduler reserves resources for critical add-on pods so that they can

      # be rescheduled after a failure.

      # See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/

      priorityClassName: "system-node-critical"

      containers:

      - image: nvidia/k8s-device-plugin:1.0.0-beta6

        name: nvidia-device-plugin-ctr

        securityContext:

          allowPrivilegeEscalation: false

          capabilities:

            drop: ["ALL"]

        volumeMounts:

          - name: device-plugin

            mountPath: /var/lib/kubelet/device-plugins

      volumes:

        - name: device-plugin

          hostPath:

            path: /var/lib/kubelet/device-plugins

EOF

5. Check the number of GPU cards in the GPU node

kubectl get nodes "-o=custom-columns=NAME:.metadata.name,GPU:.status.allocatable.nvidia\.com/gpu"

You can also view resources using kubectl describe node . The GPU resources in the node are the same as the CPU memory. Once allocated to a pod, other pods cannot use it. If the resources are insufficient, the pod cannot be started .

If a node cannot obtain GPU resources, you need to troubleshoot from the following aspects:

5.1 Check the Device Plugin log (that is, the created daemonset log, and the pod must be deployed to all nodes)

5.2 Check whether Nvidia's runC is configured as docker's default runC (runtime)

"default-runtime": "nvidia", "runtimes": { "nvidia": {"path": "nvidia-container-runtime", "runtimeArgs": []   } } in  /etc/docker/daemon.json Configuration is used to replace docker’s default runC  

After configuring, use docker info and you will see that the corresponding changes have taken effect.

Server Version: 18.09.6

Storage Driver: overlay

Cgroup Driver: cgroupfs

Runtimes: nvidia runc

Default Runtime: nvidia

Docker root dir: /data05

5.3 Check whether the Nvidia driver is installed successfully

6. Test the k8s scheduling gpu: (ps: because the nvidia/cuda image has built-in cuda, it is necessary to specify the version that matches the machine)

apiVersion: v1

kind: Pod

metadata:

  name: gpu-pod

spec:

  restartPolicy: Never

  containers:

  - image: nvidia/cuda:10.0-base

    name: cuda

    command: ["nvidia-smi"]

    resources:

      limits:

        nvidia.com/gpu: 1

kubectl create -f gpu-pod.yaml

kubectl logs gpu-pod #Check if nvidia-msi information is entered

ps: Note that the gpu solution of k8s is also based on nvidia-docker, which is built in docker19.03. nvidia.com/gpu: 1After allocating a GPU, the pod uses the host's nvidia driver and cuda version for gpu scheduling by default , so there is no need to install cuda in the pod . The GPU resources in the node are the same as the CPU memory. Once allocated to a pod, other pods cannot use it. If the GPU resources are insufficient, the pod cannot be started.

The above is Nvidia Gpu DevicePlugin: the scheduling scheme contributed by Nvidia, which is the most commonly used scheduling scheme;

In addition, there is GPU Share DevicePlugin: a GPU sharing scheduling solution contributed by the Alibaba Cloud service team. Its purpose is to solve users' needs for shared GPU scheduling.

Error: CUDA error:out of memory .  Reason: Insufficient GPU memory. A GPU can be used by many processes, with each process using a portion of the existing memory.

Creating gpuArray or calling gpuDevice() for the first time in matlab will be a little slower, but subsequent calls will be very fast. Because a lot of gpu libraries need to be loaded during the first startup.

2、Orion vGPU

There is no direct connection of the physical GPU into the container and does not depend on nvidia-docker.

Ps: The vgpu virtualization technology used requires a higher version of the gpu graphics card and requires a high-priced graphics card. Due to financial issues, this solution has not been tested.

Configure pod specified parameters and enable gpu

The yaml file for user configuration POD should contain the following content

   resources:

     limits:

       virtaitech.com/gpu: 1

   env:

     - name : ORION_GMEM

       value : "4096"

The above shows that the POD uses 1 Orion vGPU, and the video memory size of each vGPU is 4096MB.

Reference: https://gitee.com/teacherandchang/orion

Guess you like

Origin blog.csdn.net/qq_42152032/article/details/131342108