Choosing the right container runtime for Kubernetes

[Foreword] As a backend support, Kubernetes has obvious advantages and has features such as automated deployment, service scaling, fault self-healing, and load balancing. Our current system’s backend support uses Kubernetes extensively. Different systems have different data security and operating efficiency. Therefore, how to choose the appropriate container runtime has become a key consideration.

CRI and OCI

The origin of CRI

Please add image description

CRI (Container Runtime Interface) is a container runtime interface specification proposed by Kubernetes.

In the Kubernetes system, the Kubelet component is responsible for interacting with the container runtime. The process of Kubelet calling the container runtime is shown in the figure above. CRI shim is a gRPC server service that implements the CRI interface. It is responsible for connecting Kubelet and Container runtime. Container runtime is a container runtime tool that isolates an independent running environment for user processes; the specific process is that Kubelet calls the CRI shim interface, and CRI shim Respond to the request and then call the underlying Container runtime tool to run the container. Kubelet, CRI shim and Container runtime are all deployed on a Kubernetes worker node. The first two are started as independent daemon processes, while Container runtime is not a daemon process. It is usually a command line tool.

Kubernetes did not have a CRI interface before v1.5. At that time, the Kubelet source code only integrated the relevant code of two container runtimes (Docker and rkt). These two container runtimes cannot meet the needs of all users. In some business scenarios, users have higher requirements for the security isolation of containers. Users hope that Kubernetes can support more types of container runtimes. Therefore, Kubernetes launched the CRI interface in version 1.5. As long as each container implements the CRI interface specification when running, it can be connected to the Kubernetes platform to provide container services for users.

The benefits brought by the CRI interface are: first, it decouples Kubernetes and the container runtime very well. Every update iteration of the container runtime does not require compilation and release of the Kubelet project source code; secondly, it frees up the pace of update and iteration of the container runtime. , and can also ensure the code quality and platform stability of Kubernetes.

CRI interface definition

The CRI interface is divided into two parts. One is the container runtime service RuntimeService, which is responsible for managing the life cycle of pods and containers; the other is the image service ImageService, which is responsible for managing the life cycle of the image.
Please add image description

What is OCI

OCI specification (Open Container Initiative open container standard), which contains two parts: container runtime standard (runtime spec) and container image standard (image spec). The specific content is defined as follows:
Please add image description

Kubelet CRI architecture

After the introduction of CRI in Kubernetes, the architecture of Kubelet is as shown below:
Please add image description

Each container engine only needs to implement a CRI shim to process CRI requests, and then it can be connected to Kubelet.

The container runtime we are talking about includes two parts to be precise. One part is the upper container runtime CRI shim (i.e. container runtime management program, such as Containerd, CRI-O), and the other part is the lower layer container runtime Container runtime (i.e. Container runtime command tools, such as runc, runv, kata).

Current CRI Landscape

Currently, the mainstream projects that have implemented CRI include: docker, containerd, CRI-O, Frakti, and pouch. The comparison of their methods of connecting Kubelet and runtime is as follows:

Please add image description

PS: Since the rkt container engine is currently not fully compatible with the OCI specification, it is not included in the picture.

Current OCI Landscape

The following table is a list of container runtime projects that are compatible with the OCI specification:
Please add image description

Kubernetes multi-runtime

Why support multiple runtimes? For example, there is an open cloud platform that provides container services to external users. There are two types of containers running on the platform, one is a container for cloud platform management (trusted), and the other is a business container deployed by users (untrusted). letter). In this scenario, we hope to use runc to run trusted containers (weak isolation but good performance) and runv to run untrusted containers (strong isolation and good security). Faced with this demand, Kubernetes also provides a solution.

In the past, multi-container runtimes were usually supported in the form of annotations, such as cri-o, frakti, etc., which supported multi-container runtimes. But this is not elegant at all, and it is impossible to schedule containers based on the container runtime. Therefore, Kubernetes began to add the new API object RuntimeClass in v1.12 to support multi-container runtime.

RuntimeClass represents a runtime. Before using it, you need to turn on the feature switch RuntimeClassand create a RuntimeClass CRD:

kubectl apply -f https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/runtimeclass/runtimeclass_crd.yaml

Then you can define the RuntimeClass object

apiVersion: node.k8s.io/v1alpha1  # RuntimeClass is defined in the node.k8s.io API group
kind: RuntimeClass
metadata:
  name: myclass  # The name the RuntimeClass will be referenced by
  # RuntimeClass is a non-namespaced resource
spec:
  runtimeHandler: myconfiguration  # The name of the corresponding CRI configuration

Then, you can define which RuntimeClass to use in the Pod:

apiVersion: v1
kind: Pod
metadata:
  name: mypod
spec:
  runtimeClassName: myclass
  # ...

Choose the right container runtime

In a production environment, we do not need docker's image packaging, container network, file mounting, swarm and other capabilities. We only need to deploy Containerd + runc to run pods on node nodes. Therefore, in the production environment, we can not install docker, but install CRI shim components and runtime tools to run pods. How should we choose between multiple CRI shims and OCI tools?

First, compare the way Containerd and CRI-O call runc. The runc code is built into Containerd and is called through functions; CRI-O calls the runc binary file through Linux commands. Obviously, the former is an in-process function call, and Containerd is better in terms of performance. Has advantages. Secondly, compare runc and runv. These are two completely different container technologies. The container process created by runc runs directly on the host kernel, while runv runs on a virtual machine virtualized by the hypervisor. The latter takes up more resources. , The startup speed is slow, and when the runv container calls the underlying hardware (such as CPU), there is an extra virtual hardware layer in the middle, and the computing efficiency is not as good as the runc container. Finally, comparing runv and kata, both run in a dedicated kernel and provide isolation of network, I/O and memory. Kata Container is the merger of two existing open source projects: Intel Clear Containers and Hyper runV. Kata Containers are lightweight virtual machine containers designed to unify the security advantages of virtual machines (VMs) with the speed and manageability of containers, which means security and high performance .

* Therefore, it is recommended to choose an appropriate container runtime based on your own business characteristics and usage scenarios. When there is no high demand for user isolation, you can give priority to using Containerd + runc, which has better performance and is lighter. In business scenarios with high isolation requirements, Containerd + Kata is recommended. *

Problems with Kata

Kata does not support host networking. In Kubernetes, etcd, nodelocaldns, kube-apiserver, kube-scheduler, metrics-server, node-exporter, kube-proxy, calico, kube-controller-manager, etc., that is, Static Pod and Daemonset all use the host network. Therefore, when installing and deploying, runc is still used as the default runtime, and kata-runtime is used as an optional runtime for specific loads.

Reference: https://cloud.tencent.com/developer/article/1730700

Answer: k8s officially provides the conditions for using the host network. In most cases, we do not need the host network. When k8s is initialized, runc is used as the default container runtime command tool. When starting an untrusted application container, specify it manually. kata runtime as a container runtime command tool:

  • Do not specify this for a Pod unless absolutely necessary hostPort. When binding a Pod to hostPort, it limits the number of positions the Pod can be scheduled to, since each <hostIP, hostPort, protocol>combination must be unique. If you don't specify hostIPand explicitly protocol, Kubernetes will use and 0.0.0.0as the default .hostIPTCPprotocol

If you only need access to the port for debugging, you can use apiserver proxy or kubectl port-forward .

If you explicitly need to expose a Pod's port on a node, consider hostPortusing the NodePort service before using .

  • Avoid hostNetwork, for the hostPortsame reasons.

Reference: https://kubernetes.io/zh/docs/concepts/configuration/overview/

Summarize

In a production environment, you can use the container+runc/kata dual-container runtime command tool configuration. Specifically, runc is the default tool for system applications, and kata is a manual configuration tool for untrusted third-party applications.

The next article introduces how to integrate Kata in a Kubernetes cluster

Guess you like

Origin blog.csdn.net/qq_26356861/article/details/125764564