An in-depth analysis of cloud native how to quickly enable Cgroup V2 support in Kubernetes

1. What are the advantages of cgroup v2?

  • There are two versions of cgroups in Linux: cgroup v1 and cgroup v2. cgroup v2 is a new generation of cgroup API. Kubernetes cgroup2 feature is officially stable since v1.25.
  • cgroup v2 provides a unified control system with enhanced resource management capabilities. cgroup v2 has made many improvements to cgroup v1, such as:
    • A single unified hierarchical design in the API;
    • Safer subtree delegation to containers;
    • Updated features, such as Pressure Stall Information (PSI);
    • Enhanced resource allocation management and isolation across multiple resources;
    • Unified accounting for different types of memory allocation (network memory, kernel memory, etc.);
    • Consider non-immediate resource changes, such as page cache writebacks;
  • Some Kubernetes features specifically use cgroups v2 to enhance resource management and isolation. For example, the MemoryQoS feature improves memory QoS and relies on cgroup v2 primitives.

2. Prerequisites for using cgroup v2

  • cgroup v2 has the following requirements:
    • Operating system distribution enables cgroup v2
    • Ubuntu (starting with 21.10, 22.04+ recommended)
    • Debian GNU/Linux (starting with Debian 11 Bullseye)
    • Fedora (starting in 31)
    • RHEL and RHEL-like distributions (starting with 9)
    • Linux kernel is 5.8 or higher
    • The container runtime supports cgroup v2. For example:
    • containerd v1.4 and higher
    • cri-o v1.20 and later
    • The kubelet and container runtime are configured to use the systemd cgroup driver.

3. Use cgroup v2

① Enable and check the cgroup v2 of the Linux node

  • Taking Debian 11 Bullseye + containerd v1.4 as an example, Debian 11 Bullseye has cgroup v2 enabled by default, which can be verified through the following command:
stat -fc %T /sys/fs/cgroup/
  • For cgroup v2, the output is cgroup2fs.
  • For cgroup v1, the output is tmpfs.
  • If it is not enabled, you can add systemd.unified_cgroup_hierarchy=1 to GRUB_CMDLINE_LINUX under /etc/default/grub and then execute sudo update-grub.
  • If it is a Raspberry Pi, cgroups will not be enabled during standard Raspberry Pi OS installation, and cgroups is required to start the systemd service. cgroups can be enabled by appending cgroup_memory=1 cgroup_enable=memory systemd.unified_cgroup_hierarchy=1 to /boot/cmdline.txt, and it will take effect after a reboot.

② kubelet uses systemd cgroup driver

  • kubeadm supports passing a KubeletConfiguration structure when executing kubeadm init. KubeletConfiguration contains the cgroupDriver field, which can be used to control the kubelet's cgroup driver.
  • In version 1.22, if the user does not set the cgroupDriver field in the KubeletConfiguration, kubeadm init sets it to the systemd default value. Here is a minimal example where this field is configured explicitly:
# kubeadm-config.yaml
kind: ClusterConfiguration
apiVersion: kubeadm.k8s.io/v1beta3
kubernetesVersion: v1.21.0
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
cgroupDriver: systemd
  • Such a configuration file can be passed to the kubeadm command:
kubeadm init --config kubeadm-config.yaml
  • Kubeadm uses the same KubeletConfiguration for all nodes in the cluster. KubeletConfiguration is stored in a ConfigMap object under the kube-system namespace.
  • Executing subcommands such as init, join, and upgrade will cause kubeadm to write the KubeletConfiguration to the file /var/lib/kubelet/config.yaml, and then pass it to the kubelet of the local node.

③ containerd uses systemd cgroup driver

  • Edit /etc/containerd/config.toml:
[plugins.cri.containerd.runtimes.runc.options]
    SystemdCgroup = true

4. Upgrade monitoring components to support cgroup v2 monitoring

  • cgroup v2 uses a different API than cgroup v1, so if any applications directly access the cgroup file system, these applications need to be updated to support cgroup v2. For example:
    • Some third-party monitoring and security agents may depend on the cgroup file system, update these agents to a version that supports > cgroup v2.
    • If you are running cAdvisor as a standalone DaemonSet to monitor Pods and containers, you will need to update it to v0.43.0 or higher.
    • If using a JDK, it is recommended to use JDK 11.0.16 and above or JDK 15 and above for > full support of cgroup v2.

5. Summary

  • The cgroup2 feature of Kubernetes has been officially stable since v1.25. Compared with cgroup v1, cgroup2 has the following advantages:
    • A single unified hierarchical design in the API;
    • Safer subtree delegation to containers;
    • Updated features, such as Pressure Stall Information (PSI);
    • Enhanced resource allocation management and isolation across multiple resources;
    • Unified accounting for different types of memory allocation (network memory, kernel memory, etc.);
    • Consider non-immediate resource changes, such as page cache writebacks;
  • It is recommended that when using Kubernetes v1.25 and above, use Linux and CRI that support cgroup v2, and enable the cgroup v2 function of Kubernetes.

Guess you like

Origin blog.csdn.net/Forever_wj/article/details/134966308