Kubernetes study notes - computing resource management (2) pod Qos level & set default requests and limits for pods in the namespace20230218

1. Qos level of pod

One may not be able to provide as much resources as the sum of the resource limits specified by all pods. For example, two pods, podA uses 90% of the node memory, and podB suddenly needs more memory. This is because the node cannot provide enough Memory, which container will be killed, kubernetes cannot make correct decisions, and needs to be prioritized.

Kubernetes divides pods into three Qos levels:

BestEffort (lowest priority)

Burstable

Guaranteed (highest priority)

The BestEffort level of the pod

The lowest priority Qos class is BestEffort

The container does not set any requests and limits pod, and the container running at this level does not have any resource guarantee. In the worst case, they don't get any cpu time, and at the same time, these containers are the first to be killed when memory needs to be freed for other pods. But because BestEffort pods have no memory limits, these containers can use as much memory as they want when there is enough memory available.

Guaranteed level

This level will allocate those pods that all resource requests and limits are the same. For a Guaranteed-level pod, there are several conditions:

Both cpu and memory must set requests and limits

Each container needs to set the amount of resources

They must be equal (requests and limits of each resource must be equal for each container)

Because if the resource request of the container does not display the setting, the default is the same as the limits, so only setting the limit of all resources (each resource of each container in the pod) can make the Qos level of the pod Guaranteed. The containers of these pods can use the same amount of resources it requested, but cannot consume more resources (because their requests and limits are equal)

Burstable level

Between BestEffort and Guaranteed. All other pods belong to this class, including single-container pods with different requests and limits, pods with at least one container that defines requests but no limits, and one container with equal requests and limits but another Containers do not specify pods for requests and limits. Burstable pods can get the same amount of resources they requested, and can use additional resources (up to limits)

The QOS level of the container

CPU requests vs limits	Memory requests vs limits	Qos level of the container
not set	not set	Best Effort
not set	Requests<limits	Burstable
not set	Requests=limits	Burstable
Requests<limits	not set	Burstable
Requests<limits	Requests<limits	Burstable
Requests<limits	Requests=limits	Burstable
Requests=limits	Requests=limits	Guaranteed

Qos level for multi-container pods

Qos level of container 1	Qos level of container 1	Pod's Qos container
Best Effort	Best Effort	Best Effort
Best Effort	Burstable	Burstable
Best Effort	Guaranteed	Burstable
Burstable	Burstable	Burstable
Burstable	Guaranteed	Burstable
Guaranteed	Guaranteed	Guaranteed

Which process will be killed when memory is low

In an oversold system, BestEffort pods are killed first, then Burstable pods, and Guaranteed pods last. Guaranteed pods are killed only when system processes need memory.

How to handle containers with the same Qos level

Every running process has a value called an OutOfMemory (OOM) score. The system chooses which process to kill by comparing the OOM scores of all running processes. When memory needs to be freed, the process with the highest score will be killed.

The OOM score is calculated from two parameters: the percentage of available memory consumed by the process, and a fixed OOM score adjustment factor based on the pod QoS level and the amount of memory requested by the container. For two single-container pods belonging to the Burstable level, the system will kill the pod whose actual memory usage accounts for a higher proportion of memory application.

This shows that we should not only pay attention to the relationship between requests and limits, but also pay attention to the relationship between requests and the expected actual memory consumption.

2. Set default requests and limits for pods in the namespace

I talked about setting resource requests and limits for a single container.

可以通过创建一个LimitRange资源来避免必须配置每个容器。LimitRange资源不仅允许用户（为每个命名空间）指定能给容器配置的每种资源的最小和最大限额，还支持在没有显式指定资源requests时为容器设置默认值。

LimitRange资源

LimitRange资源被LimitRanger准入控制插件。API服务器接收到带有pod描述信息的POST请求时，LimitRanger插件队pod spec进行校验。如果校验失败，将直接拒绝。因此，LimitRange对象的一个广泛应用场景就是阻止用户创建大于单个节点资源量的pod。如果没有LimitRange，api服务器将欣然接收pod创建请求，但永远无法调度成功。

LimitRange资源中的limits应用于同一命名空间中每个独立的pod、容器，或者其它类型的对象，但他并不会限制这个命名空间中所有pod可用资源的总量，总量时通过ResourceQuota对象指定。

LimitRange对象创建

yaml文件：limits.yaml

apiVersion:v1

kind:LimitRange

metadata:

name:example

spec:

limits:

-type:pd

min:

cpu:50m

memory:5mi

max:

cpu:1

memory:1Gi

-type:Container

defaultRequest:

cpu:100m

memory:10Mi

default:

cpu:200m

memory:100mi

min:

cpu:50m

memroy:5Mi

max:

cpu:1

memroy:1Gi

maxLimitRequestRatio:

cpu:4

memroy:10

-type:PersistentVolumeClaim

min:

storage:1Gi

max:

storage:10Gi

LimitRange对象中配置的校验（和默认值）信息在api服务器接收到新的pod和pvc创建请求时执行，如果之后修改了限制，已经存在的pod和pvc将不会再次进行校验，新的限制只会应用于之后创建的pod和pvc。

强制进行限制

创建一个cpu申请量大于limitRange允许值的pod，会限制创建成功