k8s — Cluster Architecture

Insert image description here

1. Nodes

1.1 Management

1.1.1 Node name uniqueness
1.1.2 Self-registration of Nodes
1.1.3 Manual Node administration

1.2 Node status

A Node’s status contains the following information:

  • Addresses
  • Conditions
  • Capacity and Allocatable
  • Info

You can use kubectl to view a Node’s status and other details:

kubectl describe node <insert-node-name-here>

1.3 Node heartbeats

For nodes there are two forms of heartbeats:

  • Updates to the .status of a Node.
  • Lease objects within the kube-node-lease namespace. Each Node has an associated Lease object.

1.4 Node controller

1.4.1 Rate limits on eviction

1.5 Resource capacity tracking

1.6 Node topology

1.7 Graceful node shutdown

1.7.1 Pod Priority based graceful node shutdown

1.8 Non-graceful node shutdown handling

1.9 Swap memory management

2. Communication between Nodes and the Control Plane

2.1 Node to Control Plane

Kubernetes has a “hub-and-spoke” API pattern. All API usage from nodes (or the pods they run) terminates at the API server.

2.2 Control plane to node

There are two primary communication paths from the control plane (the API server) to the nodes. The first is from the API server to the kubelet process which runs on each node in the cluster. The second is from the API server to any node, pod, or service through the API server’s proxy functionality.

2.2.1 API server to kubelet

The connections from the API server to the kubelet are used for:

  • Fetching logs for pods.
  • Attaching (usually through kubectl) to running pods.
  • Providing the kubelet’s port-forwarding functionality.
2.2.2 API server to nodes, pods, and services

2.2.3 SSH tunnels(deprecated)

SSH tunnels are currently deprecated, so you shouldn’t opt to use them unless you know what you are doing. The Konnectivity service is a replacement for this communication channel.

2.2.4 Konnectivity service

The Konnectivity service consists of two parts: the Konnectivity server in the control plane network and the Konnectivity agents in the nodes network.

3. Controllers

In robotics and automation, a control loop is a non-terminating loop that regulates the state of a system.

In Kubernetes, controllers are control loops that watch the state of your cluster, then make or request changes where needed. Each controller tries to move the current cluster state closer to the desired state.

3.1 Controller pattern

3.1.1 Control via API server

The Job controller is an example of a Kubernetes built-in controller. Built-in controllers manage state by interacting with the cluster API server.

Job is a Kubernetes resource that runs a Pod, or perhaps several Pods, to carry out a task and then stop.

3.1.2 Direct control

In contrast with Job, some controllers need to make changes to things outside of your cluster.

For example, if you use a control loop to make sure there are enough Nodes in your cluster, then that controller needs something outside the current cluster to set up new Nodes when needed.

3.2 Desired versus current state

3.3 Design

3.4 Ways of running controllers

Kubernetes comes with a set of built-in controllers that run inside the kube-controller-manager.

4. Leases

Distributed systems often have a need for leases, which provide a mechanism to lock shared resources and coordinate activity between members of a set. In Kubernetes, the lease concept is represented by Lease objects in the coordination.k8s.io API Group, which are used for system-critical capabilities such as node heartbeats and component-level leader election

5. Cloud Controller Manager

The cloud-controller-manager is a Kubernetes control plane component that embeds cloud-specific control logic. The cloud controller manager lets you link your cluster into your cloud provider’s API, and separates out the components that interact with that cloud platform from components that only interact with your cluster.

5.1 Cloud controller manager functions

5.1.1 Node controller
5.1.2 Route controller
5.1.3 Service controller

5.2 Authorization

5.2.1 Node controller
5.2.2 Route controller
5.2.3 Service controller
5.2.4 Others

6. About cgroup v2

6.1 What is cgroup v2?

cgroup v2 offers several improvements over cgroup v1, such as the following:

Single unified hierarchy design in API
Safer sub-tree delegation to containers
Newer features like Pressure Stall Information
Enhanced resource allocation management and isolation across multiple resources
Unified accounting for different types of memory allocations (network memory, kernel memory, etc)
Accounting for non-immediate resource changes such as page cache write backs

6.2 Using cgroup v2

6.2.1 Requirements

cgroup v2 has the following requirements:

OS distribution enables cgroup v2
Linux Kernel version is 5.8 or later
Container runtime supports cgroup v2. For example:
containerd v1.4 and later
cri-o v1.20 and later
The kubelet and the container runtime are configured to use the systemd cgroup driver

7. Container Runtime Interface (CRI)

The Container Runtime Interface (CRI) is the main protocol for the communication between the kubelet and Container Runtime.

8. Garbage Collection

Garbage collection is a collective term for the various mechanisms Kubernetes uses to clean up cluster resources.

8.1 Cascading deletion

When you delete an object, you can control whether Kubernetes deletes the object’s dependents automatically, in a process called cascading deletion. There are two types of cascading deletion, as follows:

  • Foreground cascading deletion
  • Background cascading deletion
8.1.1 Foreground cascading deletion

During foreground cascading deletion, the only dependents that block owner deletion are those that have the ownerReference.blockOwnerDeletion=true field.

8.1.2 Background cascading deletion

In background cascading deletion, the Kubernetes API server deletes the owner object immediately and the controller cleans up the dependent objects in the background. By default, Kubernetes uses background cascading deletion unless you manually use foreground deletion or choose to orphan the dependent objects.

8.1.3 Orphaned dependents

8.2 Garbage collection of unused containers and images

8.2.1 Container image lifecycle

Kubernetes manages the lifecycle of all images through its image manager, which is part of the kubelet, with the cooperation of cadvisor. The kubelet considers the following disk usage limits when making garbage collection decisions:

  • HighThresholdPercent
  • LowThresholdPercent

Disk usage above the configured HighThresholdPercent value triggers garbage collection, which deletes images in order based on the last time they were used, starting with the oldest first. The kubelet deletes images until disk usage reaches the LowThresholdPercent value.

8.2.2 Container garbage collection

The kubelet garbage collects unused containers based on the following variables, which you can define:

MinAge: the minimum age at which the kubelet can garbage collect a container. Disable by setting to 0.
MaxPerPodContainer: the maximum number of dead containers each Pod can have. Disable by setting to less than 0.
MaxContainers: the maximum number of dead containers the cluster can have. Disable by setting to less than 0.

9. Mixed Version Proxy

Guess you like

Origin blog.csdn.net/houzhizhen/article/details/134395634