Detailed explanation of Kubernetes (k8s) architecture principle

Table of contents

1. Overview of k8s

1. What is k8s?

2. Characteristics

3. Main functions

3. Cluster Architecture and Components

1. Master components

(1)It was apiserver

(2)Kube-controller-manager

(3)Kube-scheduler

2. Configure storage center

3. Node components

(1) Kubelet

(2)For Proxy

(3) docker or rocket

4. Workflow of k8s

Five, k8s resource object

1.Pod

2. Pod controller

2.Label

3. Label selector (Label selector)

4.Service

5.Ingress

6.Name

7.Namespace

6. Request access process


1. Overview of k8s

1. What is k8s?

        The full name of K8S is Kubernetes. An open source system for automatically deploying, scaling and managing "containerized applications".

        It can be understood that K8S is a cluster responsible for automatic operation and maintenance management of multiple containerized programs (such as Docker), and is an extremely rich container orchestration framework tool.

        K8S is Google's open source container cluster management system. Based on container technologies such as Docker, it provides containerized applications with a series of complete functions such as deployment and operation, resource scheduling, service discovery, and dynamic scaling, which improves the management of large-scale container clusters. Convenience. 

        K8S is based on Google's Borg system (Borg system, a large-scale container orchestration tool used internally by Google) as a prototype, and then rewritten by GO language using Borg's ideas and donated to the CNCF Foundation for open source.

        The Cloud Native Foundation (CNCF) was established in December 2015 and is affiliated with the Linux Foundation. The first project incubated by CNCF is Kubernetes. With the widespread use of containers, Kubernetes has become the de facto standard for container orchestration tools.

Official website: https://kubernetes.io

GitHub:https://github.com/kubernetes/kubernetes 

2. Characteristics

  • Elastic Scaling: Use commands, UI, or automatically expand and shrink application instances based on CPU usage to ensure high availability when application business peaks are concurrent; resources are recovered when business is low, and services are run at minimum cost.
  • Self-healing: Restart the failed container when the node fails, replace and redeploy, and ensure the expected number of copies; kill the container that fails the health check, and will not process client requests until it is not ready, ensuring that online services are not interruption.
  • Service discovery and load balancing : K8S provides a unified access entry (internal IP address and a DNS name) for multiple containers, and load balances all associated containers, so that users do not need to consider container IP issues.
  • Automatic release (default rolling release mode) and rollback: K8S adopts a rolling strategy to update the application, one Pod is updated one by one, instead of deleting all Pods at the same time, if there is a problem during the update process, the changes will be rolled back to ensure that the upgrade does not affect the business .
  • Centralized configuration management and key management: manage confidential data and application configuration without exposing sensitive data in the image, improve the security of sensitive data, and store some commonly used configurations in K8S for easy application use .
  • Storage arrangement: support external storage and arrange external storage resources, mount external storage systems, whether from local storage, public cloud (such as: AWS), or network storage (such as: NFS, Glusterfs, Ceph) as cluster resources Part of the use, greatly improving the flexibility of storage usage.
  • Task batch processing and running: Provide one-time tasks and scheduled tasks to meet the scenarios of batch data processing and analysis.

3. Main functions

  • Orchestrate containers across hosts.
  • Make full use of hardware resources to maximize the needs of enterprise applications.
  • Control and automate application deployment and upgrades.
  • Mount and add storage for stateful applications.
  • Scale up or down containerized applications and their resources online.
  • Declarative container management ensures that the deployed applications work the way we deploy them.
  • Realize application status inspection and self-healing through automatic layout, automatic restart, automatic replication, and automatic scaling.
  • Provide service discovery and load balancing for multiple containers, so that users do not need to consider container IP issues.

3. Cluster Architecture and Components

        K8S belongs to the master-slave device model (Master-Slave architecture), that is, the Master node is responsible for the scheduling, management and operation and maintenance of the cluster, and the Slave node is the computing workload node in the cluster.

        The master node is generally called the Master node. The master node has apiserver, controller-manager, scheduler and uses etcd for k8s cluster storage; while the slave node is called the Worker Node node. The node node has kubelet, kube-proxy, container Engine (such as docker), each Node will be assigned some workload by the Master.

        The Master component can run on any computer in the cluster, but it is recommended that the Master node occupy a separate server. Because the Master is the brain of the entire cluster, if the node where the Master is located is down or unavailable, all control commands will be invalid. In addition to the Master, other machines in the K8S cluster are called Worker Node nodes. When a Node goes down, the workload on it will be automatically transferred to other nodes by the Master.

1. Master components

(1)It was apiserver

        Used to expose the Kubernetes API, any resource request or invocation operation is performed through the interface provided by kube-apiserver. The interface service is provided by the HTTP Restful API. The addition, deletion, modification, query and monitoring operations of all object resources are handed over to the API Server for processing and then submitted to Etcd for storage.

        It can be understood that the API Server is the request entry for all K8S services. The API Server is responsible for receiving all K8S requests (from the UI interface or CLI command line tool), and then notifies other components to work according to the user's specific request. It can be said that the API Server is the brain of the K8S cluster architecture.

(2)Kube-controller-manager

        The operation management controller is the background thread for processing routine tasks in the K8S cluster, and is the automatic control center for all resource objects in the K8S cluster.

        In a K8S cluster, a resource corresponds to a controller, and the Controller manager is responsible for managing these controllers.

        Consists of a series of controllers, monitors the status of the entire cluster through the API Server, and ensures that the cluster is in the expected working state. For example, when a Node goes down unexpectedly, the Controller Manager will find out in time and perform an automatic repair process to ensure that the cluster is always in expected working condition.

These controllers mainly include:

  • Node Controller: Responsible for discovering and responding when a node fails.
  • Replication Controller: Responsible for ensuring that the number of Pod copies associated with an RC (resource object Replication Controller) in the cluster always maintains the preset value. It can be understood as ensuring that there are only N Pod instances in the cluster, and N is the number of Pod copies defined in RC.
  • Endpoints Controller (Endpoints Controller): Populate endpoint objects (that is, connect Services and Pods), responsible for monitoring changes in Services and corresponding Pod copies. It can be understood that an endpoint is an access point exposed by a service. If you need to access a service, you must know its endpoint.
  • Service Account & Token Controllers: Create default accounts and API access tokens for new namespaces.
  • ResourceQuota Controller (Resource Quota Controller): Ensure that the specified resource object does not over-occupy system physical resources at any time.
  • Namespace Controller: Manages the life cycle of a namespace.
  • Service Controller (Service Controller): It belongs to an interface controller between the K8S cluster and the external cloud platform

(3)Kube-scheduler

        It is the process responsible for resource scheduling, and selects a suitable Node node for the newly created Pod according to the scheduling algorithm.

        It can be understood as the scheduler of all Node nodes of K8S. When the user wants to deploy the service, the Scheduler will select the most suitable Node node to deploy the Pod according to the scheduling algorithm.

Scheduling Algorithm:

  • Predicate strategy (predicate)
  • preferred strategy (priorities)

        API Server receives a request to create a batch of Pods, API Server will let Controller-manager create Pods according to the preset template, Controller-manager will go to Scheduler to select the most suitable Node node for the newly created Pods through API Server. For example, running this Pod requires 2C4G resources, and the Scheduler will filter out Node nodes that do not meet the policy through the pre-selection policy. How many resources are left in the Node node is stored in etcd by reporting to the API Server, and the API Server will call a method to find the remaining resources of all Node nodes in etcd, and then compare the resources required by the Pod, if the resources of a Node node are insufficient Or if the conditions of the pre-selection strategy are not met, the pre-selection cannot be passed. The nodes screened out in the pre-selection stage will be ranked according to the optimization strategy for the Node nodes that have passed the pre-selection in the optimization stage, and the Node with the highest score will be selected. For example, a Node with more resources and less load may have a higher rank.

2. Configure storage center

        etcd is a distributed key-value storage system and a storage service of K8S. It stores key configurations and user configurations of K8S. Only the API Server in K8S has read and write permissions, and other components must pass the interface of the API Server to read and write data.

        etcd is an open source project initiated by the Coreos team in June 2013. Its goal is to build a highly available distributed key-value (kev-value) database. etcd internally uses the raft protocol as the consensus algorithm, and etcd is written in the go language.

As a service discovery system, etcd has the following characteristics:

  • Simple: the installation and configuration are simple, and HTTP API is provided for interaction, and the use is also very simple
  • Security: Support SSI certificate verification
  • Fast: a single instance supports 2k+ read operations per second
  • Reliable: use the rat algorithm to realize the availability and consistency of distributed system data

        etcd currently uses port 2379 to provide HTTP API services by default, and port 2380 to communicate with peers (these two ports have been officially reserved for etcd by IANA). That is to say, etcd uses port 2379 to provide external communication for clients by default, and port 2380 for internal communication between servers.

        etcd is generally recommended to be deployed in a cluster in a production environment. Due to the leader election mechanism of etcd, an odd number of at least 3 or more is required.

3. Node components

(1) Kubelet

        The monitor of the Node node and the communicator with the Master node. Kubelet is the "eyeliner" placed by the Master node on the Node node. It will regularly report the status of the services running on its Node node to the API Server, and accept instructions from the Master node to take adjustment measures.

        Obtain the desired state of the Pod on its own node from the Master node (such as what container to run, the number of copies to run, how to configure the network or storage, etc.), and directly interact with the container engine to implement container lifecycle management. If the state of the Pod on its own node is the same as If the expected state is inconsistent, call the corresponding container platform interface (that is, the docker interface) to achieve this state.

        It is also responsible for managing the cleaning of images and containers, ensuring that the images on the nodes will not occupy the disk space, and the exited containers will not occupy too many resources.

        That is, in a Kubernetes cluster, a kubelet service process is started on each Node. This process is used to handle tasks sent by the Master to the node, and manage Pods and containers in Pods. Each kubelet process will register the information of the node itself on the API Server, regularly report the usage of node resources to the Master, and monitor the container and node resources through cAdvisor.

(2)For Proxy

        The Pod network agent is implemented on each Node node, which is the carrier of Kubernetes Service resources and is responsible for maintaining network rules and four-layer load balancing. Responsible for writing rules to iptables and ipvs to implement service mapping access.

        Kube-Proxy itself does not directly provide a network for Pods. The Pod's network is provided by Kubelet. Kube-Proxy actually maintains a virtual Pod cluster network.

        Kube-apiserver updates Kubernetes Service and maintains endpoints by monitoring Kube-Proxy.

        The load balancing of microservices in the K8S cluster is implemented by Kube-proxy. Kube-proxy is a load balancer inside the K8S cluster. It is a distributed proxy server that runs a Kube-proxy component on each node of K8S.

(3) docker or rocket

        The container engine runs the container and is responsible for the creation and management of the local container. When kubernetes schedules the pod to the node, the kubelet on the node will instruct docker to start a specific container. Then, kubelet will continuously collect container information through docker, and then submit it to the master node. docker will pull the container image, start or stop the container as usual. The only difference is that this is controlled by an automated system rather than an administrator doing it manually on each node.

4. Workflow of k8s

  1. The user sends a request to create a pod to the apiserver on the master node through the client;
  2. The apiserver will first write the relevant request information into etcd, and then find the controller-manager to create a pod list according to the preset resource template;
  3. Then the controller-manager will go to the scheduler through the apiserver to select the most suitable Node node for the newly created pod;
  4. The scheduler will select the most suitable Node node through the pre-selection strategy and optimization strategy of the scheduling algorithm;
  5. Then use the apiserver to find the kubelet on the corresponding Node node to create and manage pods;
  6. Kubelet will directly interact with the container engine to manage the life cycle of the container;
  7. Users create service resources hosted on kube-proxy and write related network rules to realize service discovery and load balancing of pods.

Five, k8s resource object

        Kubernetes contains many types of resource objects: Pod, Label, Service, Replication Controller, etc.

        All resource objects can be added, deleted, modified, and checked through the kubectl tool provided by Kubernetes, and stored in etcd for persistent storage.

        Kubernetes is actually a highly automated resource control system. By tracking and comparing the difference between the expected state of resources stored in etcd storage and the actual resource state in the current environment, advanced functions such as automatic control and automatic error correction are realized.

1.Pod

        Pod is the smallest/simplest basic unit created or deployed by Kubernetes, and a Pod represents a process running on the cluster. A Pod can be understood as a pea pod, and each container in the same Pod is a pea.

        A Pod consists of one or more containers. The containers in the Pod share network, storage and computing resources and run on the same Docker host.

        Multiple containers can be run in a Pod, also called sidecar mode (SideCar). In a production environment, a Pod is generally composed of a single container or multiple containers with strong associations and complementarities.

        Containers in the same Pod can access each other through localhost, and can mount all data volumes in the Pod; however, containers between different Pods cannot use localhost to access, nor can they mount data volumes in other Pods.

2. Pod controller

        Pod controller is a template for Pod startup, which is used to ensure that Pods started in K8S should always run according to user expectations (number of copies, life cycle, health status check, etc.).

There are many Pod controllers in K8S, and the commonly used ones are as follows

  • Deployment: Stateless application deployment. The role of Deployment is to manage and control Pod and ReplicaSet, and control them to run in the state expected by the user.
  • Replicaset: Ensuring the expected number of Pod replicas. The role of ReplicaSet is to manage and control Pods, and control them to work well. However, ReplicaSet is controlled by Deployment.
  • Daemonset: Ensure that all nodes run the same type of Pod, and ensure that there is one such Pod running on each node. It is usually used to implement system-level background tasks.
  • Statefulset: Stateful Application Deployment
  • Job: A one-time task. According to the user's settings, the Pod managed by the Job will automatically exit after the task is successfully completed.
  • Cronjob: periodic scheduled tasks

2.Label

        Tags are a characteristic management method of K8S, which facilitates the classification and management of resource objects. Label can be attached to various resource objects, such as Node, Pod, Service, RC, etc., for associating objects, querying and filtering.

        A Label is a key-value pair, where the key and value are specified by the user.

        A resource object can define any number of Labels, and the same Label can also be added to any number of resource objects, and can also be dynamically added or deleted after the object is created.

        The multi-dimensional resource group management function can be realized by binding one or more different Labels to the specified resource object.

Similar to Label, there is also Annotation (comment)
The difference is that a valid label value must be 63 characters or less, and must be empty or start and end with alphanumeric characters ([a-z0-9A-Z]), It can contain dashes (-), underscores (_), dots (.) and letters or numbers. There is no character length limit for comment values.

3. Label selector (Label selector)

        Defining a Label for a resource object is equivalent to giving it a label; then you can query and filter resource objects with certain Labels through the label selector (Label selector).

There are currently two types of label selectors:

  • Based on equivalence relationships (equal to, not equal to)
  • Based on set relationships (belongs to, does not belong to, exists)

4.Service

        In the K8S cluster, although each Pod will be assigned a separate IP address, since Pods have a life cycle (they can be created and will not be restarted after being destroyed), they may change at any time due to business changes. As a result, this IP address will also disappear with the destruction of the Pod. Service is the core concept used to solve this problem.

        Service in K8S does not mean "service" as we often say, but more like a gateway layer, which can be regarded as a group of external access interfaces and traffic balancers of Pods that provide the same service.

        Which Pods a Service applies to is defined by label selectors. In a K8S cluster, a Service can be regarded as an external access interface for a group of Pods that provide the same service. The service that the client needs to access is the Service object. Each Service has a fixed virtual ip (this ip is also called Cluster IP), which is automatically and dynamically bound to the back-end Pod. All network requests directly access the virtual ip of the Service, and the Service will automatically forward it to the back-end .

        In addition to providing a stable external access method, Service can also function as a load balance (Load Balance), automatically distributing request traffic to all back-end services, and Service can transparently scale horizontally to customers (scale ).
The key to realize the function of service is kube-proxy. kube-proxy runs on each node and monitors the changes of service objects in the API Server. It can implement the network through the following three traffic scheduling modes: userspace (abandoned), iptables (on the verge of obsolete), ipvs (recommended, best performance) forwarding.

        Service is the core of K8S services. It shields service details and uniformly exposes service interfaces to the outside world, truly achieving "microservices". For example, one of our service A has deployed 3 copies, that is, 3 Pods; for users, they only need to pay attention to the entrance of a Service, and do not need to worry about which Pod should be requested.
The advantages are very obvious: on the one hand, external users do not need to perceive IP changes caused by unexpected service crashes on Pods and K8S restarting Pods, and external users do not need to perceive IPs caused by Pod replacements caused by upgrades and service changes Variety.

5.Ingress

        Service is mainly responsible for the network topology inside the K8S cluster, so how can the outside of the cluster access the inside of the cluster? Ingress is needed at this time. Ingress is the access layer of the entire K8S cluster and is responsible for communication inside and outside the cluster.

        Ingress is a Layer 7 application that works under the OSI network reference model in a K8S cluster. It is an externally exposed interface. The typical access method is http/https.

        Service can only perform traffic scheduling on the fourth layer, and the form of expression is ip+port. Ingress can schedule business traffic of different business domains and different URL access paths.

        For example: client requests http://www.abc.com:port ---> Ingress ---> Service ---> Pod

6.Name

        Since K8S uses "resources" to define each logical concept (function), each "resource" should have its own "name".

        "Resources" include api version (apiversion), category (kind), metadata (metadata), definition list (spec), status (status) and other configuration information.

        "Name" is usually defined in the "metadata" information of "resource". Must be unique within the same namespace.

7.Namespace

        With the increase of projects, personnel, and cluster scale, a method that can logically isolate various "resources" in K8S is needed, which is Namespace.

        Namespace was born to divide a K8S cluster into several virtual cluster groups whose resources cannot be shared.
The names of "resources" in different Namespaces can be the same, and the "names" of the same "resources" in the same Namespace cannot be the same.

        Reasonable use of K8S Namespace can enable cluster administrators to better classify, manage and browse services delivered to K8S.

        The default namespaces in K8S include: default, kube-system, kube-public, etc. To query specific "resources" in K8S, you need to bring the corresponding Namespace.

6. Request access process

  1. The service resource associates Pods with the same label through the label selector;
  2. Each service has a fixed clusterip, which can be accessed inside the k8s cluster;
  3. The service can forward the request sent through the cluster ip load balancing layer 4 proxy to its associated backend pod;
  4. Ingress can be used as a gateway interface exposed by k8s to receive request traffic from outside the k8s cluster;
  5. Ingress supports Layer 7 proxy forwarding, which can forward request traffic to different services according to different domain names or URL access paths.

Guess you like

Origin blog.csdn.net/weixin_58544496/article/details/128205060