Kubernetes Architecture Study Notes

Kubernetes is Google's open source container cluster management system. It provides functions such as application deployment, maintenance, and expansion mechanisms. Using Kubernetes can easily manage containerized applications running across machines. It is a solution for Docker distributed systems. All resources in k8s can be defined with yaml or Json.

Kubernetes is an open-source platform for automatically deploying, scaling, and operating application containers across clusters of hosts, providing a container-centric infrastructure.

With Kubernetes, you can respond quickly and efficiently to customer needs:
• Deploy applications quickly and predictably.
• Extend applications on the fly.
• Seamlessly roll out new features.
• Optimize hardware usage with only the resources you need.

Our goal is to build an ecosystem of components and tools to ease the burden of running applications in public and private clouds.

Kubernetes is:

Kubernetes is Google's open source container cluster management system, which implements the construction of containers based on Docker, and can manage containers in multiple Docker hosts in a wide range of ways.

The main functions are as follows:

1) Abstract multiple Docker hosts into a resource, and manage containers in a cluster mode, including functions such as task scheduling, resource management, elastic scaling, and rolling upgrades.

2) Use the orchestration system (YAML File) to quickly build a container cluster, provide load balancing, and solve the problem of direct container association and communication

3) Automatically manage and repair containers. Simply put, for example, create a cluster with ten containers in it. If the container shuts down abnormally, then it will try to restart or reassign the container, always guaranteeing that there will be ten containers running, but kill the excess.

Kubernetes role composition:

1) Pod

Pod is the smallest unit of operation in kubernetes. A Pod can be composed of one or more containers;

the same Pod can only run on the same host and share the same volumes, network, and namespace;

2) ReplicationController (RC)

RC is used to manage Pod, an RC can be composed of one or more Pods. After the RC is created, the system will create the number of Pods according to the defined number of replicas. During the running process, if the number of Pods is less than the defined, it will restart the stopped or reassign Pods, otherwise it will kill the redundant ones. Of course, the scale of running Pods can also be dynamically scaled.

The RC associates the corresponding Pods through the label. In the rolling upgrade, the RC replaces the Pods in the entire Pods to be updated one by one.

3) Service

Service defines an abstract resource of a Pod logical collection, and the containers in the Pod collection provide the same function. The collection is completed according to the defined Label and selector. When a Service is created, a Cluster IP will be allocated. This IP and the defined port provide a unified access interface for the collection and achieve load balancing.

4) Label

Label is a key/value key-value pair used to distinguish Pod, Service, and RC;

Pod, Service, and RC can have multiple labels, but each label's key can only correspond to one;

it is mainly to pass Service requests through The set of Pods that lable forwards to the backend to provide services;

kubernetes components are composed of:

1) kubectl

client command line tool, which formats the received command and sends it to kube-apiserver as the operation entry of the whole system.

2) kube-apiserver

serves as the control entry of the whole system and provides interfaces with REST API services.

3) kube-controller-manager

is used to perform background tasks in the entire system, including node status, the number of Pods, and the association between Pods and Services.

4) kube-scheduler

is responsible for node resource management, accepts tasks from kube-apiserver to create Pods, and assigns them to a node.

5) etcd

is responsible for service discovery and configuration sharing between nodes.

6) kube-proxy

runs on each computing node and is responsible for the Pod network proxy. Obtain service information from etcd regularly to make corresponding policies.

7) The kubelet

runs on each computing node. As an agent, it accepts the Pods task assigned to the node and manages the container, periodically obtains the container status, and feeds it back to the kube-apiserver.

1.1 What is Kubernetes
1. First, it is a brand-new leading solution for distributed architecture based on container technology;
2. Second, Kubernetes is an open development platform;
3. Finally, Kubernetes is a complete distributed system support platform.

1.2 There are many reasons why Kubernetes

is used. The most fundamental reason is that IT has always been an industry driven by new technologies.
The benefits of using Kubernetes:
1. First of all, the most direct feeling is that we can develop complex systems "lightly";
2. Second, the use of Kubernetes is to fully embrace the microservice architecture;
3. Then, our system can be "relocated" to the public cloud as a whole anytime, anywhere;
4. Finally, the Kubernetes system architecture has super horizontal expansion capabilities.

1.3 Basic concepts and terms of Kubernetes

In Kubernetes, concepts such as Node, Pod, Replication Controller, and Service can be regarded as resource objects, which are operated through the Kubectl tool or API call provided by Kubernetes and stored in etcd.

1.3.1 Node Node

is the work host in the Kubernetes cluster relative to the Master, and was also called Minion in earlier versions. Node can be a physical host or a virtual machine (VM). The service Kubelet for starting and managing Pid runs on each Node and can be managed by the Master. Services running on Node include Kubelet, kube-proxy and docker daemon.

The Node information is as follows: 1. Node
address: IP address of the host, or Node ID.
2. Node running state: including three states: Pending, Running, and Terminated.
3.Node Condition (condition): Describes the running condition of the Running state Node. Currently, there is only one condition----Ready. Ready indicates that the Node is in a healthy state and can receive instructions from the Master to create a Pod.
4. Node system capacity: Describes the system resources available to Node, including the amount of CPU, memory, and the maximum number of schedulable Pods.
5. Others: Other information about Node, including the instance's kernel version number, Kubernetes version number, Docker version number, operating system name, etc.

1. Node management

Nodes are usually physical machines, virtual machines, or resources provided by cloud service providers, and are not created by Kubernetes. When we say that Kubernetes creates a Node, it only means that Kubernetes creates a Node object inside the system. After it is created, it will perform a series of health checks on it, including whether it can be connected, whether the service is started correctly, whether the Pod can be created, and so on. If the check fails, the Node will be marked Not Ready in the cluster.

2. Use Node Controller to manage

Node Node Controller is a component of Kubernetes Master that manages Node objects. Its two main functions include: cluster-wide Node information synchronization, and individual Node lifecycle management.
Node information synchronization can set the synchronization time period through the startup parameter --node-sync-period of kube-controller-manager.

3. Node self-registration

When Kubelet's --register-node parameter is set to true (the default value is true), Kubelet will register itself with the apiserver. This is also the Node management method recommended by Kubernetes.

The startup parameters of Kubelet for self-registration are as follows:
1.--apiservers=: apiserver address;
2.--kubeconfig=: directory of credentials/certificates required to log in to apiserver;
3.--cloud_provider=: cloud service provider address, used for Get its own metadata;
4.--register-node=: Set to true to automatically register to the apiserver.

4. Manually manage Node

Kubernetes cluster administrators can also manually create and modify Node objects. When you need to do this, first set the value of the --register-node parameter in the Kubelet startup parameters to false. In this way, the Kubelet on Node will not register itself with the apiserver.

Additionally, Kubernetes provides a way to join or isolate certain Nodes at runtime. For details, please refer to Chapter 4.

1.3.2 Pod

Pod is the most basic operation unit of Kubernetes, including one or more closely related containers, similar to the concept of pea pods. A Pod can be regarded as a "Logical Host" of the application layer by a containerized environment. Multiple container applications in a Pod are usually tightly coupled. Pods are created, started or destroyed on Node.

Why does Kubernetes use Pods to encapsulate a layer on top of containers? A very important reason is that the communication between Docker containers is limited by the Docker network mechanism. In the Docker world, a container needs a link to access a service (port) provided by another container. Linking between a large number of containers would be a lot of work. Through the concept of Pod, multiple containers are combined in a virtual "host", so that containers can communicate with each other only through Localhost.

Application containers in a Pod share the same set of resources, as follows:
1. PID namespace: Different applications in a Pod can see the process IDs of other applications;
2. Network namespace: Multiple containers in a Pod can Access the same IP and port range;
3.IPC namespace: Multiple containers in a Pod can communicate using SystemV IPC or POSIX message queues;
4.UTS namespace: Multiple containers in a Pod share a hostname;
5.Volumes (shared storage volume): Pod Individual containers in the pod can access Volumes defined at the Pod level.

1. Definition

of Pod The definition of Pod is completed through the configuration file in Yaml or Json format. The configuration file below will define a Pod named redis-slave, where kind is the Pod. The spec mainly contains the definition of Containers (containers), and multiple containers can be defined.

apiVersion: v1
kind: Pod
metadata:
name: redis-slave
labels:
name: redis-slave
spec:
containers:
- name: slave
image: kubeguide/guestbook-redis-slave
env:
- name: GET_HOSTS_FROM
value: env
ports:
- containerPort : 6379

The lifecycle of a Pod is managed through the Replication Controller. The life cycle process of a Pod includes: defining it through a template, and then assigning it to a Node to run, and the Pod also ends after the container contained in the Pod finishes running. During the whole process, the Pod is in one of the following four states:
1. Pending: The Pod is defined correctly and submitted to the Master, but the container image it contains has not yet been created. Usually it takes some time for the Master to schedule the Pod, and then some time for the Node to download the image;
2. Running: The Pod has been assigned to a Node, and all the container images it contains have been created and run successfully 3.Succeeded
: All containers in the Pod have ended successfully and will not be restarted, which is a final state of the Pod;
4.Failed: All containers in the Pod have ended, but at least one container ended in a failed state Yes, this is also a final state of the Pod.

Kubernetes has designed a unique set of network configurations for Pods, including: assigning each Pod an IP address, using the Pod name as the host name for communication between containers, etc. The design principles of the Kubernetes network will be explained in detail in Chapter 2.

Also, running multiple instances of the same application within a pod in Kubernetes is not recommended.

1.3.3 Label

Label is a core concept in the Kubernetes system. Labels are attached to various objects in the form of key/value pairs, such as Pod, Service, RC, Node, etc. Label defines the identifiable properties of these objects, which are used to manage and select them. Labels can be attached to objects when they are created, or they can be managed through the API after the object is created.

After defining the Label for the object, other objects can use the Label Selector (selector) to define the object it acts on.

The definition of a Label Selector consists of multiple comma-separated conditions.

"labels": {
"key1": "value1",
"key2": "value2"
}

There are currently two Label Selectors: Equality-based and Set-based, when using Multiple Labels can be combined to select.

Equation-based Label Selector uses equation-like expressions to select:
1.name = redis-slave: selects all objects that contain key="name" and value="redis-slave" in Label;
2.env ! = production: Selects all objects including key="env" in Label and value not equal to "production".

The set-based Label Selector uses the expression of the set operation to select:
1.name in (redis-master, redis-slave): select all the keys that contain Label and value="redis-master" or " object of redis-slave";
2.name not in (php-frontend): select all the key="name" in Label and the value is not equal to "php-frontend" Object.

When some objects need to select other objects, multiple Label Selectors can be combined and separated by commas ",". Equation-based LabelSelector and set-based Label Selector can be combined arbitrarily. E.g:

name=redis-slave,env!=production
name not in (php-frontend),env!=production

1.3.4 Replication Controller (RC)

Replication Controller is a core concept in the Kubernetes system and is used to define the number of Pod replicas. In the Master, the Controller Manager process completes the creation, monitoring, start and stop of Pods through the definition of RC.

According to the definition of Replication Controller, Kubernetes can ensure that the specified number of Pod "replicas" (Replica) can be running at any time. If there are too many Pod replicas running, the system will stop some Pods; if the number of running Pod replicas is too small, the system will start some Pods. In short, through the definition of RC, Kubernetes always guarantees that the cluster is running The number of replicas the user expects.

At the same time, Kubernetes monitors and manages all running Pods. If necessary (for example, a Pod stops running), it will submit the Pod restart command to a program on Node for completion (such as Kubelet or Docker).

It can be said that through the use of Replication Controller, Kubernetes achieves high availability of application clusters, and greatly reduces many manual operation and maintenance tasks that system administrators need to complete in traditional IT environments (such as host monitoring scripts, application monitoring scripts, faults recovery scripts, etc.).

The definition of Replication Controller is done using a configuration file in Yaml or Json format. Take redis-slave as an example, define the properties of the Pod through spec.template in the configuration file (this part of the definition is consistent with the definition of the Pod), and set spec.replicas=2 to define the number of Pod replicas.

apiVersion: v1
kind: ReplicationController
metadata:
name: redis-slave
labels: redis-slave
name: redis-slave
spec:
replicas: 2
selector:
name: redis-slave
template:
metadata:
labels:
name: redis-slave
spec:
container:
- name: slave
image: kubeguide/guestbook-redis-slave
env:
- name: GET_HOSTS_FROM
value: env
ports:
- containerPort: 6379

Usually, there is more than one Node in a Kubernetes cluster, assuming a cluster has 3 Nodes, according to the definition of RC, The system will likely create Pods on two of these Nodes.

1.3.5 Service

In the world of Kubernetes, although each Pod will be assigned a separate IP address, this IP address will disappear whenever the Pod is destroyed. This begs the question: if a group of Pods form a cluster to provide services, how do you access them?

Kubernetes' Service (service) is the core concept used to solve this problem.

A Service can be regarded as the external access interface of a group of Pods that provide the same service. Which Pods the Service acts on is defined by the Label Selector.

1. Definition

of Service The definition of Service is also done using the configuration file in Yaml or Json format. Take the definition of redis-slave service as an example:

apiVersion: v1
kind: Service
metadata:
name: redis-slave
labels:
name: redis-slave
spec:
ports:
- port: 6379
selector:
name: redis-slave

Through this definition, Kubernetes A service called "redis-slave" will be created and listen on port 6379. The definition of spec.selector indicates that the Service will contain all Pods with a Label of "name=redis-slave".

After the Pod starts normally, the system will create an Endpoint object corresponding to the Pod according to the definition of the Service to establish the correspondence between the Service and the backend Pod. As Pods are created and destroyed, the Endpoint object will also be updated. The Endpoint object mainly consists of the IP address of the Pod and the port number that the container needs to listen to.

2. The IP address of the Pod and the Cluster IP address of the Service The IP address

of the Pod is assigned by Docker Daemon according to the IP address segment of the docker0 bridge, but the Cluster IP address of the Service is a virtual IP address in the Kubernetes system, which is dynamically assigned by the system . The Cluster IP address of the Service is relatively stable relative to the IP address of the Pod. The Service is assigned an IP address when it is created, and the IP address will not change until the Service is destroyed. Pods have a short life cycle in a Kubernetes cluster and may be destroyed and recreated by ReplicationContrller. The newly created Pod will be assigned a new IP address.

3. External access to Service

Since the IP assigned by the Service object in the Cluster IP Range pool can only be accessed internally, other Pods can access it without any problems. If this Service is used as a front-end service and is ready to provide services to clients outside the cluster, we need to provide a public IP for this service.

Kubernetes supports two types of service definitions for external services: NodePort and LoadBalancer.

1. NodePort

Specify spec.type=NodePort when defining the Service, and specify the value of spec.ports.nodePort, and the system will open a real port number on the host on each Node in the Kubernetes cluster. In this way, clients that can access Node can access the internal Service through this port number.

Take the definition of php-frontend service as an example, nodePort=80, in this way, port 80 will be opened on each Node that starts the php-frontend Pod.

apiVersion: v1
kind: Service
metadata:
name: frontend
labels:
name: frontend
spec:
type: NodePort
ports:
- port: 80
nodePort: 30001
selector:
name: frontend

2. LoadBalancer

If the cloud service provider supports external load balancers, you can The Service is defined by spec.type=LoadBalaner, and the IP address of the load balancer needs to be specified. Using this type requires specifying the nodePort and clusterIP of the Service. For example:

apiVersion: v1
kind: Service
metadata: {
"kind" "Service",
"apiVersion": "v1",
"metadata": {
"name": "my-service"
},
"spec": {
"type": "LoadBalaner",
"clusterIP": "10.0.171.239",
"selector": {
"app": "MyApp"
},
"ports": [
{
"protocol": "TCP",
"port": 80,
"targetPort": 9376,
"nodePort": 30061
}
],
},
"status": {
"loadBalancer": {
"ingress":[
{
"ip": "146.148.47.155"
}
]
}
}
}

In this example, 146.146.47.155 set by status.loadBalancer.ingress.ip is the IP address of the load balancer provided by the cloud service provider.

After that, the access request to the Service will be forwarded to the back-end Pod through LoadBalancer, and the implementation of load distribution depends on the implementation mechanism of LoadBalancer provided on the cloud service.

1.3.6 Volume

Volume is a shared directory in a Pod that can be accessed by multiple containers. Kubernetes' Volume concept is similar to Docker's Volume, but not identical. A Volume in Kubernetes has the same lifecycle as a Pod, but is not related to the lifecycle of a container. When the container is terminated or restarted, the data in the Volume will not be lost. In addition, Kubernetes supports multiple types of Volumes, and a Pod can use any number of Volumes at the same time.
Kubernetes provides a very rich Volume type, which will be explained one by one below.
1.EmptyDir: An EmptyDir Volume is created when a Pod is assigned to a Node. As can be seen from its name, its initial content is empty. All containers in the same Pod can read and write the same files in EmptyDir. When the Pod is removed from the Node, the data in EmptyDir is also permanently deleted.
2.hostPath: Mount the file or directory on the host on the Pod.
3.gcePersistentDisk: Using this type of Volume means using files on Persistent Disk (PD) on Google Compute Engine (Google Compute Engine, GCE). Unlike EmptyDir, the content on the PD will be permanently saved. When the Pod is deleted, the PD is only unmounted (Unmount), but will not be deleted. Note that you need to create a persistent disk (PD) before using gcePersistentDisk.
4. awsElasticBlockStore: Similar to GCE, this type of Volume uses the EBS Volume of Amazon Web Service (AWS) provided by Amazon and can be hung in the Pod. It should be noted that an EBS Volume needs to be created first to use awsElasticBlockStore.
5.nfs: Use the shared directory provided by NFS (Network File System) to mount into the Pod. A running NFS system is required in the system.
6.iscsi: Use the directory on the iSCSI storage device to mount into the Pod.
7.glusterfs: Use the directory of the open source GlusterFS network file system to mount into the Pod.
8.rbd: Use the Linux block device shared storage (Rados Block Device) to mount to the Pod.
9. gitRepo: By mounting an empty directory, and clone a git respository from the GIT repository for Pod use.
10.secret: A secret volume is used to provide encrypted information for Pods. You can directly mount the secrets defined in Kubernetes as files for Pods to access. Secret volumes are implemented through tmfs (in-memory file system), so this type of volume is not always persistent.
11.persistentVolumeClaim: apply for the required space from PV (PersistentVolume), PV is usually a network storage, such as GCEPersistentDisk, AWSElasticBlockStore, NFS, iSCSI, etc.

1.3.7 Namespace Namespace

is another very important concept in the Kubernetes system. By "assigning" objects inside the system to different Namespaces, logically grouped different projects, groups or User groups, so that different groups can be managed separately while sharing the resources of the entire cluster.
After the Kubernetes cluster is started, a namespace named "default" will be created, which can be viewed through Kubectl.
Using Namespace to organize various objects of Kubernetes can realize grouping of users, that is, "multi-tenant" management. Different tenants can also independently set and manage resource quotas, making the resource configuration of the entire cluster very flexible and convenient.

1.3.8 Annotation

Similar to Label, Annotation is also defined in the form of key/value pairs. Label has strict naming rules, it defines the metadata of Kubernetes objects (Metadata), and is used for Label Selector. Annotation is user-defined "additional" information, which is convenient for external tools to find.
The information recorded by Annotation includes:
1. build information, release information, Docker image information, etc., such as timestamp, release id number, PR number, image hash value, docker registry address, etc.;
2. Log library, monitoring library, analysis Address information of resource libraries such as libraries;
3. Program debugging tool information, such as tool name, version number, etc.;
4. Team contact information, such as phone number, person in charge, URL, etc. ß

1.3.9 Summary

The above components are the core components of the Kubernetes system, and together they constitute the framework and computing model of the Kubernetes system. By combining them flexibly, users can quickly and easily configure, create, and manage container clusters.
In addition to the above core components, there are many configurable resource objects in the Kubernetes system, such as LimitRange and ResourceQuota. In addition, for some objects Binding, Event, etc. used inside the system, please refer to the API documentation of Kubernetes.

1.4 Kubernetes overall architecture

A Kubernetes cluster consists of two types of nodes: Master and Node. Four components, etcd, API Server, Controller Manager, and Scheduler are run on the Master. The last three components constitute the master control center of Kubernetes, which is responsible for managing and scheduling all resources in the cluster. Run Kubelet, Proxy and Docker Daemon on each Node, which is responsible for managing the life cycle of the Pod on this node and implementing the function of service proxy. In addition, the Kubectl command-line tool can be run on all nodes, which provides Kubernetes' cluster management toolset

Kubernetes Architecture Study Notes

Guess you like