Cloud native technology open class study notes: shared storage principle, health check, monitoring and log

Seven, shared storage principle

1. Introduction to Volumes

1)、Pod Volumes

Insert picture description here

First look at the usage scenarios of Pod Volumes:

  • Scenario 1: If a container in the pod exits abnormally during runtime and is pulled up again by the kubelet, how to ensure that the important data generated by the previous container is not lost?
  • Scenario 2: If multiple containers in the same pod want to share data, what should be done?

The above two scenarios can actually be solved with the help of Volumes. Next, let's first look at the common types of Pod Volumes:

  • Local storage, emptydir/hostpath is commonly used
  • Network storage: There are two current implementation methods for network storage. One is in-tree. The code for its implementation is placed in the K8s code repository. As K8s supports more storage types, this method will give K8s more support. The maintenance and development of itself bring a great burden; and the second implementation method is out-of-tree, its implementation is actually to decouple K8s itself, through abstract interfaces to implement different stored drivers from the K8s code warehouse Medium stripping
  • Projected Volumes: It actually mounts some configuration information, such as secret/configmap in the form of volumes in the container, so that the programs in the container can access the configuration data through the POSIX interface
  • PV and PVC

2)、PV

Insert picture description here

Now that there are already Pod Volumes, why should we introduce PV again? The life cycle of the volume declared in the Pod is the same as that of the Pod. There are several common scenarios as follows:

  • Scenario 1: Pod reconstruction and destruction. For example, a pod managed by Deployment will generate new pods and delete the old pods during the image upgrade process. How to reuse data between the new and old pods?
  • Scenario 2: When the host is down, the above pods must be migrated. At this time, the pods managed by StatefulSet have actually realized the semantics of volume migration. At this time, it is obviously impossible to use Pod Volumes
  • Scenario 3: How to declare if you want to share data among multiple pods? We know that if multiple containers in the same pod want to share data, it can be solved with the help of Pod Volumes; when multiple pods want to share data, it is difficult for Pod Volumes to express this semantics.
  • Scenario 4: If you want to expand the data volume, such as snapshot and resize, how should you do it?

In the above scenario, it is difficult to accurately express its reuse/sharing semantics through Pod Volumes, and it is also difficult to extend it. Therefore, K8s introduces the concept of Persistent Volumes , which can separate storage and computing, manage storage resources and computing resources through different components, and then decouple the life cycle relationship between pod and volume. In this way, when the pod is deleted, the PV used by it still exists and can be reused by the newly created pod

3)、PVC

Insert picture description here

When users use PV, they actually use PVC. Why do they design PVC with PV? The main reason is to simplify the way K8s users use storage and achieve separation of duties. Usually when users use storage, they only need to declare the required storage size and access mode

What is the access mode? In fact: Can the storage I want to use be shared by multiple nodes or can only be accessed exclusively by a single node (note that it is the node level instead of the pod level)? Read-only or read-write access? Users only need to care about these things, and the implementation details related to storage do not need to care about

Through the concept of PVC and PV, user requirements and implementation details are decoupled, and users only need to declare their storage requirements through PVC. PV has cluster administrators and storage-related teams to unify operation, maintenance and management. In this way, it simplifies the way users use storage

Since the PV is managed and controlled by the cluster administrator, let’s take a look at how the PV object is generated.

4)、Static Volume Provisioning

Insert picture description here

Static provisioning (static provisioning): The cluster administrator plans in advance how users in this cluster will use storage. It will pre-allocate some storage, that is, create some PVs in advance; then users submit their own storage requirements (also It is PVC), the internal components of K8s will help it to bind the PVC and PV; then when the user uses the storage through the pod, the corresponding PV can be found through the PVC, and it can be used.

What are the shortcomings of the static generation method? It can be seen that the cluster administrator needs to pre-allocate first, and pre-allocation is actually difficult to predict the real needs of users. To give a simple example: if the user needs 20G, but the cluster administrator may have 80G, 100G, but not 20G, so it is difficult to meet the real needs of users, and it will also cause waste of resources.

5)、Dynamic Volume Provisioning

Insert picture description here

Dynamic supply : That is to say, the cluster administrator does not pre-allocate PV, he wrote a template file, this template file is used to create a certain type of storage (block storage, file storage, etc.) required parameters, these parameters are If the user does not care, implement the relevant parameters for the storage itself. Users only need to submit their own storage requirements, which is the PVC file, and specify the storage template ( StorageClass ) used in the PVC

The management and control components in the K8s cluster will combine the information dynamics of PVC and StorageClass to generate the storage (PV) required by the user. After the PVC and PV are bound, the pod can use the PV. Generate storage templates required for storage through StorageClass configuration, and dynamically create PV objects based on user needs to achieve on-demand distribution, which frees the cluster administrator's operation and maintenance work without increasing user difficulty.

2. Use case interpretation

1), the use of Pod Volumes

Insert picture description here

In the Volumes field in the pod yaml file, declare the name of our volume and the type of volume. The two volumes declared, one uses emptyDir and the other uses hostPath, both of which are local volumes

How should this volume be used in the container? It can actually pass the volumeMounts field. The name specified in the volumeMounts field is actually which volume it uses, and mountPath is the mount path in the container.

There is also a subPath, what is subPath? Let's take a look first, both of these containers specify the same volume, which is the cache-volume. Then, when multiple containers share the same volume, in order to isolate data, we can use subPath to complete this operation. It will create two subdirectories in the volume, and then the data written by container 1 to the cache is actually written in the subdirectory cache1, and the directory written by container 2 to the cache will eventually fall into cache2 under the subdirectory in this volume.

There is also a readOnly field. ReadOnly actually means a read-only mount. For this mount, you actually have no way to write data below the mount point.

In addition, emptyDir and hostPath are both local storage. What are the subtle differences between them? emptyDir is actually a directory that will be temporarily created during the pod creation process. This directory will be deleted as the pod is deleted, and the data in it will be emptied; hostPath, as the name implies, is actually a path on the host machine, deleted in the pod After that, this directory still exists, and its data will not be lost. This is a subtle difference between the two

2) Use of static PV

Insert picture description here

The static PV is first created by the administrator. Here, the administrator uses NAS, which is Alibaba Cloud file storage, as an example. I need to create NAS storage on Alibaba Cloud’s file storage console first, and then fill in the relevant information of the NAS storage in the PV object. After this PV object is pre-created, users can declare their storage requirements through PVC, and then Then go to create the pod. To create a pod or to mount the storage under a certain mount point in a certain container through the fields we just explained

Insert picture description here

The PV corresponding to the Alibaba Cloud NAS file storage just created has a more important field: capacity , which is the size of the created storage, accessModes , and the access mode of the created storage

Then there is a ReclaimPolicy (PV recycling policy): After this piece of storage is used, after its user pod and PVC are deleted, should this PV be deleted or retained?

Next, let's see how users use the PV object. When users use storage, they need to create a PVC object first. In the PVC object, you only need to specify the storage requirements, and don't care about the specific implementation details of the storage itself. What are the storage requirements? The first is the required size, which is resources.requests.storage; the second is its access method, that is, the access method that requires this storage, which is declared here as ReadWriteMany, which also supports multi-node read and write access, which is also a typical feature of file storage

Insert picture description here

On the left side of the figure above, you can see this statement: its size and access mode actually match the statically created PV we just created. In this case, when the user submits the PVC, the components related to the K8s cluster will bind the PVC of the PV together. Later, when the user submits the pod yaml, he can write a PVC statement in the volume, and in the PVC statement, the claimName can be used to declare which PVC to use. At this time, the mounting method is actually the same as the previous one. When the yaml is submitted, it can find the PV bound by the PVC, and then use that piece of storage. This is a process from static provisioning to being used by pod

3), dynamic PV use

Insert picture description here

This template file is called StorageClass. In StorageClass, we need to fill in important information: the first is provisioner, provisioner actually means when creating PV and corresponding storage, which storage plug-in should be used to create

These parameters are some detailed parameters that need to be specified when creating storage through K8s. For these parameters, the user does not need to care, like here regionld, zoneld, fsType and its type. ReclaimPolicy is the dynamically created piece of PV. When the user ends the use and the Pod and PVC are deleted, what should be done with this piece of PV? What we write here is delete, which means that when the user’s pod and PVC are deleted , This PV will also be deleted

Next, let’s take a look. After the cluster administrator submits the StorageClass, that is, after submitting the template for creating PV, how to use the user? First, you need to write a PVC file

Insert picture description here

The size and access mode stored in the PVC file are unchanged. Now we need to add a new field, called StorageClassName , which means to specify the name of the template file for dynamically creating PV, where StorageClassName is filled in with the csi-disk declared above

After submitting the PVC, the relevant components in the K8s cluster will dynamically generate this PV according to the PVC and the corresponding StorageClass to bind the PVC. Then, when the user submits his own yaml, the usage and the following process are as before The static usage method is the same. Find the PV we dynamically created through the PVC, and then mount it to the corresponding container to use it.

4) Analysis of important fields of PV Spec

Insert picture description here

Capacity : the size of the storage object

The way AccessModes uses this PV. It can be used in three ways:

  • One is single node read and write access
  • The second is read-only access by multiple nodes, which is a common way to share data
  • The third type is read and write access on multiple nodes

When a user submits a PVC, the two most important fields: Capacity and AccessModes . After submitting the PVC, how do the relevant components in the K8s cluster find the appropriate PV? First, it finds all the PV lists that can meet the AccessModes requirements in the user's PVC through the AccessModes index established for the PV, and then further filters the PV according to the Capacity, StorageClassName, and Label Selector of the PVC. If there are multiple PVs that meet the conditions, select PV The PV with the smallest size and the shortest accessmodes list, that is, the smallest fit principle

ReclaimPolicy : After the user-side PV's PVC is deleted, how should the PV be handled? There are two common ways

  • The first way is delete, which means that after the PVC is deleted, the PV will also be deleted
  • The second way to retain is to retain. After retention, the latter PV needs to be processed manually by the administrator

StorageClassName : A field that must be specified during dynamic provisioning, that is, to specify which template file to use to generate PV

NodeAffinity : In other words, the PV created is limited by which nodes it can be mounted and used. Then use NodeAffinity to declare the restrictions on the node. In fact, there are restrictions on the pod scheduling that uses the PV, which means that the pod must be scheduled to these nodes that can access the PV before the PV can be used.

5) Circulation of PV status

Insert picture description here

First, after the PV object is created, it will be in a short pending state; after the real PV is created, it will be in the available state

The available state means the state that can be used. After the user submits the PVC, the K8s related components are bound (that is, the corresponding PV is found). At this time, the PV and PVC are combined, and both are in the bound. status. When the user finishes using the PVC and deletes it, the PV is in the released state. Should it be deleted or kept? This will depend on ReclaimPolicy

There is one point that needs special explanation: when the PV is already in the released state, it has no way to directly return to the available state, which means that it cannot be bound by a new PVC next.

If we want to reuse the released PV, what should we usually do at this time? The first way: we can create a new PV object, and then fill in the relevant field information of the previously released PV into the new PV object, in this case, this PV can be combined with the new PVC; the second is in After deleting the pod, do not delete the PVC object, so the PVC bound to the PV still exists. The next time the pod is used, it can be reused directly through the PVC. The migration of Pod with storage managed by StatefulSet in K8s is in this way

8. Observability: Is your application healthy?

1. Source of demand

Insert picture description here

2、Liveness与Readiness

1), first acquainted with Liveness and Readiness

Readiness probe is also called ready probe , which is used to determine whether a pod is in the ready state. When a pod is in the ready state, it can provide corresponding services to the outside, that is to say, the traffic of the access layer can hit the corresponding pod. When the pod is not in the ready state, the access layer will remove the corresponding traffic from the pod

The following figure is actually an example of Readiness:

Insert picture description here

When the pod pointer judges that it has been in a failed state, the traffic of the access layer will not hit the current pod.

Insert picture description here

When the state of this pod is changed from the state of FAIL to the state of success, it can truly carry this traffic

The Liveness pointer is similar, it is a survival probe , used to determine whether a pod is alive

Insert picture description here

When a pod is in a non-viable state, the upper-level judgment mechanism will determine whether the pod needs to be re-raised. Then if the restart strategy configured by the upper layer is restart always, then the pod will be pulled up directly at this time

2) How to use

Insert picture description here

Liveness pointer and Readiness pointer support three different detection methods:

  • httpGet: It is judged by sending an http Get request. When the return code is a status code between 200-399, it indicates that the application is healthy
  • Exec: It judges whether the current service is normal by executing a command in the container. When the return result of the command line is 0, it indicates that the container is healthy
  • tcpSocket: TCP health check is performed by detecting the IP and Port of the container. If the TCP connection can be established normally, then the current container is identified as healthy

In terms of detection results, there are mainly three types:

  • The first is success. When the status is success, it means that the container has passed the health check, that is, the Liveness probe or Readiness probe is a normal state.
  • The second is Failure. Failure means that the container has not passed the health check. If it does not pass the health check, then a corresponding process will be performed at this time. One way to process Readiness is through service. The service layer will not be removed by the pod of Readiness, and Liveness will pull up or delete the pod again
  • The third state is Unknown. Unknown means that the current execution mechanism has not performed a complete execution. It may be because of similar timeouts or some scripts that did not return in time. Then the Readiness probe or Liveness probe will not do anything at this time. An operation will wait for the next mechanism to be checked

3 、 、 Pod Probe Spec

Insert picture description here

1)exec

As shown in the figure above, this is a Liveness probe, which is configured with an exec diagnosis. Next, it configures a command field. In this command field, a specific file is used to determine the current Liveness probe status. When the result returned in this file is 0, or when the command returns 0, It will think that the pod is in a healthy state at this time

2)httpGet

One of the fields in httpGet is the path, the second is the port, and the third is the headers. This place sometimes needs to make a judgment of health through a mechanism like the header header, you need to configure this header, under normal circumstances, you may only need to pass health and port.

3)tcpSocket

tcpSocket only needs to set a detection port, like this example, port 8080 is used. When the 8080 port tcp connect connection is established normally, tecSocket Probe will consider it as a healthy state.

4) In addition, there are the following five parameters, which are Global parameters

  • The first parameter is called initialDelaySeconds, which indicates how long the pod is delayed for checking. For example, if there is a Java application, it may take a long time to start. Therefore, in the early stage, there may be a period of time that cannot be detected, and this time is predictable. At this time, you may need to set the initialDelaySeconds

  • The second is periodSeconds, which represents the time interval of detection. The normal default value is 10 seconds.

  • The third field is timeoutSeconds, which represents the detection timeout period. When the detection is not successful within the timeout period, it will be considered as a failure state

  • The fourth is successThreshold, which means: when the pod fails to detect the detection success again, the required threshold number of times, by default is 1 time, which means that the original failed, then the next detection succeeds this time , It will think that the pod is in a state where the probe state is normal

  • The last parameter is failureThreshold, which represents the number of retries for detection failures. The default value is 3, which means that when three consecutive detections fail from a healthy state, then it will be judged that the current state of the pod is at one Failed state

4) Summary of Liveness and Readiness

Insert picture description here

3. Problem diagnosis

Insert picture description here

It is actually a life cycle of a Pod. At first it is in a pending state, then it may switch to something like running, it may switch to Unknown, or it may even switch to failed. Then, when running has been executed for a period of time, it can switch to something like successful or failed, and then when it appears in the unknown state, it may be restored to running or succeeded or failed due to some state restoration.

In fact, a state of K8s as a whole is converted based on a mechanism similar to a state machine, and the conversion between different states will be left on the corresponding K8s object with fields like Status or Conditions for representation.

Insert picture description here

9. Observability: monitoring and logging

1. Monitoring

1), monitoring type

Insert picture description here

2) The monitoring evolution of Kubernetes

In the early days, that is, the K8s version before 1.10. Everyone will use components like Heapster to monitor the collection. The design principle of Heapster is actually relatively simple.

Insert picture description here

First of all, there is a packaged cadvisor on each Kubernetes, this cadvisor is the component responsible for data collection. When cadvisor completes the data collection, Kubernetes will wrap the data collected by cadvisor and expose it to the corresponding API. In the early days, there were actually three different APIs:

  • The first is the summary interface
  • The second is the kubelet interface
  • The third is the Prometheus interface

For these three interfaces, the corresponding data source is actually cadvisor, but the data format is different. In Heapster, it actually supports two data collection interfaces: summary interface and kubelet. Heapster will regularly go to each node to pull data, aggregate it in its own memory, and then expose the corresponding service for upper-level consumers. use. The more common consumers in K8s, similar to dashboards, or HPA-Controller, it obtains the corresponding monitoring data by calling the service to achieve the corresponding elastic scaling and a display of the monitoring data

Insert picture description here

The picture above is an architecture inside Heapster. Divided into several parts, the first part is the core part, and then the upper layer has this API exposed through standard http or https. Then there is the source part in the middle, the source part is equivalent to the different interfaces exposed by the collected data, and then the processor part is the part for data conversion and data aggregation. Finally, there is the sink part. The sink part is responsible for data offline. This is the architecture of an early Heapster application. In the later period, K8s made a standardization of this monitoring interface, and gradually tailored Heapster and transformed it into metrics-server

Insert picture description here

The current 0.3.1 version of metrics-server has a rough structure as shown in the figure above, which is very simple: there is a core layer, a middle source layer, and a simple API layer, with an additional layer of API Registration. The function of this layer is that it can register the corresponding data interface to the API server of K8s. In the future, customers no longer need to access the metrics-server through this API layer, but can use this API registration layer to access the API through the API server. Registration layer, then to metrics-server. In this case, what the real data consumer may perceive is not a metrics-server, but a specific implementation of such an API, and this implementation is metrics-server. This is the biggest change in metrics-server

Insert picture description here

2. Log

Insert picture description here

Insert picture description here

From which location log collection is collected, the following three types need to be supported:

  • The first is the host file. This scenario is more common. In my container, the log file is written on the host through a similar volume. The log is rotated through the log rotation strategy of the host, and then collected by the agent on my host

  • The second is that there are log files in the container. How to deal with this common way? A more common way is to transfer to stdout through a Streaming container of Sidecar, and write to the corresponding log-file through stdout. Then it is rotated through a local log, and then collected by an external agent

  • The third is that we write directly to stdout. This is a relatively common strategy. The first is to directly use this agent to collect to the remote. In the second, I directly collect to the remote through a standard API like some sls.

Insert picture description here

Course address : https://edu.aliyun.com/roadmap/cloudnative?spm=5176.11399608.aliyun-edu-index-014.4.dc2c4679O3eIId#suit

Guess you like

Origin blog.csdn.net/qq_40378034/article/details/112131350