Survival probe (Liveness), readiness probe (Readiness), startup probe (Startup)
The biggest difference between readiness probe and liveness probe
The survival probe is to kill the container that fails the check, and create a new startup container to keep the pod working normally; the
readiness probe is to move the pod out of the service instead of restarting the container when the readiness probe fails the check. Make sure that all pods in the service are available, make sure that clients only interact with healthy pods and that clients never know there is a problem with the system.
1. Liveness Probe
survival probe
It is used to judge whether the container is alive (running state). If the LivenessProbe probe detects that the container is unhealthy, the kubelet will kill the container and perform corresponding processing according to the restart strategy of the container. If a container does not contain a LivenessProbe probe, kubelet thinks that the value returned by the LivenessProbe probe of the container will always be "Success".
detection mechanism
Kubernetes supports the following three detection mechanisms.
-
HTTP GET: Send an HTTP GET request to the container. If the Probe receives 2xx or 3xx, the container is healthy.
-
TCP Socket: Attempt to establish a TCP connection with the specified port of the container. If the connection is successfully established, the container is healthy.
-
Exec: Probe executes the command in the container and checks the status code of the command exit. If the status code is 0, the container is healthy.
HTTP GET (it is recommended to use httpGet probe for production environment)
The HTTP GET method is the most common detection method. Its specific mechanism is to send an HTTP GET request to the container. If the Probe receives 2xx or 3xx, it means that the container is healthy. The definition method is as follows.
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: rs-nginx-http
namespace: default
spec:
replicas: 3
selector:
matchLabels:
app: nginx-rs
matchExpressions:
- {
key: env, operator: In, values: [dev]}
template:
metadata:
labels:
app: nginx-rs
env: dev
spec:
containers:
- image: nginx:1.7.9
name: nginx-container
ports:
- containerPort: 80
livenessProbe: #为容器定义一个存活探针
httpGet: #探针类型为httpGet探针
# host: 10.244.0.49 #指定主机,默认是pod的ip,一般省略不写
path: / #路径
port: 80 #端口
initialDelaySeconds: 15 #初始延时时间为15s
periodSeconds: 5 #探测周期
timeoutSeconds: 2 #超时时间
failureThreshold: 9 #失败次数
Description of Survival Probe Properties
initialDelaySeconds:表示在容器启动后延时多久秒才开始探测;
periodSeconds:表示执行探测的频率,即间隔多少秒探测一次,默认间隔周期是10秒,最小1秒;
timeoutSeconds:表示探测超时时间,默认1秒,最小1秒,表示容器必须在超时时间范围内做出响应,否则视为本次探测失败;
successThreshold:表示最少连续探测成功多少次才被认定为成功,默认是1,对于liveness必须是1,最小值是1;
failureThreshold:表示连续探测失败多少次才被认定为失败,默认是3,连续3次失败,k8s 将根据pod重启策略对容器做出决定;
Note: When defining the survival probe, be sure to set the initialDelaySeconds property, which is the initial delay. If not set, the probe will start to detect when the default container starts. This may cause the application to not start ready, which will cause If the probe detection fails, k8s will kill the container according to the pod restart strategy and then recreate the inexplicable problem of the container.
In a production environment, be sure to define a liveness probe.
exec probe
The exec probe is to execute the shell command in the container, and judge the success or failure according to whether the command exit status code is 0;
[root@master ~]# cat rs_nginx.yaml
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: rs-nginx-http
namespace: default
spec:
replicas: 3
selector:
matchLabels:
app: nginx-rs
matchExpressions:
- {
key: env, operator: In, values: [dev]}
template:
metadata:
labels:
app: nginx-rs
env: dev
spec:
containers:
- image: nginx:1.7.9
name: nginx-container
ports:
- containerPort: 80
livenessProbe: #定义探针
exec: #探针类型为exec探针
command:
- cat #表示cat /usr/share/nginx/html/index.html文件,根据命令执行
- /usr/share/nginx/html/index.html #结果返回状态是否非0来表示探针状态
initialDelaySeconds: 8 #初始延时,表示容器启动15秒后才开始探测
periodSeconds: 3 #探测周期,3秒探测一次
timeoutSeconds: 2 #超时时间为2秒,表示必须2秒内做出回应,否则将视本次探测失败
failureThreshold: 1 #连续失败次数,表示连续失败1次就定义为失败,k8s将根据pod重启机制
#做出是否重启容器的就决定
#下面将进入容器内部删除 /usr/share/nginx/html/index.html文件,看探针是否生效
[root@master ~]# kubectl exec rs-nginx-http-2rng2 -it -- bash #进入容器
root@rs-nginx-http-2rng2:/# rm -rf /usr/share/nginx/html/index.html #删除/usr/share/nginx/html/index.html
root@rs-nginx-http-2rng2:/# exit #退出容器
[root@master ~]# kubectl get pods #查看pod
NAME READY STATUS RESTARTS AGE
pod/rs-nginx-http-2rng2 1/1 Running 1 (11s ago) 74s #发现pod的容器重启过一次
pod/rs-nginx-http-bgr4m 1/1 Running 0 74s
pod/rs-nginx-http-cv9bv 1/1 Running 0 74s
[root@master ~]# kubectl describe pod/rs-nginx-http-2rng2 #查看pod信息,可以看出探针探测失败,容器被重新启动了
................
Restart Count: 1
Liveness: exec [cat /usr/share/nginx/html/index.html] delay=8s timeout=2s period=3s #success=1 #failure=1
................................
Normal Pulled 9m32s (x2 over 10m) kubelet Container image "nginx:1.7.9" already present on machine
Normal Created 9m32s (x2 over 10m) kubelet Created container nginx-container
Warning Unhealthy 9m32s kubelet Liveness probe failed: cat: /usr/share/nginx/html/index.html: No such file or directory
Normal Killing 9m32s kubelet Container nginx-container failed liveness probe, will be restarted
Normal Started 9m31s (x2 over 10m) kubelet Started container nginx-container
[root@master ~]#
tcpSocket probe
The tcpSocket probe initiates a tcp Socket link according to the IP address and port of the container, and detects whether a tcp Socket link can be successfully established with the port of the container;
[root@master ~]# cat rs_nginx.yaml
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: rs-nginx-http
namespace: default
spec:
replicas: 3
selector:
matchLabels:
app: nginx-rs
matchExpressions:
- {
key: env, operator: In, values: [dev]}
template:
metadata:
labels:
app: nginx-rs
env: dev
spec:
containers:
- image: nginx:1.7.9
name: nginx-container
ports:
- containerPort: 80
livenessProbe: #定义存活探针
tcpSocket: #探针类型为tcpSocket探针
# host: 10.244.0.49 #指定主机,默认是pod的ip,一般省略不写
port: 80 #探测与80端口是否能成功建立tcp链接
initialDelaySeconds: 8
periodSeconds: 3
timeoutSeconds: 2
failureThreshold: 1
2. Readiness Probe
readiness probe
It is used to judge whether the container is started (ready state) and can receive requests. If the ReadinessProbe detects a failure, the state of the Pod is modified. The Endpoint Controller will delete the Endpoint containing the Pod where the container is located from the Service's Endpoint.
We know that when a pod is started, it will immediately be added to the endpoint ip list of the service and start to receive connection requests from the client. If the business process of the container in the pod has not been initialized at this time, then these client connection requests It will fail. In order to solve this problem, kubernetes provides readiness probes to solve this problem.
The container in the pod defines a readiness probe. The readiness probe checks the container periodically. If the readiness probe check fails, it means that the pod is not ready and cannot accept client connections, and the pod will be removed from the endpoint list. If the service is removed, the request will not be distributed to the pod, and then the readiness probe will continue to check, and if the container is ready, the pod will be added back to the endpoint list.
Be sure to define the readiness probe
First of all, if no readiness probe is defined, the newly created pod will be added to the endpoint list of the service immediately. If the business program in the container takes a long time to start processing the incoming link, and at this time, the service will send the client If the connection is forwarded to the pod, then the client will see "connection refused" and other types of errors, so it is necessary to define a readiness probe for each container, even if it is a simple http get probe Needle.
In the production environment, developers are generally asked to provide the health check interface of the container. There is already a native health check interface in spring boot. After developing and defining the health check interface, we use the http get method to define our probes.
Three Types of Readiness Probes
There are also three types of readiness probes, exec, httpGet and tcpSocket.
exec:执行容器中的命令并检查命令退出的状态码,如果状态码为0则说明容器已经准备就绪;
httpGet:向容器的ip,端口、路径发送http get请求,通过响应的http状态码是否位于200-400之间判断容器是否准备就绪;
tcpSocke:向容器的IP地址、端口发起tcp socket链接,如果能正常建立链接,则认为容器已经准备就绪。
Properties of the Readiness Probe
The additional properties of the readiness probe are as follows:
initialDelaySeconds:延时秒数,即容器启动多少秒后才开始探测,不写默认容器启动就探测;
periodSeconds :执行探测的频率(秒),默认为10秒,最低值为1;
timeoutSeconds :超时时间,表示探测时在超时时间内必须得到响应,负责视为本次探测失败,默认为1秒,最小值为1;
failureThreshold :连续探测失败的次数,视为本次探测失败,默认为3次,最小值为1次;
successThreshold :连续探测成功的次数,视为本次探测成功,默认为1次,最小值为1次;
Create an exec liveness probe
The exec survival probe executes the command in the container and checks the status code of the command exit. If the status code is 0, it means that the container is ready. Next, use deployment to deploy
3 pods with exec readiness probes (the service already exists). As follows:
[root@master ~]# cat deplyment_nginx.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: deployment-nginx
namespace: default
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- image: nginx:1.7.9
name: nginx-container
readinessProbe: #定义exec探针
initialDelaySeconds: 5 #容器启动5秒后才开始探测
periodSeconds: 10 #探针周期为10s探测一次
timeoutSeconds: 2 #超时时间,2秒得不到响应则视为探测失败
failureThreshold: 3 #连续探测失败3次,视为本次探测失败
successThreshold: 1 #探测成功1次则视为本次探测成功
exec: #探针类型为exec
command: #执行的命令是ls /var/ready
- ls
- /var/ready
ports:
- name: http
containerPort: 80
[root@master ~]# kubectl get pods #查看pod,发现没有一个是ready状态的
NAME READY STATUS RESTARTS AGE
deployment-nginx-68bb45dd46-5rj7c 0/1 Running 0 3m1s
deployment-nginx-68bb45dd46-78ld2 0/1 Running 0 3m1s
deployment-nginx-68bb45dd46-gnnhn 0/1 Running 0 3m1
[root@master ~]# kubectl describe pods deployment-nginx-68bb45dd46-5rj7c |tail -n 3 #查看某个pod的详细信息,显示存活探针检测失败了
Normal Created 5m7s kubelet Created container nginx-container
Normal Started 5m7s kubelet Started container nginx-container
Warning Unhealthy 2m18s (x22 over 5m6s) kubelet Readiness probe failed: ls: cannot access /var/ready: No such file or directory
[root@master ~]#
[root@master ~]# kubectl get ep svc-rc-nginx-nodeport #查看service的endpoint列表,显示没有任何pod ip列表可用
NAME ENDPOINTS AGE
svc-rc-nginx-nodeport 6h32m
# 现在,我们手动在某个pod中创建一个/var/ready文件,这样pod就会处于准备就绪状态,service就能成功添加该pod
[root@master ~]# kubectl exec deployment-nginx-68bb45dd46-5rj7c -- touch /var/ready #为该pod创建/var/ready文件
[root@master ~]# kubectl get pods,ep -o wide #查看该pod,发现pod已处于就绪状态
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/deployment-nginx-68bb45dd46-5rj7c 1/1 Running 0 8m52s 10.244.1.77 node1 <none>
pod/deployment-nginx-68bb45dd46-78ld2 0/1 Running 0 8m52s 10.244.1.76 node1 <none>
pod/deployment-nginx-68bb45dd46-gnnhn 0/1 Running 0 8m52s 10.244.2.38 node2 <none>
NAME ENDPOINTS AGE #endpoint已经有该pod的ip端口
endpoints/svc-rc-nginx-nodeport 10.244.1.77:80 6h36m
[root@master ~]#
Create a httpGet liveness probe
The httpGet survival probe sends an http get request to the container, and judges whether the container is ready by the response http status code;
the following uses the deployment to deploy three pods with the httpGet readiness probe, as shown below:
..........
containers:
- image: nginx:1.7.9
name: nginx-container
readinessProbe: #定义探针
httpGet: # 探针类型为httpGet
# host: 10.244.0.49 #指定主机,默认是pod的ip,一般省略不写
path: /read
port: 80
...........
Create a tcpSocket liveness probe
The tcpSocket survival probe opens a tcp connection to the specified port of the container. If the connection is established, the container is considered ready.
Next, use the deployment to deploy the pod with the tcpSocket readiness probe, as follows:
..........
containers:
- image: nginx:1.7.9
name: nginx-container
readinessProbe: #定义探针
tcpSocket: # 探针类型为tcpSocket
# host: 10.244.0.49 #指定主机,默认是pod的ip,一般省略不写
port: 80
...........
Three, start the probe (StartupProbe)
start probe
Indicates whether the application in the container has been started. If a startup probe is provided, all other probes are disabled until it succeeds. If the startup probe fails, the kubelet will kill the container and the container will be restarted following its restart policy. If the container does not provide a start probe, the default status is Success.
startupProbe is a startup probe. The startupProbe probe is a probe added in version 1.16. It is used to determine whether the container process has started. It is to solve problems such as long program startup time and slow startup. When the startupProbe startup probe is configured, it will first Disable other probes until the startupProbe probe is successful. After success, it will exit without probing. If the startupProbe probe fails, the pod will restart.