K8S avoid the pit Guide - Deployment update POD contents can not receive the signal SIGTERM

Brief

After containerization, released at the time of application, a service restart, resulting in a large number of the caller to the service error until the service is restarted to complete. Content is being given RPC call fails, this is our RPC elegant closed, that is, in the process SIGTERM signal is received, we ShutdownHook mechanism by the JVM, registered unregister the hook RPC services, in the process of closing SIGTERM when the application itself will take the initiative to prevent the removal of a large number of the caller error from the registry. But why after containerized can cause this problem?

Troubleshooting

Application start normally

within view of the container process

# yum install psmisc
# pstree -p
bash(1)───java(22)─┬─{java}(23)
                   ├─{java}(24)
                   ├─{java}(25)
                   ├─{java}(26)
                   ├─{java}(27)
                   ├─{java}(28)
                   ├─{java}(29)
                   ├─{java}(30)
                    ...
# ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 09:50 ?        00:00:00 /bin/bash run.sh start
root        22     1 15 09:50 ?        00:01:20 /app/3rd/jdk/default/bin/java -Xmx512m -Xms512m ...
root        49     0  0 09:51 pts/0    00:00:00 bash
root       263    49  0 09:59 pts/0    00:00:00 ps -ef

In a container normally kill 22 sub-processes, we can see our application shutdown hooks can properly handle the aftermath

However, in the actual production, we deploy rollover, be deleted pod by looking at the log and found that pod is terminate when the application process does not correctly handle the SIGTERM signal problems.

problem analysis

According to the survey of Kubernetes mechanism, as shown:
https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods

because our container through run.sh script starts, the front screenshot we can see, java process is the child process number 1 run.sh process, the corresponding Kubernetes principle, we can see No. 22 java process in the POD delete does not necessarily receive SIGTERM, so we are led to the shutdown hook does not take effect.

problem solved

既然已经定位问题,那么解决问题的方法就有了思路,run.sh执行java进程后,将进程上下文让给java进程,java进程接管,java进程变为容器内的1号进程。
我们参考了这篇文章受到启发
https://yeasy.gitbooks.io/docker_practice/content/image/dockerfile/entrypoint.html
在run.sh执行java前面增加exec命令即可

然后,重新build镜像,发布,然后重启,再查看重启前POD留下的日志

问题解决!

Guess you like

Origin yq.aliyun.com/articles/705984