Interviewer: Dubbo service restarts and goes offline, and consumers continue to call, what should I do?

This article was first published on the public account [Watch the code and go to work]. It is recommended to pay attention to the public account and read the latest articles in time.

The original text you must read: mp.weixin.qq.com/s?__biz=MzI…

Hello everyone, I am tin, this is my 20th original article​ Have we ever thought about a question, in a distributed system, when the service node restarts, the consumer traffic continues to call the node, then this part calls all Abnormal, for online users, system failure?

In view of this situation, we can think like this: When the Dubbo service restarts and goes offline, will consumers continue to call this service node?

The answer is of course not .

Today, let’s take a look at the dubbo source code together, first go to a directory:

1. Graceful shutdown of the operating system

Graceful shutdown means that when the application is stopped, a series of "operations" are performed to ensure that the application is shut down normally. These operations include rejecting new requests and connections, closing service registrations, waiting for existing requests to complete, closing thread pools, closing connections, and releasing resources.

Graceful shutdown can avoid problems such as task abandonment, data loss, and application exceptions that may be caused by abnormal shutdown of the program. Graceful shutdown is essentially some extra process that a process executes just before it shuts down.

The operating system itself also thinks about graceful shutdown.

When we stop a process, we will use the kill -9 and kill -15 commands. However, we generally do not recommend using kill -9.

This is because the kill -9 command is very strong, it will send a SIGKILL signal to the operating system kernel, which directly stops the process and cannot be blocked or ignored.

Kill -9 is obviously not so elegant. If the process has unprocessed tasks when the SIGKILL signal is issued, the tasks will be discarded or terminated, resulting in unpredictable errors in the data.

Using kill -15 is different. At this time, a SIGTERM signal is sent to the system kernel.

When a process receives a SIGTERM signal, it is up to the process to decide how to deal with it. The process can reject new external tasks and complete existing tasks before stopping.

在Docker中,docker stop相当于kill -15,他会向容器内的进程发送SIGTERM信号,在10S之后(可通过参数指定)再发送SIGKILL信号。

而docker kill就像kill -9,直接发送SIGKILL信号。

二、JVM的优雅下线

我们Java应用运行时就是一个独立的进程,它的关闭也即是JVM的关闭。

JVM的关闭也分为正常关闭、强制关闭、异常关闭。

JVM的正常关闭是Java程序优雅停机的关键,正常关闭的过程中,JVM可以做一系列预善后工作,比如线程池、连接池任务的完成和资源的释放等。

JVM还预留了钩子机制,供我们开发自行处理一些特定操作。

这种钩子机制就是JDK中提供的shutdown hook,它有一个方法Java.Runtime.addShutdownHook(Thread hook),可以注册一个JVM关闭的钩子。

比如我们写一个关闭【看点代码再上班】书库的钩子,如下:

package com.tin.example.shutdown.hook;

/**
 * title: ShutdownHookTest
 * <p>
 * description:
 *
 * @author tin @公众号【看点代码再上班】 on 2022/4/5 下午12:27
 */
public class JVMShutdownHookTest {
    public static void main(String[] args) throws Exception {
        Runtime.getRuntime().addShutdownHook(new Thread(JVMShutdownHookTest::doSomething));

        while (true) {
            System.out.println("i am running...");
            Thread.sleep(500);
        }
    }

    /**
     * 停机前处理事项
     */
    private static void doSomething() {
        System.out.println("关闭【看点代码再上班】书库。");
    }
}
复制代码

执行命令:

ericli@EricdeAir IntelliJIdea2020.1 % jps
5458 Jps
5394 KotlinCompileDaemon
2986 QuorumPeerMain
5451 Launcher
5452 JVMShutdownHookTest
3023 ZooKeeperMain
ericli@EricdeAir IntelliJIdea2020.1 % kill 5452
复制代码

本地测试控制台结果输出如下: ​ 从测试输出可以看出,当我执行kill pid(实际也是kill -15命令)命名时,我们的Java程序会先执行ShutDownHook钩子逻辑,最后才退出进程。

三、Spring容器的优雅下线

Spring的优雅下线也使用到了JVM的ShutdownHook。

首先,在application context被load时会注册一个ShutdownHook。这个ShutdownHook会在进程退出前执行销毁bean、容器的销毁等操作。

同时,Spring除了销毁bean等操作之外,还会发出一个ContextClosedEvent事件,很多基于Spring容器的三方框架都可以监听这个事件实现更多的优雅停机操作。

比如,我们可以监听Spring的事件:

@Override
    public void onApplicationContextEvent(ApplicationContextEvent event) {
        if (event instanceof ContextClosedEvent) {
            //自定一些容器关闭前业务要进行的操作
            onContextClosedEvent((ContextClosedEvent) event);
        }
    }
复制代码

四、Dubbo的优雅停机

Dubbo的优雅停机从起初的2.5版本到当前的2.7,经历了多次优化。

最终版本也同样采用到了Spring的ApplicationContextEventListener接口,监听Spring容器的close事件。

在org.apache.dubbo.config.spring.extension.SpringExtensionFactory类中,就有关于ShutDownHook注册的相关代码: ​ 特别有意思的是,这里有一个issue,看②处代码,它是一个关闭注册的动作,①处则是一个注册的动作。

既要注册又要注销,这是为什么呢?

Dubbo的开发者们已经发现,Spring容器下的Dubbo进程会触发两次关机钩子。

一次在org.apache.dubbo.config.bootstrap.DubboBootstrap中: ​ 另一次是在org.apache.dubbo.config.spring.context.DubboBootstrapApplicationListener中通过监听spring的关闭事件来关闭dubbo服务: ​ 这会导致什么后果?

答:会导致高并发情况下,一些任务线程获取不到资源而抛异常。

这个issue中也讲得很明白了,如下:

dubbo没有关闭完,数据库连接池已经关闭,导致应用并发高的时候很多没跑完的dubbo线程拿不到数据库连接抛出异常。

If there is an exception, please attach the exception trace:

msg:DUBBO服务异常 remoteHost:xxx.xx.xxx.xxx service:com.xxx.service.xxxService method:xxx message:nested exception is org.apache.ibatis.exceptions.PersistenceException:

Error querying database. Cause: org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is java.sql.SQLException: HikariDataSource HikariDataSource (HikariPool-1) has been closed.

这也是Dubbo优化多次后做出的处理结果,以下文章描述得非常详细,值得一看: www.cnkirito.moe/dubbo-grace…

我们再来看看Dubbo ShutdownHook都做了哪些事情。

org.apache.dubbo.config.DubboShutdownHook#doDestroy

  1. 注册中心数据销毁:关闭注册中心中本节点对应的提供者地址以及订阅数据。

  2. 协议流程数据销毁:销毁所有协议,包括所有已经暴露和引用的服务,释放协议所占用的所有资源,比如连接和端口。

Dubbo销毁注册中心的同时还需要注销Spring容器,这部分工作由Spring的ShutdownHook完成。

org.springframework.context.support.AbstractApplicationContext#doClose

​ 发布close事件,就有利于我们应用层自己做一些特殊处理逻辑,比如Dubbo关闭注册中心。

在关闭Spring容器之前,Dubbo已经关闭注册中心上对应的服务注册,不再接收新请求进来,同时还会把本地未完成的任务做完,最后才销毁bean以及关闭进程。

五、结语

我是tin,一个在努力让自己变得更优秀的普通工程师。自己阅历有限、学识浅薄,如有发现文章不妥之处,非常欢迎加我提出,我一定细心推敲并加以修改。

When you see this, please arrange a "Three Links" (share, like, and watch) and then go. It is not easy to insist on creating. Your positive feedback is the most powerful driving force for me to persist in output, thank you!

Guess you like

Origin juejin.im/post/7084799494478364680