How to add some custom mechanisms to Spring Boot gracefully shutdown

Personal Creation Convention: I declare that all the articles I create are my own. If there is any reference to any article, it will be marked. If there are any omissions, you are welcome to criticize. If you find any plagiarism of this article online, please report it, and actively submit an issue to this github repository , thank you for your support~

We know that since the Spring Boot 2.3.x version, a graceful shutdown mechanism has been introduced. We also deployed this mechanism online to enhance the user experience. Although now everyone basically uses mechanisms such as eventual consistency and transactions to ensure that the business can be kept correct even if it is not closed gracefully. However, this will always lead to short-term data inconsistency, which affects the user experience. Therefore, introduce graceful shutdown to ensure that the current request is processed, and then start Destroy all the beans in the ApplicationContext.

Graceful shutdown problems

The closing process of ApplicationContext is simply divided into the following steps (corresponding to the doClose method of the source code AbstractApplicationContext ):

  1. Cancel the registration of the current ApplicationContext in LivBeanView (currently only includes the cancellation of registration from JMX)
  2. Publish the ContextClosedEvent event and process all Listeners of this event synchronously
  3. Process all beans that implement the Lifecycle interface, resolve their shutdown order, and call their stop method
  4. Destroy Beans in all ApplicationContexts
  5. Close BeanFactory

A simple understanding of graceful shutdown is actually a Lifecycle implemented by adding graceful shutdown logic to the third step above, including the following two steps:

  1. Cut off external traffic entry : Specifically, let Spring Boot's web container directly reject all newly received requests and no longer process new requests, such as returning 503 directly.
  2. Waiting for the thread pool of the hosted Dispatcher to process all requests : For a synchronous servlet process, it is actually a thread pool for processing servlet requests. For an asynchronous responsive WebFlux process, it is actually a Reactor thread pool for all web requests. The events published by all current Publishers are processed. .

First, cut off the external traffic entry to ensure that no new requests will come. After the thread pool processes all requests , the normal business logic is completed normally. After that, you can start to close other elements.

However, we must first ensure that the logic of graceful shutdown needs to be the first and safest in all Lifecycles . This ensures that all requests are processed before starting to stop other Lifecycles. What's the problem if it doesn't? For example, if a Lifecycle is a load balancer, the stop method will shut down the load balancer. If this Lifecycle is stopped before the stop of the gracefully closed Lifecycle, it may cause some problems that have not been executed after the load balancer is stopped. Processed requests, and these requests need to use the load balancer to call other microservices, the execution fails.

Another problem with graceful shutdown is that the default graceful shutdown function is not so comprehensive , and sometimes we need to add more shutdown logic on this basis. For example, your project does not only have thread pools for web containers to process requests, but you also use other thread pools yourself , and thread pools may be more complex, one submits to another, submits to each other, various submissions, etc., we need After the thread pool of the web container processing the request has processed all the requests, it will wait for these thread pools to finish executing all the requests before closing. Another example is for MQ consumers . When it is closed gracefully, it should stop consuming new messages and wait for all current messages to be processed. These questions can be seen in the figure below:image

Source code analysis access point - Spring Boot + Undertow & synchronous Servlet environment

We trigger from the source code, analyze the use of Undertow as a Web container in Spring Boot and a synchronous Servlet environment, if a custom mechanism is connected. First, after introducing spring boot related dependencies and configuring graceful shutdown:

pom.xml

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-web</artifactId>
    <exclusions>
        <!--不使用默认的 tomcat 容器-->
        <exclusion>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-tomcat</artifactId>
        </exclusion>
    </exclusions>
</dependency>
<!--使用 undertow 容器-->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-undertow</artifactId>
</dependency>

application.yml

server:
  # 设置关闭方式为优雅关闭
  shutdown: graceful
  
management:
  endpoint:
    health:
      show-details: always
    # actuator 暴露 /actuator/shutdown 接口用于关闭(由于这里开启了优雅关闭所以其实是优雅关闭)
    shutdown:
      enabled: true
  endpoints:
    jmx:
      exposure:
        exclude: '*'
    web:
      exposure:
        include: '*'

After setting the shutdown mode to graceful shutdown, when Spring Boot starts, when creating a WebServer based on Undertow implementation, a graceful shutdown Handler will be added. Refer to the source code:

UndertowWebServerFactoryDelegate

static List<HttpHandlerFactory> createHttpHandlerFactories(Compression compression, boolean useForwardHeaders,
			String serverHeader, Shutdown shutdown, HttpHandlerFactory... initialHttpHandlerFactories) {
	List<HttpHandlerFactory> factories = new ArrayList<>(Arrays.asList(initialHttpHandlerFactories));
	if (compression != null && compression.getEnabled()) {
		factories.add(new CompressionHttpHandlerFactory(compression));
	}
	if (useForwardHeaders) {
		factories.add(Handlers::proxyPeerAddress);
	}
	if (StringUtils.hasText(serverHeader)) {
		factories.add((next) -> Handlers.header(next, "Server", serverHeader));
	}
	//如果指定了优雅关闭,则添加 gracefulShutdown
	if (shutdown == Shutdown.GRACEFUL) {
		factories.add(Handlers::gracefulShutdown);
	}
	return factories;
}

The added Handler is Undertow GracefulShutdownHandler, it GracefulShutdownHandleris one HttpHandler, this interface is very simple:

public interface HttpHandler {
    void handleRequest(HttpServerExchange exchange) throws Exception;
}

In fact, for each HTTP request received, it will go through the handleRequest method of each HttpHandler. The implementation idea of ​​GracefulShutdownHandler is also very simple. Since each request will go through the handleRequest method of this class, then I will add an atomic counter atomic + 1 when I receive a request, and after the request is processed (note that after the response is returned, not a method return, since the request may be asynchronous, this is made into a callback), atomically set the atomic counter to - 1, if this counter is zero, it proves that there are no more requests being processed. The source code is:

GracefulShutdownHandler:

@Override
public void handleRequest(HttpServerExchange exchange) throws Exception {
    //原子更新,请求计数器加一,返回的 snapshot 是包含是否关闭状态位的数字
    long snapshot = stateUpdater.updateAndGet(this, incrementActive);
    //通过状态位判断是否正在关闭
    if (isShutdown(snapshot)) {
        //如果正在关闭,直接请求数原子减一
        decrementRequests();
        //设置响应码为 503
        exchange.setStatusCode(StatusCodes.SERVICE_UNAVAILABLE);
        //标记请求完成
        exchange.endExchange();
        //直接返回,不继续走其他的 HttpHandler
        return;
    }
    //添加请求完成时候的 listener,这个在请求完成返回响应时会被调用,将计数器原子减一
    exchange.addExchangeCompleteListener(listener);
    //继续走下一个 HttpHandler
    next.handleRequest(exchange);
}

So, when is this shutdown called? Earlier we mentioned the third step of the ApplicationContext shutdown process: process all beans that implement the Lifecycle interface, parse their shutdown sequence, and call their stop method. In fact, graceful shutdown is called here. When the Spring Boot + Undertow & Synchronous Servlet environment is started, at the step of creating the WebServer, an elegantly closed Lifecycle will be created, corresponding to the source code:

ServletWebServerApplicationContext

private void createWebServer() {
	WebServer webServer = this.webServer;
	ServletContext servletContext = getServletContext();
	if (webServer == null && servletContext == null) {
		StartupStep createWebServer = this.getApplicationStartup().start("spring.boot.webserver.create");
		ServletWebServerFactory factory = getWebServerFactory();
		createWebServer.tag("factory", factory.getClass().toString());
		this.webServer = factory.getWebServer(getSelfInitializer());
		createWebServer.end();
		//就是这里,创建一个 WebServerGracefulShutdownLifecycle 并注册到当前 ApplicationContext 的 BeanFactory 中
		getBeanFactory().registerSingleton("webServerGracefulShutdown",
				new WebServerGracefulShutdownLifecycle(this.webServer));
		getBeanFactory().registerSingleton("webServerStartStop",
				new WebServerStartStopLifecycle(this, this.webServer));
	}
	else if (servletContext != null) {
		try {
			getSelfInitializer().onStartup(servletContext);
		}
		catch (ServletException ex) {
			throw new ApplicationContextException("Cannot initialize servlet context", ex);
		}
	}
	initPropertySources();
}

As mentioned earlier, the third step of the shutdown process of ApplicationContext calls the stop method of all Lifecycles, here is the stop method in WebServerGracefulShutdownLifecycle:

WebServerGracefulShutdownLifecycle

@Override
public void stop(Runnable callback) {
	this.running = false;
	this.webServer.shutDownGracefully((result) -> callback.run());
}

The webServer here, since we are using Undertow, the corresponding implementation is UndertowWebServer, take a look at his shutdownDownGracefully implementation:

UndertowWebServer


//这里的这个 GracefulShutdownHandler 就是前面说的在启动时加的 GracefulShutdownHandler
private volatile GracefulShutdownHandler gracefulShutdown;

@Override
public void shutDownGracefully(GracefulShutdownCallback callback) {
    // 如果 GracefulShutdownHandler 不为 null,证明开启了优雅关闭(server.shutdown=graceful)
	if (this.gracefulShutdown == null) {
	    //为 null,就证明没开启优雅关闭,什么都不等
		callback.shutdownComplete(GracefulShutdownResult.IMMEDIATE);
		return;
	}
	//开启优雅关闭,需要等待请求处理完
	logger.info("Commencing graceful shutdown. Waiting for active requests to complete");
	this.gracefulShutdownCallback.set(callback);
	//调用 GracefulShutdownHandler 的 shutdown 进行优雅关闭
	this.gracefulShutdown.shutdown();
	//调用 GracefulShutdownHandler 的 addShutdownListener 添加关闭后调用的操作,这里是调用 notifyGracefulCallback
	//其实就是调用方法参数的 callback(就是外部的回调)
	this.gracefulShutdown.addShutdownListener((success) -> notifyGracefulCallback(success));
}

private void notifyGracefulCallback(boolean success) {
	GracefulShutdownCallback callback = this.gracefulShutdownCallback.getAndSet(null);
	if (callback != null) {
		if (success) {
			logger.info("Graceful shutdown complete");
			callback.shutdownComplete(GracefulShutdownResult.IDLE);
		}
		else {
			logger.info("Graceful shutdown aborted with one or more requests still active");
			callback.shutdownComplete(GracefulShutdownResult.REQUESTS_ACTIVE);
		}
	}
}

Look at the shutdown method of GracefulShutdownHandler and the addShutdownListener method:

GracefulShutdownHandler:

public void shutdown() {
    //设置关闭状态位,并原子 + 1
    stateUpdater.updateAndGet(this, incrementActiveAndShutdown);
    //直接请求数原子减一
    decrementRequests();
}

private void decrementRequests() {
    long snapshot = stateUpdater.updateAndGet(this, decrementActive);
    // Shutdown has completed when the activeCount portion is zero, and shutdown is set.
    //如果与 关闭状态位 MASK 完全相等,证明其他位都是 0,证明剩余处理中的请求数量为 0
    if (snapshot == SHUTDOWN_MASK) {
        //调用 shutdownComplete
        shutdownComplete();
    }
}

private void shutdownComplete() {
    synchronized (lock) {
        lock.notifyAll();
        //调用每个 ShutdownListener 的 shutdown 方法
        for (ShutdownListener listener : shutdownListeners) {
            listener.shutdown(true);
        }
        shutdownListeners.clear();
    }
}

/**
 * 这个方法并不只是字面意思,首先如果不是关闭中不能添加 ShutdownListener
 * 然后如果没有请求了,就直接调用传入的 shutdownListener 的 shutdown 方法
 * 如果还有请求,则添加入 shutdownListeners,等其他调用 shutdownComplete 的时候遍历 shutdownListeners 调用 shutdown
 * lock 主要为了 addShutdownListener 与 shutdownComplete 对 shutdownListeners 的访问安全
 * lock 的 wait notify 主要为了实现 awaitShutdown 机制,我们这里没有提
 */
public void addShutdownListener(final ShutdownListener shutdownListener) {
        synchronized (lock) {
            if (!isShutdown(stateUpdater.get(this))) {
                throw UndertowMessages.MESSAGES.handlerNotShutdown();
            }
            long count = activeCount(stateUpdater.get(this));
            if (count == 0) {
                shutdownListener.shutdown(true);
            } else {
                shutdownListeners.add(shutdownListener);
            }
        }
    }

This is the underlying principle of graceful shutdown, but we have not analyzed the third step of the shutdown process of ApplicationContext and the order of stop of graceful shutdown and other Lifecycle Beans. Let’s clarify it here. First, let’s take a lookSmart

Start to close the entry of Lifecycle Bean:

DefaultLifecycleProcessor

private void stopBeans() {
    //读取所有的 Lifecycle bean,返回的是一个 LinkedHashMap,遍历它的顺序和放入的顺序一样
    //放入的顺序就是从 BeanFactory 读取所有 Lifecycle 的 Bean 的返回顺序,这个和 Bean 加载顺序有关,不太可控,可能这个版本加载顺序升级一个版本就变了
	Map<String, Lifecycle> lifecycleBeans = getLifecycleBeans();
	//按照每个 Lifecycle 的 Phase 值进行分组
	//如果实现了 Phased 接口就通过其 phase 方法返回得出 phase 值
	//如果没有实现 Phased 接口则认为 Phase 是 0
	Map<Integer, LifecycleGroup> phases = new HashMap<>();
	lifecycleBeans.forEach((beanName, bean) -> {
		int shutdownPhase = getPhase(bean);
		LifecycleGroup group = phases.get(shutdownPhase);
		if (group == null) {
			group = new LifecycleGroup(shutdownPhase, this.timeoutPerShutdownPhase, lifecycleBeans, false);
			phases.put(shutdownPhase, group);
		}
		group.add(beanName, bean);
	});
	//如果不为空,证明有需要关闭的 Lifecycle,开始关闭
	if (!phases.isEmpty()) {
	    //按照 Phase 值倒序
		List<Integer> keys = new ArrayList<>(phases.keySet());
		keys.sort(Collections.reverseOrder());
		//挨个关闭
		for (Integer key : keys) {
			phases.get(key).stop();
		}
	}
}

To sum up, it is actually:

  1. Get all the beans that implement the Lifecycle interface in the Beanfactory of the current ApplicationContext.
  2. Read the Phase value of each bean, if the bean implements the Phased interface, take the value returned by the interface method, if not, it is 0.
  3. Group beans by Phase value
  4. According to the order of Phase value from large to small, traverse each group in turn to close
  5. We will not look at the code in detail for the specific logic of closing each group. When we know that it is closed, we also see whether the current Lifecycle Bean still depends on other Lifecycle beans. If it depends, the dependent Lifecycle Bean will be turned off first.

Let's take a look at WebServerGracefulShutdownLifecyclethe :

class WebServerGracefulShutdownLifecycle implements SmartLifecycle {
    ....
}

SmartLifecycle includes Phased interface and default implementation:

public interface SmartLifecycle extends Lifecycle, Phased {
    int DEFAULT_PHASE = Integer.MAX_VALUE;
    @Override
	default int getPhase() {
		return DEFAULT_PHASE;
	}
}

It can be seen that as long as SmartLifecycle is implemented, Phase defaults to the maximum value. Therefore, the Lifecycle that is closed gracefully: WebServerGracefulShutdownLifecyclethe Phase is the maximum value, that is, the group that is closed first .

Summary Access Point - Spring Boot + Undertow & Synchronous Servlet Environment

1. Access point 1 - By adding a bean that implements the SmartLifecycle interface, specify that the Phase is smaller than the Phase of WebServerGracefulShutdownLifecycle

In the previous analysis, we already know: WebServerGracefulShutdownLifecyclethe Phase is the maximum value, that is, it belongs to the group that is first closed . What we want to achieve is to add some graceful shutdown logic after this, and before the Destroy Bean (the fourth step of ApplicationContext shutdown mentioned above) (that is, before the Bean is destroyed, some beans cannot be used in the destruction, such as Some beans in the microservice call, at this time, if there are still tasks that have not been called, they will report an exception). Then the first thing we think of is to add a Phase Lifecycle at this time, and realize our graceful shutdown access in it, for example:

@Log4j2
@Component
public class BizThreadPoolShutdownLifecycle implements SmartLifecycle {
    private volatile boolean running = false;
    
    @Override
    public int getPhase() {
        //在 WebServerGracefulShutdownLifecycle 那一组之后
        return SmartLifecycle.DEFAULT_PHASE - 1;
    }

    @Override
    public void start() {
        this.running = true;
    }

    @Override
    public void stop() {
        //在这里编写的优雅关闭逻辑
        this.running = false;
    }

    @Override
    public boolean isRunning() {
        return running;
    }
}

In this way, the compatibility is better, and there is basically no need to modify the version of the underlying framework dependencies. But the problem is that a certain framework may be introduced with a Lifecycle bean. Although its Phase is correct and smaller than WebServerGracefulShutdownLifecycle, SmartLifecycle.DEFAULT_PHASE - 1 is equal to our custom Lifecyce, and this is exactly the elegance that needs to wait for us It is closed again after closing, and due to the problem of bean loading order, the Lifecycle of the framework runs to the front of our custom Lifecycle to stop. This will cause problems, but the probability of problems occurring is not large.

2. Access point 2 - Add ShutdownListener implementation List<ShutdownListener> shutdownListenersto

This implementation, obviously, restricts that the container must be undertow, and the compatibility of upgrades may not be good. But we can execute our graceful shutdown logic immediately after the graceful shutdown of the Http thread pool, so we don't have to worry about introducing a dependency that will cause problems with our custom graceful shutdown sequence. Which is better or worse than the first one, please judge for yourself, the simple implementation is:

@Log4j2
@Componenet
//仅在包含 Undertow 这个类的时候加载
@ConditionalOnClass(name = "io.undertow.Undertow")
public class ThreadPoolFactoryGracefulShutDownHandler implements ApplicationListener<ApplicationEvent> {
    
    //获取操作 UndertowWebServer 的 gracefulShutdown 字段的句柄
    private static VarHandle undertowGracefulShutdown;
    //获取操作 GracefulShutdownHandler 的 shutdownListeners 字段的句柄
    private static VarHandle undertowShutdownListeners;

    static {
        try {
            undertowGracefulShutdown = MethodHandles
                    .privateLookupIn(UndertowWebServer.class, MethodHandles.lookup())
                    .findVarHandle(UndertowWebServer.class, "gracefulShutdown",
                            GracefulShutdownHandler.class);
            undertowShutdownListeners = MethodHandles
                    .privateLookupIn(GracefulShutdownHandler.class, MethodHandles.lookup())
                    .findVarHandle(GracefulShutdownHandler.class, "shutdownListeners",
                            List.class);
        } catch (Exception e) {
            log.warn("ThreadPoolFactoryGracefulShutDownHandler undertow not found, ignore fetch var handles");
        }
    }

    @Override
    public void onApplicationEvent(ApplicationEvent event) {
        //仅处理 WebServerInitializedEvent 事件,这个是在 WebServer 创建并初始化完成后发出的事件
        if (event instanceof WebServerInitializedEvent) {
            WebServer webServer = ((WebServerInitializedEvent) event).getWebServer();
            //检查当前的 web 容器是否是 UnderTow 的
            if (webServer instanceof UndertowWebServer) {
                GracefulShutdownHandler gracefulShutdownHandler = (GracefulShutdownHandler) undertowGracefulShutdown.getVolatile(webServer);
                //如果启用了优雅关闭,则 gracefulShutdownHandler 不为 null
                if (gracefulShutdownHandler != null) {
                    var shutdownListeners = (List<GracefulShutdownHandler.ShutdownListener>) undertowShutdownListeners.getVolatile(gracefulShutdownHandler);
                    shutdownListeners.add(shutdownSuccessful -> {
                        if (shutdownSuccessful) {
                            //添加你的优雅关闭逻辑
                        } else {
                            log.info("ThreadPoolFactoryGracefulShutDownHandler-onApplicationEvent shutdown failed");
                        }
                    });
                }
            }
        }
    }
}

How to achieve graceful shutdown of extra thread pools

Now that we know how to access it, how to turn off the custom thread pools in the project? First of all, we must first get all the thread pools to be checked. The implementation of different environments is different, and the implementation is relatively simple. I won't repeat them here. The two thread pools in , ignore the scheduled task thread pool ScheduledThreadPoolExecutor):

  • java.util.concurrent.ThreadPoolExecutor: The most commonly used thread pool
  • java.util.concurrent.ForkJoinPool: Thread pool in the form of ForkJoin

How to judge whether they have no tasks to execute for these two thread pools? Reference Code:

public static boolean isCompleted(ExecutorService executorService) {
    if (executorService instanceof ThreadPoolExecutor) {
        ThreadPoolExecutor threadPoolExecutor = (ThreadPoolExecutor) executorService;
        //对于 ThreadPoolExecutor,就是判断没有任何 active 的线程了
        return threadPoolExecutor.getActiveCount() == 0;
    } else if (executorService instanceof ForkJoinPool) {
        //对于 ForkJoinPool,复杂一些,就是判断既没有活跃线程,也没有运行的线程,队列里面也没有任何任务并且并没有任何等待提交的任务
        ForkJoinPool forkJoinPool = (ForkJoinPool) executorService;
        return forkJoinPool.getActiveThreadCount() == 0
                && forkJoinPool.getRunningThreadCount() == 0
                && forkJoinPool.getQueuedTaskCount() == 0
                && forkJoinPool.getQueuedSubmissionCount() == 0;
    }
    return true;
}

How to judge that all thread pools have no tasks? Since practical applications may be very free, for example, thread pool A may submit tasks to thread pool B, thread pool B may submit tasks to thread pool C, and thread pool C may submit tasks to A and B, so if we traverse the In a round of all thread pools, it is found that the above method isCompleted returns true, and there is no guarantee that all thread pools will run completely (for example, I check A, B, C in turn, and when C is checked, C submits tasks to A and B again And ended, C checked and found that the tasks were completed, but A and B that had been checked before had tasks that were not completed). So my solution is: disrupt all thread pools, traverse, check whether each thread pool is completed, if the check finds that all are completed, the counter will be incremented by 1, as long as there are unfinished ones, the counter will not be incremented and the counter will be cleared. Keep looping, sleeping for 1 second each time, until the counter reaches 3 (that is, checking all thread pools in random order three times in a row without any tasks):

List<ExecutorService> executorServices = 获取所有线程池
for (int i = 0; i < 3; ) {
    //连续三次,以随机乱序检查所有的线程池都完成了,才认为是真正完成
    Collections.shuffle(executorServices);
    if (executorServices.stream().allMatch(ThreadPoolFactory::isCompleted)) {
        i++;
        log.info("all threads pools are completed, i: {}", i);
    } else {
        //连续三次
        i = 0;
        log.info("not all threads pools are completed, wait for 1s");
        try {
            TimeUnit.SECONDS.sleep(1);
        } catch (InterruptedException ignored) {
        }
    }
}

How is it handled in RocketMQ-spring-starter

rocketmq 的官方 spring boot starter:https://github.com/apache/rocketmq-spring

Among them, the first access point method we mentioned here is used, and the consumer container is made into a SmartLifcycle (Phase is the maximum value, which belongs to the most preferred shutdown group), and the shutdown logic is added in it:

DefaultRocketMQListenerContainer

@Override
public int getPhase() {
    // Returning Integer.MAX_VALUE only suggests that
    // we will be the first bean to shutdown and last bean to start
    return Integer.MAX_VALUE;
}
@Override
public void stop(Runnable callback) {
    stop();
    callback.run();
}
@Override
public void stop() {
    if (this.isRunning()) {
        if (Objects.nonNull(consumer)) {
            //关闭消费者
            consumer.shutdown();
        }
        setRunning(false);
    }
}

Wechat search "My Programming Meow", follow the official account, add the author's Wechat, swipe every day, easily improve the technology, and gain various offers : imageI will often send some good news video materials of various frameworks' official communities and add them. The last person translates the subtitles to the following address (including the above public account), welcome to pay attention:

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324133387&siteId=291194637