Dubbo elegant online and offline detailed explanation

Dubbo elegant online and offline

What is Dubbo elegant online and offline?

First let us explain what elegance means:

Refers to a person's behavior that is graceful, natural and elegant.

And what about elegance in the eyes of programmers?

The code has the following characteristics:

Good naming: standard method names, knowing the meaning after seeing the name

Clear structure: clean code, avoid overly long methods, split into multiple small methods, appropriate comments

Not a terrible algorithm: good performance

And what is elegance for applications?

Let's first take a look at what are the inelegant situations:

  • When the application starts, the service starts to provide external services before it is ready, resulting in many failed calls.
  • When the application starts, the application health status is not checked (bug), and external services are provided, resulting in failed calls.
  • When the application stops, some threads that are still executing have not finished executing, and the service is stopped directly.
  • When the application stopped, the caller was not notified, resulting in many requests coming.
  • When the application is stopped, the corresponding monitoring is not turned off, causing an alarm to occur after the application is stopped.

And these opposites are situations of grace.

Why should Dubbo go online and offline gracefully?

Why go online gracefully (delayed release)?

  • It takes time to put the relevant resources in place. If the provided service needs to initialize cached data, the data needs to be obtained from the database and then calculated. If the time takes 3S, it is necessary to delay the release time >3S.

  • Smooth online. For example, a service that has just been launched will produce response jitter : when the program is first started, JAVA is still in the interpreter mode. Only after running for a period of time, the code method or loop code will be compiled into machine code and executed after it has been run for a certain number of times. Efficiency will increase. Then, after executing for a period of time, the JVM's optimization methods will slowly catch up.

Why log off gracefully?

  • In order to ensure that normal business is not affected. When the service on a certain node is down, it cannot affect normal business execution.

Elegant online and offline principles

Principles of elegant online launch

Graceful rollout is also known as delayed exposure.

In fact, the main thing is to add the following code (if there is no need to delay, you do not need to configure this code):

<dubbo:service delay="延迟暴露的时间" />

This configuration will cause the Dubbo service to execute the exposure logic after the Spring container starts XS.

<dubbo: service delay="-1">

The above configuration can be delayed until the Spring container initialization is completed before exposing the service.

This time is used to warm up the code, initialize the cache, wait for relevant resources to be in place, etc.

Graceful shutdown principle

Let’s take a look at the official principles given by DUBBO:

service provider

  • When stopping, first mark it as not accepting new requests. When new requests come in, an error will be reported directly, allowing the client to try again on other machines.
  • Then, detect whether the threads in the thread pool are running. If so, wait for all threads to complete execution, unless it times out, force shutdown.

service consumer

  • When stopped, no new call requests will be initiated, and all new calls will report an error on the client.
  • Then, it detects whether the requested response has not been returned, and waits for the response to be returned. Unless it times out, it will be forced to close.

Interpretation of Dubbo's delayed release of source code

Rough flow chart:

Insert image description here

Entrance

The following code is located in the serviceBean:

//该方法是在spring容器初始化完成后触发的一个事件回调
@Override
public void onApplicationEvent(ApplicationEvent event) {
    
    
    if (ContextRefreshedEvent.class.getName().equals(event.getClass().getName())) {
    
    
        if (isDelay() && ! isExported() && ! isUnexported()) {
    
    
            if (logger.isInfoEnabled()) {
    
    
                logger.info("The service ready on spring started. service: " + getInterface());
            }
            export();
        }
    }
}

//注意,这段代码,如果配置了delay则返回为FALSE,没有配置返回为TRUE
private boolean isDelay() {
    
    
    //取service中的delay
    Integer delay = getDelay();
    //取Provider中的delay
    ProviderConfig provider = getProvider();
    //如果service的delay为空,provider的不为空
    if (delay == null && provider != null) {
    
    
        delay = provider.getDelay();
    }
    //这里supportedApplicationListener基本可以认为是TRUE,所以基本判断只要看后面的。
    return supportedApplicationListener && (delay == null || delay.intValue() == -1);
}

@Override
public void afterPropertiesSet() throws Exception {
    
    
    //...
    if (!isDelay()) {
    
    
        export();
    }
}

The above code will execute afterPropertiesSet() when initializing a serviceBean, and onApplicationEvent() will be executed when the Spring container completes initialization. Export() of these two methods will only execute one of them. Of course, it is mainly judged by isDelay().

accomplish

Next, enter ServiceConfig and do export():

public synchronized void export() {
    
    
    if (provider != null) {
    
    
        if (export == null) {
    
    
            export = provider.getExport();
        }
        //如果service配置的delay为null,则取provide的
        if (delay == null) {
    
    
            delay = provider.getDelay();
        }
    }
    if (export != null && ! export.booleanValue()) {
    
    
        return;
    }
    //如果配置了delay,则延迟,再做暴露。doExport()是真正实现暴露的服务
    if (delay != null && delay > 0) {
    
    
        Thread thread = new Thread(new Runnable() {
    
    
            public void run() {
    
    
                try {
    
    
                    Thread.sleep(delay);
                } catch (Throwable e) {
    
    
                }
                doExport();
            }
        });
        thread.setDaemon(true);
        thread.setName("DelayExportServiceThread");
        thread.start();
    } else {
    
    
        doExport();
    }
}

The above code is to do the actual exposed interface logic after processing the delay logic.

If delay() is configured, the export() method will be executed in afterPropertiesSet(), a new thread will be created for delay, the service will be exposed, and the doExport() method will be executed (the method that truly implements service exposure).

If delay() is not configured, the export() method will be executed in onApplicationEvent(), and the doExport() method (the method that actually implements service exposure) will be triggered immediately.

Dubbo graceful shutdown source code interpretation

Rough flow chart

Insert image description here

shutdown entrance

The entry code is located in AbstractConfig:

static {
    
    
    Runtime.getRuntime().addShutdownHook(new Thread(new Runnable() {
    
    
        public void run() {
    
    
            if (logger.isInfoEnabled()) {
    
    
                logger.info("Run shutdown hook now.");
            }
            ProtocolConfig.destroyAll();
        }
    }, "DubboShutdownHook"));
}

A ShutDownHook is registered here, which will trigger ProtocolConfig.destroyAll() if it is shut down.

Next, enter ProtocolConfig:

public static void destroyAll() {
    
    
    //注销注册中心
    AbstractRegistryFactory.destroyAll();
   //等待消费者接收到该节点服务已下线通知
    try {
    
    
        Thread.sleep(ConfigUtils.getServerShutdownTimeout());
    } catch (InterruptedException e) {
    
    
        logger.warn("Interrupted unexpectedly when waiting for registry notification during shutdown process!");
    }
    ExtensionLoader<Protocol> loader = ExtensionLoader.getExtensionLoader(Protocol.class);
    for (String protocolName : loader.getLoadedExtensions()) {
    
    
        try {
    
      
          Protocol protocol = loader.getLoadedExtension(protocolName);
            if (protocol != null) {
    
    
              //注销协议  
              protocol.destroy();
            }
        } catch (Throwable t) {
    
    
            logger.warn(t.getMessage(), t);
        }
    }
}

As can be seen from the above, the graceful shutdown process is mainly divided into two parts:

  • Cancel registration center
  • Cancellation Agreement

Registration center logout

The code is located in AbstractRegistryFactory:

public static void destroyAll() {
    
    
    if (LOGGER.isInfoEnabled()) {
    
    
        LOGGER.info("Close all registries " + getRegistries());
    }
    // 锁定注册中心关闭过程
    LOCK.lock();
    try {
    
    
        for (Registry registry : getRegistries()) {
    
    
            try {
    
    
                registry.destroy();
            } catch (Throwable e) {
    
    
                LOGGER.error(e.getMessage(), e);
            }
        }
        REGISTRIES.clear();
    } finally {
    
    
        // 释放锁
        LOCK.unlock();
    }
}

registry.destroy() mainly deletes the service provider address corresponding to this node in the registration center.

After the service provider address of the registration center is deleted, consumers who subscribe to the service will know that the node's service has been offline and will no longer call the node's service.

Next, in ProtocolConfi, after the registration center logout is completed, there is this piece of code:

try {
    
    
        Thread.sleep(ConfigUtils.getServerShutdownTimeout());
    } catch (InterruptedException e) {
    
    
        logger.warn("Interrupted unexpectedly when waiting for registry notification during shutdown process!");
    }

There is a Thread.sleep() here, which is to wait for the consumer to receive a notification from the registration center that the node's service has been offline. The default time is 10S, which is an empirical value. Why?

  • The time is too short, and the consumer may not have received the service offline notification from the registration center and will continue to use the node's service to make requests.
  • The time is too long, meaningless waiting time is added in vain.

Influencing factors:

  • Cluster size
  • Registration center selection. Taking Naocs and Zookeeper as an example, Nacos' ability to push addresses under the same scale of service instances far exceeds Zookeeper's.
  • Network status.

Cancellation Agreement

ExtensionLoader<Protocol>loader = ExtensionLoader.getExtensionLoader(Protocol.class);
    for (String protocolName : loader.getLoadedExtensions()) {
    
    
        try {
    
      
          Protocol protocol = loader.getLoadedExtension(protocolName);
            if (protocol != null) {
    
    
              //注销协议  
              protocol.destroy();
            }
        } catch (Throwable t) {
    
    
            logger.warn(t.getMessage(), t);
        }
    }

There are two main protocols returned by protocol here:

  • DubboProtocol: Interacting with server requests (remote service exposure protocol) (key introduction)
  • InjvmProtocol: Interacting with internal requests (local service exposure protocol)
public interface Protocol {
    
    
	<T> Exporter<T> export(Invoker<T> invoker) throws RpcException;
	<T> Invoker<T> refer(Class<T> type, URL url) throws RpcException;
	 void destroy();
}

In fact, Protocol is Dubbo's three life cycle methods: exposure, reference, and destruction.

protocol.destroy() This code is located in DubboProtocol:

public void destroy() {
    
    
  //服务端关闭  
  for (String key : new ArrayList<String>(serverMap.keySet())) {
    
    
        ExchangeServer server = serverMap.remove(key);
        if (server != null) {
    
    
            try {
    
    
                if (logger.isInfoEnabled()) {
    
    
                    logger.info("Close dubbo server: " + server.getLocalAddress());
                }
                server.close(getServerShutdownTimeout());
            } catch (Throwable t) {
    
    
                logger.warn(t.getMessage(), t);
            }
        }
    }
    
  //客户端关闭
    for (String key : new ArrayList<String>(referenceClientMap.keySet())) {
    
    
        ExchangeClient client = referenceClientMap.remove(key);
        if (client != null) {
    
    
            try {
    
    
                if (logger.isInfoEnabled()) {
    
    
                    logger.info("Close dubbo connect: " + client.getLocalAddress() + "-->" + client.getRemoteAddress());
                }
                client.close();
            } catch (Throwable t) {
    
    
                logger.warn(t.getMessage(), t);
            }
        }
    }
   
    for (String key : new ArrayList<String>(ghostClientMap.keySet())) {
    
    
        ExchangeClient client = ghostClientMap.remove(key);
        if (client != null) {
    
    
            try {
    
    
                if (logger.isInfoEnabled()) {
    
    
                    logger.info("Close dubbo connect: " + client.getLocalAddress() + "-->" + client.getRemoteAddress());
                }
                client.close();
            } catch (Throwable t) {
    
    
                logger.warn(t.getMessage(), t);
            }
        }
    }
    stubServiceMethodsMap.clear();
    super.destroy();
}

The above code is mainly divided into server logout and client logout. Log out of the server first and stop receiving new requests, which reduces the possibility of the service being called by consumers again.

Server logout:

public void close(final int timeout) {
    
    
    if (timeout > 0) {
    
    
        final long max = (long) timeout;
        final long start = System.currentTimeMillis();
        if (getUrl().getParameter(Constants.CHANNEL_SEND_READONLYEVENT_KEY, false)){
    
    
            sendChannelReadOnlyEvent();
        }
      //如果还有服务还在运行中,则等待其运行完成直至超时
        while (HeaderExchangeServer.this.isRunning() 
                && System.currentTimeMillis() - start < max) {
    
    
            try {
    
    
                Thread.sleep(10);
            } catch (InterruptedException e) {
    
    
                logger.warn(e.getMessage(), e);
            }
        }
    }
  //停止心跳检测
    doClose();
  //关闭底层通讯框架NettyServer
    server.close(timeout);
}

private void doClose() {
    
    
    if (!closed.compareAndSet(false, true)) {
    
    
        return;
    }
  //其实就是把heartbeatTimer置为null
    stopHeartbeatTimer();
    try {
    
    
        scheduled.shutdown();
    } catch (Throwable t) {
    
    
        logger.warn(t.getMessage(), t);
    }
}

public void close(int timeout) {
    
    
  //会先关闭线程池。
  //尽量等到线程池中的线程都执行完成之后,再关闭线程池,之后执行关闭Netty Server
    ExecutorUtil.gracefulShutdown(executor, timeout);
    close();
}

There are three main steps for server logout:

  • If there is still a service running, wait for it to complete or run until it times out.
  • Stop heartbeat detection
  • Close NettyServer

Client logout is basically the same. The difference is that if there are still requests that have not been returned, the client will wait for the request to return or wait until it times out . The rest are the same.

Dubbo gracefully shuts down under Spring

shortcoming

The above process is not suitable for graceful shutdown under the Spring container, and it has some defects:

The Spring container also uses shutdown hooks for graceful shutdown, which will be executed concurrently with the graceful shutdown of Dubbo 2.5 . The related beans cannot be entered from the BeanService entrance, resulting in graceful shutdown and failure.

solve

In order to solve this problem, Dubbo began to reconstruct this part of the logic in 2.6.X and continued to iterate. The 2.7.X version is the final logic.

The new version adds ShutdownHookListener, which inherits the Spring ApplicationListener interface to listen to Spring-related events. Here ShutdownHookListene only listens to Spring shutdown events. When Spring starts to shut down, the internal logic of ShutdownHookListener will be triggered.

public class SpringExtensionFactory implements ExtensionFactory {
    
    
    private static final Logger logger = LoggerFactory.getLogger(SpringExtensionFactory.class);

    private static final Set<ApplicationContext> CONTEXTS = new ConcurrentHashSet<ApplicationContext>();
    private static final ApplicationListener SHUTDOWN_HOOK_LISTENER = new ShutdownHookListener();

    public static void addApplicationContext(ApplicationContext context) {
    
    
        CONTEXTS.add(context);
        if (context instanceof ConfigurableApplicationContext) {
    
    
            // 注册 ShutdownHook
            ((ConfigurableApplicationContext) context).registerShutdownHook();
            // 取消 AbstractConfig 注册的 ShutdownHook 事件
            DubboShutdownHook.getDubboShutdownHook().unregister();
        }
        BeanFactoryUtils.addApplicationListener(context, SHUTDOWN_HOOK_LISTENER);
    }
    // 继承 ApplicationListener,这个监听器将会监听容器关闭事件
    private static class ShutdownHookListener implements ApplicationListener {
    
    
        @Override
        public void onApplicationEvent(ApplicationEvent event) {
    
    
            if (event instanceof ContextClosedEvent) {
    
    
                DubboShutdownHook shutdownHook = DubboShutdownHook.getDubboShutdownHook();
                shutdownHook.doDestroy();
            }
        }
    }
}

public abstract class AbstractConfig implements Serializable {
    
    
    static {
    
    
        Runtime.getRuntime().addShutdownHook(DubboShutdownHook.getDubboShutdownHook());
    }
}

When the Spring container starts to initialize, the logic of SpringExtensionFactory will be triggered. In its addApplicationContext method, the ShutdownHook event registered by AbstractConfig will be unregistered and SHUTDOWN_HOOK_LISTENER will be added. This avoids the above ShutdownHook execution problem for both Spring and Dubbo.

Summarize

This article takes you through the Dubbo elegant online and offline process through flowcharts and codes. It uses the flow chart to summarize and divide the Dubbo elegant online and offline processes into parts to help you better understand.

reference

  • "One article to talk about Dubbo gracefully shut down" https://www.cnkirito.moe/dubbo-gracefully-shutdown/

  • "Dubbo's Evolution of Graceful Shutdown" https://juejin.cn/post/6844903986277908488

  • "In-depth analysis of dubbo delay exposure" https://www.jianshu.com/p/0ce318f98e74

Guess you like

Origin blog.csdn.net/weixin_43589025/article/details/118485400