SpringCloud works with OpenFeign to refresh the Nacos service list in real time (using the Ribbon load balancer solution)

foreword

  • When using SpringCloud with OpenFeign and using Ribbon as a load balancer, you often encounter a situation when pulling the service list on the Nacos registry. When the downstream service is just started or restarted, there will be a problem that it cannot be accessed for a period of time, because Pulling the Nacos service address in the Ribbon is pulled by a timing thread every 30S by default. That is to say, the downstream service may not be able to obtain the service information within 30S after the downstream service is just launched, which is very uncomfortable. Yes, whether it is offline testing or online deployment, we hope to get the service list as soon as possible. Here are two solutions.

error message

If the service list cannot be obtained, com.netflix.client.ClientException: Load balancer does not have available server for client: service name will be thrown
insert image description here

Solution 1 (subscribe to Nacos service status changes, actively update the local service list)

Core code implementation

import com.alibaba.cloud.nacos.NacosDiscoveryProperties;
import com.alibaba.nacos.api.exception.NacosException;
import com.alibaba.nacos.api.naming.NamingService;
import com.netflix.loadbalancer.ILoadBalancer;
import com.netflix.loadbalancer.ZoneAwareLoadBalancer;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.cloud.openfeign.ribbon.CachingSpringLoadBalancerFactory;
import org.springframework.cloud.openfeign.ribbon.FeignLoadBalancer;
import org.springframework.stereotype.Component;

import javax.annotation.PostConstruct;
import java.util.Arrays;
import java.util.List;

/**
 * @author kerwin
 */
@Slf4j
@Component
public class NacosServerStatusListener{
    
    

    // 应用服务使用feign时使用该对象刷新rebbon中的服务地址信息,该对象会在FeignRibbonClientAutoConfiguration中被加载
    @Autowired
    private CachingSpringLoadBalancerFactory cachingSpringLoadBalancerFactory;
    // Nacos注册中心配置信息 包括我们需要的NamingService也能在里面获取到
    @Autowired
    private NacosDiscoveryProperties nacosDiscoveryProperties;
    // 需要订阅的服务名称集合 可以放在Nacos的配置中心
    private List<String> serviceList = Arrays.asList("shopping-customer", "shopping-merchant", "shopping-order");

    @PostConstruct
    public void init() {
    
    
        try {
    
    
            //获取 NamingService
            NamingService namingService = nacosDiscoveryProperties.namingServiceInstance();
            //订阅服务,服务状态刷新时,更新ribbon
            serviceList.stream().forEach(service -> {
    
    
                //订阅服务状态发生改变时,刷新 ribbon 服务实例
                try {
    
    
                    namingService.subscribe(service, (event -> {
    
    
                        // 创建feign负载均衡器,如果已经创建过会直接获取
                        FeignLoadBalancer feignLoadBalancer = cachingSpringLoadBalancerFactory.create(service);
                        if (feignLoadBalancer != null) {
    
    
                            log.info("刷新 ribbon 服务实例:{}", service);
                            ILoadBalancer iLoadBalancer = feignLoadBalancer.getLoadBalancer();
                            if (iLoadBalancer != null) {
    
    
                                ZoneAwareLoadBalancer loadBalancer = (ZoneAwareLoadBalancer) iLoadBalancer;
                                loadBalancer.updateListOfServers();
                                log.info("刷新 ribbon 服务实例成功:{}", service);
                            }
                        }
                    }));
                } catch (NacosException e) {
    
    
                    log.error("订阅 nacos 服务失败,error:{}", e.getErrMsg());
                    e.printStackTrace();
                }
            });
        } catch (Exception e) {
    
    
            log.error("获取 nacos 服务信息失败,error:{}", e.getMessage());
            e.printStackTrace();
        }
    }
}

Principle analysis

  • 1. Subscribe to Nacos subscription service status
    • Here we obtain the NacosDiscoveryProperties in the IOC container through dependency injection . NacosDiscoveryProperties can be obtained through NacosDiscoveryProperties , and the subscribe method of NamingService is called to subscribe to a service. There are two incoming parameters. The first parameter is the service name, and the second parameter is a function. Object, here it is assumed to subscribe to the order service order-service, if the order service goes online or offline or starts multiple order services, each change will notify the function object with two parameters, and you can manually update the Ribbon in this function method list of services.
  • 2. Why use GatewayProperties directly
    • Theoretically, how many downstream services need to be called on the Gateway side, then those downstream service names need to be subscribed, because the subscribe method of NamingService is to monitor whether a certain service changes, here we need to subscribe to each service, we can directly Inject the GatewayProperties object. This object is to parse the Gateway routing configuration information and store it in an array. You can get all the routing information through getRoutes . We usually set the routing id as the service name, so the direct use here is A good choice, of course, you can also specify which services need to be monitored through the configuration file.
  • 3. What is the function of CachingSpringLoadBalancerFactory
    • The factory class CachingSpringLoadBalancerFactory is mainly used to create storage and obtain load balancers . It will be loaded into the IOC container in FeignRibbonClientAutoConfiguration . Here, direct injection can be used. We use the corresponding - @FeignClient(value = "service name") Call the CachingSpringLoadBalancerFactory.create("service name") method to get the corresponding FeignLoadBalancer load balancer, and get the real processing logic Ribbon load balancer through feignLoadBalancer.getLoadBalancer() . This load balancer uses ZoneAwareLoadBalancer , ZoneAwareLoadBalancer There is a method updateListOfServers() to update the service list , get the load balancer of the corresponding service and then call the update service list method, thus realizing the function of updating the service list in real time.

Solution 2 (by setting the Ribbon to pull the Nacos service list interval at regular intervals)

If you think that the first solution is troublesome and does not need such a high real-time performance, it is a good choice to use the second solution.

  • As mentioned above, you can modify the time for Ribbon to regularly pull the service list. The default is 30s to pull the service list once. Here, you can adjust it to the corresponding value according to the actual situation.
ribbon:
  ServerListRefreshInterval: 15000 # 刷新所服务列表间隔时间 默认30000毫秒
  • In fact, we also need to consider the configuration of Nacos, because the heartbeat offline mechanism of Nacos will also affect the accuracy of obtaining the latest service list, here comes the configuration of Nacos heartbeat and offline time
spring:
  cloud:
    nacos:
      metadata:
        preserved.heart.beat.interval: 5000 #该实例在客户端上报心跳的间隔时间 默认5000。(单位:毫秒)
        preserved.heart.beat.timeout: 5000 #该实例在不发送心跳后,从健康到不健康的时间 默认30000。(单位:毫秒)
        preserved.ip.delete.timeout: 20000 #该实例在不发送心跳后,被nacos下掉该实例的时间 默认30000。(单位:毫秒)

Tiankeng Tips

I don’t know if you have seen a blog saying that you should inherit the PollingServerListUpdater class, rewrite its start method, and then refer the UpdateAction object to the member variable, and inject the rewritten class into the IOC container, and then in your own Listen to Nacos service updates to call UpdateAction.doUpdate() to update the service list.

If you do this, you have basically changed jobs^^, as long as more than one downstream service does this, there will be a big problem. If there is only one downstream service, it can be used.

Error code example

@Component
public class MyPollingServerListUpdater extends PollingServerListUpdater {
    
    
    private UpdateAction updateAction;
    @Override
    public synchronized void start(UpdateAction updateAction) {
    
    
        this.updateAction = updateAction;
        super.start(updateAction);
    }
    public UpdateAction getUpdateAction() {
    
    
        return updateAction;
    }
}

@Slf4j
@Component
public class NacosServerStatusListener {
    
    
    @Autowired
    private MyPollingServerListUpdater listUpdater;
    // Nacos注册中心配置信息 包括我们需要的NamingService也能在里面获取到
    @Autowired
    private NacosDiscoveryProperties nacosDiscoveryProperties;
    // 需要订阅的服务名称集合 可以放在Nacos的配置中心
    private List<String> serviceList = Arrays.asList("shopping-customer","shopping-merchant","shopping-order");
    
    @PostConstruct
    public void init() {
    
    
        try {
    
    
            //获取 NamingService
            NamingService namingService = nacosDiscoveryProperties.namingServiceInstance();
            //订阅服务,服务状态刷新时,更新ribbon
            serviceList.stream().forEach(service-> {
    
    
                //订阅服务状态发生改变时,刷新 ribbon 服务实例
                try {
    
    
                    namingService.subscribe(service,(event -> {
    
    
                        ServerListUpdater.UpdateAction updateAction = this.listUpdater.getUpdateAction();
                        if (updateAction != null) {
    
    
                            log.info("Ribbon 刷新 service:{}",service);
                            updateAction.doUpdate();
                        }
                    }));
                } catch (NacosException e) {
    
    
                    log.error("订阅 nacos 服务失败,error:{}",e.getErrMsg());
                    e.printStackTrace();
                }
            });
        } catch (Exception e) {
    
    
            log.error("获取 nacos 服务信息失败,error:{}",e.getMessage());
            e.printStackTrace();
        }
    }
}

wrong reason

There are two problems with this method. I will not elaborate on the source code process here, but only analyze the reasons for this.

First of all, in fact, each downstream service will initialize its own load balancer, and each will independently manage its own PollingServerListUpdater object. The start method in this object will start a timing thread to pull the service list of the corresponding service. Here is not Pull all, but only pull the service list under the corresponding service. For example, if your routing id is order and uri is lb://order, then this timing thread will only pull the service list of the order service. Here we ourselves A PollingServerListUpdater object is instantiated and loaded into the IOC container, then all subsequent services will use this object. What is interesting is that the start method has a judgment to judge whether it has been called. If it has been called, it will not be in the To create a timed thread to pull the service list, if we have two services here to use this PollingServerListUpdater object, then only the first caller can create a timed thread to pull the service list, and the subsequent callers cannot create it. Except for the first created load balancer, the service list will not be pulled regularly.

Another problem is that we use UpdateAction as a member variable, which means that only the last loaded load balancer will update the service list in real time, because all load balancers call the start method of a PollingServerListUpdater object , later Calling directly replaces the UpdateAction object, and whoever calls it last will own the UpdateAction object.

insert image description here

Guess you like

Origin blog.csdn.net/weixin_44606481/article/details/129949514