The Spring Cloud microservice project uses Nacos+load balancer to achieve grayscale release-full link grayscale design with Demo source code

concept

Grayscale release, also called canary release. It refers to a publishing method that can smoothly transition between black and white. AB test is a grayscale release method that allows some users to continue using A and some users to start using B. If users have no objection to B, then gradually expand the scope and migrate all users to B. Grayscale release can ensure the stability of the overall system. Problems can be found and adjusted at the initial grayscale to ensure their impact. What we usually call canary [deployment] is also a way of grayscale release.
Specific to the server, more control can be done in the actual operation, for example, set a lower weight for the first 10 updated servers, control the number of requests sent to these 10 servers, and then gradually increase the weight and increase the number of requests . A smooth transition idea, this control is called "traffic splitting".

Component Release Notes

We have practiced this project for two and a half years. The version we use is not very new. My Demo here will also use this version. I have feelings. Friends who use the new version can adjust it by themselves. The implementation idea is the same, but the source code of these frameworks Subject to change.

  • spring-boot: 2.3.12.RELEASE
  • spring-cloud-dependencies: Hoxton.SR12
  • spring-cloud-alibaba-dependencies: 2.2.9.RELEASE

spring cloud corresponding version diagram

Core Component Description

  • Registration center: Nacos
  • Gateway: Spring Cloud Gateway
  • Load balancer: Ribbon (the implementation using SpringCloudLoadBalancer is also similar)
  • RPC calls between services: OpenFeign

Gray release code implementation

There are many technical solutions for grayscale publishing of the Spring Cloud project. The focus is on service discovery. How to request grayscale traffic only to grayscale services. Here we will use Nacos as the registration center and configuration center. The core is to use Nacos Metadata to set a version value, when calling downstream services, the version value is used to distinguish which version to call. Some processes will be omitted here. The source code address is provided at the end of the article and needs to be mentioned.
insert image description here

code design structure

This is a demo project, and the structure is the simplest.

  • spring-cloud-gray-example // Father process
    • kerwin-common // project common module
    • kerwin-gateway // microservice gateway
    • kerwin-order // order module
      • order-app // order business service
    • kerwin-starter // custom springboot starter module
      • spring-cloud-starter-kerwin-gray // grayscale release starter package (core code is here)
    • kerwin-user // user module
      • user-app // user business service
      • user-client // user client (Feign and DTO)

Core package spring-cloud-starter-kerwin-gray structure introduction

insert image description here

Entrance Spring Cloud Gateway implements grayscale release design (some basic information classes are below)

When the request enters the gateway, it starts to judge whether to request the grayscale version. Through the filter implementation of Spring Cloud Gateway, when calling the downstream service, rewrite a Ribbon load balancer to realize the judgment of the grayscale state when calling.

Access request gray mark Holder (this is also used by business services)

useThreadLocalRecord the gray mark of each request thread, and set the mark toThreadLocalmiddle.

public class GrayFlagRequestHolder {
    
    
    /**
     * 标记是否使用灰度版本
     * 具体描述请查看 {@link com.kerwin.gray.enums.GrayStatusEnum}
     */
    private static final ThreadLocal<GrayStatusEnum> grayFlag = new ThreadLocal<>();
    public static void setGrayTag(final GrayStatusEnum tag) {
    
    
        grayFlag.set(tag);
    }
    public static GrayStatusEnum getGrayTag() {
    
    
        return grayFlag.get();
    }
    public static void remove() {
    
    
        grayFlag.remove();
    }
}

Prefilter

In the pre-filter, it will be judged whether the request should use the grayscale version, and the grayscale state will be enumeratedGrayStatusEnumset toGrayRequestContextHolderThe grayscale state enumeration of this request is stored in theload balancerThe grayscale state enumeration will be taken out to determine which version of the service to call, and it is also implemented hereOrderedThe interface will sort the filter of the gateway, here we set the sort of this filter asOrdered.HIGHEST_PRECEDENCEThe minimum value of an int to ensure that this filter is executed first.

public class GrayGatewayBeginFilter implements GlobalFilter, Ordered {
    
    
    @Autowired
    private GrayGatewayProperties grayGatewayProperties;
    @Override
    public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {
    
    
        GrayStatusEnum grayStatusEnum = GrayStatusEnum.ALL;
        // 当灰度开关打开时才进行请求头判断
        if (grayGatewayProperties.getEnabled()) {
    
    
            grayStatusEnum = GrayStatusEnum.PROD;
            // 判断是否需要调用灰度版本
            if (checkGray(exchange.getRequest())) {
    
    
                grayStatusEnum = GrayStatusEnum.GRAY;
            }
        }
        GrayFlagRequestHolder.setGrayTag(grayStatusEnum);
        ServerHttpRequest newRequest = exchange.getRequest().mutate()
                .header(GrayConstant.GRAY_HEADER, grayStatusEnum.getVal())
                .build();
        ServerWebExchange newExchange = exchange.mutate()
                .request(newRequest)
                .build();
        return chain.filter(newExchange);
    }

    /**
     * 校验是否使用灰度版本
     */
    private boolean checkGray(ServerHttpRequest request) {
    
    
        if (checkGrayHeadKey(request) || checkGrayIPList(request) || checkGrayCiryList(request) || checkGrayUserNoList(request)) {
    
    
            return true;
        }
        return false;
    }

    /**
     * 校验自定义灰度版本请求头判断是否需要调用灰度版本
     */
    private boolean checkGrayHeadKey(ServerHttpRequest request) {
    
    
        HttpHeaders headers = request.getHeaders();
        if (headers.containsKey(grayGatewayProperties.getGrayHeadKey())) {
    
    
            List<String> grayValues = headers.get(grayGatewayProperties.getGrayHeadKey());
            if (!Objects.isNull(grayValues)
                    && grayValues.size() > 0
                    && grayGatewayProperties.getGrayHeadValue().equals(grayValues.get(0))) {
    
    
                return true;
            }
        }
        return false;
    }

    /**
     * 校验自定义灰度版本IP数组判断是否需要调用灰度版本
     */
    private boolean checkGrayIPList(ServerHttpRequest request) {
    
    
        List<String> grayIPList = grayGatewayProperties.getGrayIPList();
        if (CollectionUtils.isEmpty(grayIPList)) {
    
    
            return false;
        }
        String realIP = request.getHeaders().getFirst("X-Real-IP");
        if (realIP == null || realIP.isEmpty()) {
    
    
            realIP = request.getRemoteAddress().getAddress().getHostAddress();
        }
        if (realIP != null && CollectionUtils.contains(grayIPList.iterator(), realIP)) {
    
    
            return true;
        }
        return false;
    }

    /**
     * 校验自定义灰度版本城市数组判断是否需要调用灰度版本
     */
    private boolean checkGrayCiryList(ServerHttpRequest request) {
    
    
        List<String> grayCityList = grayGatewayProperties.getGrayCityList();
        if (CollectionUtils.isEmpty(grayCityList)) {
    
    
            return false;
        }
        String realIP = request.getHeaders().getFirst("X-Real-IP");
        if (realIP == null || realIP.isEmpty()) {
    
    
            realIP = request.getRemoteAddress().getAddress().getHostAddress();
        }
        // 通过IP获取当前城市名称
        // 这里篇幅比较长不具体实现了,想要实现的可以使用ip2region.xdb,这里写死cityName = "本地"
        String cityName = "本地";
        if (cityName != null && CollectionUtils.contains(grayCityList.iterator(), cityName)) {
    
    
            return true;
        }
        return false;
    }

    /**
     * 校验自定义灰度版本用户编号数组(我们系统不会在网关获取用户编号这种方法如果需要可以自己实现一下)
     */
    private boolean checkGrayUserNoList(ServerHttpRequest request) {
    
    
        List<String> grayUserNoList = grayGatewayProperties.getGrayUserNoList();
        if (CollectionUtils.isEmpty(grayUserNoList)) {
    
    
            return false;
        }
        return false;
    }

    @Override
    public int getOrder() {
    
    
        // 设置过滤器的执行顺序,值越小越先执行
        return Ordered.HIGHEST_PRECEDENCE;
    }
}

post filter

The post-filter is for calling the downstream business service before respondingGrayFlagRequestHolderneutralThreadLocalClear to avoid memory leaks.

public class GrayGatewayAfterFilter implements GlobalFilter, Ordered {
    
    

    @Override
    public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {
    
    
        // 请求执行完必须要remore当前线程的ThreadLocal
        GrayFlagRequestHolder.remove();
        return chain.filter(exchange);
    }

    @Override
    public int getOrder() {
    
    
        // 设置过滤器的执行顺序,值越小越先执行
        return Ordered.LOWEST_PRECEDENCE;
    }
}

global exception handler

The global exception handler is designed to handle exceptions that willGrayFlagRequestHolderneutralThreadLocalClear to avoid memory leaks, if an exception occurs when calling downstream business services, you will not be able to enter the post-filter.

public class GrayGatewayExceptionHandler implements WebExceptionHandler, Ordered {
    
    
    @Override
    public Mono<Void> handle(ServerWebExchange exchange, Throwable ex) {
    
    
        // 请求执行完必须要remore当前线程的ThreadLocal
        GrayFlagRequestHolder.remove();
        ServerHttpResponse response = exchange.getResponse();
        if (ex instanceof ResponseStatusException) {
    
    
            // 处理 ResponseStatusException 异常
            ResponseStatusException responseStatusException = (ResponseStatusException) ex;
            response.setStatusCode(responseStatusException.getStatus());
            // 可以根据需要设置响应头等
            return response.setComplete();
        } else {
    
    
            // 处理其他异常
            response.setStatusCode(HttpStatus.INTERNAL_SERVER_ERROR);
            // 可以根据需要设置响应头等
            return response.setComplete();
        }
    }

    @Override
    public int getOrder() {
    
    
        // 设置过滤器的执行顺序,值越小越先执行
        return Ordered.HIGHEST_PRECEDENCE;
    }
}

Custom Ribbon load balancing routing (this is also used by business services)

  • Grayscale Ribbon load balancing routing abstract class: Here are two methods to obtain the service list, which will beGrayFlagRequestHolderThe current thread grayscale status enumeration stored in is judged, if the enumeration value isGrayStatusEnum.ALLThen respond to all service lists regardless of version, if the enumeration value isGrayStatusEnum.PRODThen return the service list of the production version, if the enumeration value isGrayStatusEnum.GRAYThen return the grayscale version of the service list, the version number will be inGrayVersionPropertiesConfigured in the service list in theNacosofmetadata=version= set in andGrayVersionPropertiesMatch the version number of the corresponding version of the service list.
public abstract class AbstractGrayLoadBalancerRule extends AbstractLoadBalancerRule {
    
    
    @Autowired
    private GrayVersionProperties grayVersionProperties;

    @Value("${spring.cloud.nacos.discovery.metadata.version}")
    private String metaVersion;

    /**
     * 只有已启动且可访问的服务器,并对灰度标识进行判断
     */
    public List<Server> getReachableServers() {
    
    
        ILoadBalancer lb = getLoadBalancer();
        if (lb == null) {
    
    
            return new ArrayList<>();
        }
        List<Server> reachableServers = lb.getReachableServers();

        return getGrayServers(reachableServers);
    }

    /**
     * 所有已知的服务器,可访问和不可访问,并对灰度标识进行判断
     */
    public List<Server> getAllServers() {
    
    
        ILoadBalancer lb = getLoadBalancer();
        if (lb == null) {
    
    
            return new ArrayList<>();
        }
        List<Server> allServers = lb.getAllServers();
        return getGrayServers(allServers);
    }

    /**
     * 获取灰度版本服务列表
     */
    protected List<Server> getGrayServers(List<Server> servers) {
    
    
        List<Server> result = new ArrayList<>();
        if (servers == null) {
    
    
            return result;
        }
        String currentVersion = metaVersion;
        GrayStatusEnum grayStatusEnum = GrayFlagRequestHolder.getGrayTag();
        if (grayStatusEnum != null) {
    
    
            switch (grayStatusEnum) {
    
    
                case ALL:
                    return servers;
                case PROD:
                    currentVersion = grayVersionProperties.getProdVersion();
                    break;
                case GRAY:
                    currentVersion = grayVersionProperties.getGrayVersion();
                    break;
            }
        }

        for (Server server : servers) {
    
    
            NacosServer nacosServer = (NacosServer) server;
            Map<String, String> metadata = nacosServer.getMetadata();
            String version = metadata.get("version");
            // 判断服务metadata下的version是否于设置的请求版本一致
            if (version != null && version.equals(currentVersion)) {
    
    
                result.add(server);
            }
        }
        return result;
    }
}

  • Custom polling algorithm to implement GrayRoundRobinRule: the code is too long, here is only the code snippet, I copied Ribbon's polling algorithm directly, and replaced the method of obtaining the service list with the custom ==AbstractGrayLoadBalancerRule == in method, other algorithms can also be implemented in a similar way.

insert image description here

Business service implementation gray scale release design

Custom SpringMVC request interceptor

Customize the SpringMVC request interceptor to obtain the grayscale request header of the upstream service, and if it is obtained, set it toGrayFlagRequestHolderIn the following, if there is a subsequent RPC call, the grayscale mark will also be passed on.

@SuppressWarnings("all")
public class GrayMvcHandlerInterceptor implements HandlerInterceptor {
    
    
    @Override
    public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) throws Exception {
    
    
        String grayTag = request.getHeader(GrayConstant.GRAY_HEADER);
        // 如果HttpHeader中灰度标记存在,则将灰度标记放到holder中,如果需要就传递下去
        if (grayTag!= null) {
    
    
            GrayFlagRequestHolder.setGrayTag(GrayStatusEnum.getByVal(grayTag));
        }
        return true;
    }
    @Override
    public void postHandle(HttpServletRequest request, HttpServletResponse response, Object handler, ModelAndView modelAndView) throws Exception {
    
    

    }
    @Override
    public void afterCompletion(HttpServletRequest request, HttpServletResponse response, Object handler, Exception ex) throws Exception {
    
    
        GrayFlagRequestHolder.remove();
    }
}

Custom OpenFeign request interceptor

Customize the OpenFeign request interceptor, take out the settings in the custom SpringMVC request interceptorGrayFlagRequestHolderThe grayscale mark in , and put it in the request header of calling the downstream service, and pass the grayscale mark on.

public class GrayFeignRequestInterceptor implements RequestInterceptor {
    
    

    @Override
    public void apply(RequestTemplate template) {
    
    
        // 如果灰度标记存在,将灰度标记通过HttpHeader传递下去
        GrayStatusEnum grayStatusEnum = GrayFlagRequestHolder.getGrayTag();
        if (grayStatusEnum != null ) {
    
    
            template.header(GrayConstant.GRAY_HEADER, Collections.singleton(grayStatusEnum.getVal()));
        }
    }
}

basic information design

Some basic parameters will be defined here, such as whether to enable the grayscale and what requests need to use the grayscale version, etc., to prepare for the follow-up business.

  • The grayscale unified request header set when calling the business service
public interface GrayConstant {
    
    
    /**
     * 灰度统一请求头
     */
    String GRAY_HEADER="gray";
}
  • Grayscale version status enumeration
public enum GrayStatusEnum {
    
    
    ALL("ALL","可以调用全部版本的服务"),
    PROD("PROD","只能调用生产版本的服务"),
    GRAY("GRAY","只能调用灰度版本的服务");
    GrayStatusEnum(String val, String desc) {
    
    
        this.val = val;
        this.desc = desc;
    }
    private String val;
    private String desc;
    public String getVal() {
    
    
        return val;
    }
    public static GrayStatusEnum getByVal(String val){
    
    
        if(val == null){
    
    
            return null;
        }
        for (GrayStatusEnum value : values()) {
    
    
            if(value.val.equals(val)){
    
    
                return value;
            }
        }
        return null;
    }
}
  • Gateway grayscale configuration information class
@Data
@Configuration
@RefreshScope
@ConfigurationProperties("kerwin.tool.gray.gateway")
public class GrayGatewayProperties {
    
    

    /**
     * 灰度开关(如果开启灰度开关则进行灰度逻辑处理,如果关闭则走正常处理逻辑)
     * PS:一般在灰度发布测试完成以后会将线上版本都切换成灰度版本完成全部升级,这时候应该关闭灰度逻辑判断
     */
    private Boolean enabled = false;

    /**
     * 自定义灰度版本请求头 (通过grayHeadValue来匹配请求头中的值如果一致就去调用灰度版本,用于公司测试)
     */
    private String grayHeadKey="gray";

    /**
     * 自定义灰度版本请求头匹配值
     */
    private String grayHeadValue="gray-996";

    /**
     * 使用灰度版本IP数组
     */
    private List<String> grayIPList = new ArrayList<>();

    /**
     * 使用灰度版本城市数组
     */
    private List<String> grayCityList = new ArrayList<>();

    /**
     * 使用灰度版本用户编号数组(我们系统不会在网关获取用户编号这种方法如果需要可以自己实现一下)
     */
    private List<String> grayUserNoList = new ArrayList<>();
}
  • Global version configuration information class
@Data
@Configuration
@RefreshScope
@ConfigurationProperties("kerwin.tool.gray.version")
public class GrayVersionProperties {
    
    
    /**
     * 当前线上版本号
     */
    private String prodVersion;

    /**
     * 灰度版本号
     */
    private String grayVersion;
}
  • Global auto-configuration class
@Configuration
// 可以通过@ConditionalOnProperty设置是否开启灰度自动配置 默认是不加载的
@ConditionalOnProperty(value = "kerwin.tool.gray.load",havingValue = "true")
@EnableConfigurationProperties(GrayVersionProperties.class)
public class GrayAutoConfiguration {
    
    
    @Configuration(proxyBeanMethods = false)
    @ConditionalOnClass(value = GlobalFilter.class)
    @EnableConfigurationProperties(GrayGatewayProperties.class)
    static class GrayGatewayFilterAutoConfiguration {
    
    
        @Bean
        public GrayGatewayBeginFilter grayGatewayBeginFilter() {
    
    
            return new GrayGatewayBeginFilter();
        }
        @Bean
        public GrayGatewayAfterFilter grayGatewayAfterFilter() {
    
    
            return new GrayGatewayAfterFilter();
        }
        @Bean
        public GrayGatewayExceptionHandler grayGatewayExceptionHandler(){
    
    
            return new GrayGatewayExceptionHandler();
        }
    }

    @Configuration(proxyBeanMethods = false)
    @ConditionalOnClass(value = WebMvcConfigurer.class)
    static class GrayWebMvcAutoConfiguration {
    
    
        /**
         * Spring MVC 请求拦截器
         * @return WebMvcConfigurer
         */
        @Bean
        public WebMvcConfigurer webMvcConfigurer() {
    
    
            return new WebMvcConfigurer() {
    
    
                @Override
                public void addInterceptors(InterceptorRegistry registry) {
    
    
                    registry.addInterceptor(new GrayMvcHandlerInterceptor());
                }
            };
        }
    }
    @Configuration
    @ConditionalOnClass(value = RequestInterceptor.class)
    static class GrayFeignInterceptorAutoConfiguration {
    
    
        /**
         * Feign拦截器
         * @return GrayFeignRequestInterceptor
         */
        @Bean
        public GrayFeignRequestInterceptor grayFeignRequestInterceptor() {
    
    
            return new GrayFeignRequestInterceptor();
        }
    }
}

Project run configuration

Here I will start five services, a gateway service, a user service V1 version, an order service V1 version, a user service V2 version, and an order service V2 version, to demonstrate the grayscale release effect.
PS: The namespace of Nacos is called spring-cloud-gray-example here. You can create one yourself or replace it with your own namespace. The configuration in the source code exists. If you have any problems, just look at the source code.

Configure Nacos global configuration file (common-config.yaml)

All services will use this configuration

kerwin:
  tool:
    gray:
      ## 配置是否加载灰度自动配置类,如果不配置那么默认不加载
      load: true
      ## 配置生产版本和灰度版本号
      version:
        prodVersion: V1
        grayVersion: V2

## 配置Ribbon调用user-app和order-app服务时使用我们自定义灰度轮询算法
user-app:
  ribbon:
    NFLoadBalancerRuleClassName: com.kerwin.gray.loadbalancer.GrayRoundRobinRule
order-app:
  ribbon:
    NFLoadBalancerRuleClassName: com.kerwin.gray.loadbalancer.GrayRoundRobinRule

insert image description here

Configure the gateway Nacos configuration file (gateway-app.yaml)

kerwin:
  tool:
    gray:
      gateway:
        ## 是否开启灰度发布功能
        enabled: true
        ## 自定义灰度版本请求头
        grayHeadKey: gray
        ## 自定义灰度版本请求头匹配值
        grayHeadValue: gray-996
        ## 使用灰度版本IP数组
        grayIPList:
          - '127.0.0.1'
        ## 使用灰度版本城市数组
        grayCityList:
          - 本地

insert image description here

Start the gateway service

Just start one of the gateway services, and start Debug directly, which is convenient for debugging the source code

Start the V1 and V2 versions of the business service (both the user service and the order service are started in this way)

  • Start directly with Debug first, and you will see a message corresponding to the name of the startup class at the location of IDEA
    insert image description here
  • Click Edit to edit this launch configuration
    insert image description here
  • Copy a corresponding startup configuration as the V2 version, and change the Name to something you can distinguish
    insert image description here
  • To configure startup parameters, the first step is to clickModify optionsThen the second step will beAdd VM optionsCheck it, and the third step fills in the corresponding service startup port andNacosofmetadata.version, my user service V1 version is configured as -Dserver.port=7201 -Dspring.cloud.nacos.discovery.metadata.version=V1 , and the user service V2 version is configured as -Dserver.port=7202 -Dspring.cloud.nacos.discovery .metadata.version=V2 , order service configuration is similar, click Apply after configuration.
    insert image description here
  • Last started service information
    insert image description here

Grayscale effect demo

The user-app in the source code provides an interface for obtaining user information and will carry the port and version information of the current service. The order-app service provides an interface for obtaining order information, and will remotely call user-app to obtain the user associated with the order information, and will also carry the port and version information of the current service in response.

Scenario 1 (turn off the grayscale switch: no distinction is made between the calling service version)

There are two configurations for turning off the grayscale switch

  • 1. Modify the Nacos global configuration file before the project startskerwin.tool.gray.loadConfigure whether to load the grayscale automatic configuration class, as long as the configuration is not true, the entire grayscale related class will not be loaded
    insert image description here
  • 2. Turn off the gateway grayscale switch, and modify the gateway Nacos configuration filekerwin.tool.gray.gateway.enabled, as long as the configuration is not true, grayscale judgment will not be performed.

call demo

The call here does not necessarily mean that the Order service version is V1 and the User service version is also V1. It is also possible that the Order service version is V1 and the User service version is also V2.

  • For the first call, the Order service version is V1, and the User service version is also V1
    insert image description here
  • For the second call, the Order service version is V2, and the User service version is also V2
    -

Scenario 2 (turn on the grayscale switch: only call the production version)

Modify the gateway Nacos configuration filekerwin.tool.gray.gateway.enabledSet astrue, other grayscale IP arrays and city array configurations do not match, so how to call is V1 version, because inGrayVersionPropertiesThe production version set in the version configuration is V1 and the grayscale version is V2.
insert image description here
insert image description here

Scenario 3 (turn on the grayscale switch: call the grayscale version by matching the request header, ip, and city)

Here, through the request header test, if you access the gateway with the request header gray=gray-996, all the traffic will enter the grayscale version V2.
insert image description here
insert image description here

source code

  • The source code is placed on the gitee address

There is a problem

  • 1. If distributed task scheduling is used in the project, how to distinguish the gray version
    • This is actually quite easy to solve. Take xxl-job as an example, just register different executors, and register to the executor of the grayscale version when releasing the grayscale version.
  • 2. If MQ is used in the project, how do we control the grayscale of sending and receiving messages?
    • Here is the same as the idea of ​​solving distributed task scheduling. When the grayscale version of the service sends a message, it is delivered to another MQ server. It is to get two sets of MQ servers. The produced service uses the produced MQ, and the grayscale release uses grayscale. Degree of MQ
  • 3. The entire implementation process here is not very complicated, but it is also very unnecessary, just provide an implementation plan for reference
    • In fact, the gateway is directly routed through Nginx + Lua scripts, and then a Nacos grayscale namespace is used for the entire set of grayscale services, and the production namespace is used for production, so that the two sets of services can be isolated and distributed task scheduling , MQ and other configurations can be independently in the configuration file of their own namespace.

Guess you like

Origin blog.csdn.net/weixin_44606481/article/details/131726893