Springcloud's Feign and ribbon set timeout time and retry mechanism

Preface

When we call services from microservices, we use feign and ribbon. For example, an instance fails and the situation has not been discovered and removed in time by the service governance mechanism. At this time, the client will naturally fail when accessing the node. Therefore, in order to build a more robust application system, we hope to have a certain strategy retry mechanism when the request fails, instead of returning the failure directly.

First look at a configuration:

#预加载配置,默认为懒加载
ribbon:
  eager-load:
    enabled: true
    clients: zoo-plus-email

zoo-plus-email:
  ribbon:
    # 代表Ribbon使用的负载均衡策略
    NFLoadBalancerRuleClassName: com.netflix.loadbalancer.RandomRule
    # 每台服务器最多重试次数,但是首次调用不包括在内
    MaxAutoRetries: 1
    # 最多重试多少台服务器
    MaxAutoRetriesNextServer: 1
    # 无论是请求超时或者socket read timeout都进行重试
    OkToRetryOnAllOperations: true
    ReadTimeout: 3000
    ConnectTimeout: 3000

hystrix:
  command:
    default:
      execution:
        isolation:
          thread:
            timeoutInMilliseconds: 4000

In general, it is the timeout time of ribbon (<) the timeout time of hystrix (because it involves the retry mechanism of ribbon) 

Feign retry:

Because ribbon’s retry mechanism conflicts with Feign’s retry mechanism, Feign’s retry mechanism is turned off by default in the source code. Take a look at the source code for details.

The retry mechanism to enable Feign is as follows: (Feign defaults to retry five times by default in the source code)

@Bean
Retryer feignRetryer() {
        return  new Retryer.Default();
}

Retry mechanism of ribbon:

ribbon:
  ReadTimeout: 3000
  ConnectTimeout: 3000
  MaxAutoRetries: 1 #同一台实例最大重试次数,不包括首次调用
  MaxAutoRetriesNextServer: 1 #重试负载均衡其他的实例最大重试次数,不包括首次调用
  OkToRetryOnAllOperations: false  #是否所有操作都重试 

计算重试的次数:MaxAutoRetries+MaxAutoRetriesNextServer+(MaxAutoRetries *MaxAutoRetriesNextServer) 

That is, 3 retries plus the first call result in a total of 4 calls.

Note: If during the retry period, the time exceeds hystrix's timeout period, the fuse and fallback will be executed immediately. Therefore, the hystrix timeout time must be calculated according to the parameters configured above, so that the hystrix timeout time cannot be reached during the retry period, otherwise the retry mechanism will be meaningless.

Hystrix timeout calculation:  (1 + MaxAutoRetries + MaxAutoRetriesNextServer) * ReadTimeout means that according to the above configuration, the hystrix timeout time should be configured as (1+1+1)*3=9 seconds

When the ribbon times out and Hystrix does not time out, it will take a retry mechanism. When OkToRetryOnAllOperations is set to false, only get requests will be retried. If set to true, all requests will be retried. If it is a write operation such as put or post, if the server interface is not idempotent, it will produce bad results, so OkToRetryOnAllOperations should be used with caution.
If you do not configure the number of retries for the ribbon, it will be retried by default. 
Note :  By default, GET requests will be retried regardless of whether the connection is abnormal or read abnormal  . Non-GET requests will only be performed when the connection is abnormal. Retry

Guess you like

Origin blog.csdn.net/qq_36850813/article/details/102816423