spring cloud eureka registration principle - registration fails filled pit

EDITORIAL

  We know that Eureka divided into two parts, Eureka Server and Eureka Client. Eureka Server act as a registration center, Eureka Client with respect to Eureka Server is a client, you need to register their information to the registry. This article describes the time to Eureka Server is registered in Eureka Client RetryableClientQuarantineRefreshPercentagetips parameters.

Eureka Client registration process analysis

  Eureka Client registration to Eureka Server, first encountered the first problem is to know the end Eureka Client Server address, this parameter corresponds to eureka.client.service-url.defaultZonean example, the configuration file in Eureka Client's properties as follows:

eureka.client.service-url.defaultZone=
http://localhost:8761/eureka,http://localhost:8762/eureka,http://localhost:8763/eureka,http://localhost:8764/eureka

  As indicated above, Eureka Client configuration corresponding Eureka Server addresses are 8761,8762,8763,8764. There are two issues here:

  • Eureka Client will own information are registered to four address?
  • Eureka Clinent registration mechanism is what?

  In front of the source code at a glance, with these two issues we have to answer these two questions by source. Eureka Client at startup registration source as follows:
EXECUT method of RetryableEurekaHttpClient

 @Override
  protected <R> EurekaHttpResponse<R> execute(RequestExecutor<R> requestExecutor) {
      List<EurekaEndpoint> candidateHosts = null;
      int endpointIdx = 0;
      for (int retry = 0; retry < numberOfRetries; retry++) {
          EurekaHttpClient currentHttpClient = delegate.get();
          EurekaEndpoint currentEndpoint = null;
          if (currentHttpClient == null) {
              if (candidateHosts == null) {
                  candidateHosts = getHostCandidates();
                  if (candidateHosts.isEmpty()) {
                      throw new TransportException("There is no known eureka server; cluster server list is empty");
                  }
              }
              if (endpointIdx >= candidateHosts.size()) {
                  throw new TransportException("Cannot execute request on any known server");
              }

              currentEndpoint = candidateHosts.get(endpointIdx++);
              currentHttpClient = clientFactory.newClient(currentEndpoint);
          }

          try {
              EurekaHttpResponse<R> response = requestExecutor.execute(currentHttpClient);
              if (serverStatusEvaluator.accept(response.getStatusCode(), requestExecutor.getRequestType())) {
                  delegate.set(currentHttpClient);
                  if (retry > 0) {
                      logger.info("Request execution succeeded on retry #{}", retry);
                  }
                  return response;
              }
              logger.warn("Request execution failure with status code {}; retrying on another server if available", response.getStatusCode());
          } catch (Exception e) {
              logger.warn("Request execution failed with message: {}", e.getMessage());  // just log message as the underlying client should log the stacktrace
          }

          // Connection error or 5xx from the server that must be retried on another server
          delegate.compareAndSet(currentHttpClient, null);
          if (currentEndpoint != null) {
              quarantineSet.add(currentEndpoint);
          }
      }
      throw new TransportException("Retry limit reached; giving up on completing the request");
  }

  According to my understanding, the streamlined code reads as follows:

int endpointIdx = 0 ;
 // used to store all Eureka Server Information (8761,8762,8763,8764) 
List <EurekaEndpoint> candidateHosts = null ;
 // value code numberOfRetries written died default to 3 
for ( int retry = 0; the retry <numberOfRetries; the retry ++ ) {
     / ** 
     * when first entering the loop, the total amount of acquired information Eureka Server (8761,8762,8763,8764) 
     * / 
    IF (candidateHosts == null ) { 
        candidateHosts = getHostCandidates (); 
    } 
    / * * 
     * by endpointIdx increment, Eureka Server in order to obtain information, and then sending 
     * registered Post request. 
     * / 
    currentEndpointCandidateHosts.get = (endpointIdx ++ ); 
    currentHttpClient = clientFactory.newClient (currentEndpoint);
     the try {
        / ** 
         * Post registration request transmission operation, note that if successful, out of the loop, if fail 
         * Get the next sequence in accordance with a Eureka Server endpointIdx . 
         * / 
        the Response = requestExecutor.execute (currentHttpClient);
         return respones; 
    } the catch (Exception E) {
         // when initiating registered post to the registry (Eureka Server) is abnormal, the print log ... 
    }
     // if the registration action fails to save the current information to quarantineSet in (a Set collection) 
    IF (currentEndpoint! = null ) {
        quarantineSet.add (currentEndpoint); 
    } 
} 
// If all else fails, throw in the form of abnormal places ... 
the throw  new new TransportException ( "Retry limit Reached; Giving up at The ON Completing Request");

  The above code there is a very important method is List<EurekaEndpoint> candidateHosts = getHostCandidates();the next facie getHostCandidates()method Source

    private List<EurekaEndpoint> getHostCandidates() {
        List<EurekaEndpoint> candidateHosts = clusterResolver.getClusterEndpoints();
        quarantineSet.retainAll(candidateHosts);

        // If enough hosts are bad, we have no choice but start over again
        int threshold = (int) (candidateHosts.size() * transportConfig.getRetryableClientQuarantineRefreshPercentage());
        if (quarantineSet.isEmpty()) {
            // no-op
        } else if (quarantineSet.size() >= threshold) {
            logger.debug("Clearing quarantined list of size {}", quarantineSet.size());
            quarantineSet.clear();
        } else {
            List<EurekaEndpoint> remainingHosts = new ArrayList<>(candidateHosts.size());
            for (EurekaEndpoint endpoint : candidateHosts) {
                if (!quarantineSet.contains(endpoint)) {
                    remainingHosts.add(endpoint);
                }
            }
            candidateHosts = remainingHosts;
        }

        return candidateHosts;
    }

  As I understand it, under the streamlined code, including key only logical, as follows:

Private List <EurekaEndpoint> getHostCandidates () {
     / ** 
     * Get All defaultZone registry configuration (the Eureka Server), 
     * represents four examples herein (8761,8762,8763,8764) the Eureka Server 
     * / 
    List candidateHosts = clusterResolver.getClusterEndpoints ();
     / ** 
     * set saved quarantineSet this collection is Eureka Server unavailable 
     * here is to take the full amount of Eureka Server unavailable Eureka Server intersected 
     * / 
    quarantineSet.retainAll (candidateHosts); 
    / ** 
     * calculated from the parameter threshold RetryableClientQuarantineRefreshPercentage 
     * and the threshold value will follow the Eureka quarantineSet number stored Server unavailable 
     * comparing to determine whether to return the total amount of Eureka Server or unavailable filtered off 
     * Eureka Server. 
     * / 
    Int= threshold  
       ( int ) ( 
        candidateHosts.size ()
               * 
        transportConfig.getRetryableClientQuarantineRefreshPercentage () 
        ); 
    IF (quarantineSet.isEmpty ()) {
         / ** 
         * When first entered, quarantineSet is empty at this time, the total amount of direct return 
         * Eureka Server list 
         * / 
    } the else  IF (quarantineSet.size ()> = threshold) {
         / ** 
         * not available compared with the threshold value Eureka Server and, if not available 
         * Eureka Server with the number greater than the threshold value, then the previously stored Eureka 
         * Server content directly emptied, and return to full amount of the list of Eureka Server. 
         * / 
        QuarantineSet.clear (); 
    } the else{
         / ** 
         * Save the set by quarantineSet Eureka Server unavailable filtering 
         * full amount of EurekaServer, in order to gain the Eureka Client To register for 
         Eureka Server instance registered address *. 
         * / 
        List <EurekaEndpoint> = remainingHosts new new the ArrayList <> (candidateHosts.size ());
         for (EurekaEndpoint Endpoint: candidateHosts) {
             IF (! QuarantineSet.contains (Endpoint)) { 
                remainingHosts.add (Endpoint); 
            } 
        } 
        candidateHosts = remainingHosts; 
    } 
    return candidateHosts; 
}

  Through source code analysis, we now know that the initial, when Eureka Client initiates a registration request to Eureka Server (based on defaultZone looking for Eureka Server list), if there is a request for registration is successful, then the follow-up does not initiate a registration request to the other in Eureka Server. To this article, for example, has registered four (8761,8762,8763,8764). If the state Eureka Server 8761 service corresponds to the UP, then back to the registration center Eureka Client registration is successful, will no longer (8762,8763,8764) corresponding Eureka Server to initiate a registration request (corresponding to the program directly in a for loop return respones).

  Here he leads out another question, if this Eureka Server 8761 is down off of it?
According to the source will be the first time we can see Eureka Client Server 8761 to initiate this registration request, if the status of the Server is down, then it will save the Set Server to quarantineSet this collection, then visit the Eureka Server 8762 again, if this Server to 8762 the state is still down, it will save the set Server to quarantineSet this collection, and then continue to access the Server 8763, if the state Server 8763 is still down, except this time it is saved to quarantineSet set this set , the will to jump out of this cycle. Thus ending the registration process.

  Road someone here will ask next to the Server 8764 initiates the registration, the answer is no, because the number of cycles of default is 3 times. So even if the state Server 8764 is UP, it will not receive registration information from Eureka Client initiated.

  Eureka Client initiated registration information to Eureka Server process except when in Eureka Client-initiated trigger, there is another way, is the background regular tasks.

  We assume that the above description of the scene in Eureka Client is the time to start, because when you start the registration process all this fails, the timing of when the background task execution, will enter the registration process. Note A value of 3 (registration failure before 8761,8762,8763 Eureka Server) at this time of quarantineSet.
So when the program re-enter getHostCandidates()when the method, if (quarantineSet.isEmpty())this method is not satisfied, the next will take else if (quarantineSet.size() >= threshold)this judge, if the judge holds true, then the set will quarantineSet empty, and returns the full amount of Eureka Server list, if the judge does not hold, will take quarantineSet save the contents of a collection to filter the full amount of Eureka Server list. To this article as an example:

  • quarantineSetIs saved (8761,8762,8763) three Eureka Server
  • Content Eureka Server full amount of the list is (8761,8762,8763,8764) result filtered four Eureka Server return for 8764 this Eureka Server.

  In the example in this article 8761,8762,8763 three state Eureka Server 8764 is down and the state Eureka Server is UP, we actually wanted to go to the end of the branch else, thus completing the filtering operation, and finally get this 8764 Server, unfortunately it does not come to this branch, but was above else if (quarantineSet.size() >= threshold)the intercept of this branch, the return is still full amount of Eureka Server list. The consequences of this is that Eureka Client will still turn to (8761,8762,8763) three down the Eureka Server initiates a registration request.
So where is the key problem in it? The key problem is the origin of the threshold value, because the quarantineSet.size () is 3, and 3 is greater than the threshold value, resulting in, will quarantineSet empty set, returns a list of the total amount of the Server.
  We know that this threshold value is calculated based on the total amount of Eureka Server parameter list is multiplied by a configurable, in the case of this article, my properties file in addition to defaultZone this parameter is not configured, then that is the parameter there is a default value, by source we learned that the default value is 0.66. Specific source as follows:

Final  class PropertyBasedTransportConfigConstants {
     / ** 
     * Source omitted parts 
     * / 
    static  class Values {
         static  Final  int SESSION_RECONNECT_INTERVAL * = 20 is 60 ;
         // default value of 0.66 
        static  Final  Double QUARANTINE_REFRESH_PERCENTAGE = 0.66 ;
         static  Final  int DATA_STALENESS_TRHESHOLD * = 60. 5 ;
         static  Final  int ASYNC_RESOLVER_REFRESH_INTERVAL = 60 * 1000 *. 5 ;
         static  Final  int ASYNC_RESOLVER_WARMUP_TIMEOUT = 5000;
        static final int ASYNC_EXECUTOR_THREADPOOL_SIZE = 5;
    }
}
/**
 *@return the percentage of the full endpoints set above which the   
 *quarantine set is cleared in the range [0, 1.0]
 */
double getRetryableClientQuarantineRefreshPercentage();

  See here is not difficult to understand, because the value is 0.66 at a time when the whole amount of Eureka Server is 4. After the calculated value of 2, and since the registration for 3 cycles, so that when the second value quarantineSet initiated registration flow is always greater than threshold. This will lead to a problem that has been down even if 8761,8762,8763 8764 has been good, so Eureka Client registration will not succeed. Also this parameter value is the interval 0-1.

  Now through source code analysis we found the root of the problem, in fact the corresponding we also find a solution to this problem is to adjust the value of this parameter corresponds bigger.
This value corresponds to the wording of the following properties:

eureka.client.transport.retryableClientQuarantineRefreshPercentage = xxx

  Next we modify the properties file under, the modified content as follows:

 

eureka.client.service-url.defaultZone=
http://localhost:8761/eureka,http://localhost:8762/eureka,http://localhost:8763/eureka,http://localhost:8764/eureka
eureka.client.transport.retryableClientQuarantineRefreshPercentage=1
eureka.client.service-url.defaultZone=
http://localhost:8761/eureka,http://localhost:8762/eureka,http://localhost:8763/eureka,http://localhost:8764/eureka
eureka.client.transport.retryableClientQuarantineRefreshPercentage=1

According to this configuration at the next review of the above process again:

  • Register (8761,8762,8763 state is Down) start Eureka Client, so in this case is 3 quarantineSet.
  • The next task in timing and triggering event registration, at this time because the value of the parameter adjusted from 0.66 to 1. Therefore, the calculated threshold value of 4. 3 at a time quarantineSet value. So they will not enter the else if (quarantineSet.size() >= threshold)branch, but will enter the final esle branch.
  • Will be completed in filtering else branch, the final results list returned in 8764 only one is the Eureka Server.
  • Eureka Client to launch the Eureka Server 8764 registration request to obtain the corresponding success, and return.

Remaining problem

  Says here we feel like is to solve this problem, then ask a question, infinite value of this parameter can be set right?
  For example, I set the value of this parameter is 10, although javaDoc described in this range of parameter values between 0 and 1, but did not specify if this parameter adjustment greater than 1 what happens. Then we follow the above process Analysis:
  the premise of the process before we analyzed these three states are 8761,8762,8763 Server 8764 is down and the state of the server is up, and now we modify this premise.
  It is assumed that a start 8761,8762,8763,8764 four state Eureka Server is down.
  Register (8761,8762,8763 state is Down) start Eureka Client, so in this case is 3 quarantineSet.

  • The next task in timing and triggering event registration, at this time because the value of the parameter adjusted from 0.66 to 10. Therefore, the threshold value 40 is calculated. 3 at a time quarantineSet value. So they will not enter the else if (quarantineSet.size() >= threshold)branch, but will enter the final esle branch.
  • Will be completed in filtering else branch, the final results list returned in 8764 only one is the Eureka Server.
  • Eureka Client to launch the Eureka Server 8764 registration request, because the 8764 state also leads down registration fails, then the content is quarantineSet (8761,8762,8763,8764)
  • When the scheduled task is triggered again if (quarantineSet.isEmpty())this branch will not enter, because the value of 4 quarantineSet else if (quarantineSet.size() >= threshold)this branch will not enter because the threshold is 40
  • Eventually entering else branch, which was originally the meaning wanted to act as a filter through quarantineSet, previously filtered from the full amount of Eureka Server status is down the Eureka Server, but since the value of quarantineSet is now full amount, resulting filtered the results returned is an empty list. Even at this time Eureka Server list (8761,8762,8763,8764) any changes to the status of a Server UP, the Eureka Client could not complete the registration event.

 

Guess you like

Origin www.cnblogs.com/chihirotan/p/11568185.html