Eureka client renewal and server-side expired lease cleanup source code analysis

In the previous article: EurekaClient automatic assembly and startup process analysis , we mentioned that in DiscoveryClientaddition to the registration process, a heartbeat thread is also scheduled during construction:

scheduler.schedule(
                    new TimedSupervisorTask(
                            "heartbeat",
                            scheduler,
                            heartbeatExecutor,
                            renewalIntervalInSecs,
                            TimeUnit.SECONDS,
                            expBackOffBound,
                            new HeartbeatThread()
                    ),
                    renewalIntervalInSecs, TimeUnit.SECONDS);
                    

where the HeartbeatThreadthread is as follows:

    private class HeartbeatThread implements Runnable {

        public void run() {
        //续约
            if (renew()) {
			  //续约成功时间戳更新
                lastSuccessfulHeartbeatTimestamp = System.currentTimeMillis();
            }
        }
    }

 boolean renew() {
        EurekaHttpResponse<InstanceInfo> httpResponse;
        try {
		  //发送续约请求
            httpResponse = eurekaTransport.registrationClient.sendHeartBeat(instanceInfo.getAppName(), instanceInfo.getId(), instanceInfo, null);
            logger.debug(PREFIX + "{} - Heartbeat status: {}", appPathIdentifier, httpResponse.getStatusCode());
            if (httpResponse.getStatusCode() == 404) {
                REREGISTER_COUNTER.increment();
                logger.info(PREFIX + "{} - Re-registering apps/{}", appPathIdentifier, instanceInfo.getAppName());
                long timestamp = instanceInfo.setIsDirtyWithTime();
			  //重新注册
                boolean success = register();
                if (success) {
                    instanceInfo.unsetIsDirty(timestamp);
                }
                return success;
            }
            return httpResponse.getStatusCode() == 200;
        } catch (Throwable e) {
            logger.error(PREFIX + "{} - was unable to send heartbeat!", appPathIdentifier, e);
            return false;
        }
    }

The renewal request is sent directly here. If the renewal request fails, it will try to register again.

The server accepts the renewal request

The controller that the server accepts the renewal request is in the InstanceResourceclass

@PUT
    public Response renewLease(
            @HeaderParam(PeerEurekaNode.HEADER_REPLICATION) String isReplication,
            @QueryParam("overriddenstatus") String overriddenStatus,
            @QueryParam("status") String status,
            @QueryParam("lastDirtyTimestamp") String lastDirtyTimestamp) {
        boolean isFromReplicaNode = "true".equals(isReplication);
	  //续约
        boolean isSuccess = registry.renew(app.getName(), id, isFromReplicaNode);

        // 续约失败
        if (!isSuccess) {
            logger.warn("Not Found (Renew): {} - {}", app.getName(), id);
            return Response.status(Status.NOT_FOUND).build();
        }
        // 校验客户端与服务端的时间差异,如果存在问题则需要重新发起注册
        Response response = null;
        if (lastDirtyTimestamp != null && serverConfig.shouldSyncWhenTimestampDiffers()) {
            response = this.validateDirtyTimestamp(Long.valueOf(lastDirtyTimestamp), isFromReplicaNode);
            if (response.getStatus() == Response.Status.NOT_FOUND.getStatusCode()
                    && (overriddenStatus != null)
                    && !(InstanceStatus.UNKNOWN.name().equals(overriddenStatus))
                    && isFromReplicaNode) {
                registry.storeOverriddenStatusIfRequired(app.getAppName(), id, InstanceStatus.valueOf(overriddenStatus));
            }
        } else {
            response = Response.ok().build();
        }
        logger.debug("Found (Renew): {} - {}; reply status={}", app.getName(), id, response.getStatus());
        return response;
    }

It can be seen that there is still a problem of checking the time difference after the renewal. This is not detailed. Continue to see the relevant information of renewal

    public boolean renew(final String appName, final String id, final boolean isReplication) {
        if (super.renew(appName, id, isReplication)) {
		  //集群同步
            replicateToPeers(Action.Heartbeat, appName, id, null, null, isReplication);
            return true;
        }
        return false;
    }

The relevant content of cluster synchronization here has been mentioned in the previous article, and will not be expanded anymore. The core processing of the renewal is as follows

 public boolean renew(String appName, String id, boolean isReplication) {
        RENEW.increment(isReplication);
   //获取已存在的租约
        Map<String, Lease<InstanceInfo>> gMap = registry.get(appName);
        Lease<InstanceInfo> leaseToRenew = null;
        if (gMap != null) {
            leaseToRenew = gMap.get(id);
        }
   //租约不存在
        if (leaseToRenew == null) {
            RENEW_NOT_FOUND.increment(isReplication);
            logger.warn("DS: Registry: lease doesn't exist, registering resource: {} - {}", appName, id);
            return false;
        } else {
		  //获取客户端
            InstanceInfo instanceInfo = leaseToRenew.getHolder();
		  //设置客户端的状态
            if (instanceInfo != null) {
                // touchASGCache(instanceInfo.getASGName());
                InstanceStatus overriddenInstanceStatus = this.getOverriddenInstanceStatus(
                        instanceInfo, leaseToRenew, isReplication);
                if (overriddenInstanceStatus == InstanceStatus.UNKNOWN) {
                    logger.info("Instance status UNKNOWN possibly due to deleted override for instance {}"
                            + "; re-register required", instanceInfo.getId());
                    RENEW_NOT_FOUND.increment(isReplication);
                    return false;
                }
                if (!instanceInfo.getStatus().equals(overriddenInstanceStatus)) {
                    logger.info(
                            "The instance status {} is different from overridden instance status {} for instance {}. "
                                    + "Hence setting the status to overridden status", instanceInfo.getStatus().name(),
                                    instanceInfo.getOverriddenStatus().name(),
                                    instanceInfo.getId());
				  //覆盖当前状态
                    instanceInfo.setStatusWithoutDirty(overriddenInstanceStatus);

                }
            }
            renewsLastMin.increment();
		  //设置租约最后更新时间
            leaseToRenew.renew();
            return true;
        }
    }

For students who have read the previous article, the overall process is relatively simple

Server-side expired lease cleanup

In the article Eureka application registration and cluster data synchronization source code analysis , you should be familiar with the following line of code

int registryCount = registry.syncUp();

The above line of code initiates cluster data synchronization, and the next line of code is the server-side expired lease cleanup logic

registry.openForTraffic(applicationInfoManager, registryCount);

openForTrafficAt the end of the method, a method is called postInit, and postInita thread is started in the method , EvictionTaskand this thread is responsible for cleaning up the expired lease

evictionTimer.schedule(evictionTaskRef.get(),       
serverConfig.getEvictionIntervalTimerInMs(), 
serverConfig.getEvictionIntervalTimerInMs());

Take a look at this thread

class EvictionTask extends TimerTask {

   @Override
   public void run() {
       try {
           //补偿时间毫秒数
           long compensationTimeMs = getCompensationTimeMs();
           logger.info("Running the evict task with compensationTime {}ms", compensationTimeMs);
           // 清理逻辑
           evict(compensationTimeMs);
       } catch (Throwable e) {
           logger.error("Could not run the evict task", e);
       }
   }

}

The compensation time is obtained as follows:

long getCompensationTimeMs() {
            long currNanos = getCurrentTimeNano();
            long lastNanos = lastExecutionNanosRef.getAndSet(currNanos);
            if (lastNanos == 0l) {
                return 0l;
            }

            long elapsedMs = TimeUnit.NANOSECONDS.toMillis(currNanos - lastNanos);
            //当前时间 - 最后任务执行时间 - 任务执行频率
            long compensationTime = elapsedMs - serverConfig.getEvictionIntervalTimerInMs();
            return compensationTime <= 0l ? 0l : compensationTime;
        }

Then look at the core logic of cleaning

public void evict(long additionalLeaseMs) {
        logger.debug("Running the evict task");

        if (!isLeaseExpirationEnabled()) {
            logger.debug("DS: lease expiration is currently disabled.");
            return;
        }

        // 1. 获得所有的过期租约
        List<Lease<InstanceInfo>> expiredLeases = new ArrayList<>();
        for (Entry<String, Map<String, Lease<InstanceInfo>>> groupEntry : registry.entrySet()) {
            Map<String, Lease<InstanceInfo>> leaseMap = groupEntry.getValue();
            if (leaseMap != null) {
                for (Entry<String, Lease<InstanceInfo>> leaseEntry : leaseMap.entrySet()) {
                    Lease<InstanceInfo> lease = leaseEntry.getValue();
                    if (lease.isExpired(additionalLeaseMs) && lease.getHolder() != null) {
                        expiredLeases.add(lease);
                    }
                }
            }
        }

        // 2. 计算允许清理的数量
        int registrySize = (int) getLocalRegistrySize();
        int registrySizeThreshold = (int) (registrySize * serverConfig.getRenewalPercentThreshold());
        int evictionLimit = registrySize - registrySizeThreshold;
        int toEvict = Math.min(expiredLeases.size(), evictionLimit);
        
        // 3. 过期
        if (toEvict > 0) {
            logger.info("Evicting {} items (expired={}, evictionLimit={})", toEvict, expiredLeases.size(), evictionLimit);

            Random random = new Random(System.currentTimeMillis());
            for (int i = 0; i < toEvict; i++) {
                // Pick a random item (Knuth shuffle algorithm)
                int next = i + random.nextInt(expiredLeases.size() - i);
                Collections.swap(expiredLeases, i, next);
                Lease<InstanceInfo> lease = expiredLeases.get(i);

                String appName = lease.getHolder().getAppName();
                String id = lease.getHolder().getId();
                EXPIRED.increment();
                logger.warn("DS: Registry: expired lease for {}/{}", appName, id);
                internalCancel(appName, id, false);
            }
        }
    }

The entire expiration execution process is mainly divided into the following three steps:

  1. Get all expired leases Expired leases are calculated asisExpired
public boolean isExpired(long additionalLeaseMs) {    
    return (evictionTimestamp > 0 || System.currentTimeMillis() > 
(lastUpdateTimestamp + duration + additionalLeaseMs));
}

Service offline time>0||Current time>(last update time + lease duration + compensation time)

  1. Calculate the number of allowable cleanups. The getRenewalPercentThreshold()default value is 0.85, which means that by default, the maximum allowable expired number for each cleanup and the minimum number of 15% of all registrations
  2. Expired and expired cleanup is performed randomly, which is also designed to prevent a single application from being completely expired. Expiration processing is the opposite of registration processing:
 protected boolean internalCancel(String appName, String id, boolean isReplication) {
        try {
            read.lock();
            CANCEL.increment(isReplication);
            Map<String, Lease<InstanceInfo>> gMap = registry.get(appName);
            Lease<InstanceInfo> leaseToCancel = null;
            if (gMap != null) {
                leaseToCancel = gMap.remove(id);
            }
            synchronized (recentCanceledQueue) {
                recentCanceledQueue.add(new Pair<Long, String>(System.currentTimeMillis(), appName + "(" + id + ")"));
            }
            InstanceStatus instanceStatus = overriddenInstanceStatusMap.remove(id);
            if (instanceStatus != null) {
                logger.debug("Removed instance id {} from the overridden map which has value {}", id, instanceStatus.name());
            }
            if (leaseToCancel == null) {
                CANCEL_NOT_FOUND.increment(isReplication);
                logger.warn("DS: Registry: cancel failed because Lease is not registered for: {}/{}", appName, id);
                return false;
            } else {
                leaseToCancel.cancel();
                InstanceInfo instanceInfo = leaseToCancel.getHolder();
                String vip = null;
                String svip = null;
                if (instanceInfo != null) {
                    instanceInfo.setActionType(ActionType.DELETED);
                    recentlyChangedQueue.add(new RecentlyChangedItem(leaseToCancel));
                    instanceInfo.setLastUpdatedTimestamp();
                    vip = instanceInfo.getVIPAddress();
                    svip = instanceInfo.getSecureVipAddress();
                }
                invalidateCache(appName, vip, svip);
                logger.info("Cancelled instance {}/{} (replication={})", appName, id, isReplication);
                return true;
            }
        } finally {
            read.unlock();
        }
    }

1

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324148459&siteId=291194637