[Source learn together - Micro Services] Nexflix Eureka Source twelve: EurekaServer cluster model source code analysis

Foreword

Antecedent Review

Last lecture looked at self-protection mechanism Eureka registry, as well as bug problem which he mentions.

Haha Suddenly all in 2020, this series of articles from 12.17 until about now, is not easy to Ha, continuing to learn every day, outputs blog, this period of time is really a lot.

Today, members of the group within the company to share the Eureka source code analysis, echo effect can also be harvested feel something. Behind will continue to feign, ribbon, hystrix source of learning, the article still serialized form output.

This lecture directory

This lecture is mainly data synchronization EurekaServer cluster model to explain the main directory is as follows.

Contents are as follows:

  1. eureka server cluster mechanism
  2. Registration, off the assembly line, the renewal of registry synchronization mechanism
  3. Registry Synchronization three queuing Comments

Technical highlights:

  1. 3-layer queuing mechanism to achieve the registry batch synchronization requirements

Explanation

The original is not easy, should the reprint please indicate the source!

Blog address: flower count romantic
micro-channel public number: One branch count romantic flowers

Source code analysis

eureka server cluster mechanism

image.png

Eureka Server will be registered in the data synchronization, offline, renewal time, information will be synchronized to other Eureka Server nodes.

Imagine that this is certainly not the real-time synchronization, the future continues to look at the registry synchronization mechanism of it.

Registration, off the assembly line, the renewal of registry synchronization mechanism

We registered an example to Eureka Client, to see how the Eureka Server synchronization to other nodes.

PeerAwareInstanceRegistryImpl.java :

public void register(final InstanceInfo info, final boolean isReplication) {
    int leaseDuration = Lease.DEFAULT_DURATION_IN_SECS;
    if (info.getLeaseInfo() != null && info.getLeaseInfo().getDurationInSecs() > 0) {
        leaseDuration = info.getLeaseInfo().getDurationInSecs();
    }
    super.register(info, leaseDuration, isReplication);
    replicateToPeers(Action.Register, info.getAppName(), info.getId(), info, null, isReplication);
}

private void replicateToPeers(Action action, String appName, String id,
                                  InstanceInfo info /* optional */,
                                  InstanceStatus newStatus /* optional */, boolean isReplication) {
    Stopwatch tracer = action.getTimer().start();
    try {
        if (isReplication) {
            numberOfReplicationsLastMin.increment();
        }
        // If it is a replication already, do not replicate again as this will create a poison replication
        if (peerEurekaNodes == Collections.EMPTY_LIST || isReplication) {
            return;
        }

        for (final PeerEurekaNode node : peerEurekaNodes.getPeerEurekaNodes()) {
            // If the url represents this host, do not replicate to yourself.
            if (peerEurekaNodes.isThisMyUrl(node.getServiceUrl())) {
                continue;
            }
            replicateInstanceActionsToPeers(action, appName, id, info, newStatus, node);
        }
    } finally {
        tracer.stop();
    }
}

private void replicateInstanceActionsToPeers(Action action, String appName,
                                                 String id, InstanceInfo info, InstanceStatus newStatus,
                                                 PeerEurekaNode node) {
    try {
        InstanceInfo infoFromRegistry = null;
        CurrentRequestVersion.set(Version.V2);
        switch (action) {
            case Cancel:
                node.cancel(appName, id);
                break;
            case Heartbeat:
                InstanceStatus overriddenStatus = overriddenInstanceStatusMap.get(id);
                infoFromRegistry = getInstanceByAppAndId(appName, id, false);
                node.heartbeat(appName, id, infoFromRegistry, overriddenStatus, false);
                break;
            case Register:
                node.register(info);
                break;
            case StatusUpdate:
                infoFromRegistry = getInstanceByAppAndId(appName, id, false);
                node.statusUpdate(appName, id, newStatus, infoFromRegistry);
                break;
            case DeleteStatusOverride:
                infoFromRegistry = getInstanceByAppAndId(appName, id, false);
                node.deleteStatusOverride(appName, id, infoFromRegistry);
                break;
        }
    } catch (Throwable t) {
        logger.error("Cannot replicate information to {} for action {}", node.getServiceUrl(), action.name(), t);
    }
}
  1. After registration, call replicateToPeers(), note that there are a parameter isReplication, if it is true, on behalf of other Eureka Server nodes synchronized, false registration is EurekaClient come.
  2. replicateToPeers()A period of logic, if isReplicationis true then directly out here to register means that the client service instance needs to spread to other nodes, if not you do not need to synchronize
  3. peerEurekaNodes.getPeerEurekaNodes()Get all of Eureka Server node, loop through to synchronize data, callreplicateInstanceActionsToPeers()
  4. replicateInstanceActionsToPeers()The method according to the registration, offline, etc. to deal with different logical renew

The next step is where the real implementation of synchronization logic, this is mainly spent three synchronous request queue for a batch operation, the request is labeled as a batch and then http request to the various EurekaServer.

Registry Synchronization three queuing Comments

Coming here is really into the synchronization logic, the logic here is to register above the main line, and then continue down the above code with:

PeerEurekaNode.java :

public void register(final InstanceInfo info) throws Exception {
    long expiryTime = System.currentTimeMillis() + getLeaseRenewalOf(info);
    batchingDispatcher.process(
            taskId("register", info),
            new InstanceReplicationTask(targetHost, Action.Register, info, null, true) {
                public EurekaHttpResponse<Void> execute() {
                    return replicationClient.register(info);
                }
            },
            expiryTime
    );
}

Here will be the implementation of batchingDispatcher.process()the method, we continue to point in, then enter TaskDispatchers.createBatchingTaskDispatcher()method, see the anonymous inner class in which process()method:

void process(ID id, T task, long expiryTime) {
        // 将请求都放入到acceptorQueue中
        acceptorQueue.add(new TaskHolder<ID, T>(id, task, expiryTime));
        acceptedTasks++;
    }

Task will need to synchronize data are put into acceptorQueuethe queue.
Then back to the createBatchingTaskDispatcher()process, look AcceptorExecutor, it's the constructor starts a background thread:

ThreadGroup threadGroup = new ThreadGroup("eurekaTaskExecutors");

this.acceptorThread = new Thread(threadGroup, new AcceptorRunner(), "TaskAcceptor-" + id);

We continue with AcceptorRunner.java:

class AcceptorRunner implements Runnable {
    @Override
    public void run() {
        long scheduleTime = 0;
        while (!isShutdown.get()) {
            try {
                // 处理acceptorQueue队列中的数据
                drainInputQueues();

                int totalItems = processingOrder.size();

                long now = System.currentTimeMillis();
                if (scheduleTime < now) {
                    scheduleTime = now + trafficShaper.transmissionDelay();
                }
                if (scheduleTime <= now) {
                    // 将processingOrder拆分成一个个batch,然后进行操作
                    assignBatchWork();
                    assignSingleItemWork();
                }

                // If no worker is requesting data or there is a delay injected by the traffic shaper,
                // sleep for some time to avoid tight loop.
                if (totalItems == processingOrder.size()) {
                    Thread.sleep(10);
                }
            } catch (InterruptedException ex) {
                // Ignore
            } catch (Throwable e) {
                // Safe-guard, so we never exit this loop in an uncontrolled way.
                logger.warn("Discovery AcceptorThread error", e);
            }
        }
    }

    private void drainInputQueues() throws InterruptedException {
        do {
            drainAcceptorQueue();

            if (!isShutdown.get()) {
                // If all queues are empty, block for a while on the acceptor queue
                if (reprocessQueue.isEmpty() && acceptorQueue.isEmpty() && pendingTasks.isEmpty()) {
                    TaskHolder<ID, T> taskHolder = acceptorQueue.poll(10, TimeUnit.MILLISECONDS);
                    if (taskHolder != null) {
                        appendTaskHolder(taskHolder);
                    }
                }
            }
        } while (!reprocessQueue.isEmpty() || !acceptorQueue.isEmpty() || pendingTasks.isEmpty());
    }

    private void drainAcceptorQueue() {
        while (!acceptorQueue.isEmpty()) {
            // 将acceptor队列中的数据放入到processingOrder队列中去,方便后续拆分成batch
            appendTaskHolder(acceptorQueue.poll());
        }
    }

    private void appendTaskHolder(TaskHolder<ID, T> taskHolder) {
        if (isFull()) {
            pendingTasks.remove(processingOrder.poll());
            queueOverflows++;
        }
        TaskHolder<ID, T> previousTask = pendingTasks.put(taskHolder.getId(), taskHolder);
        if (previousTask == null) {
            processingOrder.add(taskHolder.getId());
        } else {
            overriddenTasks++;
        }
    }
            
}

Seriously with this code inside, you can see here above is acceptorQueueput into processingOrder, which processingOrderis a queue.

In AcceptorRunner.javathe run()process, will call assignBatchWork()the method, there is to processingOrderbe labeled as one batch, and then look at the code:

void assignBatchWork() {
            if (hasEnoughTasksForNextBatch()) {
                if (batchWorkRequests.tryAcquire(1)) {
                    long now = System.currentTimeMillis();
                    int len = Math.min(maxBatchingSize, processingOrder.size());
                    List<TaskHolder<ID, T>> holders = new ArrayList<>(len);
                    while (holders.size() < len && !processingOrder.isEmpty()) {
                        ID id = processingOrder.poll();
                        TaskHolder<ID, T> holder = pendingTasks.remove(id);
                        if (holder.getExpiryTime() > now) {
                            holders.add(holder);
                        } else {
                            expiredTasks++;
                        }
                    }
                    if (holders.isEmpty()) {
                        batchWorkRequests.release();
                    } else {
                        batchSizeMetric.record(holders.size(), TimeUnit.MILLISECONDS);
                        // 将批量数据放入到batchWorkQueue中
                        batchWorkQueue.add(holders);
                    }
                }
            }
        }

        private boolean hasEnoughTasksForNextBatch() {
            if (processingOrder.isEmpty()) {
                return false;
            }
            // 默认maxBufferSize为250
            if (pendingTasks.size() >= maxBufferSize) {
                return true;
            }

            TaskHolder<ID, T> nextHolder = pendingTasks.get(processingOrder.peek());
            // 默认maxBatchingDelay为500ms
            long delay = System.currentTimeMillis() - nextHolder.getSubmitTimestamp();
            return delay >= maxBatchingDelay;
        }

Rule batch is added here: maxBufferSizeThe default is 250
maxBatchingDelaydefaults to 500ms, labeled as one batch after the start of transmission to the server side. As for how to send us went to see PeerEurekaNode.java, we call the beginning register()is to call the method PeerEurekaNode.register(), we look at its constructor:

PeerEurekaNode(PeerAwareInstanceRegistry registry, String targetHost, String serviceUrl,
                                     HttpReplicationClient replicationClient, EurekaServerConfig config,
                                     int batchSize, long maxBatchingDelayMs,
                                     long retrySleepTimeMs, long serverUnavailableSleepTimeMs) {
    this.registry = registry;
    this.targetHost = targetHost;
    this.replicationClient = replicationClient;

    this.serviceUrl = serviceUrl;
    this.config = config;
    this.maxProcessingDelayMs = config.getMaxTimeForReplication();

    String batcherName = getBatcherName();
    ReplicationTaskProcessor taskProcessor = new ReplicationTaskProcessor(targetHost, replicationClient);
    this.batchingDispatcher = TaskDispatchers.createBatchingTaskDispatcher(
            batcherName,
            config.getMaxElementsInPeerReplicationPool(),
            batchSize,
            config.getMaxThreadsForPeerReplication(),
            maxBatchingDelayMs,
            serverUnavailableSleepTimeMs,
            retrySleepTimeMs,
            taskProcessor
    );
}

Here instantiates a ReplicationTaskProcessor.java, we go with, it is issued to achieve TaskProcessor, they must be executed in such a process()method, executed as follows:

public ProcessingResult process(List<ReplicationTask> tasks) {
    ReplicationList list = createReplicationListOf(tasks);
    try {
        EurekaHttpResponse<ReplicationListResponse> response = replicationClient.submitBatchUpdates(list);
        int statusCode = response.getStatusCode();
        if (!isSuccess(statusCode)) {
            if (statusCode == 503) {
                logger.warn("Server busy (503) HTTP status code received from the peer {}; rescheduling tasks after delay", peerId);
                return ProcessingResult.Congestion;
            } else {
                // Unexpected error returned from the server. This should ideally never happen.
                logger.error("Batch update failure with HTTP status code {}; discarding {} replication tasks", statusCode, tasks.size());
                return ProcessingResult.PermanentError;
            }
        } else {
            handleBatchResponse(tasks, response.getEntity().getResponseList());
        }
    } catch (Throwable e) {
        if (isNetworkConnectException(e)) {
            logNetworkErrorSample(null, e);
            return ProcessingResult.TransientError;
        } else {
            logger.error("Not re-trying this exception because it does not seem to be a network exception", e);
            return ProcessingResult.PermanentError;
        }
    }
    return ProcessingResult.Success;
}

There is List<ReplicationTask> tasksby submitBatchUpdate()sending to the server side.
end server PeerReplicationResource.batchReplication()to handle, actually calling the cycle ApplicationResource.addInstance()method, the method back to the beginning of registration.

This synchronization logic EurekaServer ended, where the main data structure is about three queues, batch through a batchList to synchronize data.

Note that there is a very important point, is to call addInstance registration Client () method, here to the server-side PeerAwareInstanceRegistryImplperforms other EurekaServer synchronization logic.

The EurekaServer synchronous register interface still calls addInstance () method, where do the dead cycle call? Certainly not, addInstance () is also a parameter: isReplicationin the end of the last call server when the following method:registry.register(info, "true".equals(isReplication));

We know, EurekaClient at the time of registration of the isReplicationtransfer is null, so here is false, and when the Server-side synchronization of call:

PeerReplicationResource:

private static Builder handleRegister(ReplicationInstance instanceInfo, ApplicationResource applicationResource) {
        applicationResource.addInstance(instanceInfo.getInstanceInfo(), REPLICATION);
        return new Builder().setStatusCode(Status.OK.getStatusCode());
    }

Here REPLICATIONis true

In addition AbstractJersey2EurekaHttpClient, when the transmission request register, there is a addExtraHeaders()method, as shown below:

image.png

If you are used Jersey2ReplicationClientto send, then the header of the x-netflix-discovery-replicationconfiguration was true, registered in the later execution addInstance()method receives the parameters:

to sum up

Still a flow chart, the text parsing content are included in this figure a:

11_Eureka注册中心集群同步原理.png

Declare

This article starting from my blog: https://www.cnblogs.com/wang-meng and public numbers: One ramiflorous be considered romantic , should reprint please indicate the source!

Interested partner may be concerned about the small number of individual public: One branch count romantic flowers

22.jpg

Guess you like

Origin www.cnblogs.com/wang-meng/p/12143004.html