Nacos Registration Center AP Architecture Analysis Flowchart

 

Accessible writing articles

log in Register

Nacos registration center principle and core source code analysis

Uploading... Reupload Cancel

Advance Group

11 people liked this article

Table of contents

put away

1 Introduction

2. Nacos core concepts

3. Registry structure of Nacos Server

4. Nacos Client and Spring Cloud integration

Client service registration

5. How to support high-concurrency registration (asynchronous task and memory queue design principle and source code analysis)

6. How does the registry prevent multi-node read and write concurrent conflicts (implementation of the idea of ​​Copy On Write)

7. Heartbeat mechanism and service health check design principle and source code analysis

8. Design principle and source code analysis of heartbeat detection under cluster architecture

9. Service change event release mechanism and source code analysis

10. Service offline mechanism and source code analysis

1 Introduction

Since Spring Cloud Netflix entered Maintenance Mode on December 12, 2018, everyone shifted their focus to Spring Cloud Alibaba, and Alibaba also increased investment in the Spring Cloud ecosystem, so Spring Cloud Alibaba is currently very popular in China. Today Share some of the core functional principles of the registration center module of Nacos (version: 1.4.2), a very important component in Spring Cloud Alibaba.

The content of this article is as follows:

  • Nacos core concepts
  • Nacos Server registry structure
  • Nacos Client integrates with Spring Cloud
  • How to support high-concurrency registration (asynchronous task and memory queue design principle and source code analysis)
  • How does the registry prevent multi-node read and write concurrent conflicts (implementation of Copy On Write idea)
  • Heartbeat mechanism and service health check design principle and source code analysis
  • Design principle and source code analysis of heartbeat detection under cluster architecture
  • Service Change Event Publishing Mechanism and Source Code Analysis
  • Service offline mechanism and source code analysis

Part of the core function source code map of the registration center

Click the original link to view the original picture (or pay attention to the official account of Leading Group, the article is more abundant)

04. Part of the core function source code map of the Nacos registration center | ProcessOn free online drawing, online flow chart, online mind map |

2. Nacos core concepts

  • Namespace: Used for tenant granular configuration isolation. The configuration of the same Group or Data ID can exist under different namespaces. One of the common scenarios of Namespace is the separation and isolation of configurations in different environments, such as the isolation of resources (such as configurations and services) in development and test environments and production environments. If not configured, the default is: public
  • Group: Service grouping, different services can be classified into the same group.
  • Service: The identifier provided by the service, through which the service it refers to can be uniquely identified. The ${spring.application.name} obtained by default is used as the service name.
  • Cluster: All service instances under the same service form a default cluster, and the cluster can be further divided according to needs, and the division unit can be a virtual cluster.
  • Instance: A process with an accessible network address (IP:Port) that provides one or more services.

Illustration of the above concept:

Usage example:

The picture above is an example of the usage of the actual scene where I worked before. Each company has its own set of division rules, so it is necessary to flexibly use the design of the Nacos registry according to the actual business situation of the company.

3. Registry structure of Nacos Server

When it comes to service registration, the first thing we need to pay attention to is how the registry structure is designed. The Nacos registry structure design method is a double Map structure, which is defined as follows:

The comments in the source code have actually explained the storage structure of this double Map data structure. The key of the outermost Map is Namespace, the Value is a Map, the Key of the inner Map is group::serviceName, and the Value is a Service object. There is also a Map data structure in the Service object, as follows:

The Key value of the Map is the name of the Cluster, and the Value is the Cluster object. There are two Set data structures in the Cluster object, which are used to store the Instance. This Instance is the real instance information registered by the client.

There are two sets here, one is used to store temporary instances, the other is used to store persistent instances, there is a key point, what will be stored in temporary instances, and what will be stored in persistent instances, this is up to the client It depends on the configuration. By default, the client configures ephemeral=true. If you want to store the instance in a persistent manner, you can set ephemeral=false, so that this parameter will be brought to Nacos Server when the client initiates registration. , Nacos Server will store it in a persistent manner.

Note: Nacos currently uses the local file storage method for persistent storage.

4. Nacos Client and Spring Cloud integration

Client service registration

The HTTP protocol used by Nacos in the communication mode of version 1.x will add the gRPC protocol after version 2.0. The version used in this article is 1.4.2, so this article is still based on the analysis of the HTTP protocol. The interface address of the service registration is: /nacos /v1/ns/instance, the full source code path of this interface is: com.alibaba.nacos.naming.controllers.InstanceController#register.

Nacos provides a variety of client integration methods. This article mainly analyzes the source code of the Spring Cloud integration method. If readers are interested in other integration methods, you can go to the Nacos official website to view other integration methods.
4.1 Add dependencies

4.2 Configure Nacos server address

Students who are familiar with Spring Boot know this file. Here is the place to configure AutoConfiguration. Here you only need to pay attention to it: NacosServiceRegistryAutoConfiguration. This class injects three classes: NacosServiceRegistry, NacosRegistration, and NacosAutoServiceRegistration. The Nacos client mainly relies on this Several classes are used to realize service registration and discovery. First, let’s look at the class inheritance relationship of NacosAutoServiceRegistration, as follows:

Careful students may have discovered that the inherited AbstractAutoServiceRegistration class of NacosAutoServiceRegistration implements the ApplicationListener interface, so there must be an Event that is monitored in the AbstractAutoServiceRegistration class. Sure enough, the following code is found in the AbstractAutoServiceRegistration class:

It is found in the above source code that a register method will be called eventually, and this method is to actually register the current instance with Nacos Server.

It can be seen from the source code that the reqApi method is finally called, a POST request is sent to the Nacos Server /nacos/v1/ns/instance interface, and the current instance is registered. Here, the core registration process of the entire client is analyzed.


Client Service Discovery
The address of the service discovery interface is: /nacos/v1/ns/instance/list, and the full path of the source code of this interface is: com.alibaba.nacos.naming.controllers.InstanceController#list

A NacosDiscoveryClientConfiguration class is configured in spring.factories, which injects a NacosWatch class into Spring. The class diagram of this class is as follows:


As can be seen from the above figure, this class implements the Lifecycle interface, which is a lifecycle interface designed by Spring. If this interface is implemented, the start() method will be called back after Spring has loaded all the beans and initialized them. In this method, the pull of the service is completed and updated to the local cache. The code is as follows:

It can be seen from the source code that the serverProxy.queryList method is also called at the end. This method also initiates an HTTP request and calls the /nacos/v1/ns/instance/list interface of Nacos Server to pull the service.
So far, we have analyzed Spring Cloud’s integrated Nacos client code for service pull from the source code level. In fact, the code is relatively simple. In summary, it is to construct the parameters required by the list interface, and then initiate an HTTP request to perform service pull. Pick.
Pay attention to the call of a scheduleUpdateIfAbsent method from the source code. Here, an UpdateTask task is submitted. UpdateTask is a class that implements the Runnable interface. The main code is as follows:

It can be seen from the source code that this code is equivalent to timing 10s (this time is returned from the /nacos/v1/ns/instance/list interface) to pull a service. Here is a clever design of Nacos Server It needs to be mentioned that in the updateServiceNow method, you can see that a Udp port is passed in when calling the server /nacos/v1/ns/instance/list interface. The function of this port is that if Nacos Server senses the change of the Service, it will The change information will be notified to the clients who have subscribed to the Service information.


5. How to support high-concurrency registration (asynchronous task and memory queue design principle and source code analysis)


Previously, we mainly analyzed the logic of Spring Cloud’s integrated Nacos client’s service registration and service pull. Now we will analyze the core function logic and source code of the Nacos Server registration center. First, we will analyze how Nacos can support high-concurrency Intance registration.
Give the answer directly: the service registration is performed by using the memory queue,
that is to say, when the client registers its own information to the Nacos Server, it does not write the information into the registry synchronously, and adopts the method of first writing the information to the memory In the queue, and then use an independent thread pool to consume the queue for registration.
The source code is as follows:

It can be seen from the source code that the listener.onChange() method will be executed in the end, and the Instances will be passed in, and then the real registration logic will be performed. The design here is to increase the number of concurrent registrations of Nacos Server. If you are very concerned about Nacos performance-related issues , you can view the official stress test report ( https://nacos.io/zh-cn/docs/nacos-naming-benchmark.html ), or do a stress test yourself.
Let me mention here again that the thread pool of JDK is actually used in the queue consumption. The code to track the instantiated thread pool is:

6. How does the registry prevent multi-node read and write concurrent conflicts (implementation of the idea of ​​Copy On Write)

How does Nacos Server resolve read-write concurrency conflicts when writing Instances to the registry?
The answer is:
the source code of Copy on write idea is as follows:

There are a lot of source code here, and it is also difficult to understand. I also wrote related annotations in the source code. In fact, the general meaning is that a List<Instance> ips is passed in the updateIps method, and then the Instances in the previous registry are compared with the ips. Compare and get the instances that need to be added, updated, and deleted, and then do some related operations, such as setting some properties of Instance, starting heartbeat, deleting heartbeat, etc., and finally replace the processed List<Instance> ips directly The memory registry, so that if there is a read request at the same time, the read is actually the information of the old registry before, so that the problem of concurrent read and write conflicts is well controlled. This idea is the Copy On Write idea, which is not included in the JDK source code. There are also some related implementations in the package, such as: CopyOnWriteArrayList

7. Heartbeat mechanism and service health check design principle and source code analysis

The client will call the /nacos/v1/ns/instance/beat interface for heartbeat. The main logic is:
7.1 If the corresponding Instance is not found in Nacos Server, then construct an Instance. The source code is as follows:

7.2 If there is a current instance in Nacos Server, then update lastBeat, healthy, etc. The source code is as follows:

The server will also start a thread to detect the heartbeat information of the client to determine whether the client is alive. How does Nacos start the heartbeat detection and how does it detect the heartbeat?
Please see the source code:

The default heartbeat timeout time is 15s. If it is found that the lastBeat time in the Instance is less than the current time by more than 15s, then the Instance will be judged to be in an unhealthy state. First, the healthy in the Instance will be set to false, and then a service change will be issued. Event, and then publish a heartbeat timeout event.


8. Design principle and source code analysis of heartbeat detection under cluster architecture


I just analyzed the code of heartbeat detection. I have to say a little more about the detailed design here. If Nacos is in the state of the cluster, not every node machine in the cluster will detect the heartbeat of all Instances, but use an algorithm to calculate Each node machine needs to detect those Instances, let's see how it is calculated in the source code:

The general algorithm is to calculate a Hash value through serviceName and then take the modulus with the number of all machines in the Nacos cluster. If the obtained result is between the index and lastIndex of the cluster List with the current Nacos Server node, then the current Nacos Server node will be used. Perform heartbeat detection.


9. Service change event release mechanism and source code analysis


In the above, there is an interface /nacos/v1/ns/instance/list on the Nacos Server side. This interface is used to pull the Instace interface according to the serviceName, and the client will pull it regularly. If the Instance in the Service If there is a change, you can perceive it when calling this interface, but this perception will be delayed. The default pull frequency is 10s. Nacos has also made some good designs in some details. Nacos Client is not completely Rely on this timed task to perceive the change of Service. In order to make up for this delay as much as possible, a UDP change notification design is adopted. When the client calls the /nacos/v1/ns/instance/list interface, a UDP will be passed in. port. In the interface, other services subscribed by the Service will be added to a com.alibaba.nacos.naming.push.PushService#clientMap. If the Instance in the Service changes, the list of clients that have subscribed to this instance will be retrieved. And notify through UDP.
The source code is as follows:

10. Service offline mechanism and source code analysis


The interface for service offline is /nacos/v1/ns/instance, which is the same interface as registration, but the predicate of HTTP is: DELETE. The logic of calling this interface is roughly: first obtain the corresponding namespaceId and serviceName from the registry. Instances copy, and then delete the instance that needs to be offline from the copy, and then replace the corresponding information in the registry with the processed Instances copy, the first half of the logic in the source code is to get a deleted Instances copy of the instance, The logic behind is actually consistent with the registration logic, the source code is as follows:

Guess you like

Origin blog.csdn.net/m0_70793154/article/details/126977680