Source code reading|Young people don’t need to teach martial arts, but you have to be able to read the Naocs source code

Why do I often read the source code, because reading the source code can make you closer to the big guys, haha, this is my nonsense.

This article will take you to read the Nacos source code and teach you the skills to read the source code. Let’s start!

First, I will present a high-definition source code map that I sorted out, so that you can have an overall understanding of the source code of nacos.

Insert picture description here

With this picture, we can easily see the nacos source code.

How to find the entry point

First of all, we have to find an entry point into the nacos source code, then start with nacos dependencies

   <dependency>
            <groupId>com.alibaba.cloud</groupId>
            <artifactId>spring-cloud-starter-alibaba-nacos-discovery</artifactId>
  </dependency>

Enter this dependency file, you will find that it depends on another component:

<dependency>
            <groupId>com.alibaba.cloud</groupId>
            <artifactId>spring-cloud-alibaba-nacos-discovery</artifactId>
</dependency>

After entering the dependency, we found that it looks like this:

Insert picture description here

From this picture, we found a familiar configuration file spring.factories, which is a necessary file for sringboot automatic assembly

org.springframework.boot.autoconfigure.EnableAutoConfiguration=\
  com.alibaba.cloud.nacos.discovery.NacosDiscoveryAutoConfiguration,\
  com.alibaba.cloud.nacos.ribbon.RibbonNacosAutoConfiguration,\
  com.alibaba.cloud.nacos.endpoint.NacosDiscoveryEndpointAutoConfiguration,\
  com.alibaba.cloud.nacos.registry.NacosServiceRegistryAutoConfiguration,\
  com.alibaba.cloud.nacos.discovery.NacosDiscoveryClientConfiguration,\
  com.alibaba.cloud.nacos.discovery.reactive.NacosReactiveDiscoveryClientConfiguration,\
  com.alibaba.cloud.nacos.discovery.configclient.NacosConfigServerAutoConfiguration
org.springframework.cloud.bootstrap.BootstrapConfiguration=\
  com.alibaba.cloud.nacos.discovery.configclient.NacosDiscoveryClientConfigServiceBootstrapConfiguration

Because this piece mainly refers to the service registration source code, we can just focus on the (NacosServiceRegistryAutoConfiguration) automatic assembly file

public class NacosServiceRegistryAutoConfiguration {

	@Bean
	public NacosServiceRegistry nacosServiceRegistry(
			NacosDiscoveryProperties nacosDiscoveryProperties) {
		return new NacosServiceRegistry(nacosDiscoveryProperties);
	}

	@Bean
	@ConditionalOnBean(AutoServiceRegistrationProperties.class)
	public NacosRegistration nacosRegistration(
			NacosDiscoveryProperties nacosDiscoveryProperties,
			ApplicationContext context) {
		return new NacosRegistration(nacosDiscoveryProperties, context);
	}

	@Bean
	@ConditionalOnBean(AutoServiceRegistrationProperties.class)
	public NacosAutoServiceRegistration nacosAutoServiceRegistration(
			NacosServiceRegistry registry,
			AutoServiceRegistrationProperties autoServiceRegistrationProperties,
			NacosRegistration registration) {
		return new NacosAutoServiceRegistration(registry,
				autoServiceRegistrationProperties, registration);
	}

}

What we see is three bean injections. Here is a little trick to look at the source code: the bean class declared in the auto-assembly file, we only need to look at the bean with auto, this is often the entrance; NacosAutoServiceRegistration with auto , We click in to see what's inside:

	@Override
	protected void register() {
		if (!this.registration.getNacosDiscoveryProperties().isRegisterEnabled()) {
			log.debug("Registration disabled.");
			return;
		}
		if (this.registration.getPort() < 0) {
			this.registration.setPort(getPort().get());
		}
		super.register();
	}

There is a register() method in it. I’ll make a breakpoint here, because I guess this is the entry point for registration. I’m using debug mode to start a service and see if it will call this method:

Client registration

After posting my debug here, enter the screenshot of the call chain of the register method

Insert picture description here

See this call chain, see an onApplicationEvent callback method, find the class where this method is located. AbstractAutoServiceRegistration
This class inherits the ApplicationListener multicast listener. After spring starts, the multicast event will be released, and then the callback will implement the multicast. The onApplicationEvent method of the component, we start from this method to analyze:

public void onApplicationEvent(WebServerInitializedEvent event) {
		bind(event); // 绑定端口,并启动
	}
	
		@Deprecated
public void bind(WebServerInitializedEvent event) {
// 设置端口
    this.port.compareAndSet(0, event.getWebServer().getPort());
    // 启动客户端注册组件
	this.start();
}
public void start() {
        // 省略分支代码
        // 调用注册
			register();
	}

Because springcloud provides a variety of registry extensions, but we only refer to the nacos registry here, so the register method of NacosServiceRegistry is directly called here:

	public void register(Registration registration) {

    // 省略分支代码
    // 获取服务id
		String serviceId = registration.getServiceId();
		// 获取组配置
		String group = nacosDiscoveryProperties.getGroup();
     // 封装服务实例
		Instance instance = getNacosInstanceFromRegistration(registration);
		// 调用 命名服务的 registerInstance方法 注册实例
			namingService.registerInstance(serviceId, group, instance);
	}

Enter the registerInstance method

    public void registerInstance(String serviceName, String groupName, Instance instance) throws NacosException {
        if (instance.isEphemeral()) {
            // 省略分支代码
            // 与服务端建立心跳,默认每隔5秒定时发送新跳包
            this.beatReactor.addBeatInfo(NamingUtils.getGroupedName(serviceName, groupName), beatInfo);
        }
        // 通过http方式向服务端发送注册请求
        this.serverProxy.registerService(NamingUtils.getGroupedName(serviceName, groupName), groupName, instance);
    }

serverproxy sends a request to the server interface ("/nacos/v1/ns/instance") by calling the reapi method that encapsulates http,

   public void registerService(String serviceName, String groupName, Instance instance) throws NacosException {
        LogUtils.NAMING_LOGGER.info("[REGISTER-SERVICE] {} registering service {} with instance: {}", new Object[]{this.namespaceId, serviceName, instance});
        Map<String, String> params = new HashMap(9);
        params.put("namespaceId", this.namespaceId);
        params.put("serviceName", serviceName);
        params.put("groupName", groupName);
        params.put("clusterName", instance.getClusterName());
        params.put("ip", instance.getIp());
        params.put("port", String.valueOf(instance.getPort()));
        params.put("weight", String.valueOf(instance.getWeight()));
        params.put("enable", String.valueOf(instance.isEnabled()));
        params.put("healthy", String.valueOf(instance.isHealthy()));
        params.put("ephemeral", String.valueOf(instance.isEphemeral()));
        params.put("metadata", JSON.toJSONString(instance.getMetadata()));
        this.reqAPI(UtilAndComs.NACOS_URL_INSTANCE, params, (String)"POST");
    }

We know that nacos is often deployed in the form of a cluster. How does the client select one of the nodes to send? It must implement the logic of load balancing. We click reqAPI to see how it is implemented

 if (servers != null && !servers.isEmpty()) {
                Random random = new Random(System.currentTimeMillis());
                // 随机获取一个索引,servers保存的是所有nacos节点地址
                int index = random.nextInt(servers.size());
                // 遍历所有节点,根据index值,从servers中找到对应位置的server,进行请求调用,如果调用成功则返回,否则依次往后遍历,直到请求成功
                for(int i = 0; i < servers.size(); ++i) {
                    String server = (String)servers.get(index);

                    try {
                        return this.callServer(api, params, server, method);
                    } catch (NacosException var11) {
                        exception = var11;
                        LogUtils.NAMING_LOGGER.error("request {} failed.", server, var11);
                    } catch (Exception var12) {
                        exception = var12;
                        LogUtils.NAMING_LOGGER.error("request {} failed.", server, var12);
                    }
                    // index+1 然后取模 是保证index不会越界
                    index = (index + 1) % servers.size();
                }

                throw new IllegalStateException("failed to req API:" + api + " after all servers(" + servers + ") tried: " + ((Exception)exception).getMessage());
            }

At this point, the client registration code has been analyzed, but this is not the end of this article, we have to continue to analyze how the server handles the registration request sent by the client:

The server processes the client registration request

If you need to see the server source code, then you will need to source down under nacos Download

From the service registration api interface address (/nacos/v1/ns/instance), we can find the corresponding controller as (com.alibaba.nacos.naming.controllers.InstanceController)

Because the registered instance sends a post request, so directly find the register method annotated by postmapping

 @CanDistro
    @PostMapping
    public String register(HttpServletRequest request) throws Exception {
// 获取服务名
        String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);
// 获取命名空间id
        String namespaceId = WebUtils.optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID);

// 注册实例
serviceManager.registerInstance(namespaceId, serviceName, parseInstance(request));
        return "ok";
    }

We click to enter the registerInstance method:

    public void registerInstance(String namespaceId, String serviceName, Instance instance) throws NacosException {

        createEmptyService(namespaceId, serviceName, instance.isEphemeral());

        Service service = getService(namespaceId, serviceName);

        if (service == null) {
            throw new NacosException(NacosException.INVALID_PARAM,
                "service not found, namespace: " + namespaceId + ", service: " + serviceName);
        }
// 执行添加实例的操作
        addInstance(namespaceId, serviceName, instance.isEphemeral(), instance);
    }

analysis

In nacos, after registering an instance, you also need to synchronize the registration information to other nodes. There are two synchronization modes in nacos, AP and CP. AP and CP are mainly reflected in the implementation of how to synchronize registration information to other cluster nodes in the cluster.
Above ; nacos uses the ephemeral field value to decide whether to use ap mode synchronization or cp mode synchronization. The default ap mode is used to synchronize registration information.
com.alibaba.nacos.naming.core.ServiceManager.addInstance()

    public void addInstance(String namespaceId, String serviceName, boolean ephemeral, Instance... ips) throws NacosException {
        // 生成服务的key
        String key = KeyBuilder.buildInstanceListKey(namespaceId, serviceName, ephemeral);
        // 获取服务
        Service service = getService(namespaceId, serviceName);
        // 使用同步锁处理
        synchronized (service) {
            List<Instance> instanceList = addIpAddresses(service, ephemeral, ips);

            Instances instances = new Instances();
            instances.setInstanceList(instanceList);
            // 调用consistencyService.put 处理同步过来的服务
            consistencyService.put(key, instances);
        }
    }

We are entering the consistencyService.put method

Insert picture description here

When you click the put method, you will see three implementation classes. According to the context (or debug method), you can infer that the DelegateConsistencyServiceImpl implementation class is referenced here.

    @Override
    public void put(String key, Record value) throws NacosException {
        // 进入到这个put方法后,就可以知道应该使用ap方式同步还是cp方式同步
        mapConsistencyService(key).put(key, value);
    }

From the following method, you can judge whether to use ap or cp to synchronize registration information by key, where the key is composed of the ephemeral field;

   private ConsistencyService mapConsistencyService(String key) {
        return KeyBuilder.matchEphemeralKey(key) ? ephemeralConsistencyService : persistentConsistencyService;
    }

AP mode synchronization process (ephemeralConsistencyService)

The local server processes the registration information & synchronizes the registration information to other nodes

    @Override
    public void put(String key, Record value) throws NacosException {
        // 处理本地注册列表
        onPut(key, value);
        // 添加阻塞任务,同步信息到其他集群节点
        taskDispatcher.addTask(key);
    }

Process local registered nodes

Nacos uses the key as a task, adds it to the blocking queue tasks in the notifer, and uses single-threaded execution. When notifer is initialized, it is placed in the thread pool as a thread (the thread pool only sets a core thread);

Here is a point to tell you: in most distributed frameworks, single-threaded blocking queues are used to handle time-consuming tasks. On the one hand, it can solve the concurrency problem, and on the other hand, it can solve the write-write conflict caused by concurrency.

The main processing logic in the thread is to cyclically read the contents of the blocking queue, then process the registration information and update it to the memory registration list.

Synchronize registration information to other cluster nodes

Nacos also stores the registration key as a task in the taskShedule blocking queue in TaskDispatcher, and then starts the thread to read the blocking queue in a loop:

       @Override
        public void run() {

            List<String> keys = new ArrayList<>();
            while (true) {
                    String key = queue.poll(partitionConfig.getTaskDispatchPeriod(),
                        TimeUnit.MILLISECONDS);
                    // 省略判断代码
                    // 添加同步的key
                    keys.add(key);
                    // 计数
                    dataSize++;
                    // 判断同步的key大小是否等于 批量同步设置的限量 或者 判断据上次同步时间 是否大于 配置的间隔周期,如果满足任意一个,则开始同步
                    if (dataSize == partitionConfig.getBatchSyncKeyCount() ||
                        (System.currentTimeMillis() - lastDispatchTime) > partitionConfig.getTaskDispatchPeriod()) {
                        // 遍历所有集群节点,直接调用http进行同步
                        for (Server member : dataSyncer.getServers()) {
                            if (NetUtils.localServer().equals(member.getKey())) {
                                continue;
                            }
                            SyncTask syncTask = new SyncTask();
                            syncTask.setKeys(keys);
                            syncTask.setTargetServer(member.getKey());

                            if (Loggers.DISTRO.isDebugEnabled() && StringUtils.isNotBlank(key)) {
                                Loggers.DISTRO.debug("add sync task: {}", JSON.toJSONString(syncTask));
                            }

                            dataSyncer.submit(syncTask, 0);
                        }
                        // 记录本次同步时间
                        lastDispatchTime = System.currentTimeMillis();
                        // 计数清零
                        dataSize = 0;
                    }
            }
        }
    }

The process of synchronization using ap is very simple, but there are two design ideas to solve the problem of single key synchronization:
If a new key is pushed up, nacos will initiate a synchronization, which will cause a waste of network resources, because every time Only one key or several keys are synchronized;

Synchronize a small number of key solutions:
  1. Only when the specified number of keys are accumulated, the batch synchronization is initiated
  2. Since the last synchronization time exceeds the configured limit time, the number of keys is ignored and synchronization is initiated directly

CP mode synchronization process (RaftConsistencyServiceImpl)

The cp mode pursues data consistency. For data consistency, a leader must be selected. The leader will synchronize first, and then the leader will notify the follower to obtain the latest registered node (or actively push it to the follower)

Nacos uses the raft protocol to elect a leader to implement the cp mode.

Also enter the put method of RaftConsistencyServiceImpl

    @Override
    public void put(String key, Record value) throws NacosException {
        try {
            raftCore.signalPublish(key, value);
        } catch (Exception e) {
            Loggers.RAFT.error("Raft put failed.", e);
            throw new NacosException(NacosException.SERVER_ERROR, "Raft put failed, key:" + key + ", value:" + value, e);
        }
    }

Into the raftCore.signalPublish method, I extracted a few key codes

// 首先判断当前nacos节点是否是leader,如果不是leader,则获取leader节点的ip,然后将请求转发到leader处理,否则往下走
if (!isLeader()) {
            JSONObject params = new JSONObject();
            params.put("key", key);
            params.put("value", value);
            Map<String, String> parameters = new HashMap<>(1);
            parameters.put("key", key);

            raftProxy.proxyPostLarge(getLeader().ip, API_PUB, params.toJSONString(), parameters);
            return;
        }

Also use the same queue method to process the local registration list

onPublish(datum, peers.local());

public void onPublish(Datum datum, RaftPeer source) throws Exception {
       
        // 添加同步key任务到阻塞队列中
        notifier.addTask(datum.key, ApplyAction.CHANGE);

        Loggers.RAFT.info("data added/updated, key={}, term={}", datum.key, local.term);
    }

Traverse all cluster nodes and send http synchronization request

 for (final String server : peers.allServersIncludeMyself()) {
                // 如果是leader,则不进行同步
                if (isLeader(server)) {
                    latch.countDown();
                    continue;
                }
                // 组装url 发送同步请求到其它集群节点
                final String url = buildURL(server, API_ON_PUB);
                HttpClient.asyncHttpPostLarge(url, Arrays.asList("key=" + key), content, new AsyncCompletionHandler<Integer>() {
                    @Override
                    public Integer onCompleted(Response response) throws Exception {
                        if (response.getStatusCode() != HttpURLConnection.HTTP_OK) {
                            Loggers.RAFT.warn("[RAFT] failed to publish data to peer, datumId={}, peer={}, http code={}",
                                datum.key, server, response.getStatusCode());
                            return 1;
                        }
                        latch.countDown();
                        return 0;
                    }

                    @Override
                    public STATE onContentWriteCompleted() {
                        return STATE.CONTINUE;
                    }
                });

            }

So far, the main source code of nacos service registration and service instance synchronization has been analyzed.

to sum up

For students who are new to the nacos source code, you can read the picture on the head a few times, then find the corresponding position against the source code, and finally combine the picture and then combine this article, and look at the whole coherently, I believe there will be great gains; Although the process of reading the source code is painful, as long as you persevere and master the skills of reading source code, you will find that no matter how difficult the source code is, you can gnaw it down; I will write a special article later to teach you how to be efficient Read the source code article, I hope it will be helpful for students who are new to the source code.

Search on WeChat [AI Coder] Follow the handsome me and reply [Receive dry goods], there will be a lot of interview materials and architect must-read books waiting for you to choose, including java basics, java concurrency, microservices, middleware, etc. More information is waiting for you.

Guess you like

Origin blog.csdn.net/weixin_34311210/article/details/112856979