nacos注册中心源码分析一之服务注册、服务心跳

源码分析

nacos客户端注册分析

依赖包

<dependency>
			<groupId>com.alibaba.cloud</groupId>
			<artifactId>spring-cloud-starter-alibaba-nacos-discovery</artifactId>
		</dependency>

Nacos的客户端是基于SpringBoot的自动装配实现的
看下依赖包下边的spring.factories文件

在这里插入图片描述

NacosServiceRegistryAutoConfiguration

NacosServiceRegistryAutoConfiguration 给我们注入了NacosAutoServiceRegistration

	@Bean
	@ConditionalOnBean(AutoServiceRegistrationProperties.class)
	public NacosAutoServiceRegistration nacosAutoServiceRegistration(
			NacosServiceRegistry registry,
			AutoServiceRegistrationProperties autoServiceRegistrationProperties,
			NacosRegistration registration) {
    
    
		return new NacosAutoServiceRegistration(registry,
				autoServiceRegistrationProperties, registration);
	}

NacosAutoServiceRegistration

	public NacosAutoServiceRegistration(ServiceRegistry<Registration> serviceRegistry,
			AutoServiceRegistrationProperties autoServiceRegistrationProperties,
			NacosRegistration registration) {
    
    
		super(serviceRegistry, autoServiceRegistrationProperties);
		this.registration = registration;
	}

看下它的类图
在这里插入图片描述
我们可以看到它的父类实现了ApplicationListener接口

在这里插入图片描述
可以看到它实现了ApplicationListener接口,监听Spring容器启动过程中的事件。

在监听到WebServerInitializedEvent(web服务初始化完成)的事件后,执行了bind 方法。
在这里插入图片描述
bind方法如下

public void bind(WebServerInitializedEvent event) {
    
    
		ApplicationContext context = event.getApplicationContext();
		if (context instanceof ConfigurableWebServerApplicationContext) {
    
    
			if ("management".equals(((ConfigurableWebServerApplicationContext) context)
					.getServerNamespace())) {
    
    
				return;
			}
		}
		this.port.compareAndSet(0, event.getWebServer().getPort());
		//开始服务注册
		this.start();
	}

// start 方法
public void start() {
    
    
		if (!isEnabled()) {
    
    
			if (logger.isDebugEnabled()) {
    
    
				logger.debug("Discovery Lifecycle disabled. Not starting");
			}
			return;
		}

		// only initialize if nonSecurePort is greater than 0 and it isn't already running
		// because of containerPortInitializer below
		if (!this.running.get()) {
    
    
			//发布准备注册事件
			this.context.publishEvent(
					new InstancePreRegisteredEvent(this, getRegistration()));
			//注册服务
			register();
			if (shouldRegisterManagement()) {
    
    
				registerManagement();
			}
			//发布注册完成事件
			this.context.publishEvent(
					new InstanceRegisteredEvent<>(this, getConfiguration()));
			this.running.compareAndSet(false, true);
		}

	}



// 注册方法
protected void register() {
    
    
		this.serviceRegistry.register(getRegistration());
	}

在这里插入图片描述

NacosServiceRegistry

NacosServiceRegistry是Spring的ServiceRegistry接口的实现类,而ServiceRegistry接口是服务注册、发现的规约接口,定义了register、deregister等方法的声明。

而NacosServiceRegistry对register的实现如下:

public void register(Registration registration) {
    
    
			
		if (StringUtils.isEmpty(registration.getServiceId())) {
    
    
			log.warn("No service to register for nacos client...");
			return;
		}
		// 获取Nacos的命名服务,其实就是注册中心服务
		NamingService namingService = namingService();
		// 获取serviceId
		String serviceId = registration.getServiceId();
		//获取group
		String group = nacosDiscoveryProperties.getGroup();
	   // 封装服务实例的基本信息,如 cluster-name、是否为临时实例、权重、IP、端口等
		Instance instance = getNacosInstanceFromRegistration(registration);

		try {
    
    
		// 注册服务
			namingService.registerInstance(serviceId, group, instance);
			log.info("nacos registry, {} {} {}:{} register finished", group, serviceId,
					instance.getIp(), instance.getPort());
		}
		catch (Exception e) {
    
    
			if (nacosDiscoveryProperties.isFailFast()) {
    
    
				log.error("nacos registry, {} register failed...{},", serviceId,
						registration.toString(), e);
				rethrowRuntimeException(e);
			}
			else {
    
    
				log.warn("Failfast is false. {} register failed...{},", serviceId,
						registration.toString(), e);
			}
		}
	}

注册准备工作
最终是调用NamingService的registerInstance方法实现注册的
NamingService接口的默认实现就是NacosNamingService。

在这里插入图片描述

扫描二维码关注公众号,回复: 15440072 查看本文章

NacosNamingService

  public void registerInstance(String serviceName, String groupName, Instance instance) throws NacosException {
    
    
        NamingUtils.checkInstanceIsLegal(instance);
        String groupedServiceName = NamingUtils.getGroupedName(serviceName, groupName);
        //判断是否为临时实例如果是临时实例会开启定时任务向nacos服务端发送心跳
        if (instance.isEphemeral()) {
    
    
            BeatInfo beatInfo = beatReactor.buildBeatInfo(groupedServiceName, instance);
            beatReactor.addBeatInfo(groupedServiceName, beatInfo);
        }
        //发送注册请求
        serverProxy.registerService(groupedServiceName, groupName, instance);
    }

由NamingProxy的registerService方法,完成服务注册

NamingProxy

 public void registerService(String serviceName, String groupName, Instance instance) throws NacosException {
    
    
        
        NAMING_LOGGER.info("[REGISTER-SERVICE] {} registering service {} with instance: {}", namespaceId, serviceName,
                instance);
        //组装请求参数
        final Map<String, String> params = new HashMap<String, String>(16);
        //命名空间
        params.put(CommonParams.NAMESPACE_ID, namespaceId);
        //服务名称
        params.put(CommonParams.SERVICE_NAME, serviceName);
		//组名称
        params.put(CommonParams.GROUP_NAME, groupName);
		//集群名称
        params.put(CommonParams.CLUSTER_NAME, instance.getClusterName());
        // ip
        params.put("ip", instance.getIp());
        //端口
        params.put("port", String.valueOf(instance.getPort()));
        //权重
        params.put("weight", String.valueOf(instance.getWeight()));
       
        params.put("enable", String.valueOf(instance.isEnabled()));
        //健康状态
        params.put("healthy", String.valueOf(instance.isHealthy()));
        //是否是临时实例
        params.put("ephemeral", String.valueOf(instance.isEphemeral()));
        params.put("metadata", JacksonUtils.toJson(instance.getMetadata()));
        //通过POST请求将上述参数,发送到 /nacos/v1/ns/instance
        reqApi(UtilAndComs.nacosUrlInstance, params, HttpMethod.POST);
    }

心跳发送(临时实例)

BeatReactor 发送心跳核心类

在这里插入图片描述

   public BeatInfo buildBeatInfo(String groupedServiceName, Instance instance) {
    
    
        BeatInfo beatInfo = new BeatInfo();
        beatInfo.setServiceName(groupedServiceName);
        beatInfo.setIp(instance.getIp());
        beatInfo.setPort(instance.getPort());
        beatInfo.setCluster(instance.getClusterName());
        beatInfo.setWeight(instance.getWeight());
        beatInfo.setMetadata(instance.getMetadata());
        beatInfo.setScheduled(false);
        beatInfo.setPeriod(instance.getInstanceHeartBeatInterval());
        return beatInfo;
    }

心跳周期 5秒
在这里插入图片描述
在这里插入图片描述

 public void addBeatInfo(String serviceName, BeatInfo beatInfo) {
    
    
        NAMING_LOGGER.info("[BEAT] adding beat: {} to beat map.", beatInfo);
        String key = buildKey(serviceName, beatInfo.getIp(), beatInfo.getPort());
        BeatInfo existBeat = null;
        //fix #1733
        if ((existBeat = dom2Beat.remove(key)) != null) {
    
    
            existBeat.setStopped(true);
        }
        dom2Beat.put(key, beatInfo);
        //利用线程池,定期执行心跳任务,周期为 beatInfo.getPeriod()
        executorService.schedule(new BeatTask(beatInfo), beatInfo.getPeriod(), TimeUnit.MILLISECONDS);
        MetricsMonitor.getDom2BeatSizeMonitor().set(dom2Beat.size());
    }

心跳的任务封装在BeatTask这个类中,是一个Runnable,其run方法如下:
在这里插入图片描述

BeatTask

 @Override
        public void run() {
    
    
            if (beatInfo.isStopped()) {
    
    
                return;
            }
                // 获取心跳周期
            long nextTime = beatInfo.getPeriod();
            try {
    
    
               // 发送心跳
                JsonNode result = serverProxy.sendBeat(beatInfo, BeatReactor.this.lightBeatEnabled);
                long interval = result.get("clientBeatInterval").asLong();
                boolean lightBeatEnabled = false;
                if (result.has(CommonParams.LIGHT_BEAT_ENABLED)) {
    
    
                    lightBeatEnabled = result.get(CommonParams.LIGHT_BEAT_ENABLED).asBoolean();
                }
                BeatReactor.this.lightBeatEnabled = lightBeatEnabled;
                if (interval > 0) {
    
    
                    nextTime = interval;
                }
                int code = NamingResponseCode.OK;
                if (result.has(CommonParams.CODE)) {
    
    
                    code = result.get(CommonParams.CODE).asInt();
                }
                // 如果失败,则需要 重新注册实例
                if (code == NamingResponseCode.RESOURCE_NOT_FOUND) {
    
    
                    Instance instance = new Instance();
                    instance.setPort(beatInfo.getPort());
                    instance.setIp(beatInfo.getIp());
                    instance.setWeight(beatInfo.getWeight());
                    instance.setMetadata(beatInfo.getMetadata());
                    instance.setClusterName(beatInfo.getCluster());
                    instance.setServiceName(beatInfo.getServiceName());
                    instance.setInstanceId(instance.getInstanceId());
                    instance.setEphemeral(true);
                    try {
    
    
                        serverProxy.registerService(beatInfo.getServiceName(),
                                NamingUtils.getGroupName(beatInfo.getServiceName()), instance);
                    } catch (Exception ignore) {
    
    
                    }
                }
            } catch (NacosException ex) {
    
    
                NAMING_LOGGER.error("[CLIENT-BEAT] failed to send beat: {}, code: {}, msg: {}",
                        JacksonUtils.toJson(beatInfo), ex.getErrCode(), ex.getErrMsg());
                
            }
            executorService.schedule(new BeatTask(beatInfo), nextTime, TimeUnit.MILLISECONDS);
        }
    }

发送心跳com.alibaba.nacos.client.naming.net.NamingProxy#sendBeat

  public JsonNode sendBeat(BeatInfo beatInfo, boolean lightBeatEnabled) throws NacosException {
    
    
        
        if (NAMING_LOGGER.isDebugEnabled()) {
    
    
            NAMING_LOGGER.debug("[BEAT] {} sending beat to server: {}", namespaceId, beatInfo.toString());
        }
        // 组织请求参数
        Map<String, String> params = new HashMap<String, String>(8);
        Map<String, String> bodyMap = new HashMap<String, String>(2);
        if (!lightBeatEnabled) {
    
    
            bodyMap.put("beat", JacksonUtils.toJson(beatInfo));
        }
        params.put(CommonParams.NAMESPACE_ID, namespaceId);
        params.put(CommonParams.SERVICE_NAME, beatInfo.getServiceName());
        params.put(CommonParams.CLUSTER_NAME, beatInfo.getCluster());
        params.put("ip", beatInfo.getIp());
        params.put("port", String.valueOf(beatInfo.getPort()));
        // 发送请求,这个地址就是:/v1/ns/instance/beat
        String result = reqApi(UtilAndComs.nacosUrlBase + "/instance/beat", params, bodyMap, HttpMethod.PUT);
        return JacksonUtils.toObj(result);
    }

上边分析的是临时实例,如果是永久实例则是由nacos服务端主动发起探测基于raft算法

Nacos的实例分为临时实例和永久实例两种

可以通过在yaml 文件配置:

spring:
  application:
    name: order-service
  cloud:
    nacos:
      discovery:
        ephemeral: false # 设置实例为永久实例。true:临时; false:永久
      server-addr: 127.0.0.1:8848

服务端源码分析

服务端要拉取源码

服务端接受注册

通过注册接口/nacos/v1/ns/instance找到接入点

  public String register(HttpServletRequest request) throws Exception {
    
    
        //获取namespaceId
        final String namespaceId = WebUtils
                .optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID);
        final String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);
        NamingUtils.checkServiceNameFormat(serviceName);
        //解析出实例信息,封装为Instance对象
        final Instance instance = parseInstance(request);
        //注册实例
        serviceManager.registerInstance(namespaceId, serviceName, instance);
        return "ok";
    }

进入到了serviceManager.registerInstance()方法中。

ServiceManager

 public void registerInstance(String namespaceId, String serviceName, Instance instance) throws NacosException {
    
    
        //创建一个空的service(如果是第一次来注册实例,要先创建一个空service出来,放入注册表)
        createEmptyService(namespaceId, serviceName, instance.isEphemeral());
        //拿到创建好的service
        Service service = getService(namespaceId, serviceName);
        
        if (service == null) {
    
    
            throw new NacosException(NacosException.INVALID_PARAM,
                    "service not found, namespace: " + namespaceId + ", service: " + serviceName);
        }
        
        //添加要注册的实例到service中
        addInstance(namespaceId, serviceName, instance.isEphemeral(), instance);
    }



	/**
	addInstance方法
	*/
 /**
     * Add instance to service.
     *
     * @param namespaceId namespace
     * @param serviceName service name
     * @param ephemeral   whether instance is ephemeral
     * @param ips         instances
     * @throws NacosException nacos exception
     */
public void addInstance(String namespaceId, String serviceName, boolean ephemeral, Instance... ips)
    throws NacosException {
    
    
	// 监听服务列表用到的key,服务唯一标识,例如:com.alibaba.nacos.naming.iplist.ephemeral.public##DEFAULT_GROUP@@order-service
    String key = KeyBuilder.buildInstanceListKey(namespaceId, serviceName, ephemeral);
    // 获取服务
    Service service = getService(namespaceId, serviceName);
    // 同步锁,避免并发修改的安全问题
    synchronized (service) {
    
    
        // 1)获取要更新的实例列表
        List<Instance> instanceList = addIpAddresses(service, ephemeral, ips);
		// 2)封装实例列表到Instances对象
        Instances instances = new Instances();
        instances.setInstanceList(instanceList);
		// 3)完成 注册表更新 以及 Nacos集群的数据同步
        consistencyService.put(key, instances);
    }
}

更新服务列表

 private List<Instance> addIpAddresses(Service service, boolean ephemeral, Instance... ips) throws NacosException {
    
    
        return updateIpAddresses(service, UtilsAndCommons.UPDATE_INSTANCE_ACTION_ADD, ephemeral, ips);
    }


public List<Instance> updateIpAddresses(Service service, String action, boolean ephemeral, Instance... ips)
            throws NacosException {
    
    
        //根据namspace和serviceName获取当前服务的实例列表
        Datum datum = consistencyService
                .get(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), ephemeral));
        // 得到服务中现有的实例列表
        List<Instance> currentIPs = service.allIPs(ephemeral);
        Map<String, Instance> currentInstances = new HashMap<>(currentIPs.size());
        Set<String> currentInstanceIds = Sets.newHashSet();
        //循环现存的实例ip
        for (Instance instance : currentIPs) {
    
    
            currentInstances.put(instance.toIpAddr(), instance);
            currentInstanceIds.add(instance.getInstanceId());
        }
        //保存更新后的实例列表
        Map<String, Instance> instanceMap;
        if (datum != null && null != datum.value) {
    
    
            instanceMap = setValid(((Instances) datum.value).getInstanceList(), currentInstances);
        } else {
    
    
            instanceMap = new HashMap<>(ips.length);
        }
        
        for (Instance instance : ips) {
    
    
            if (!service.getClusterMap().containsKey(instance.getClusterName())) {
    
    
                Cluster cluster = new Cluster(instance.getClusterName(), service);
                cluster.init();
                service.getClusterMap().put(instance.getClusterName(), cluster);
                Loggers.SRV_LOG
                        .warn("cluster: {} not found, ip: {}, will create new cluster with default configuration.",
                                instance.getClusterName(), instance.toJson());
            }
            
            if (UtilsAndCommons.UPDATE_INSTANCE_ACTION_REMOVE.equals(action)) {
    
    
                instanceMap.remove(instance.getDatumKey());
            } else {
    
    
               //新增实例,instance生成全新的instanceId
                Instance oldInstance = instanceMap.get(instance.getDatumKey());
                if (oldInstance != null) {
    
    
                    instance.setInstanceId(oldInstance.getInstanceId());
                } else {
    
    
                    instance.setInstanceId(instance.generateInstanceId(currentInstanceIds));
                }
                // 放入instance列表
                instanceMap.put(instance.getDatumKey(), instance);
            }
            
        }
        
        if (instanceMap.size() <= 0 && UtilsAndCommons.UPDATE_INSTANCE_ACTION_ADD.equals(action)) {
    
    
            throw new IllegalArgumentException(
                    "ip list can not be empty, service: " + service.getName() + ", ip list: " + JacksonUtils
                            .toJson(instanceMap.values()));
        }
        
        return new ArrayList<>(instanceMap.values());
    }

就是先获取旧的实例列表,然后把新的实例信息与旧的做对比,新的实例就添加,老的实例同步ID。然后返回最新的实例列表。

nacos服务端机群同步

在这里插入图片描述

DistroConsistencyServiceImp

  public void put(String key, Record value) throws NacosException {
    
    
  		//更新实例到本地里表
        onPut(key, value);
        //集群同步
        distroProtocol.sync(new DistroKey(key, KeyBuilder.INSTANCE_LIST_KEY_PREFIX), DataOperation.CHANGE,
                globalConfig.getTaskDispatchPeriod() / 2);
    }

onput方法

 public void onPut(String key, Record value) {
    
    
        //是不是临时实例
        if (KeyBuilder.matchEphemeralInstanceListKey(key)) {
    
    
            Datum<Instances> datum = new Datum<>();
            datum.value = (Instances) value;
            datum.key = key;
            datum.timestamp.incrementAndGet();
            dataStore.put(key, datum);
        }
        
        if (!listeners.containsKey(key)) {
    
    
            return;
        }
        //放入阻塞队列异步进行处理(知识点)
        notifier.addTask(key, DataOperation.CHANGE);
    }

Notifier
Notifier实现了Runnable接口
在这里插入图片描述
在这里插入图片描述

 public void addTask(String datumKey, DataOperation action) {
    
    
            
            if (services.containsKey(datumKey) && action == DataOperation.CHANGE) {
    
    
                return;
            }
            if (action == DataOperation.CHANGE) {
    
    
                services.put(datumKey, StringUtils.EMPTY);
            }
            tasks.offer(Pair.with(datumKey, action));
        }

上边是把参数放入阻塞队列,准备进行异步消费

通过一个单线程的线程池来不断从阻塞队列中获取任务,执行服务列表的更新。来看下其中的run方
在这里插入图片描述

从阻塞队列中取出来,然后进行处理

我们看下handle方法

 private void handle(Pair<String, DataOperation> pair) {
    
    
            try {
    
    
                String datumKey = pair.getValue0();
                DataOperation action = pair.getValue1();
                
                services.remove(datumKey);
                
                int count = 0;
                
                if (!listeners.containsKey(datumKey)) {
    
    
                    return;
                }
                
                for (RecordListener listener : listeners.get(datumKey)) {
    
    
                    
                    count++;
                    
                    try {
    
    
                        if (action == DataOperation.CHANGE) {
    
    
                            listener.onChange(datumKey, dataStore.get(datumKey).value);
                            continue;
                        }
                        
                        if (action == DataOperation.DELETE) {
    
    
                            listener.onDelete(datumKey);
                            continue;
                        }
                    } catch (Throwable e) {
    
    
                        Loggers.DISTRO.error("[NACOS-DISTRO] error while notifying listener of key: {}", datumKey, e);
                    }
                }
                
                if (Loggers.DISTRO.isDebugEnabled()) {
    
    
                    Loggers.DISTRO
                            .debug("[NACOS-DISTRO] datum change notified, key: {}, listener count: {}, action: {}",
                                    datumKey, count, action.name());
                }
            } catch (Throwable e) {
    
    
                Loggers.DISTRO.error("[NACOS-DISTRO] Error while handling notifying task", e);
            }
        }
    }

重点看下onChange事件
在这里插入图片描述

 public void onChange(String key, Instances value) throws Exception {
    
    
        
        Loggers.SRV_LOG.info("[NACOS-RAFT] datum is changed, key: {}, value: {}", key, value);
        
        for (Instance instance : value.getInstanceList()) {
    
    
            
            if (instance == null) {
    
    
                // Reject this abnormal instance list:
                throw new RuntimeException("got null instance " + key);
            }
            
            if (instance.getWeight() > 10000.0D) {
    
    
                instance.setWeight(10000.0D);
            }
            
            if (instance.getWeight() < 0.01D && instance.getWeight() > 0.0D) {
    
    
                instance.setWeight(0.01D);
            }
        }
        //更新实例列表
        updateIPs(value.getInstanceList(), KeyBuilder.matchEphemeralInstanceListKey(key));
        
        recalculateChecksum();
    }

updateIPs方法

public void updateIPs(Collection<Instance> instances, boolean ephemeral) {
    
    
    // 准备一个Map,key是cluster,值是集群下的Instance集合
    Map<String, List<Instance>> ipMap = new HashMap<>(clusterMap.size());
    // 获取服务的所有cluster名称
    for (String clusterName : clusterMap.keySet()) {
    
    
        ipMap.put(clusterName, new ArrayList<>());
    }
    // 遍历要更新的实例
    for (Instance instance : instances) {
    
    
        try {
    
    
            if (instance == null) {
    
    
                Loggers.SRV_LOG.error("[NACOS-DOM] received malformed ip: null");
                continue;
            }
			// 判断实例是否包含clusterName,没有的话用默认cluster
            if (StringUtils.isEmpty(instance.getClusterName())) {
    
    
                instance.setClusterName(UtilsAndCommons.DEFAULT_CLUSTER_NAME);
            }
			// 判断cluster是否存在,不存在则创建新的cluster
            if (!clusterMap.containsKey(instance.getClusterName())) {
    
    
                Loggers.SRV_LOG
                    .warn("cluster: {} not found, ip: {}, will create new cluster with default configuration.",
                          instance.getClusterName(), instance.toJson());
                Cluster cluster = new Cluster(instance.getClusterName(), this);
                cluster.init();
                getClusterMap().put(instance.getClusterName(), cluster);
            }
			// 获取当前cluster实例的集合,不存在则创建新的
            List<Instance> clusterIPs = ipMap.get(instance.getClusterName());
            if (clusterIPs == null) {
    
    
                clusterIPs = new LinkedList<>();
                ipMap.put(instance.getClusterName(), clusterIPs);
            }
			// 添加新的实例到 Instance 集合
            clusterIPs.add(instance);
        } catch (Exception e) {
    
    
            Loggers.SRV_LOG.error("[NACOS-DOM] failed to process ip: " + instance, e);
        }
    }

    for (Map.Entry<String, List<Instance>> entry : ipMap.entrySet()) {
    
    
        //make every ip mine
        List<Instance> entryIPs = entry.getValue();
        // 将实例集合更新到 clusterMap(注册表)
        clusterMap.get(entry.getKey()).updateIps(entryIPs, ephemeral);
    }

    setLastModifiedMillis(System.currentTimeMillis());
    // 发布服务变更的通知消息
    getPushService().serviceChanged(this);
    StringBuilder stringBuilder = new StringBuilder();

    for (Instance instance : allIPs()) {
    
    
        stringBuilder.append(instance.toIpAddr()).append("_").append(instance.isHealthy()).append(",");
    }

    Loggers.EVT_LOG.info("[IP-UPDATED] namespace: {}, service: {}, ips: {}", getNamespaceId(), getName(),
                         stringBuilder.toString());

}

上边这个方法中的updatedIPs

 public void updateIps(List<Instance> ips, boolean ephemeral) {
    
    

        Set<Instance> toUpdateInstances = ephemeral ? ephemeralInstances : persistentInstances;

        HashMap<String, Instance> oldIpMap = new HashMap<>(toUpdateInstances.size());

        for (Instance ip : toUpdateInstances) {
    
    
            oldIpMap.put(ip.getDatumKey(), ip);
        }

        List<Instance> updatedIPs = updatedIps(ips, oldIpMap.values());
        if (updatedIPs.size() > 0) {
    
    
            for (Instance ip : updatedIPs) {
    
    
                Instance oldIP = oldIpMap.get(ip.getDatumKey());

                // do not update the ip validation status of updated ips
                // because the checker has the most precise result
                // Only when ip is not marked, don't we update the health status of IP:
                if (!ip.isMarked()) {
    
    
                    ip.setHealthy(oldIP.isHealthy());
                }

                if (ip.isHealthy() != oldIP.isHealthy()) {
    
    
                    // ip validation status updated
                    Loggers.EVT_LOG.info("{} {SYNC} IP-{} {}:{}@{}", getService().getName(),
                            (ip.isHealthy() ? "ENABLED" : "DISABLED"), ip.getIp(), ip.getPort(), getName());
                }

                if (ip.getWeight() != oldIP.getWeight()) {
    
    
                    // ip validation status updated
                    Loggers.EVT_LOG.info("{} {SYNC} {IP-UPDATED} {}->{}", getService().getName(), oldIP.toString(),
                            ip.toString());
                }
            }
        }
        // 检查新加入实例的状态
        List<Instance> newIPs = subtract(ips, oldIpMap.values());
        if (newIPs.size() > 0) {
    
    
            Loggers.EVT_LOG
                    .info("{} {SYNC} {IP-NEW} cluster: {}, new ips size: {}, content: {}", getService().getName(),
                            getName(), newIPs.size(), newIPs.toString());

            for (Instance ip : newIPs) {
    
    
                HealthCheckStatus.reset(ip);
            }
        }
        // 移除要删除的实例
        List<Instance> deadIPs = subtract(oldIpMap.values(), ips);

        if (deadIPs.size() > 0) {
    
    
            Loggers.EVT_LOG
                    .info("{} {SYNC} {IP-DEAD} cluster: {}, dead ips size: {}, content: {}", getService().getName(),
                            getName(), deadIPs.size(), deadIPs.toString());

            for (Instance ip : deadIPs) {
    
    
                HealthCheckStatus.remv(ip);
            }
        }

        toUpdateInstances = new HashSet<>(ips);
        // 直接覆盖旧实例列表
        if (ephemeral) {
    
    
            ephemeralInstances = toUpdateInstances;
        } else {
    
    
            persistentInstances = toUpdateInstances;
        }
    }

getPushService().serviceChanged(this);
这个方法就是通过监听器的发布的方式给客户端发送upd事件,这也是服务变更后客户端会主动向客户端udp发送,自己看下

集群同步操作
在这里插入图片描述
sync方法

public void sync(DistroKey distroKey, DataOperation action, long delay) {
    
    
    // 遍历 Nacos 集群中除自己以外的其它节点
    for (Member each : memberManager.allMembersWithoutSelf()) {
    
    
        DistroKey distroKeyWithTarget = new DistroKey(distroKey.getResourceKey(), distroKey.getResourceType(),
                                                      each.getAddress());
        // 定义一个Distro的同步任务
        DistroDelayTask distroDelayTask = new DistroDelayTask(distroKeyWithTarget, action, delay);
        // 交给线程池去执行
        distroTaskEngineHolder.getDelayTaskExecuteEngine().addTask(distroKeyWithTarget, distroDelayTask);
        if (Loggers.DISTRO.isDebugEnabled()) {
    
    
            Loggers.DISTRO.debug("[DISTRO-SCHEDULE] {} to {}", distroKey, each.getAddress());
        }
    }
}

重点看下 distroTaskEngineHolder.getDelayTaskExecuteEngine()这个方法

//好像没有什么直接返回,莫慌
 public DistroDelayTaskExecuteEngine getDelayTaskExecuteEngine() {
    
    
        return delayTaskExecuteEngine;
    }

我们看下delayTaskExecuteEngine是怎么实例化的
通过无参构造
在这里插入图片描述
我们看下无参数构造
在这里插入图片描述
调用的是父类的构造方法

上边的也是调用的父类的构造方法

   public NacosDelayTaskExecuteEngine(String name, Logger logger) {
    
    
        this(name, 32, logger, 100L);
    }

我们看下this

  public NacosDelayTaskExecuteEngine(String name, int initCapacity, Logger logger, long processInterval) {
    
    
        super(logger);
        tasks = new ConcurrentHashMap<Object, AbstractDelayTask>(initCapacity);
        processingExecutor = ExecutorFactory.newSingleScheduledExecutorService(new NameThreadFactory(name));
                //通过单线程的进行同步
        processingExecutor
                .scheduleWithFixedDelay(new ProcessRunnable(), processInterval, processInterval, TimeUnit.MILLISECONDS);
    }

ProcessRunnable
这个一看就是线程类


    private class ProcessRunnable implements Runnable {
    
    

        @Override
        public void run() {
    
    
            try {
    
    
                processTasks();
            } catch (Throwable e) {
    
    
                getEngineLog().error(e.toString(), e);
            }
        }
    }

processTasks方法

protected void processTasks() {
    
    
    Collection<Object> keys = getAllTaskKeys();
    for (Object taskKey : keys) {
    
    
        AbstractDelayTask task = removeTask(taskKey);
        if (null == task) {
    
    
            continue;
        }
        NacosTaskProcessor processor = getProcessor(taskKey);
        if (null == processor) {
    
    
            getEngineLog().error("processor not found for task, so discarded. " + task);
            continue;
        }
        try {
    
    
            // 尝试执行同步任务,如果失败会重试
            if (!processor.process(task)) {
    
    
                retryFailedTask(taskKey, task);
            }
        } catch (Throwable e) {
    
    
            getEngineLog().error("Nacos task execute error : " + e.toString(), e);
            retryFailedTask(taskKey, task);
        }
    }
}

//看下process方法

 @Override
    public boolean process(NacosTask task) {
    
    
        if (!(task instanceof DistroDelayTask)) {
    
    
            return true;
        }
        DistroDelayTask distroDelayTask = (DistroDelayTask) task;
        DistroKey distroKey = distroDelayTask.getDistroKey();
        if (DataOperation.CHANGE.equals(distroDelayTask.getAction())) {
    
    
            DistroSyncChangeTask syncChangeTask = new DistroSyncChangeTask(distroKey, distroComponentHolder);
            distroTaskEngineHolder.getExecuteWorkersManager().addTask(distroKey, syncChangeTask);
            return true;
        }
        return false;
    }

distroTaskEngineHolder.getExecuteWorkersManager(
这个方法最终会追溯到这里

 public TaskExecuteWorker(final String name, final int mod, final int total, final Logger logger) {
    
    
        this.name = name + "_" + mod + "%" + total;
        this.queue = new ArrayBlockingQueue<Runnable>(QUEUE_CAPACITY);
        this.closed = new AtomicBoolean(false);
        this.log = null == logger ? LoggerFactory.getLogger(TaskExecuteWorker.class) : logger;
        //启动线程从队列里边取出来进行同步
        new InnerWorker(name).start();
    }

DistroSyncChangeTask
这里边会从阻塞队列里边取出来进行同步

 public void run() {
    
    
        Loggers.DISTRO.info("[DISTRO-START] {}", toString());
        try {
    
    
            String type = getDistroKey().getResourceType();
            DistroData distroData = distroComponentHolder.findDataStorage(type).getDistroData(getDistroKey());
            distroData.setType(DataOperation.CHANGE);
            boolean result = distroComponentHolder.findTransportAgent(type).syncData(distroData, getDistroKey().getTargetServer());
            if (!result) {
    
    
                handleFailedTask();
            }
            Loggers.DISTRO.info("[DISTRO-END] {} result: {}", toString(), result);
        } catch (Exception e) {
    
    
            Loggers.DISTRO.warn("[DISTRO] Sync data change failed.", e);
            handleFailedTask();
        }
    }

syncData
在这里插入图片描述
在这里插入图片描述

集群同步可以看出来基于Distro模式的同步是异步进行的,并且失败时会将任务重新入队并充实,因此不保证同步结果的强一致性,属于AP模式的一致性策略

服务端流程图
在这里插入图片描述

Nacos的注册表结构是什么样的?
答:Nacos是多级存储模型,最外层通过namespace来实现环境隔离,然后是group分组,分组下就是服务,一个服务有可以分为不同的集群,集群中包含多个实例。因此其注册表结构为一个Map,类型是:
Map<String, Map<String, Service>>,
外层key是namespace_id,内层key是group+serviceName.
Service内部维护一个Map,结构是:Map<String,Cluster>,key是clusterName,值是集群信息
Cluster内部维护一个Set集合,元素是Instance类型,代表集群中的多个实例。

Nacos如何保证并发写的安全性?
答:首先,在注册实例时,会对service加锁,不同service之间本身就不存在并发写问题,互不影响。相同service时通过锁来互斥。并且,在更新实例列表时,是基于异步的线程池来完成,而线程池的线程数量为1.

Nacos如何避免并发读写的冲突?
答:Nacos在更新实例列表时,会采用CopyOnWrite技术,首先将Old实例列表拷贝一份,然后更新拷贝的实例列表,再用更新后的实例列表来覆盖旧的实例列表。

Nacos如何应对阿里内部数十万服务的并发写请求?
答:Nacos内部会将服务注册的任务放入阻塞队列,采用线程池异步来完成实例更新,从而提高并发写能力。

服务端接手客户端的心跳
controller

 @CanDistro
    @PutMapping("/beat")
    @Secured(parser = NamingResourceParser.class, action = ActionTypes.WRITE)
    public ObjectNode beat(HttpServletRequest request) throws Exception {
    
    
        
        ObjectNode result = JacksonUtils.createEmptyJsonNode();
        result.put(SwitchEntry.CLIENT_BEAT_INTERVAL, switchDomain.getClientBeatInterval());
        
        String beat = WebUtils.optional(request, "beat", StringUtils.EMPTY);
        RsInfo clientBeat = null;
        if (StringUtils.isNotBlank(beat)) {
    
    
            clientBeat = JacksonUtils.toObj(beat, RsInfo.class);
        }
        String clusterName = WebUtils
                .optional(request, CommonParams.CLUSTER_NAME, UtilsAndCommons.DEFAULT_CLUSTER_NAME);
        String ip = WebUtils.optional(request, "ip", StringUtils.EMPTY);
        int port = Integer.parseInt(WebUtils.optional(request, "port", "0"));
        if (clientBeat != null) {
    
    
            if (StringUtils.isNotBlank(clientBeat.getCluster())) {
    
    
                clusterName = clientBeat.getCluster();
            } else {
    
    
                // fix #2533
                clientBeat.setCluster(clusterName);
            }
            ip = clientBeat.getIp();
            port = clientBeat.getPort();
        }
        String namespaceId = WebUtils.optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID);
        String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);
        NamingUtils.checkServiceNameFormat(serviceName);
        Loggers.SRV_LOG.debug("[CLIENT-BEAT] full arguments: beat: {}, serviceName: {}", clientBeat, serviceName);
        //从Nacos的注册表中 获取实例
        Instance instance = serviceManager.getInstance(namespaceId, serviceName, clusterName, ip, port);
        //如果实例不存在则重新注册,
        if (instance == null) {
    
    
            if (clientBeat == null) {
    
    
            
                result.put(CommonParams.CODE, NamingResponseCode.RESOURCE_NOT_FOUND);
                return result;
            }
            
            Loggers.SRV_LOG.warn("[CLIENT-BEAT] The instance has been removed for health mechanism, "
                    + "perform data compensation operations, beat: {}, serviceName: {}", clientBeat, serviceName);
            
            instance = new Instance();
            instance.setPort(clientBeat.getPort());
            instance.setIp(clientBeat.getIp());
            instance.setWeight(clientBeat.getWeight());
            instance.setMetadata(clientBeat.getMetadata());
            instance.setClusterName(clusterName);
            instance.setServiceName(serviceName);
            instance.setInstanceId(instance.getInstanceId());
            instance.setEphemeral(clientBeat.isEphemeral());
            
            serviceManager.registerInstance(namespaceId, serviceName, instance);
        }
        //通过服务名字和命名空间去除服务
        Service service = serviceManager.getService(namespaceId, serviceName);
        
        if (service == null) {
    
    
            throw new NacosException(NacosException.SERVER_ERROR,
                    "service not found: " + serviceName + "@" + namespaceId);
        }
        if (clientBeat == null) {
    
    
            clientBeat = new RsInfo();
            clientBeat.setIp(ip);
            clientBeat.setPort(port);
            clientBeat.setCluster(clusterName);
        }
        //执行客户端心跳
        service.processClientBeat(clientBeat);
        
        result.put(CommonParams.CODE, NamingResponseCode.OK);
        if (instance.containsMetadata(PreservedMetadataKeys.HEART_BEAT_INTERVAL)) {
    
    
            result.put(SwitchEntry.CLIENT_BEAT_INTERVAL, instance.getInstanceHeartBeatInterval());
        }
        result.put(SwitchEntry.LIGHT_BEAT_ENABLED, switchDomain.isLightBeatEnabled());
        return result;
    }

processClientBeat方法

  public void processClientBeat(final RsInfo rsInfo) {
    
    
  		//构建参数
        ClientBeatProcessor clientBeatProcessor = new ClientBeatProcessor();
        clientBeatProcessor.setService(this);
        clientBeatProcessor.setRsInfo(rsInfo);
        //启动一个延迟0秒的线程任务
        HealthCheckReactor.scheduleNow(clientBeatProcessor);
    }

ClientBeatProcessor

@Override
public void run() {
    
    
    Service service = this.service;
    if (Loggers.EVT_LOG.isDebugEnabled()) {
    
    
        Loggers.EVT_LOG.debug("[CLIENT-BEAT] processing beat: {}", rsInfo.toString());
    }

    String ip = rsInfo.getIp();
    String clusterName = rsInfo.getCluster();
    int port = rsInfo.getPort();
    // 获取集群信息
    Cluster cluster = service.getClusterMap().get(clusterName);
    // 获取集群中的所有实例信息
    List<Instance> instances = cluster.allIPs(true);

    for (Instance instance : instances) {
    
    
        // 找到心跳的这个实例
        if (instance.getIp().equals(ip) && instance.getPort() == port) {
    
    
            if (Loggers.EVT_LOG.isDebugEnabled()) {
    
    
                Loggers.EVT_LOG.debug("[CLIENT-BEAT] refresh beat: {}", rsInfo.toString());
            }
            // 更新实例的最后一次心跳时间 lastBeat
            instance.setLastBeat(System.currentTimeMillis());
            if (!instance.isMarked()) {
    
    
                if (!instance.isHealthy()) {
    
    
                    instance.setHealthy(true);
                    Loggers.EVT_LOG
                        .info("service: {} {POS} {IP-ENABLED} valid: {}:{}@{}, region: {}, msg: client beat ok",
                              cluster.getService().getName(), ip, port, cluster.getName(),
                              UtilsAndCommons.LOCALHOST_SITE);
                    getPushService().serviceChanged(service);
                }
            }
        }
    }
}

处理心跳请求的核心就是更新心跳实例的最后一次心跳时间,lastBeat,这个会成为判断实例心跳是否过期的关键指标!
如果不是健康的会通过监听器的方式发送udp消息给客户端

客户端接收udp推送通知

PushReceiver

在这里插入图片描述
PushReceiver实现了Runnable接口
我们先看下构造方法

 public PushReceiver(HostReactor hostReactor) {
    
    
        try {
    
    
            this.hostReactor = hostReactor;
            // 创建 UDP客户端
            this.udpSocket = new DatagramSocket();
            this.executorService = new ScheduledThreadPoolExecutor(1, new ThreadFactory() {
    
    
                @Override
                public Thread newThread(Runnable r) {
    
    
                    Thread thread = new Thread(r);
                    thread.setDaemon(true);
                    thread.setName("com.alibaba.nacos.naming.push.receiver");
                    return thread;
                }
            });
            //开启线程任务,准备接收变更数据
            this.executorService.execute(this);
        } catch (Exception e) {
    
    
            NAMING_LOGGER.error("[NA] init udp socket failed", e);
        }
    }

run方法

 @Override
    public void run() {
    
    
        while (!closed) {
    
    
            try {
    
    
                
                // byte[] is initialized with 0 full filled by default
                byte[] buffer = new byte[UDP_MSS];
                DatagramPacket packet = new DatagramPacket(buffer, buffer.length);
                //接收服务端数据
                udpSocket.receive(packet);
                
                String json = new String(IoUtils.tryDecompress(packet.getData()), UTF_8).trim();
                NAMING_LOGGER.info("received push data: " + json + " from " + packet.getAddress().toString());
                
                PushPacket pushPacket = JacksonUtils.toObj(json, PushPacket.class);
                String ack;
                if ("dom".equals(pushPacket.type) || "service".equals(pushPacket.type)) {
    
    
                    //去处理接收数据
                    hostReactor.processServiceJson(pushPacket.data);
                    
                    // send ack to server
                    ack = "{\"type\": \"push-ack\"" + ", \"lastRefTime\":\"" + pushPacket.lastRefTime + "\", \"data\":"
                            + "\"\"}";
                } else if ("dump".equals(pushPacket.type)) {
    
    
                    // dump data to server
                    ack = "{\"type\": \"dump-ack\"" + ", \"lastRefTime\": \"" + pushPacket.lastRefTime + "\", \"data\":"
                            + "\"" + StringUtils.escapeJavaScript(JacksonUtils.toJson(hostReactor.getServiceInfoMap()))
                            + "\"}";
                } else {
    
    
                    // do nothing send ack only
                    ack = "{\"type\": \"unknown-ack\"" + ", \"lastRefTime\":\"" + pushPacket.lastRefTime
                            + "\", \"data\":" + "\"\"}";
                }
                //回发ack给服务端
                udpSocket.send(new DatagramPacket(ack.getBytes(UTF_8), ack.getBytes(UTF_8).length,
                        packet.getSocketAddress()));
            } catch (Exception e) {
    
    
                if (closed) {
    
    
                    return;
                }
                NAMING_LOGGER.error("[NA] error while receiving push data", e);
            }
        }
    }

看下 hostReactor.processServiceJson(pushPacket.data)方法
HostReactor

public ServiceInfo processServiceJson(String json) {
    
    
    // 解析出ServiceInfo信息
    ServiceInfo serviceInfo = JacksonUtils.toObj(json, ServiceInfo.class);
    String serviceKey = serviceInfo.getKey();
    if (serviceKey == null) {
    
    
        return null;
    }
    // 查询缓存中的 ServiceInfo
    ServiceInfo oldService = serviceInfoMap.get(serviceKey);

    // 如果缓存存在,则需要校验哪些数据要更新
    boolean changed = false;
    if (oldService != null) {
    
    
		// 拉取的数据是否已经过期
        if (oldService.getLastRefTime() > serviceInfo.getLastRefTime()) {
    
    
            NAMING_LOGGER.warn("out of date data received, old-t: " + oldService.getLastRefTime() + ", new-t: "
                               + serviceInfo.getLastRefTime());
        }
        // 放入缓存
        serviceInfoMap.put(serviceInfo.getKey(), serviceInfo);
		
        // 中间是缓存与新数据的对比,得到newHosts:新增的实例;remvHosts:待移除的实例;
        // modHosts:需要修改的实例
        if (newHosts.size() > 0 || remvHosts.size() > 0 || modHosts.size() > 0) {
    
    
            // 发布实例变更的事件
            NotifyCenter.publishEvent(new InstancesChangeEvent(
                serviceInfo.getName(), serviceInfo.getGroupName(),
                serviceInfo.getClusters(), serviceInfo.getHosts()));
            DiskCache.write(serviceInfo, cacheDir);
        }

    } else {
    
    
        // 本地缓存不存在
        changed = true;
        // 放入缓存
        serviceInfoMap.put(serviceInfo.getKey(), serviceInfo);
        // 直接发布实例变更的事件
        NotifyCenter.publishEvent(new InstancesChangeEvent(
            serviceInfo.getName(), serviceInfo.getGroupName(),
            serviceInfo.getClusters(), serviceInfo.getHosts()));
        serviceInfo.setJsonFromServer(json);
        DiskCache.write(serviceInfo, cacheDir);
    }
	// 。。。
    return serviceInfo;
}

猜你喜欢

转载自blog.csdn.net/qq_42600094/article/details/130702923