Take you to learn Nacos kernel design source code analysis (2) Explore the mystery of open source projects

Nacos kernel design

Nacos Consistency Protocol

Why does Nacos need a consensus protocol?
Nacos has set a goal in open source support to reduce user deployment and operation and maintenance costs as much as possible, so that users only need one package to quickly start Nacos in stand-alone mode or use Start Nacos in cluster mode. Nacos is a component that needs to store data. Therefore, in order to achieve this goal, data storage needs to be implemented inside Nacos. In fact, it is not a big problem in a single machine, and a simple embedded relational database is enough; but in the cluster mode, it is necessary to consider how to ensure data consistency and data synchronization between each node. To solve this problem, we have to introduce Consensus algorithm, which ensures the consistency of data between nodes through algorithms.

Why did Nacos choose Raft and Distro
? Why does Nacos run both the CP protocol and the AP protocol in a single cluster? This actually starts from the Nacos scenario: Nacos is a component that integrates service registration discovery and configuration management. Therefore, for the data consistency guarantee between nodes under the cluster, it needs to be split into two aspect

From the perspective of service registration discovery,
the service discovery registry is a very important component in the current micro-service system. The instance information that services can perceive the current service of each other's services and can normally provide services must be obtained from the service discovery registry. Therefore, High requirements are put forward for the availability of the service registration and discovery center components. In any scenario, it is necessary to ensure that the service registration and discovery capabilities can provide services to the outside world as much as possible. At the same time, Nacos' service registration discovery design adopts the heartbeat to automatically complete the service. The mechanism of data compensation. If data is lost, this mechanism can quickly make up for data loss.
Therefore, in order to meet the availability of the service discovery registry, a strong consensus consensus algorithm is not suitable here, because whether the strong consensus algorithm can provide external services is required, if the number of nodes available in the current cluster does not If more than half, the entire algorithm "strikes" directly, and the final consensus algorithm will ensure the availability of the service and ensure that
the data between each node can reach a consensus within a certain period of time. The above are all for the non-persistent services in the Nacos service discovery registration (that is, the client needs to report the heartbeat to
renew the service instance). For the persistent service in the Nacos service discovery registration, because all data is created directly by calling the Nacos server, it is necessary to ensure the strong consistency of data between nodes by Nacos, so for this type of service Data, a strong consensus consensus algorithm is selected to ensure data consistency.

From the perspective of configuration management,
configuration data is created and managed directly on the Nacos server. It must be ensured that most nodes have saved this configuration data before the configuration is considered to be successfully saved, otherwise the configuration changes will be lost. If If this happens, the problem is very serious. If an important configuration change is released and the change action is lost, it will most likely cause a serious failure of the existing network. Therefore, for the management of configuration data, it is necessary to require large and small clusters in the cluster. Some nodes are strongly consistent, and only a strong consensus consensus algorithm can be used here.

Why Raft and Distro?
For the strong consensus algorithm, the Raft protocol is most used in current industrial production. The Raft protocol is easier to understand, and there are many mature industrial algorithm implementations, such as JRaft of Ant Financial. Zookeeper's ZAB, Consul's Raft, Baidu's braft, Apache Ratis; because Nacos is a Java technology stack, it can only be selected from JRaft, ZAB, Apache Ratis, but ZAB is strongly bound to Zookeeper, plus I hope to be able to work with Raft The support team of the algorithm library communicates at any time, so JRaft is selected. JRaft is also selected because JRaft supports multiple RaftGroups, which brings the possibility of multiple data sharding behind Nacos.

The Distro protocol is an eventual consistency protocol developed by Alibaba, and there are many eventual consistency protocols, such as the data synchronization algorithms in Gossip and Eureka. The Distro algorithm is based on the advantages of the Gossip and Eureka protocols and optimized. For the native Gossip, since the node that sends the message is randomly selected, it is inevitable that the message will be repeatedly sent to the same node, which increases the network The pressure of transmission also brings extra processing load to the message nodes, and the Distro algorithm introduces the concept of authoritative server, each node is responsible for a part of the data and synchronizes its own data to other nodes, effectively reducing message redundancy. The problem.

Next, we reveal how they are implemented from the source code

Let's look at the source code and first look at its class and interface design
insert image description here
. In order to achieve reusability and low coupling between modules, nacos designs different modules.

Let's see how nacos start to work

Nacos startup first needs to read the configuration file, so it is called from this class

/**
 * Nacos Factory.
 *
 * @author Nacos
 */
public class NacosFactory {
    
    
    
    /**
     * Create config service.
     *
     * @param properties init param
     * @return config
     * @throws NacosException Exception
     */
    public static ConfigService createConfigService(Properties properties) throws NacosException {
    
    
        return ConfigFactory.createConfigService(properties);
    }
    
    /**
     * Create config service.
     *
     * @param serverAddr server list
     * @return config
     * @throws NacosException Exception
     */
    public static ConfigService createConfigService(String serverAddr) throws NacosException {
    
    
        return ConfigFactory.createConfigService(serverAddr);
    }
    
    /**
     * Create naming service.
     *
     * @param serverAddr server list
     * @return Naming
     * @throws NacosException Exception
     */
    public static NamingService createNamingService(String serverAddr) throws NacosException {
    
    
        return NamingFactory.createNamingService(serverAddr);
    }
    
    /**
     * Create naming service.
     *
     * @param properties init param
     * @return Naming
     * @throws NacosException Exception
     */
    public static NamingService createNamingService(Properties properties) throws NacosException {
    
    
        return NamingFactory.createNamingService(properties);
    }
    
    /**
     * Create maintain service.
     *
     * @param serverAddr server address
     * @return NamingMaintainService
     * @throws NacosException Exception
     */
    public static NamingMaintainService createMaintainService(String serverAddr) throws NacosException {
    
    
        return NamingMaintainFactory.createMaintainService(serverAddr);
    }
    
    /**
     * Create maintain service.
     *
     * @param properties server address
     * @return NamingMaintainService
     * @throws NacosException Exception
     */
    public static NamingMaintainService createMaintainService(Properties properties) throws NacosException {
    
    
        return NamingMaintainFactory.createMaintainService(properties);
    }

All nacos registration, subscription and other interfaces
insert image description here
initialization code

 /**
     * 主要方法实现
     * @param properties
     * @throws NacosException
     */
    private void init(Properties properties) throws NacosException {
    
    
        //校验文件参数
        ValidatorUtils.checkInitParam(properties);
        //初始化命名空间
        this.namespace = InitUtils.initNamespaceForNaming(properties);
        InitUtils.initSerialization();
        initServerAddr(properties);
        InitUtils.initWebRootContext(properties);
        initCacheDir();
        initLogName(properties);
        
        //服务地址代理具体实现,初始化线程池配置
        this.serverProxy = new NamingProxy(this.namespace, this.endpoint, this.serverList, properties);
        //服务心跳
        this.beatReactor = new BeatReactor(this.serverProxy, initClientBeatThreadCount(properties));
        //获取host实现
        this.hostReactor = new HostReactor(this.serverProxy, beatReactor, this.cacheDir, isLoadCacheAtStart(properties),
                isPushEmptyProtect(properties), initPollingThreadCount(properties));
    }

The next step is to register each service instance

@Override
    public void registerInstance(String serviceName, String groupName, String ip, int port, String clusterName)
            throws NacosException {
    
    

        Instance instance = new Instance();
        instance.setIp(ip);
        instance.setPort(port);
        instance.setWeight(1.0);
        instance.setClusterName(clusterName);

        registerInstance(serviceName, groupName, instance);
    }
@Override
    public void registerInstance(String serviceName, String groupName, Instance instance) throws NacosException {
    
    
        //校验名称是否合法
        NamingUtils.checkInstanceIsLegal(instance);
        //获取namespce对应组
        String groupedServiceName = NamingUtils.getGroupedName(serviceName, groupName);
        //是否是临时实例
        if (instance.isEphemeral()) {
    
    
            //如果是临时实例,增加心跳检测
            BeatInfo beatInfo = beatReactor.buildBeatInfo(groupedServiceName, instance);
            beatReactor.addBeatInfo(groupedServiceName, beatInfo);
        }
        //实例注册
        serverProxy.registerService(groupedServiceName, groupName, instance);
    }

When the service goes offline, the instance will be destroyed, and the thread pool just initialized will be destroyed.

 @Override
    public void shutDown() throws NacosException {
    
    
        beatReactor.shutdown();
        hostReactor.shutdown();
        serverProxy.shutdown();
    }

Refer to the specific destruction process

The clever design of the thread pool of Nacos source code analysis (1) can be added to your own project

This article will start here, and the next article will share with you how nacos implements a distributed consensus protocol.

Everyone must remember to like, subscribe, and follow

Update the next article, you are the first time to learn

Your support is the driving force for me to continue to create! ! !

Guess you like

Origin blog.csdn.net/weixin_44302240/article/details/123576352