Summary of problems encountered in the project (2)

CAP theory

The CAP theory is a very important theory in distributed systems. It points out that in a distributed system, at most, the three characteristics of consistency (Consistency), availability (Availability) and partition tolerance (Partition Tolerance) can only be satisfied at the same time. Two, three characteristics cannot be satisfied at the same time.

Specifically, the CAP theory explains that in a distributed system, due to factors such as network partitions and machine failures, different nodes in the distributed system may be in different states, so how to deal with these states needs to be considered when designing a distributed system. The CAP theory proposes three characteristics, namely consistency (Consistency), availability (Availability) and partition tolerance (Partition Tolerance):

  1. Consistency: In a distributed system, the data among all nodes is consistent, that is, the data of all nodes is the same at any time.

  2. Availability: In a distributed system, the data of all nodes can be accessed at any time, that is, the system has high availability and will not be unavailable due to failure of some nodes.

  3. Partition Tolerance (Partition Tolerance): In a distributed system, even if factors such as network partition or machine failure prevent communication between nodes, the system can still continue to operate. In a distributed system, the network of nodes should be connected. However, due to some faults, some nodes are disconnected, and the entire network is divided into several areas. Data is scattered across these disconnected regions. This is called a partition.

The CAP theory points out that in a distributed system, at most two of the characteristics can be satisfied at the same time, and three characteristics cannot be satisfied at the same time. This is because the system needs to make a trade-off between consistency and availability when a node fails or the network partitions. If you choose to ensure consistency, you must stop the service and wait for the failure to recover or the network connection to resume before continuing to provide services, thereby reducing the availability of the system; if you choose to ensure availability, then you must give up consistency and tolerate data between nodes inconsistent, thereby reducing the consistency of the system.

In the actual distributed system, it is necessary to select the two characteristics in the CAP theory according to the actual needs and make a trade-off. For example, for systems with very high consistency requirements such as financial systems, you can choose to ensure consistency and partition tolerance, thereby sacrificing availability. For large-scale Internet applications, you can choose to ensure availability and partition fault tolerance, so as to tolerate data inconsistency between nodes.

Is CP or AP guaranteed in Nacos?

Nacos supports both CP mode and AP mode, which can be configured according to actual needs.

In Nacos, you can choose to use CP mode or AP mode by configuring the nacos.core.cluster.switch parameter. If this parameter is set to nacos, AP mode will be used; if this parameter is set to raft, CP mode will be used.

Specifically, the AP mode of Nacos means that even in the event of a network partition or node failure, the service can still be provided to ensure the availability of the system. The CP mode of Nacos means that it guarantees data consistency and partition fault tolerance, and can meet financial and other scenarios that require relatively high data consistency.

For example, if our application has high requirements on data consistency, such as the capital transaction system in the financial field, then we can configure Nacos in CP mode. In CP mode, Nacos will synchronize all data to all nodes to ensure data consistency, but in the event of node failure or network partition, the availability of the system may be reduced. In order to ensure high availability, multiple nodes can be used for deployment, and high availability can be achieved through load balancing.

If our application does not require high data consistency, such as Internet applications, then we can configure Nacos in AP mode. In AP mode, Nacos can tolerate data inconsistency between nodes, ensuring the availability of the system. However, when a node fails or a network partition occurs, data inconsistency between different nodes may occur, which needs to be handled by the application itself.

For example, the following is a sample Nacos configuration file using AP mode:

# 使用AP模式
nacos.core.cluster.switch=nacos

In the above configuration, set the nacos.core.cluster.switch parameter to nacos to use the AP mode. If you want to use CP mode, you only need to set this parameter to raft.

Where is this configuration configured

This configuration can be set in the Nacos configuration file. Specifically, you can set the value of this parameter in the conf/application.properties file to choose whether to use the CP mode or the AP mode.

For example, if you want to configure Nacos in CP mode, you can add the following configuration to the conf/application.properties file:

# 使用CP模式
nacos.core.cluster.switch=raft

If you want to configure Nacos in AP mode, you can add the following configuration to the conf/application.properties file:

# 使用AP模式
nacos.core.cluster.switch=nacos

It should be noted that after modifying the configuration file of Nacos, you need to restart Nacos to take effect. In addition, in order to avoid configuration files being accidentally modified or tampered with, it is recommended to use a secure configuration management tool to manage Nacos configuration files in the production environment.

Does Nacos use AP or CP by default?

Nacos uses AP mode by default. Of course, users can configure according to actual needs, and configure Nacos in CP mode or AP mode. By default, Nacos uses the AP mode by default, that is, the default value of the nacos.core.cluster.switch parameter is nacos.

In AP mode, each node in the Nacos cluster maintains the final consistency of data through asynchronous communication. In this mode, if there is a network partition between nodes, data inconsistency may occur in some nodes, but Nacos will give priority to ensuring the availability of services, so this mode is suitable for Internet applications and other scenarios that do not require high data consistency.

In CP mode, each node in the Nacos cluster uses a consistency protocol to ensure data consistency. In this mode, if a network partition occurs between nodes, some nodes may be temporarily disabled to ensure data consistency, but some services will also be unavailable. Therefore, this mode is suitable for scenarios that require high data consistency, such as applications in the financial field.

Does this refer to the AP and CP of the Nacos cluster?

Yes, the default AP mode of Nacos mentioned here refers to the AP mode of Nacos cluster. By default, asynchronous communication is used to maintain the final consistency of data. However, if Nacos is configured in CP mode, a consistency protocol will be used to ensure data consistency. The AP and CP modes of Nacos are both for Nacos clusters, not for a single instance. In a Nacos cluster, all instances should use the same mode to ensure data consistency and correct operation of the cluster.

Do you think the registration center should be CP or AP?

I think in most cases, the registration center should be AP. If the registration center belongs to CP, it means that when we register an instance with the registration center or remove an instance, we must wait for the data in the registration center cluster to reach consistency. However, this is time-consuming. As the scale of business applications increases and applications go online and offline frequently, the pressure on the registration center will be relatively high, which will affect the efficiency of service discovery and service Called, and if the registration center belongs to the AP, the registration center cluster can provide services no matter what happens. The offline service node, but now the general microservice framework or component provides service fault tolerance and retry function, which can also avoid this problem, and if it is an AP, it does not need to consume too many resources for the registration center To ensure data consistency in real time, it is enough to ensure final consistency, so that the pressure on the registration center will be less. In addition, Zookeeper is used as the registration center, because Zookeeper guarantees CP, but if most nodes in the cluster hang If it is lost, even if there are some Zookeeper nodes left, these nodes cannot provide services, so this is not suitable, so in general, the registration center should guarantee that the AP will be better, just like Euraka and Nacos, what they guarantee by default is AP.

Summary: AP, in most cases, does not require strict data consistency, and only needs to ensure final consistency (asynchronous communication). If it is CP, waiting for data has always been time-consuming.

How does the Nacos service determine the status of the service instance?

After the service is registered, Nacos Client will maintain a regular heartbeat to continuously notify Nacos Server, indicating that the service is always available to prevent it from being removed. By default, a heartbeat is sent every 5s. Nacos Server will start a scheduled task to check the health of the registered service instance. For instances that have not received the client heartbeat for more than 15s, their healthy attribute will be set to false (it will not be found when the client service is discovered). If an instance has not received a heartbeat for more than 30 seconds, the instance will be removed directly (the removed instance will re-register if it resumes sending the heartbeat).

How is the bottom layer of load balancing in Nacos implemented?

The load balancing in Nacos is implemented through the client, while the underlying implementation is implemented through the Ribbon. Specifically, Nacos Client integrates two components, OpenFeign and Ribbon. Ribbon is responsible for client load balancing, while OpenFeign realizes service invocation and load balancing based on Ribbon.

In Nacos, the client uses Nacos Client to subscribe to service instance information, and periodically obtains the latest list of service instances from Nacos Server. After the client obtains the list of service instances, it will pass these instances to the Ribbon, and the Ribbon will select an optimal service instance to request according to the load balancing strategy. When the service instance changes, Nacos Client will notify Ribbon in time, and Ribbon will reselect the service instance to achieve dynamic load balancing.

Nacos supports a variety of load balancing strategies, including round robin, random, weight and other strategies. Users can configure different load balancing strategies in Nacos Client according to actual needs. For example, when using Ribbon as a load balancer, you can add the following configuration to the configuration file:

# 使用轮询策略
ribbon.NFLoadBalancerRuleClassName=com.netflix.loadbalancer.RoundRobinRule

In the above configuration, set the ribbon.NFLoadBalancerRuleClassName parameter to com.netflix.loadbalancer.RoundRobinRule to use the round-robin strategy. If you want to use other load balancing strategies, you only need to set this parameter to the corresponding class name.

Guess you like

Origin blog.csdn.net/m0_51431003/article/details/131155580