The difference between zk and eureka, how to choose

Introduction

Eureka itself is a Netflix open source product that provides service registration and discovery, and provides corresponding Java packaging. In its implementation, the nodes are equal to each other, and the failure of part of the registry node will not affect the cluster. Even if only one node in the cluster survives, the discovery service can be provided normally. Even if all service registration nodes are down, Eureka Clients will cache service call information. This ensures that the mutual calls between our microservices are robust enough.

Zookeeper mainly provides open source distributed configuration services, synchronization services and named registration for large-scale distributed computing. It used to be a sub-project of the Hadoop project to control the data in the cluster. It has been upgraded to an independent top-level project. It is also used as a service discovery service solution in many scenarios.

Compared

There is a well-known CAP theorem in distributed systems (C-data consistency; A-service availability; P-service fault tolerance to network partition failures. These three characteristics cannot be met at the same time in any distributed system, at most at the same time Meet two);

Zookeeper

Zookeeper is designed based on CP, that is, access requests to Zookeeper at any time can get consistent data results. At the same time, the system is fault-tolerant for network segmentation, but it cannot guarantee the availability of each service request. Analyzing from the actual situation, when using Zookeeper to obtain the service list, if Zookeeper is choosing the master, or more than half of the machines in the Zookeeper cluster are unavailable, then data will not be available. Therefore, Zookeeper cannot guarantee service availability.

It is true that in most distributed environments, especially scenarios involving data storage, data consistency should be guaranteed first, which is why zookeeper is designed as a CP. But for the service discovery scenario, the situation is different: for the same service, even if the service provider information stored in different nodes of the registry is not the same, it will not cause catastrophic consequences. Because for service consumers, being able to consume is the most important thing-trying to consume after getting the service instance information that may be incorrect is better than not consuming it because the instance information cannot be obtained. (Try it to fail quickly, then you can update the configuration and try again) So, for service discovery, availability is more important than data consistency-AP is better than CP.

Eureka

And Spring Cloud Netflix abides by the AP principle when designing Eureka. Eureka Server can also run multiple instances to build clusters and solve single-point problems, but unlike ZooKeeper's leader election process, Eureka Server uses Peer to Peer peer-to-peer communication. This is a decentralized architecture, there is no master/slave distinction, and each peer is equal. In this architecture, nodes register with each other to improve availability, and each node needs to add one or more valid serviceUrls to point to other nodes. Each node can be regarded as a copy of other nodes.

If a Eureka Server goes down, Eureka Client requests will be automatically switched to the new Eureka Server node. When the down server is restored, Eureka will again include it in the server cluster management. When a node starts to accept client requests, all operations will perform a replicateToPeer (inter-node replication) operation, and the request will be replicated to all nodes currently known by other Eureka Server.

After a new Eureka Server node is started, it will first try to obtain all instance registry information from neighboring nodes to complete the initialization. Eureka Server obtains all nodes through the getEurekaServiceUrls() method, and will update regularly through heartbeat renewal. In the default configuration, if Eureka Server does not receive the heartbeat of a service instance within a certain period of time, Eureka Server will log out the instance (the default is 90 seconds, through the eureka.instance.lease-expiration-duration-in-seconds configuration) . When the Eureka Server node loses too many heartbeats in a short period of time (for example, a network partition failure occurs), then this node will enter the self-protection mode.

What is the self-protection mode? In the default configuration, if the number of heartbeat renewals received by Eureka Server per minute is lower than a threshold (the number of instances (60/the number of seconds between heartbeats for each instance) self-protection coefficient), and it lasts for 15 minutes, it will trigger itself protection. In the self-protection mode, Eureka Server will protect the information in the service registry and no longer log out any service instances. When the number of heartbeats it receives returns to above the threshold, the Eureka Server node will automatically exit the self-protection mode. Its design philosophy was mentioned earlier, that is, it would rather keep the wrong service registration information than blindly cancel any service instances that may be healthy. This mode can be disabled by eureka.server.enable-self-preservation = false, and eureka.instance.lease-renewal-interval-in-seconds can be used to change the heartbeat interval, eureka.server.renewal-percent-threshold can be used To modify the self-protection coefficient (default 0.85).

to sum up

ZooKeeper is based on CP and does not guarantee high availability. If ZooKeeper is electing the master, or more than half of the machines in the ZooKeeper cluster are unavailable, data will not be available. Eureka is based on AP and can guarantee high availability, even if all machines are hung up, you can get locally cached data. As a registration center, the configuration does not change frequently, only when the version is released and the machine fails. For configurations that do not change frequently, CP is not appropriate, and AP can sacrifice consistency to ensure availability when encountering problems, both returning old data and caching data.

So in theory, Eureka is more suitable as a registration center. In the real environment, most projects may use ZooKeeper, because the cluster is not large enough, and basically it will not encounter the situation that more than half of the machines used as the registry are down. So there is actually no big problem.

Guess you like

Origin blog.csdn.net/qq_39809613/article/details/108438659