Distributed configuration service etcd VS distributed coordination service zookeeper

etcd is a highly available key-value store mainly used for shared configuration and service discovery. Developed and maintained by CoreOS and inspired by ZooKeeper and Doozer, etcd is written in Go and handles log replication through the Raft consensus algorithm to ensure strong consistency. Raft is a new consensus algorithm from Stanford, which is suitable for log replication in distributed systems. Raft achieves consistency through elections. In Raft, any node may become a leader. Google's container cluster management system Kubernetes, the open source PaaS platform Cloud Foundry, and CoreOS's Fleet all use etcd extensively.

The working principle of etcd cluster is based on the raft consensus algorithm (The Raft Consensus Algorithm). etcd reimplemented the raft algorithm in version 0.5.0 instead of relying on the third-party library  go-raft as before  . The advantage of the raft consensus algorithm is that it can efficiently solve the log content consistency problem of each node in a distributed system, and at the same time make the cluster have a certain fault tolerance. Even if there are some node failures, network failures, etc. in the cluster, it can still ensure that most of the remaining nodes step correctly. Even when more nodes (generally more than half of the total number of cluster nodes) fail and the cluster becomes unavailable, it is still guaranteed that the data in the nodes will not have erroneous results.

 

background

CoreOS uses etcd as a cluster configuration service, which has many outstanding features. etcd is a key, value data server, a single instance can reach 1000 write operations per second, and a convenient REST interface. Zookeeper is a distributed coordination service that shines in Hadoop, providing distributed locks, data synchronization, and other services.

From a functional point of view, both of them can well complete the synchronization and configuration problems encountered in the distributed cluster, but it is undeniable that these two services have different purposes when they are designed, which also makes these two services have different small differences.

etcd

Purpose: A highly available Key/Value storage system, mainly used for sharing configuration and service discovery 

Interface: REST interface (HTTP+JSON) is convenient for each host in the cluster to access 

Function: Provide key, value storage service, cluster queue synchronization service, observe the value change of a key, and query historical key value information, etc. 

Distributed protocol: Raft consensus protocol. Provides strong consistency guarantees 

Deployment form: In the form of small cluster (etcd server nodes form a cluster) + large cluster (other nodes use services directly), the cluster can reach thousands of nodes 

Implementation language: go has almost the same efficiency as C, especially the go language itself is a language for multi-threading and process communication. Excellent performance in small clusters 

zookeeper

Purpose: Highly effective and reliable collaborative working system 

Interface: TCP-based self-protocol, you need to install the corresponding client program 

Function: Provide key and value storage services, resume temporary nodes in the cluster, observe key value changes, etc. 

Distributed protocol: Consolidated Paxos consensus protocol, improved Zab protocol. Provides strong consistency guarantees 

Deployment form: In the form of small cluster (zookeeper server nodes form a cluster) + large cluster (other nodes use services directly), the cluster can reach thousands of nodes 

Implementation language: java, the amount of implementation code is more than go, and the performance is average in small-scale clusters, but in large-scale cases, after using the optimization of multi-threading, it is not much different from go 

Summarize:

It can be seen that due to different design ideas, etcd is more suitable as a cluster configuration server in terms of native interfaces and service delivery methods to store a large amount of data in the cluster. The convenient REST interface can also make it convenient for any node in the cluster to use the key value service. Zookeeper is more suitable for providing distributed coordination services. It is much simpler than etcd to implement a distributed lock model. Therefore, in actual use, you should choose the corresponding service according to your own usage.

 

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326336145&siteId=291194637