[Introduction to CoreOS--ETCD]

etcd is a distributed key value store that provides a reliable way to store data across a cluster of machines. It’s open-source and available on GitHub. etcd gracefully handles leader elections during network partitions and will tolerate machine failure, including the leader.

ETCD is an open source project initiated by CoreOS company. ETCD is a distributed and consistent KV storage system for shared configuration and service discovery.

Your applications can read and write data into etcd. A simple use-case is to store database connection details or feature flags in etcd as key value pairs. These values can be watched, allowing your app to reconfigure itself when they change.

Advanced uses take advantage of the consistency guarantees to implement database leader elections or do distributed locking across a cluster of workers.

ETCD uses the Raft protocol to maintain the consistency of the state of each node in the cluster. Simply put, the ETCD cluster is a distributed system. Multiple nodes communicate with each other to form an overall external service. Each node stores complete data, and the Raft protocol ensures that the data maintained by each node is consistent.

Disaster recovery

etcd is designed to withstand machine failures. An etcd cluster automatically recovers from temporary failures (e.g., machine reboots) and tolerates up to (N-1)/2 permanent failures for a cluster of N members. When a member permanently fails, whether due to hardware failure or disk corruption, it loses access to the cluster. If the cluster permanently loses more than (N-1)/2 members then it disastrously fails, irrevocably losing quorum. Once quorum is lost, the cluster cannot reach consensus and therefore cannot continue accepting updates.

There are many systems that provide configuration sharing and service discovery, of which the most well-known is [Zookeeper] (hereinafter referred to as ZK), and ETCD can be regarded as a rising star. In terms of project implementation, easy-to-understand consensus protocol, operation and maintenance, security and other dimensions, etcd has advantages over Zookeeper.

Clustering Guide

Starting an etcd cluster statically requires that each member knows another in the cluster. In a number of cases, the IPs of the cluster members may be unknown ahead of time. In these cases, the etcd cluster can be bootstrapped with the help of a discovery service.

Once an etcd cluster is up and running, adding or removing members is done via runtime reconfiguration. To better understand the design behind runtime reconfiguration, we suggest reading the runtime configuration design document.

ETCD usage scenarios

Similar to ZK, ETCD has many usage scenarios, including:

Configuration management

Service registered with discovery

choose the main

Application scheduling

Distributed queue

Distributed lock

Discovery service protocol

Discovery service protocol helps new etcd member to discover all other members in cluster bootstrap phase using a shared discovery URL.

Discovery service protocol is only used in cluster bootstrap phase, and cannot be used for runtime reconfiguration or cluster monitoring.

The protocol uses a new discovery token to bootstrap one unique etcd cluster. Remember that one discovery token can represent only one etcd cluster. As long as discovery protocol on this token starts, even if it fails halfway, it must not be used to bootstrap another etcd cluster.

The rest of this article will walk through the discovery process with examples that correspond to a self-hosted discovery cluster. The public discovery service, discovery.etcd.io, functions the same way, but with a layer of polish to abstract away ugly URLs, generate UUIDs automatically, and provide some protections against excessive requests. At its core, the public discovery service still uses an etcd cluster as the data store as described in this document.

[Introduction to CoreOS--ETCD]

Guess you like