2019/08/09 Basic concept of zookeeper (01)

** zookeeper mainly provides services for distributed systems, ES cluster, distributed file system, mysql master-slave replication, mainly not working on the same node are called distributed system
distributed system: it is a hardware or software component distributed in A system that communicates and cooperates with each other on different computers in the network only through message passing.
**

Features:
Distribution, peer-to-peer, concurrency (while providing services to the outside), lack of global clock, failures will inevitably occur (the network is not always reliable)

Typical problems faced:
abnormal communication, network partition, tri-state (success , Failure, timeout), node failure
Insert picture description here

CAP: 2000,7, Eric Brewer, ACM PODC (conference),
2002, MIT (MIT), Seth Gilbert, Nancy Lynch;
any distributed system can only complete two
P in the CAP : partition fault tolerance
CP: Consistency, fault tolerance
AP: availability, fault tolerance
Guaranteed availability, data inconsistency, this is meaningless
Insert picture description here
BASE:
BA: basic availability, (partial availability is allowed to be lost when a distributed system fails)
S: soft state (the system is allowed to exist) Intermediate state, and believe that this state will not affect the overall availability of the system)
E: final consistency (after all replicas in the system are synchronized, you can eventually reach a consistent state)
causal consistency, session consistency (read in a session The data is always consistent), monotonous read consistency, monotonous write consistency, and consistency of what you write (the data you write can always be read by yourself, and it needs to be consistent before and after, only one person is guaranteed to write on your own At the time, the reading is consistent); to
Insert picture description here
ensure the consistency of the distributed system, multiple protocols:
2PC: 2 Phase-Commit 2 terminal submission, the entire data is retained To be divided into 2 steps:
1. Submit preparation (request and confirmation)
request and execute
3PC: 3 Phase-Commit 3-terminal submission, CanCommit can submit (judging which nodes in the cluster can submit)-> PreCommit pre-submit-> DoCommit
Paxos Pakseth Protocol (Ancient Greece, Tianma paxos): Leslie Lamport, proposed in 1990, is the only recognized one of the most effective methods to solve the problem of consistency in distributed
But now there are simpler ones that can provide service coordination and registration management for distributed systems. 2 solutions, zookeeper, etcd, and docker use each node in etcd distributed. Zookeeper is
not implemented by paxos, but on Paxos. The whole agreement has been changed, and the implementation idea has been changed. Because paxos is really difficult to implement with computer algorithms,

2PC: 2 Phase-Commit 2-terminal submission, the entire data retention must be carried out in 2 steps,
1. Submit preparation (request and confirmation)
request and execution
has a problem, the first one to request will be blocked (barrel Theoretically, the short board effect), when the last request is completed, it is considered to be fully completed, and not fully completed is not enough to request the next stage
Insert picture description here,Insert picture description here

Google Chubby, distributed lock service, GFS (google file system, hdfs hadoopfs) / BigTable (HBASE cottage BIG table) all use chubby, (highly available distributed lock service)
role: faceted collaboration, metadata storage, Master election;

there is a common service, coordination of the processes running by multiple nodes, coordination between nodes, and the need to send messages between nodes, but if there are nodes hanging in the middle, there is no way to coordinate,
so these three The two nodes will not communicate with each other when they coordinate. They all communicate with public storage. They can synchronize their state to public storage on a periodic basis. Each node guarantees that other nodes can see relevant information. The storage saves a data node. Once the server hangs and the link is terminated, the storage is removed. When other nodes cannot obtain data about the node, the node does not exist.
Insert picture description here
The server itself supports the master node. Select functions, public storage services, also need high availability, generally cardinality, more than three sections To ensure their own hang up, the other continues to serve

The cottage chubby is the zookeeper

HDFS / HBase, Zookeeper Zookeeper
is an open source faceted coordination service, created by the well-known Internet company Yahoo, it is Chubby's open source implementation; in other words, zk is a typical distributed data consistency solution, distributed Applications can implement data publishing / subscription, load balancing, name services, distributed coordination / notification, cluster management, master election, distributed locks, and distributed queues based on it;

HDFS + mapreduce = hadoop
hbase cottage big table provides columnar data services on hdfs

Puppet updates one by one in synchronization, zk can notify all subscribersInsert picture description here

Basic concept:
cluster role: Leader, Follower evaluation, Observer observer (at least three nodes in zk, is a cluster, generally 2 followers, a leader)
Leader: election, provide read / write services;
Follower: participate in elections, can be Election (leader hangs, can be master), read service;
Observer: participate in the election, can not be elected (can not become a master), provide read service;
Insert picture description here
Session: ZK, client <-> server, establish a long TCP connection, Until the node does not participate in the distributed system;
sessionTimeout (network problems cause the link can not be set)

zk data management model is an inverted tree management structure, a distributed application can register a child node, at this node, each host can there is a leaf node, used to save information about the current host, there is online, not online, does not exist,
the host is connected to zk, the session information will always exist, the host goes down, this session will turn off the information
distributed to Each node can coordinate the need for each node and zk to establish a persistent session, through this session, constantly changing Zk among themselves, on their own node status information, such as every three seconds, the other nodes is to see this node file still, it considers the node is still alive
Insert picture description here Insert picture description here

Data node (ZNode): the data unit in the zk data model; zk data is stored in memory, and the data model is a tree structure (ZNode Tree); each ZNode will save its own data in memory; the
node has two Class
Persistent node: only disappear after explicit deletion
Temporary node: automatically disappear when the session is terminated (session link established between zk client and zk)

The data node has a version Insert picture description here
version (version): ZK will maintain a data structure called Stat for each ZNode, which records the three data versions of the current ZNode
version: current version
cversion: the current version of the znode child node
aversion: current Znode's ACL version (access list)
Insert picture description here
multiple distributed systems may use a zk, so zk needs to isolate them to
ACL: ZK uses the ACL mechanism for permission control
CREATE, READ, WRITE, DELETE, ADMIN

Insert picture description here
event listener (Watcher ): On
ZK, the trigger mechanism specified by the user. When certain events occur, ZK can notify the relevant client; the
node registers the listener with zk. Once the data changes, zk can immediately notify the registered listener of the change. Related nodes of the device, and then everyone comes to choose a new village head.

All distributed nodes can complete elections, coordination, publish and subscribe based on zk . Zk is realized by ZK ’s special agreement Insert picture description here
ZAB protocol: Zookeeper Atomic Broadcast, ZK atomic broadcast protocol ; The zab protocol is completed, the leader election among the participants of each node
ensures that a new leader can be re-elected when the leader crashes, It also needs to ensure data integrity and
supports a crash recovery mechanism.
All transaction requests are handled by the leader. Other servers, follow and obsrv, only provide
three states in the ZAB protocol:
(1) Looking,
(2) Following,
(3 ) Leading

When the user sends a write request, it is sent to the leader of zk. After the leader receives the information, the leader turns this modification request into a proposal, broadcasts it to the zk cluster, and whether the modification data agrees. The following followers raise their hands to agree, more than half , Update the data to the source data

All things in zk must be accepted and processed by the leader. No matter which zk node the client is connected to, if it is not a leader, it should transfer this information to the leader, and the leader will turn each client request into a proposal. Broadcast to the player's current cluster. When more than half of the follower votes are received, this operation will be confirmed, and the result will be broadcast to other followers to complete the submission
Insert picture description here

There are three states in the ZAB protocol:
(1) Looking, when each zk node is just started, to find the leader, (election process, looking process)
(2) Following, there is already a leader
(3) The state
of the leader leader A node may be converted among three nodes. When zk starts, all nodes are in the default Looking state, and then the entire cluster will try to elect a leader, the elected switch is leading, and the other switches are following
. The zab protocol will enter the atomic broadcasting stage. The leader data needs to be packaged to the follower node. The leader node and follower node must use the heartbeat detection mechanism to detect whether each follower is in a normal state. This detection mechanism is equivalent to the follower continuous A data node in the cluster updates the data version. If it keeps updating, the leader thinks that they are normal. If the follower is over time, a follower is not updating its own specified information, it is considered to have lost the connection. If the leader is lost, everyone switches to looking and proposes to be the leader
election, release, synchronization, broadcast, etc.

Published 252 original articles · Likes6 · Visits 60,000+

Guess you like

Origin blog.csdn.net/qq_42227818/article/details/98958672