table of Contents
Data model
The data structure of ZooKeeper is very similar to the tree structure of the Unix file system , but there is no concept of directories and files. Each data node is called ZNode , and the node path is very similar to the Unix file system path. We can write data to a node or create child nodes under the node.
In ZooKeeper, transactions refer to operations that can change the state of the ZooKeeper server, which generally include operations such as data node creation and deletion, content update, and client session creation and invalidation. For each transaction request, ZooKeeper will assign it a globally unique transaction ID, which is represented by ZXID , usually a 64-bit number, and each ZXID corresponds to an update operation.
Node characteristics
Node type
The types of nodes are divided into: persistent node ( PERSISTENT ), persistent sequence node ( PERSISTENT_SEQUENTIAL ), temporary node ( EPHEMERAL ), temporary sequence node ( EPHEMERAL_SEQUENTIAL ).
A persistent node means that after the data node is created, it will always exist on the ZooKeeper server until there is a delete operation to actively clear the node.
Temporary node means that its life cycle is bound to the client's session. If the client's session fails, the node will be automatically cleared.
Sequential node means that in the process of creating a node, ZooKeeper will automatically add a numeric suffix to the given node name as a new and complete node name. The upper limit of this number suffix is the maximum value of shaping.
Stat status attributes
In addition to the stored data content, the data node also stores some state information of the data node itself. The status information is encapsulated in the Stat object. The following is the description of the status attributes of the Stat object:
State attribute | Description |
---|---|
czxid | Indicates the transaction ID when the data node was created |
mzxid | Indicates the transaction ID when the node was last updated |
pzxid | Indicates the transaction ID when the child node list of the node was last modified (only the child node list is updated will be updated, and the child node content change will not affect pzxid) |
ctime | Indicates the time when the node was created |
mtime | Indicates the time when the node was last updated |
version | The version number of the data node |
cversion | The version number of the child node |
aversion | ACL version number of the node |
ephemeralOwner | The sessionID of the session that created the temporary node. If the node is a persistent node, this attribute value is 0 |
datalength | The length of the data content |
numChildren | The number of child nodes of the current node |
version
The version here indicates the number of times the data content of the data node, the child node list, or the node ACL information has been modified . When a data node is created successfully, its version value is 0, which means that the node has been updated 0 times since it was created. If the data content of the node is updated, the value of version will become 1.
The version is used to avoid some concurrency problems of distributed updates, such as distributed lock services (to ensure atomic operation of distributed data). If client A tries to perform an update operation, it will update with the version value obtained last time. If during this period of time, the data of the node on the ZooKeeper server happens to be updated by other clients, then the data version must have changed. The latest data version cannot match the version carried by client A, so client A cannot update successfully.
If there is no atomic requirement for the update operation of ZooKeeper data nodes, then the data version parameter can use -1 to tell the server that the client needs to update based on the latest version of the data.
Watcher
In ZooKeeper, the Watcher mechanism is introduced to realize the publish/subscribe function of distributed data . ZooKeeper allows the client to register a Watcher to the server. When some specified events on the server trigger the Watcher, it will send an event notification to the specified client.
WatchedEvent
In ZooKeeper, the interface class Watcher is used to represent a standard event handler, which defines the logic related to event notification, including two enumeration classes KeeperState and EventType , which represent the notification state and event type, and define the event at the same time The callback method process(WatchedEvent event) .
KeeperState | EventType | Triggering conditions |
---|---|---|
SyncConnected | None | Client and server successfully establish a session |
SyncConnected | NodeCreated | The corresponding data node monitored by Watcher is created |
SyncConnected | NodeDeleted | The corresponding data node monitored by Watcher is deleted |
SyncConnected | NodeDataChanged | The data content of the corresponding data node monitored by Watcher changes |
SyncConnected | NodeChildrenChanged | The child node list of the corresponding data node monitored by Watcher changes |
Disconnected | None | The client disconnects from the ZooKeeper server |
Expired | None | Session timeout |
AuthFailed | None | Use wrong scheme for permission check or SASL permission check failed |
ConnectedReadOnly | None | Read only mode |
package org.apache.zookeeper;
import org.apache.yetus.audience.InterfaceAudience;
import org.apache.zookeeper.proto.WatcherEvent;
import org.apache.zookeeper.Watcher.Event.EventType;
import org.apache.zookeeper.Watcher.Event.KeeperState;
/**
* A WatchedEvent represents a change on the ZooKeeper that a Watcher
* is able to respond to. The WatchedEvent includes exactly what happened,
* the current state of the ZooKeeper, and the path of the znode that
* was involved in the event.
*/
@InterfaceAudience.Public
public class WatchedEvent {
final private KeeperState keeperState;
final private EventType eventType;
private String path;
/**
* Create a WatchedEvent with specified type, state and path
*/
public WatchedEvent(EventType eventType, KeeperState keeperState, String path) {
this.keeperState = keeperState;
this.eventType = eventType;
this.path = path;
}
/**
* Convert a WatcherEvent sent over the wire into a full-fledged WatcherEvent
*/
public WatchedEvent(WatcherEvent eventMessage) {
keeperState = KeeperState.fromInt(eventMessage.getState());
eventType = EventType.fromInt(eventMessage.getType());
path = eventMessage.getPath();
}
public KeeperState getState() {
return keeperState;
}
public EventType getType() {
return eventType;
}
public String getPath() {
return path;
}
@Override
public String toString() {
return "WatchedEvent state:" + keeperState
+ " type:" + eventType + " path:" + path;
}
/**
* Convert WatchedEvent to type that can be sent over network
*/
public WatcherEvent getWrapper() {
return new WatcherEvent(eventType.getIntValue(),
keeperState.getIntValue(),
path);
}
}
WatchedEvent encapsulates three basic attributes of each server event: notification state (KeeperState), event type (EventType), and node path (Path). After the server generates the WatchedEvent event, it will call the getWrapper method to wrap itself into a serializable WatcherEvent event, and then transmit it to the client through the network. After the client receives the event object from the server, it will restore the WatcherEvent event to a WatchedEvent event and pass it to the process method for processing. Therefore, WatchedEvent is a logical event, used for the logical objects needed during the execution of the server and client programs. WatcherEvent implements a serialization interface for network transmission.
Whether it is WatchedEvent or WatcherEvent, the server event is simply encapsulated. The client cannot directly obtain the original data content of the corresponding data node and the new data content after the change from the event. The client needs to actively obtain the data (ZooKeeper uses Push-pull combination mode, client monitoring, server notification, client actively obtain data).
Watcher features
The working mechanism of Watcher can be summarized as: client registers Watcher, server processes Watcher, and client calls back Watcher.
When the client registers Watcher with the server, it does not pass the Watcher object to the server, but uses the boolean type attribute to mark the client request. At the same time, the server only saves the currently connected ServerCnxn object (ServerCnxn represents the connection between the client and the server). This is done to reduce the overhead of network transmission.
One-time : Whether it is the server or the client, once a Watcher is triggered, ZooKeeper will delete it from the corresponding storage.
The process of client Watcher callback is a serial synchronization process, which ensures the order of execution.
ACL permission control
The permission control method used in the Unix/Linux file system is the UGO (User, Group, Others) permission control mechanism. A file or directory is configured with different permissions for the creator (User), the creator's group (Group), and other users (Others). UGO is a coarse-grained file system permission control mode. It cannot solve the following scenario: User U 1 creates a file F 1 and hopes that the user group G 1 where U 1 is located has the permission to read, write and execute on F 1 , and A user group G 2 has read permissions, while another user U 3 does not have any permissions.
The access control list (ACL) can perform fine-grained permission control for any user and group. Usually use permission mode (Scheme), authorization object (ID), permission (Permission), "Scheme: ID: Permission" to represent a valid ACL information.
The authorization mode is used to determine the inspection strategy used in the authorization verification process. The following four permission modes are generally used:
Permission mode | Description |
---|---|
IP | Permission control is carried out by IP address. For example, "IP: 192.168.0.1" is configured, which means that the permission control is for this IP address. At the same time, the IP mode also supports configuration according to the network segment. For example, "IP: 192.168.0.1/24" means that the authority is controlled for the network segment 192.168.0.*. |
Digest | Digest is the most commonly used permission control mode. Use the authority identification in the form of "username:password" for authority configuration, which is convenient for distinguishing different applications for authority control. When we configure the authorization identification through the form of "username:password", ZooKeeper will encode it twice, namely SHA-1 algorithm encryption and BASE64 encoding. |
World | World is the most open access control mode. The access authority of the data node is open to all users, and all users can operate the data on ZooKeeper without any authority check. The World mode can be regarded as a special Digest mode, which has only one authority identification, namely "world:anyone" . |
Super | Super mode is also a special Digest mode. In Super mode, the super user can perform any operation on any data node. |
The authorized object refers to the user or a specified entity to whom the authority is granted. For example, IP address or machine.
Permission mode | Description |
---|---|
IP | IP address or IP network segment |
Digest | Custom, usually "username:BASE64(SHA-1(username:password))" |
World | Only one ID: anyone |
Super | Consistent with Digest mode |
Permission refers to the operations that can be allowed to perform after passing the permission check
Authority | Description |
---|---|
CREATE(c) | Data node creation permission, allowing authorized objects to create child nodes under the data node |
DELETE(d) | Data node deletion permission |
READ(r) | The read permission of the data node, allowing authorized objects to access the data node and read its data content or child node list, etc. |
WRITE(w) | Update permissions of data nodes |
ADMIN(a) | The management authority of the data node, allowing authorized objects to perform related ACL setting operations on the data node |
Set ACL method 1: create [-s] [-e] path data acl
create -e /temp nothing digest:abc:123456:cdrwa
Set ACL method two: setAcl path acl
setAcl /temp digest:abc:123456:cdrwa