[Series] preparing for spring strokes soul 28 ZooKeeper ask, teach you how to get the interviewer!

1.ZooKeeper face questions? 

ZooKeeper is an open source distributed coordination service, which is a cluster manager, monitors the status of each node in the cluster to the next step reasonable operating according to the feedback node submitted. Finally, the easy to use interface and efficient performance, function and stability of the system to the user. 
Distributed applications, such as data released under Zookeeper realization / subscribe, load balancing, name services, distributed coordination / notification, cluster management, Master election, distributed locking and distributed queue functions. 

Zookeeper distributed to ensure the consistency of the following characteristics: 

  1. Sequential consistency
  2. Atomicity
  3. A single view
  4. reliability
  5. Real-time (eventual consistency) 
Client read request may be processed in any machine in the cluster, if the read request to register the listener on the node, the listener is processed by a zookeeper connected machine. For written requests, which will be simultaneously sent to other machines and the zookeeper agreement, the request will return success. Therefore, with the increase zookeeper cluster machine, read request throughput increases but write request throughput will decrease. 
Orderliness is a very important characteristic of a zookeeper, all updates are global ordered, each update has a unique time stamp, the time stamp called zxid (Zookeeper Transaction Id). But only with respect to the read request to update orderly, that is, the read request will return results with the latest zxid the zookeeper. 

2. ZooKeeper provides what? 

  1. File system
  2. Notification mechanism

3. Zookeeper file system

Zookeeper provide a multi-level namespace node (node ​​called znode). And the file system is different is that these nodes can set the data associated with it, and the file system can only store data file node directory node can not.
Zookeeper In order to ensure low latency and high throughput, maintained in memory of this tree directory structure, this feature can not be used to store Zookeeper such large amounts of data, each node storing data limit is 1M.

4. ZAB agreement?

ZAB atomic broadcast protocol is a protocol crash recovery support for distributed coordination service designed Zookeeper.
ZAB protocol includes two basic modes: the broadcast message and crash recovery.
When the entire cluster zookeeper just started or Leader server is down, restart or network failure resulting in more than half of the server and the server Leader maintain normal communication does not exist, all processes (server) into the crash recovery mode, first elected a new Leader server and cluster in Follower server starts data with the new Leader server synchronization, when a cluster than half machine completes the data with the Leader server synchronization, exit recovery mode incoming messages broadcast mode, Leader server starts receiving the client's transaction request generating things proposals for transaction request processing. 

The four types of data nodes Znode 

1, PERSISTENT- lasting node
Unless manually deleted, or node has been present on the Zookeeper 
2, EPHEMERAL- temporary node
Lifecycle temporary node with the client session binding, once the client session expires (the client and not necessarily zookeeper disconnected session expires), then the client creates all temporary nodes will be removed. 
3, PERSISTENT_SEQUENTIAL- lasting order of the nodes
The basic characteristics of the same enduring node, only increased the order attribute, node name is appended behind an auto-incremented integer numbers maintained by the parent. 
4, EPHEMERAL_SEQUENTIAL- temporal order node
The basic characteristics of the node with the temporary increase of the order of attributes, node name is appended behind an auto-incremented integer numbers maintained by the parent. 

6. Zookeeper Watcher mechanism - data change notification 

Zookeeper allows the client a Watcher to listen to a Znode registered service side, while some specified event server triggers the Watcher, the server sends an event notification to a specified client to achieve distributed notification, then the client according Watcher notification status and event types to make a change on the business. 

Working Mechanism: 

  1. Client registration watcher
  2. Server processing watcher
  3. Client callback watcher 

Watcher Summary of Features:

1, a one-time

Whether the server or the client, once a Watcher is triggered, Zookeeper will be removed from the appropriate store. This design effectively reduces the pressure on the service side, or for heavily-updated node, the server will continue to send event notifications to the client, regardless of the network or server-side pressures are very large.

2, the client serial execution

Watcher client callback process is a serial process synchronization.

3, lightweight 

3.1, Watcher notice is very simple, just tell the client event has occurred without the specific details of the event.
3.2, the client registers Watcher server when the client does not put real object entity Watcher passed to the server, just use boolean type attribute tagged in the client request.

4, watcher event notification sent asynchronously watcher of events sent from the server to the client is asynchronous, lies the problem, communication between different client and server through socket, due to network delays or other factors cause the client does not make sense listening for the event, due to the Zookeeper itself provides ordering guarantee, namely the client to listen to events, will perceive it monitors znode changed. So we can not expect to be able to monitor the use of Zookeeper changes to each node. Zookeeper can only ensure that the final agreement, but can not guarantee strong consistency.

5、注册 watcher getData、exists、getChildren

6、触发 watcher create、delete、setData

7, when a client connects to a new server, watch the session will be triggered in any event. When a lost connection with the server is unable to receive the watch. And when the client reconnects, if necessary, all previously registered watch, will be re-registered. Usually this is completely transparent. Only in a special case, watch may be lost: to exist znode of a nonexistent watch, if disconnected during a client is created, and then before the client connections and removed, in which case this watch events may be lost. 

7. Watcher achieve client registration

  1. Call the getData () / getChildren () / exist () three API, the object passed Watcher
  2. Marker request request, encapsulating the Watcher WatchRegistration
  3. Packet encapsulated into objects, server sends request sent
  4. After receiving the server response, the Watcher registered to ZKWatcherManager for management
  5. Request to return to complete the registration. 

8. The server process implemented Watcher

1, the server receives and stores the Watcher

Receiving the client request, processing the request judge whether to register Watcher, if desired node path data node and ServerCnxn (ServerCnxn represents a linking a client and server, to achieve the Watcher of process interfaces, this time can be regarded as a Watcher objects ) stored in WatcherManager of WatchTable go in and watch2Paths.

2, Watcher trigger

Service receives setData () transaction requests trigger NodeDataChanged incident as an example: 
2.1 Package WatchedEvent 
The notification state (SyncConnected), event type (NodeDataChanged) and packaged into a node path objects WatchedEvent 
2.2 Query Watcher
Watcher find the path from node according WatchTable 
2.3 did not find ; no explanation Watcher client registered on the data node 
2.4 find ; extracting and deleting Watcher from WatchTable and Watch2Paths in (from here you can see Watcher is a one-time service in the end, the trigger once ineffective) 

3, calls the process method to trigger Watcher

This process is mainly through ServerCnxn corresponding TCP connection to send Watcher event notification. 

9. The client callback Watcher 

SendThread client thread receives event notifications, referred EventThread thread callback Watcher.
Watcher mechanisms client is also a one-time, once triggered, the Watcher becomes ineffective. 

10. ACL access control mechanisms 

UGO(User/Group/Others) 
Currently used in Linux / Unix file systems, but also the most widely used access control mode. It is a coarse-grained file system access control mode.
ACL (Access Control List) Access Control Lists 
It includes three aspects: 

Permissions mode (Scheme)
1, IP: the IP address of the authority control particle size
2, Digest: The most common, with similar username: password authority to identify with permission
Position, to facilitate application to distinguish between different access control
3, World: access control most open way, it is a special digest mode, only a mark rights
识“world:anyone”

4, Super: Super User
Authorization object

Authorization object refers to permissions granted to a user or a designated entity, e.g. the IP address or machine lights.
Permissions Permission 

  1. CREATE: Create a data node privilege allows authorized to create sub-node objects in this Znode
  2. DELETE: delete child nodes permissions to allow authorization object to delete the data node's children
  3. READ: read permission data node, permitting authorized access to the data objects and read the data contents of the node or sub-node list, etc.
  4. WRITE: data node update permissions to allow authorized object to update the data node
  5. ADMIN: data node management authority that allows authorized object ACL-related operations on the data set node 

11. Chroot properties

After version 3.2.0, adds Chroot feature that allows each client to set a namespace for themselves. If a client is set Chroot, then the client any operation on the server, it will be limited in its own namespace. 
By setting Chroot, a client can be used in a sub-tree Zookeeper server correspond to those at the scene Zookeeper multiple applications into a utility group, to achieve mutual isolation between different applications is very helpful.

12. Session Management

Sub-barrel strategy : the similar session in the same block management, so as to enter the session Zookeeper
Unification of isolation and treatment of different blocks of the same block.
Allocation principles : "The next point timeout" for each session (ExpirationTime) 
The formula is: 

ExpirationTime_ = currentTime + sessionTimeout
ExpirationTime = (ExpirationTime_ / ExpirationInrerval + 1) *
ExpirationInterval , ExpirationInterval 是指 Zookeeper 会话超时检查时间
间隔,默认 tickTime复制代码

13. Server Role

Leader

1 only by the scheduling and processing transaction requests, the order of a cluster to ensure that the transaction
2, each service within the cluster dispatcher 

Follower 

1, handle non-transactional client's request, forwarding the request to the server transaction Leader
2, involved in the transaction request to vote Proposal
3, to participate in elections Leader 

Observer

  1. After the introduction of a version 3.0 server role, without affecting the cluster transaction processing capabilities of the non-transactional capabilities to enhance the cluster
  2. Non-transaction processing the client's request, forwarding the transaction request to the server Leader
  3. Voting does not participate in any form of 

14. Zookeeper working condition at Server

The server has four states, namely, LOOKING, FOLLOWING, LEADING, OBSERVING. 
  1. LOOKING: looking Leader status. When the server is in this state, it does not think the current cluster Leader, Leader is required to enter the state election.
  2. FOLLOWING : follower state. It indicates that the current server role is Follower.
  3. LEADING : leader state. It indicates that the current server role is a Leader.
  4. The observing : observer status. It indicates that the current server role is Observer. 

15. Data Synchronization

After the completion of the entire cluster Leader elections, Learner (Follower and Observer collectively) back to the server to register with the Leader. When the Leader Learner server wants to complete the registration server, enter data synchronization link. 
Data synchronization process are performed :( way messaging) 
Learner 向 learners 注册
data synchronization
Synchronization confirmation
Zookeeper data synchronization is usually divided into four categories:
  1. Direct synchronization difference (DIFF synchronous)
  2. First rollback then differentiated synchronization (TRUNC + DIFF synchronization)
  3. Only rollback synchronization (TRUNC synchronous)
  4. The full amount of synchronization (SNAP synchronous) 
Performing data synchronization ago, Leader will complete server data synchronization initialization:
peerLastZxid: 
  • LastZxid ACKEPOCH extracted from the transmitted message when the learner registration server (the server last processed Learner ZXID) 
minCommittedLog:
  • Leader buffer queue server Proposal committedLog smallest ZXID 
maxCommittedLog: 
  • Leader server Proposal buffer queue committedLog largest ZXID 
Direct synchronization difference (DIFF synchronous)
  • Scene: peerLastZxid between minCommittedLog and maxCommittedLog 
First rollback then differentiated synchronization (TRUNC + DIFF synchronization)
  • Scene: When the server finds a new Leader Learner server contains a transaction record he did not, then you need to let the Learner server for transaction rollback - rollback to the Leader server exists, but also the closest to peerLastZxid ZXID
Only rollback synchronization (TRUNC synchronous)
  • Scene: peerLastZxid greater than maxCommittedLog
The full amount of synchronization (SNAP synchronous)
  • Scene One: peerLastZxid less than minCommittedLog
  • Scene 2: Leader No Proposal on the cache queue is not equal to the server and peerLastZxid lastProcessZxid

16. zookeeper is how to ensure the consistency of the order of transactions? 

zookeeper increasing use of global transaction Id identified, all of proposal (proposal) are proposed to be added when zxid, zxid is actually a 64-bit digital, 32-bit high is epoch (period; era; the world; new age) period leader used to identify if there is a new generation leader out, Epoch will be incremented, the lower 32 bits used to count up. When the newly generated proposal, the database will be based on two-stage process, first of all transaction requests are issued to the other server, if more than half of the machine can perform and be successful, then it will begin.

17. Why is there a distributed cluster Master? 

In a distributed environment, some business logic needs only one of the machines in the cluster for execution, other machines can share the results, which can greatly reduce double counting, improve performance, so he needs to be leader election. 

18. zk node goes down how to deal with?

Zookeeper itself is a cluster, the recommended configuration of not less than three servers. Zookeeper also must ensure that when a node goes down, the other nodes will continue to provide services.
If a Follower is down, there are two servers to provide access to the data on the Zookeeper is because there are multiple copies of the data will not be lost;
If a Leader is down, Zookeeper will elect a new Leader.
ZK cluster mechanism is normal as long as more than half of the nodes, the cluster will be able to provide normal services. Only in node ZK hang too much, only half or less than half of the nodes can work, only the cluster fail. and so
node cluster 3 may hang a node (2 votes obtained Leader> 1.5)
cluster 2 nodes will not hang up on any one of the nodes (1 votes obtained Leader <= 1)

19. zookeeper nginx load balancing and load balancing difference 

zk load balancing can be regulated, but nginx can adjust the weight, other needs are controllable need to write their own plug-ins; however nginx throughput is much larger than zk, it should be said by business choose which way to use. 

20. Zookeeper, which has several several deployment models?

Deployment modes: single mode, pseudo-cluster model, the cluster model.

21. The cluster minimum of several machines, clustering rules is what? 

Cluster sets rules 2N + 1, N> 0, i.e. three.

22. The cluster supports dynamically adding machine?

In fact, the level of expansion, Zookeeper in this area is not very good. In two ways:
All Restart : Close all Zookeeper service, started after modifying the configuration. It does not affect the session prior to the client. 
Restart one by one : more than half under the principle of survival can be used, a machine reboot does not affect the entire cluster to provide services. This is a more common way. 
Since version 3.5 supports dynamic expansion.

23. Zookeeper watch monitor notice of nodes is permanent? Why is not permanent? 

No. Official statement: Watch a trigger event is a one-off, when set up Watch the changed data when the server will send the change to set up a client's Watch, in order to inform them.
Why is not permanent, for example, if the server changes frequently, and monitor clients in many cases, each change requires notification to all clients, caused great pressure on networks and servers.
General client implementation of getData ( "/ node A", true), if node A has occurred to change or delete, the client will get its watch events, but after node A has undergone change, and the client has not set watch an event, do not give the client sends.
In practical applications, in many cases, our client does not need to know every time the server changes, the latest data as long as I can.

java client 24. Zookeeper what are? 

java client: zk comes zkclient and Apache open source Curator.

25. chubby what it is, and see how zookeeper than you? 

chubby is google, the full realization paxos algorithm, not open source. zookeeper is chubby open source implementation, use zab protocol variants paxos algorithm.

26. zookeeper say a few commonly used commands.

Commonly used commands: ls get set create delete and so on.

27. Contact with the distinction ZAB and Paxos algorithm?

Same point: 

1, there is both a role similar to the Leader process, which is responsible for coordination of multiple processes running Follower
After 2, Leader process will wait for more than half of Follower make the right feedback, will submit a proposal
3, ZAB agreement, each of the Proposal contains an epoch value to represent the current Leader period, Paxos the name for Ballot
difference: 
ZAB standby system used to build highly available distributed data (Zookeeper), Paxos coherency state is used to build distributed systems. 

28. Zookeeper typical application scenarios

Zookeeper distributed data management and coordination framework of a typical publish / subscribe model, developers can use it to publish and subscribe to distributed data.
Zookeeper by the wealth of data nodes cross-use, with the Watcher event notification mechanism, you can build a series of distributed core feature of middle-aged will involve very convenient, such as:
  1. Data Publish / Subscribe
  2. Load Balancing
  3. Naming Service
  4. Distributed Coordination / Notification
  5. Cluster Management
  6. Master election
  7. Distributed Lock
  8. Distributed Queue 

1. Data Publish / Subscribe

Introduction 
Data Publish / Subscribe systems, so-called distribution center, as the name suggests is a publisher to publish data for subscriber data subscription.

purpose 

Obtaining dynamic data (configuration information)
Dynamic update for data (configuration information) and centralized management of data
Design Patterns 
Push mode
Pull mode 
Data (configuration information) properties
  1. The amount of data is usually small
  2. Data content at runtime dynamic updates occur
  3. Each machine in the cluster share the same configuration 
Eg: machine list information, the runtime switch configuration database configuration information
Based implementation of Zookeeper 
  • Data storage: the data (configuration information) is stored on a data node Zookeeper
  • Data Acquisition: The Zookeeper start initialization data node node data read from and sign up for a data node on the change Watcher
  • Data Change: When the change data, node data corresponding to the update Zookeeper, Zookeeper data change notification will be sent to each client, the client after receiving the notice to re-read the data after the change. 

2. Load Balancing

zk naming service
Naming Service is the address to access to resources or services by specifying the name, use zk create a global path, this path can be used as a name, pointing to a cluster of clusters, address of the service provided, or a remote object, etc. Wait. 
Distributed notification and coordination
For scheduling system for: the operator to send a notification via the console is actually changing the state of a node, then zk these changes will be sent to all registered clients watcher of this node. For the implementation of the report: Each worker process to create a temporary node in a directory. And carrying the work schedule data, such processes can monitor changes summary catalog of child nodes of the global real-time progress of the case. 
zk naming service (file system) 
Naming Service is the address to access to resources or services by specifying the name, use zk create a global path, that is the only path, the path can be used as a name, pointing to a cluster of clusters, address of the service provided, or a remote object, and so on. 
zk configuration management (file system notification mechanism)
Program distributed deployed on different machines, the configuration information on the program under znode zk, when there are configuration changes, that is, when znode changes, you can change the contents of zk in a directory node, use watcher to notify each client in order to change the configuration.
Zookeeper Cluster Management (file system notification mechanism)
The so-called cluster management not care about two things: Is there a machine quit and join the electoral master.
On the first point, all the machines agreed to create a temporary directory node in the parent directory, and then change the message the child node listens parent directory node. Once the machine hang up, the machine is connected to the zookeeper's off, it creates a temporary directory node is removed, all other machines have notified: a sibling directory is removed, so everyone knows: it on board a.
The new machine is similar to join, all machines receive notification: new sibling to join, highcount there has been, for the second point, we change a little bit, all the machines in order to create a temporary directory node number, select the smallest number each time the machine as a master just fine.
Zookeeper distributed lock (file system notification mechanism)
With zookeeper consistency of the file system, lock the problem easier. Lock service can be divided into two categories, one is to maintain exclusivity, the other is to control the timing. 
For the first category, we will znode on a zookeeper seen as a lock, which is achieved by createznode way. All clients to create / distribute_lock nodes, that create the ultimate success of the client that is owned the lock. Run delete distribute_lock node that you create will release the lock. 
For the second category, / distribute_lock already pre-existing, all clients to create a temporary directory node sequentially numbered under it, and choose the same master, the lowest-numbered lock was exhausted deleted, and convenient.
Zookeeper queue management (file system notification mechanism)
Two types of queues:
1, synchronized queue, when they gathered all the members of a queue, the queue is available, otherwise it has been waiting for all members to arrive. 
2, the queue enqueue and dequeue operation in FIFO mode.
The first category, create a temporary directory node in the directory under the agreement, whether the number is the number of nodes monitor our requirements. 
The second type, and a distributed lock service control sequence consistent with the basic principles of the scene, the numbered columns, the columns by number. Created under the specific directory PERSISTENT_SEQUENTIAL node, create Watcher waiting for notification of success queues, queue delete the smallest node number for consumption. Under this scenario znode Zookeeper for storing data messages, the message is stored znode content of the message queue, the message is the SEQUENTIAL SEQ ID numbers can be sequentially taken out. Since the node are creating a persistent, so do not worry about losing issue queue messages.


Guess you like

Origin juejin.im/post/5e0600995188251264167d2d