Article directory
Preface
As a distributed coordination service, Zookeeper provides some basic services for distributed systems, such as naming services, configuration management, synchronization, etc., allowing developers to handle distributed problems more easily.
In distributed systems, coordination is a critical task. For example, how to let a group of independent processes or machines know what tasks they should perform, how to synchronize their status to other processes or machines, and how to handle failures or exceptions. These problems are all solved by Zookeeper.
This article will give you an in-depth understanding of the internal implementation of Zookeeper, starting from its various components and interfaces, and analyzing its working principles and design ideas. I hope that in the process of reading this source code analysis, you can have a deep understanding of the working principles and design ideas of Zookeeper, so that you can better use it to solve your distributed problems.
Zookeeper starts
No matter what code we are looking at, we should start from the entry point of the program so that we can better understand the overall structure and operation process.
Through zkServer.sh
the script, you can see that the startup class is QuorumPeerMain
.
You can see that Zookeeper startup is started through the main method, and finally runFromConfig
the method is called to set up and start a ZooKeeper cluster node.
The above code sets QuorumPeer
the transaction log and snapshot directory paths, election algorithm selection, specified server ID, defined clock cycle and data saving instance and other configurations. In addition, factory classes (NIO/Netty) for server and client connections are created. Finally, call the start method to start the service.
In start
, the following four tasks are mainly completed:
- Load data from disk into memory to provide necessary data reserves for subsequent processing and response.
- Establish a Socket to handle the client's request and implement communication and interaction with the client.
- Conducted preparations for Leader election and determined the election algorithm. .
- Conduct Leader election and monitor node status and process accordingly, detect and handle node failures or abnormal states in a timely manner, and ensure the reliability and stability of the entire system.
Load disk data
ZooKeeper operates at the memory level. In order to ensure reliability, data will be persisted to files in the form of transaction logs, so data is loaded into memory first at startup.
The main purpose of this code is to load the Zookeeper database from disk to memory and check and process related epoch
information. getDataTree().lastProcessedZxid
This line of code gets the latest transaction that has been processed on the ZooKeeper server zxid
, ZxidUtils.getEpochFromZxid(lastProcessedZxid)
this line of code gets it zxid
from epoch
, readLongFromFile(CURRENT_EPOCH_FILENAME)
: This line of code reads the current information from the file epoch
. If zxid
the belonging value epoch
is less than the current one epoch
, IOException
an exception will be thrown.
Communication and interaction with clients
cnxnFactory
The server is established Socket
to receive and process client requests. The code in the picture below is NIOServerCnxnFactory
, and there is another one NettyServerCnxnFactory
. The option can NIO
still Netty
be set in the configuration file. If you are familiar with NIO
it, you can read the network programming column.Netty
The core is the processing of client requests. There are the following processors:
CommitProcessor
It is the transaction submission processor, which will wait for the voting for the Proposal in the cluster until the Proposal can be submitted.SyncRequestProcessor
Responsible for persisting transaction requests (write requests) to the local disk.AckRequestProcessor
It is a processor unique to the Leader. Its main responsibility isSyncRequestProcessor
to send ACK feedback to the Proposal's vote collector after the processor completes transaction log recording, to notify the vote collector that the current server has completed transaction log recording for the Proposal.FollowerRequestProcessor
This processor may be responsible for processing read requests and write requests from the client, and forward them to other processors for processing, such as forwarding write requests to the Leader node.SendAckRequestProcessor
Responsible for sending an acknowledgment message (ACK) to the leader node to notify the leader node that the request has been processed.
Leader election preparation
startLeaderElection
The main call is to createElectionAlgorithm
create the manager of the inter-cluster network connection QuorumCnxManager
and select an election algorithm. This is configured through the configuration file and the default is FastLeaderElection
.
Node status processing
super.start()
QuorumPeer.java
The methods in the execution class run()
are mainly used to monitor and process the node status.
When Zookeeper is started, it is in the first Looking
state. Through election, one of the servers becomes the Leader and the other servers become Followers. If the code is difficult to understand, you can take a look at the ZAB protocol .
Summarize
By deeply analyzing the source code, I noticed its simplicity and clever design of functional implementation. Although the functionality handled by the code is quite complex, the simplicity of the organization and coding style makes it fairly easy to read and understand. Particular attention is paid to the extensive use of network programming, and the careful treatment of connection security and optimization. There is also how its ZAB protocol cleverly solves the problem of distributed consistency. These delicate design concepts undoubtedly provide valuable reference for the development of distributed systems. I look forward to integrating these ideas into the code in future work.