Introduction to Zookeeper, environment construction

1. Introduction to Zookeeper

  • 1. Zookeeper is an efficient distributed coordination service, which exposes some public services, such as naming/configuration management/synchronization control/group services. We can use ZK for things like consensus/cluster management/leader election.
    Zookeeper is a highly available distributed management and coordination framework based on the implementation of the ZAB algorithm (Atomic Message Broadcasting Protocol). The framework can well guarantee the consistency of data in a distributed environment. It is precisely based on this feature that Zookeeper has become a powerful tool for solving distributed consistency problems.
    *Sequential consistency: A transaction request initiated from a client will eventually be applied to Zookeeper in strict accordance with the order in which it was initiated.
    *Atomicity: The application of the processing results of all transaction requests on all machines in the entire cluster is consistent, that is, either all machines in the entire cluster have successfully applied a certain transaction, or there is no application, it will never appear A situation where some machines are applied and others are not.
    *Single view: No matter which Zookeeper server the client connects to, the server data model it sees is the same.
    *Reliability: Once the server successfully applies a transaction and completes the response to the client, the server-side state caused by the transaction will be consistently preserved. unless another transaction changes it.
    *Real-time: Usually, real-time means that once the transaction is successfully applied, the client can immediately obtain the changed new data from the server. Zookeeper can only guarantee that within a period of time, the client will eventually be able to obtain the new data from the server. The terminal reads the latest data status.

  • 2. Zookeeper design goals
    Goal 1: Simple data structure. Zookeeper is to coordinate with each other with a simple attribute structure (also called tree namespace);
    Goal 2: A cluster can be built. Generally, a zookeeper cluster is usually composed of a group of machines. Generally, 3 to 5 machines can form a zookeeper cluster. As long as more than half of the machines in the cluster can work normally, the entire cluster can provide external services normally.
    Goal 3: Sequential access. For each request from each client, zookeeper assigns a globally unique incremental number, which reflects the order of all transaction operations. Applications can use this feature of zookeeper to achieve higher-level synchronization.
    Goal 4: High performance. Since zookeeper stores all data in memory and directly serves all non-transactional requests, its performance is particularly outstanding in scenarios where read operations are the master. Under the JMater stress test (100% read request scenario), the result is about 12~13w QPS.

  • 3. Zookeeper consists of
    three types of ZK servers according to their identity characteristics: Leader, Follower, Observer, among which Follower and Observer are collectively referred to as Learner (learner).
    Leader: responsible for the client's writer type request;
    Follower: responsible for the client's reader type request, participating in leader election, etc.
    Observer: A special "Follower" that can receive client reader requests but does not participate in elections.
  • 4. Zookeeper application scenario
    4.1. Configuration management: Configuration management is very common in distributed application environments. For example, in ordinary application systems, we often encounter such requirements: such as machine configuration list, runtime switch configuration , database configuration information, etc. These global configuration information usually have the following three characteristics:
    A. The amount of data is relatively small;
    B. The content of the data changes dynamically during runtime;
    C. Each node in the cluster shares information and the configuration is consistent.
    4.2. Cluster management: zookeeper can not only help you maintain the service status of the machines in the current cluster, but also can help you select a "manager" to manage the cluster, another function of Zookeeper, Leader, and achieve cluster fault tolerance Function.
    A. I want to know how many clusters work in the current cluster;
    B. Collect data on the runtime status of each cluster in the cluster every day;
    C. Perform online and offline operations on each cluster in the cluster.
    4.3. Publish and subscribe: Zookeeper is a typical publish/subscribe model of distributed CNC management and coordination framework, developers can use it to publish and subscribe to distributed data.
    4.4. Database switching: For example, when we initialize zookeeper, we read the database configuration file on its node. When the configuration changes, zookeeper can help us send the notification of the change to each client, and each interaction receives this After the change notification, the latest data can be obtained again.
    4.5. Distributed log collection: We can build a log system to collect all log information in the cluster for unified management.
    4.6. The feature of zookeeper is that it is highly available in distributed scenarios, but it is very difficult for the native API to implement distributed functions, and it is too time-consuming for the team to implement it. Even if it is implemented, it may not be stable. Then use the perfect implementation of third-party clients, such as the Curator framework, which is a top-level project of Apache.

Note: The underlying implementation principle of zookeeper, understand two points of ZAB and PAXOS.

Second, the Zookeeper environment to build

1. Unzip zookeeper: tar -zxvf zookeeper-3.4.5.tar.gz -C /usr/local
2. Rename: mv zookeeper-3.4.5 zookeeper
3. Modify environment variables:vim /etc/profile

export ZOOKEEPER_HOME=/usr/local/zookeeper
export PATH=.;$HADOOP_HOME/bin;$ZOOKEEPER_HOME/bin;$JAVA_HOME/....

4. Refresh takes effect: source /etc/profile
5. Go to zookeeper to modify the configuration file name: cd /usr/local/zookeeper/conf, mv zoo_sample.cfg zoo.cfg
6. Modify conf: vim zoo.cfg,

(1)dataDir=/usr/local/zookeeper/data
(2)server.0=192.168.5.121:2888:3888
   server.1=192.168.5.122:2888:3888
   server.2=192.168.5.123:2888:3888

7. Server identification configuration:
create a folder: mkdir /usr/local/zookeeper/data
create a file myid and fill in the content as 0: vim myid(内容为服务器标识:0)
8. Copy the zookeeper directory to server1 and server2 and the /etc/profile file
9. Modify the value in the myid file in server1 and server2 For 1 and 2 paths ( vim /usr/local/zookeeper/data/myid)
10. Start zookeeper:

路径:cd /usr/local/zookeeper/bin
执行:zkServer.sh start(三台设备都要进行启动)
状态:zkServer.sh status(在三个节点上检验zk的mode,一个leader和两个follwer)

11. Operate the Zookeeper (Xhell interface)
command: zkCli.sh, enter the Zookeeper client and operate
according to the prompt command:
search: ls zookeeper
create and assign: create /bhz hadoop
get: get /bhz
set: set /bhz baihezhuo
rmr /path recursively delete the node
delete /path/child delete the specified node
to create a node with Two types: ephemeral, persistent

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324694077&siteId=291194637