The current company is using Ali's internal dubbo, which is EDAS, which uses Ali's own EDAS service. If you are an old iron who has used dubbo, you should know that zookeeper, zookeeper is more concerned about the application of big data and RPC communication. Whether you have used zookeeper or not, this time I mainly introduce the deployment of zookeeper and cluster. This can only be understood profoundly only under actual operation.
Source code: https://github.com/limingios/netFuture/ [zookeeper]
(1) Introduction to zookeeper
-
history
Zookeeper originated from a research group in Yahoo Research Institute. At that time, researchers discovered that many large systems within Yahoo basically depend on a similar system for distributed coordination, but these systems often have distributed single-point problems. Therefore, Yahoo developers tried to develop a general distributed coordination framework with no single point of issue, so that developers can focus on processing business logic.
There is actually an interesting anecdote about the name of the "ZooKeeper" project. In the early stage of the project, considering that many internal projects were named after animals (such as the famous Pig project), Yahoo engineers hoped to give this project an animal name. Raghu Ramakrishnan, the chief scientist of the Institute at the time, jokingly said: "If this continues, we will become a zoo!" As soon as this was said, everyone said that they would call the zookeeper one by one because of the distribution of the names of the animals. Put together the components, Yahoo’s entire distributed system looks like a large zoo, and Zookeeper happens to be used to coordinate the distributed environment one by one, so the name Zookeeper was born. -
Why zookeeper is hot
In the 1960s, the mainframe was invented and became the mainstream of the world by virtue of its super computing and I/O processing capabilities, as well as stability and security. But mainframes also have some fatal problems, such as expensive construction, complicated operations, and single-point problems. Especially the cost of training talents for mainframes is very high. With the emergence of these problems, it continues to affect the development of mainframes. On the other hand, with the continuous improvement of PC performance and the popularization of networks, everyone has turned to using minicomputers and ordinary PC servers to build distributed computer systems to reduce costs. After that, the distributed system became more and more popular.
- Zookeeper's official website
Download link: https://www-eu.apache.org/dist/zookeeper/
Source address: https://github.com/apache/zookeeper
(2) Cluster deployment
There are two types of clusters, one is a distributed cluster and the other is a pseudo-distributed cluster.
Distributed: Each application is on a separate independent host, and the ports are consistent.
Pseudo-distributed: Multiple applications are on one host, and the ports are differentiated. Pseudo-distribution is rare in actual production environments.
Pseudo-distributed clusters are more difficult to operate.mac install vgarant: https://idig8.com/2018/07/29/docker-zhongji-07/
window install vgarant https://idig8.com/2018/07/29/docker-zhongji-08/
System type | IP address | Node role | CPU | Memory | Hostname |
---|---|---|---|---|---|
Centos7 | 192.168.69.100 | Pseudo-distributed | 2 | 2G | zookeeper-virtua |
Centos7 | 192.168.69.101 | True Distributed-Leader | 2 | 2G | zookeeper-Leader |
Centos7 | 192.168.69.102 | True Distributed-Subordinate 1 | 2 | 2G | zookeeper-Follower1 |
Centos7 | 192.168.69.103 | True Distributed-Subordinate 2 | 2 | 2G | zookeeper-Follower2 |
src's little trick, so that there is color, I have been neglecting it before, and my eyes hurt when I look at it. Now the color is much better.
- (2.1) Pseudo-environment configuration
I still use vagrant. Since I am familiar with vagrant, I have hardly created a virtual machine manually.
(2.1.1) Basic settings
su #密码 vagrant cd ~ vi /etc/ssh/sshd_config sudo systemctl restart sshd
vi /etc/resolv.conf
Set to 8.8.8.8
service network restart
![](https://upload-images.jianshu.io/upload_images/11223715-78a8109c41a094e6.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
![](https://upload-images.jianshu.io/upload_images/11223715-2aab7202b523bc3f.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
![](https://upload-images.jianshu.io/upload_images/11223715-1916bbc2f18c3145.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
>(2.1.2)jdk安装
>脚本在我的源码里面
``` bash
vi pro.sh
sh pro.sh
(2.1.3) Zookeeper download and
download tool must remember to use the latest version has been released to 3.5.4, I still use 3.4.10
wget https://www-eu.apache.org/dist/zookeeper/zookeeper-3.4.10/zookeeper-3.4.10.tar.gz
(2.1.4) Unzip zookeeper
tar zxvf zookeeper-3.4.10.tar.gz
(2.1.5) Enter the conf directory in zk and copy 3 files
cd /root/zookeeper-3.4.10/conf cp zoo_sample.cfg zoo1.cfg cp zoo_sample.cfg zoo2.cfg cp zoo_sample.cfg zoo3.cfg
(2.1.6) Edit these 3 files zoo1.cfg, zoo2.cfg, zoo3.cfg
(2.1.6.1) Edit zoo1.cfg
vi zoo1.cfg
dataDir=/apps/servers/data/d_1
dataLogDir=/apps/servers/logs/logs_1
clientPort=2181# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/apps/servers/data/d_1 dataLogDir=/apps/servers/logs/logs_1 # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 server.1=localhost:2187:2887 server.2=localhost:2188:2888 server.3=localhost:2189:2889
(2.1.6.2) Edit zoo2.cfg
vi zoo2.cfg
dataDir=/apps/servers/data/d_2
dataLogDir=/apps/servers/logs/logs_2
clientPort=2182# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/apps/servers/data/d_2 dataLogDir=/apps/servers/logs/logs_2 # the port at which the clients will connect clientPort=2182 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 server.1=localhost:2187:2887 server.2=localhost:2188:2888 server.3=localhost:2189:2889
(2.1.6.3) Edit zoo3.cfg
vi zoo3.cfg
dataDir=/apps/servers/data/d_3
dataLogDir=/apps/servers/logs/logs_3
clientPort=2183# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/apps/servers/data/d_3 dataLogDir=/apps/servers/logs/logs_3 # the port at which the clients will connect clientPort=2183 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 server.1=localhost:2187:2887 server.2=localhost:2188:2888 server.3=localhost:2189:2889
(2.1.7) Create data directory and log directory
mkdir -p /apps/servers/data/d_1 mkdir -p /apps/servers/data/d_2 mkdir -p /apps/servers/data/d_3
mkdir -p /apps/servers/logs/logs_1
mkdir -p /apps/servers/logs/logs_2
mkdir -p /apps/servers/logs/logs_3
echo "1" >/apps/servers/data/d_1/myid
echo "2" >/apps/servers/data/d_2/myid
echo "3" >/apps/servers/data/d_3/myid
![](https://upload-images.jianshu.io/upload_images/11223715-29358fa9e009905c.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
> (2.1.8)进入bin目录下输入命令 分别进行启动
``` bash
cd /root/zookeeper-3.4.10/bin
sh zkServer.sh start ../conf/zoo1.cfg
sh zkServer.sh start ../conf/zoo2.cfg
sh zkServer.sh start ../conf/zoo3.cfg
(2.1.9) Enter each one to see the effect
source /etc/profile sh zkCli.sh -server localhost:2181 sh zkCli.sh -server localhost:2182 sh zkCli.sh -server localhost:2183
The pseudo-distribution is actually completed in this way. The focus is still distributed and look down.
- (1.2) Distributed environment configuration
(1.2.1) Basic settings (all three machines need to be set)
su #密码 vagrant cd ~ vi /etc/ssh/sshd_config sudo systemctl restart sshd
vi /etc/resolv.conf
Set to 8.8.8.8
service network restart
![](https://upload-images.jianshu.io/upload_images/11223715-af39780dc5b88a2c.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
![](https://upload-images.jianshu.io/upload_images/11223715-1006580435e435f3.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
![](https://upload-images.jianshu.io/upload_images/11223715-514cba97d35d79e3.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
![](https://upload-images.jianshu.io/upload_images/11223715-42238d619aa86169.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
![](https://upload-images.jianshu.io/upload_images/11223715-099a2ba509e81d91.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
![](https://upload-images.jianshu.io/upload_images/11223715-008debaab296be6f.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
![](https://upload-images.jianshu.io/upload_images/11223715-fc58d43a8ce53bf2.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
![](https://upload-images.jianshu.io/upload_images/11223715-faa0c439160805be.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
![](https://upload-images.jianshu.io/upload_images/11223715-d036b005acb8483f.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
>(1.2.2)jdk安装(三台机器都需要设置)
>脚本在我的源码里面
``` bash
vi pro.sh
sh pro.sh
(1.2.3) Zookeeper download (all three machines need to be set up) The
download tool must remember to use the latest one that has been released to 3.5.4. I still use 3.4.10.
Why is there three? Because Zookeeper likes odd numbers and doesn't like even numbers.
wget https://www-eu.apache.org/dist/zookeeper/zookeeper-3.4.10/zookeeper-3.4.10.tar.gz
(1.2.4) Unzip zookeeper
tar zxvf zookeeper-3.4.10.tar.gz
(1.2.4) Configure the cfg file (all three machines need to be set)
cd ~
cd zookeeper-3.4.10/
cd conf
cp zoo_sample.cfg zoo.cfg
(1.2.5) Configure the cfg file. In fact, the configuration files of these 3 machines are the same. I will not repeat writing cfg, just take a screenshot.
vi zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/tmp/zookeeper
dataLogDir=/tmp/zookeeper/logs
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.0=192.168.69.101:2888:3888
server.1=192.168.69.102:2888:3888
server.2=192.168.69.103:2888:3888
(1.2.6) The configuration of myid
needs to be configured to the directory above dataDir=/tmp/zookeeper.cd / cd tmp mkdir zookeeper cd zookeeper
(1.2.6.1) 192.168.69.101 configure myid
echo '0'>myid cat myid
(1.2.6.2) 192.168.69.102 configure myid
echo '1'>myid cat myid
(1.2.6.3) 192.168.69.103 configure myid
echo '2'>myid cat myid
Start the command to run zookeeper under 3 virtual machines
cd ~/zookeeper-3.4.10/bin
sh zkServer.sh start
#### (3) Conceptual combing
- (3.1) Zoo.cfg configuration
parameter | significance |
---|---|
tickTime | 2000 |
syncLimit | The longest communication time between leader and follower cannot exceed initLimt*ticktime |
initLimt | Accept the longest waiting heartbeat time initLimt*ticktime initialized by the client link zk |
dataDir | Data directory |
dataLogDir | Log file |
clientPort | Client link server port number Server.A=B:C:DA: the number server B Service IP C represents the leader and follower communication port D alternate leader port |
- (3.2) Role
Leader:
As the master node of the entire ZooKeeper cluster, the Leader is responsible for responding to all requests for ZooKeeper state changes. It will sort and number each status update request to ensure that the FIFO and write operations of the entire cluster's internal message processing go to the leader
Follower :
The logic of Follower is relatively simple. In addition to responding to read requests on this server, the follower also processes the leader's proposal, and submits it locally when the leader submits the proposal. Another thing to note is that the leader and follower constitute the quorum of the ZooKeeper cluster, that is, only they participate in the election of the new leader and respond to the leader's proposal.
Observer :
If the read load of the ZooKeeper cluster is very high, or there are many clients across computer rooms, you can set up some observer servers to improve the read throughput. Observer and Follower are similar, with some minor differences: First, the observer is not a quorum, that is, it does not participate in elections and does not respond to proposals; secondly, the observer does not need to persist transactions to disk. Once the observer is restarted, the entire name needs to be resynchronized from the leader. space.
- (3.3) Zookeeper features
Zookeeper is composed of multiple servers
1. Cluster one leader, multiple followers
2. Each server saves a copy of data
3. Global data is consistent and distributed to read followers, write updates are forwarded by the leader, and updated by the leader Requests are made in order, and update requests from the same client are executed in the order of sending atomicity.
4. A data update either succeeds or fails. The globally unique data view.
5. The data view is always the same regardless of which server the client is connected to. Consistent real-time, within a certain range of events, the client can read the latest data
PS: This time I mainly talk about the principle of zookeeper and cluster deployment. I didn't introduce the details in too much detail. Next time, I will talk about the use of zookeeper.