大数据学习之路37-zookeeper简介，集群安装，自动启动脚本开发

一.zookeeper简介：

zookeeper是一个分布式系统的协调服务.

场景：

比如HDFS的HA机制，两个namenode的状态互相感知。

比如HBASE中的HMASTER感知regionserver的上下线。

比如solr集群中的各服务器如何同步更新配置文件

都可以利用zookeeper

因为zookeeper提供以下几个核心功能：

1.可以为用户存储数据。

2.可以为用户读取数据。

3.可以为用户提供数据监听通知服务。

二.zookeeper的安装

1.解压缩

tar -zxvf zookeeper-3.4.13.tar.gz -C app/

2.将没用的文件删除

rm -rf *.txt *.xml contrib/ docs/ dist-maven/ src/ recipes/ zookeeper-3.4.13.jar.*

rm -rf zookeeper.out

3.修改配置文件

zoo.cfg

mv zoo_sample.cfg  zoo.cfg

tickTime=2000   ------>   心跳间隔为2秒钟，zookeeper自己本身
就是分布式系统，它自己的节点之间就要进行通信。
initLimit=10    ------>    初始化阶段可以耗费的时间是十个心跳
syncLimit=5     ------>    它内部的节点发出一个请求到另一个节点，
发出请求到获得响应之间的间隔最多可以为5个心跳。
dataDir=/root/zkdata   ------>   zookeeper保存数据的目录
clientPort=2181  ------>   客户端端口
server.1=marshal01:2888:3888 --
server.2=marshal02:2888:3888   --
server.3=marshal03:2888:3888     -->服务器列表，2888是内部节点通信使用的端口，3888是选举用的端口
server.4=marshal04:2888:3888   --
server.5=marshal05:2888:3888 --

配置文件如下：

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/root/zkdata
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=marshal01:2888:3888
server.2=marshal02:2888:3888
server.3=marshal03:2888:3888
server.4=marshal04:2888:3888
server.5=marshal05:2888:3888

4.创建数据目录

要在zk集群的每一台机器上都创建

5.在数据目录中创建myid文件

要在zk集群的每一台机器上都创建

而且每台机器上的myid的内容不同

echo 1 > zkdata/myid

6.分发安装包到各台机器（此时在app目录下）：

scp -r  zookeeper-3.4.13/ marshal05:$PWD

7.启动zookeeper

在每一台节点上，bin/zkServer.sh start

启动完成之后，通过命令： bin/zkServer.sh status 检查集群状态

但是我们发现这样启动集群很麻烦，zookeeper没有提供这样的自动化的脚本，但是我们可以自己写：

我们这里教一个小知识点：

如何快速杀死所有进程和java有关的进程?
killall java
hadoop,zookeeper这些都和java有关
如何查看和java有关的进程的信息：
ps -ef | grep java
-e 为显示所有进程
-f 为全格式

如果是CentOS精简版的话可能会无法使用killall java这个命令，这个时候我们只要使用yum安装即可：

yum  -y install psmisc

下面是我写的自动化脚本，这里参照了hdfs启动的方法，将要启动的集群配置在zk-slaves中,然后脚本依次读取并通过ssh启动每台机器的zookeeper服务：

zk-slaves:

marshal
marshal01
marshal02
marshal03
marshal04
marshal05

start-zk.sh:

#!/bin/bash
echo '----------------------------------'
for i in `awk '{print $1}' /root/zk-slaves`
do
echo starting $i zookeeper
ssh $i 'source /etc/profile;/root/app/zookeeper-3.4.13/bin/zkServer.sh start'
echo '-----------------------------------'
done

echo '-----------------------------------'

for i in `awk '{print $1}' /root/zk-slaves`
do
echo checking $i zookeeper status
ssh $i 'source /etc/profile;/root/app/zookeeper-3.4.13/bin/zkServer.sh status'
echo '-----------------------------------'
done

大数据学习之路37-zookeeper简介，集群安装，自动启动脚本开发

猜你喜欢