Hadoop cluster (1) Zookeeper construction

As a Hadoop beginner, it is natural to start with installation. The advantage of hadoop is that it is distributed, so a distributed system must also be installed.

The overall installation steps, including Zookeeper+HDFS+Hbase, for the sake of brevity, I will record my installation steps in three blogs.

This article records the preparation of the cluster environment and the installation steps of zookeeper.

~~~~~~~~~~~~Environment preparation part~~~~~~~~~~~~~~~~~~

IP deployment planning (three virtual machines are prepared, and the operating system is Centos6.7)

Zookeeper host assignment

1
2
3
192.168.67.101 c6701
192.168.67.102 c6702
192.168.67.103 c6703

HDFS host assignment

1
2
3
192.168.67.101 c6701  --Namenode+datanode
192.168.67.102 c6702  --datanode
192.168.67.103 c6703  --datanode

Hbase host allocation

1
2
3
192.168.67.101 c6701  --Master + region
192.168.67.102 c6702  --region
192.168.67.103 c6703  --region

1. In order to manage each service separately, I created a user for each service separately.

The created users are:

1
2
3
zk
hdfs
hbase

2. Create ssh password-free for each user and distribute it to all nodes, so that I only execute commands on C6701 and install C6702 and C6703 remotely

1
ssh-keygen -t rsa -f ~/.ssh/id_rsa

Just copy the key to all nodes (only c6701 does not need a password to access 02 and 03, and reverse access requires a password)

1
2
3
ssh-copy-id  c6701  
ssh-copy-id  c6702 
ssh-copy-id  c6703

3. Emphasize that /etc/hosts needs to be set

1
2
3
4
[root@c6701 ~]# more /etc/hosts 
192.168.67.101 c6701.org c6701
192.168.67.102 c6702.org c6702
192.168.67.103 c6703.org c6703

4. The following is executed on C6701, and ssh will be called to remotely execute some commands in the middle to verify whether ssh password-free is effective

1
2
ssh c6702  "cat /proc/cpuinfo"
ssh c6702  "hostname"

5. Download and install the software (intranet address)

1
2
3
4
5
6
cd /tmp/software
wget http://192.21.104.48/deploy/jdk-8u144-linux-x64.tar.gz
wget http://192.21.104.48/deploy/zookeeper-3.4.6.tar.gz
wget http://192.21.104.48/deploy/hbase-1.1.3.tar.gz
wget http://192.21.104.48/deploy/hadoop-2.6.0-EDH-0u2.tar.gz
wget http://192.21.104.48/deploy/hadoop-2.7.1.tar.gz

6. Install JDK, all nodes must be installed

1
#tar -xzvf  jdk-8u144-linux-x64.tar.gz -C /usr/ local

7. Add the following information to .bash_profile

1
2
3
4
5
export JAVA_HOME=/usr/ local /jdk1.8.0_144  
export JRE_HOME=/usr/ local /jdk1.8.0_144/jre
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$PATH
source /etc/profile

8. Check the version information

1
2
3
4
# java -version
java version  "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64- Bit  Server VM (build 25.144-b01, mixed mode)

 

~~~~~~~~~~~~zookeeper installation part~~~~~~~~~~~~~~~~~~

1. Install Zookeeper on C6701

1
2
3
4
5
useradd zk
echo  "zk:zk"  | chpasswd
su - zk
mkdir zk
tar -zxvf /tmp/software/zookeeper-3.4.6.tar.gz -C /home/zk/zk
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
~~~~~~~~~~~~~zoo.cfg的配置~~~~~~~~~~~~~~~
$ pwd
/home/zk/zk/zookeeper-3.4.6/conf/zoo.cfg
$ cat /home/zk/zk/zookeeper-3.4.6/conf/zoo.cfg |grep -v  '^#'
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/data/zookeeper/data
dataLogDir=/data/zookeeper/log
clientPort=2181
autopurge.snapRetainCount=3
autopurge.purgeInterval=6
server.1=c6701:2888:3888
server.2=c6702:2888:3888
server.3=c6703:2888:3888
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

2. 根据zoo.cfg创建data和log两个文件夹

1
2
3
4
5
#mkdir -p /data/zookeeper/data
#mkdir -p /data/zookeeper/log
#chown -R zk:zk /data/zookeeper
#chown -R zk:zk /data/zookeeper/data
#chown -R zk:zk /data/zookeeper/log

3.在zookeeper的目录中,创建上述两个文件夹。进入zkdata文件夹,创建文件myid,填入1。这里写入的1,是在zoo.cfg文本中的server.1中的1。当我们把所有文件都配置完毕,我们把hadoop1中yarn目录复制到其它机器中,我们在修改每台机器中对应的myid文本,hadoop2中的myid写入2。其余节点,安照上面配置,依此写入相应的数字。Zkdatalog文件夹,是为了指定zookeeper产生日志指定相应的路径。

1
# su - zk -c  "echo 1 > /data/zookeeper/data/myid"

4. 添加环境变量,在/etc/profile目录中,将ZOOKEEPER_HOME/bin在原有的PATH后面加入":$ZOOKEEPER_HOME/bin"

 

关于环境变量修改/etc目录下的profile文件,也可以在根目录下的.bashrc目录下添加环境变量。这两者有什么区别:.bashrc是对当前目录用户的环境变量,profile文件是对所有用户都开放的目录。当系统加载文件中,先从profile找相应的路劲,如果没有会在.bashrc文件中找对应的环境变量路径。这两者大家稍至了解。

然后 source /etc/profile

 

5. 安装c6702的zookeeper

1
2
#ssh c6702  "useradd zk"
#ssh c6702  "echo " zk:zk " | chpasswd"

为zk用户ssh免密

1
#ssh-copy-id  zk@c6702

拷贝软件

1
2
#scp -r /tmp/software/hadoop-* root@c6702:/tmp/software
#ssh c6702  "chmod 777 /tmp/software/*"

创建目录,解压软件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#ssh zk@c6702  "mkdir zk"
#ssh zk@c6702  "tar -zxvf /tmp/software/zookeeper-3.4.6.tar.gz -C /home/zk/zk"
#ssh zk@c6702  "ls -al zk"
#ssh zk@c6702  "ls -al zk/zookeeper*"
#ssh zk@c6702  "rm /home/zk/zk/zookeeper-3.4.6/conf/zoo.cfg"
#scp -r /home/zk/zk/zookeeper-3.4.6/conf/zoo.cfg zk@c6702:/home/zk/zk/zookeeper-3.4.6/conf/.
#ssh zk@c6702  "cat /home/zk/zk/zookeeper-3.4.6/conf/zoo.cfg |grep -v '^#'"
根据zoo.cfg创建目录
#ssh c6702  "mkdir -p /data/zookeeper/data"
#ssh c6702  "chown -R zk:zk /data/zookeeper"
#ssh c6702  "chown -R zk:zk /data/zookeeper/data"
#ssh c6702  "mkdir -p /data/zookeeper/log" 
#ssh c6702  "chown -R zk:zk /data/zookeeper/log"
创建文件myid,填入2
ssh zk@c6702   "echo 2 > /data/zookeeper/data/myid"

6. 安装c6703的zookeeper

1
2
#ssh c6703  "useradd zk"
#ssh c6703  "echo " zk:zk " | chpasswd"

为zk用户ssh免密

1
ssh-copy-id  zk@c6703

拷贝软件

1
2
#scp -r /tmp/software/hadoop-* root@c6703:/tmp/software
#ssh c6703  "chmod 777 /tmp/software/*"

创建目录,解压软件

1
2
3
4
5
6
7
#ssh zk@c6703  "mkdir zk"
#ssh zk@c6703  "tar -zxvf /tmp/software/zookeeper-3.4.6.tar.gz -C /home/zk/zk"
#ssh zk@c6703  "ls -al zk"
#ssh zk@c6703  "ls -al zk/zookeeper*"
#ssh zk@c6703  "rm /home/zk/zk/zookeeper-3.4.6/conf/zoo.cfg"
#scp -r /home/zk/zk/zookeeper-3.4.6/conf/zoo.cfg zk@c6703:/home/zk/zk/zookeeper-3.4.6/conf/.
#ssh zk@c6703  "cat /home/zk/zk/zookeeper-3.4.6/conf/zoo.cfg |grep -v '^#'"

根据zoo.cfg创建目录

1
2
3
4
5
6
7
#ssh c6703  "mkdir -p /data/zookeeper/data"
#ssh c6703  "chown -R zk:zk /data/zookeeper"
#ssh c6703  "chown -R zk:zk /data/zookeeper/data"
#ssh c6703  "mkdir -p /data/zookeeper/log" 
#ssh c6703  "chown -R zk:zk /data/zookeeper/log"
创建文件myid,填入3
ssh zk@c6703   "echo 3 > /data/zookeeper/data/myid"

7. 启动ZK

1
2
/home/zk/zk/zookeeper-3.4.6/bin/zkServer.sh start
/home/zk/zk/zookeeper-3.4.6/bin/zkServer.sh status

远程启动命令

1
2
3
4
ssh zk@c6702  "/home/zk/zk/zookeeper-3.4.6/bin/zkServer.sh\ start"
ssh zk@c6703  "/home/zk/zk/zookeeper-3.4.6/bin/zkServer.sh\ start"
ssh zk@c6702  "/home/zk/zk/zookeeper-3.4.6/bin/zkServer.sh\ status"
ssh zk@c6703  "/home/zk/zk/zookeeper-3.4.6/bin/zkServer.sh\ status"

 

8. 没有全部启动三个ZK前的状态

1
2
3
4
[vagrant@c7003 bin]$ ./zkServer.sh status
ZooKeeper JMX enabled  by  default
Using config: /home/vagrant/zookeeper-3.4.10/bin/../conf/zoo.cfg
Error contacting service. It  is  probably  not  running.

9. 需要将三个ZK全启动之后,才能选举出leader

1
2
3
4
5
6
7
8
$ ./zkServer.sh status
ZooKeeper JMX enabled  by  default
Using config: /home/vagrant/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: follower
[vagrant@c7002 bin]$ ./zkServer.sh status
ZooKeeper JMX enabled  by  default
Using config: /home/vagrant/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: leader

=======2018.1.13更新==========

客户端连接方式

客户端是随机顺序的方式连接server的,无法指定,或者优先读哪个一个zookeeper server。

./zkCli.sh -server c6701:2181,c6702:2181,c6703:2181

输出如下:

第一次连接,连接到c6701上

1
2
3
4
5
6
7
8
9
10
11
[zookeeper@c6702 bin]$ /usr/ local /hadoop/zookeeper-3.4.6/bin/zkCli.sh -server c6701:2181,c6702:2181,c6703:2181
Connecting  to  c6701:2181,c6702:2181,c6703:2181
......
2018-01-11 21:07:30,797 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client  connection , connectString=c6701:2181,c6702:2181,c6703:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@4b4bc1e
Welcome  to  ZooKeeper!
2018-01-11 21:07:30,830 [myid:] - INFO  [main-SendThread(c6701.python279.org:2181):ClientCnxn$SendThread@975] - Opening socket  connection  to  server c6701.python279.org/192.168.67.101:2181. Will  not  attempt  to  authenticate using SASL (unknown error)
JLine support  is  enabled
2018-01-11 21:07:30,873 [myid:] - INFO  [main-SendThread(c6701.python279.org:2181):ClientCnxn$SendThread@852] - Socket  connection  established  to  c6701.python279.org/192.168.67.101:2181, initiating session
2018-01-11 21:07:30,916 [myid:] - INFO  [main-SendThread(c6701.python279.org:2181):ClientCnxn$SendThread@1235] - Session establishment complete  on  server c6701.python279.org/192.168.67.101:2181, sessionid = 0x160e70285f70001, negotiated timeout = 30000
WATCHER::
WatchedEvent state:SyncConnected type:None path: null

第二次连接,连接到c6702上

1
2
3
4
5
6
7
8
9
10
Connecting  to  c6701:2181,c6702:2181,c6703:2181
......
2018-01-11 21:10:18,442 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client  connection , connectString=c6701:2181,c6702:2181,c6703:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@4b4bc1e
Welcome  to  ZooKeeper!
2018-01-11 21:10:18,489 [myid:] - INFO  [main-SendThread(c6702.python279.org:2181):ClientCnxn$SendThread@975] - Opening socket  connection  to  server c6702.python279.org/192.168.67.102:2181. Will  not  attempt  to  authenticate using SASL (unknown error)
JLine support  is  enabled
2018-01-11 21:10:18,508 [myid:] - INFO  [main-SendThread(c6702.python279.org:2181):ClientCnxn$SendThread@852] - Socket  connection  established  to  c6702.python279.org/192.168.67.102:2181, initiating session
2018-01-11 21:10:18,561 [myid:] - INFO  [main-SendThread(c6702.python279.org:2181):ClientCnxn$SendThread@1235] - Session establishment complete  on  server c6702.python279.org/192.168.67.102:2181, sessionid = 0x260e70284650002, negotiated timeout = 30000
WATCHER::
WatchedEvent state:SyncConnected type:None path: null

 

 

至此,基础环境部署完。并且zookeeper也安装完成。下面我们会在下一篇文章中,继续安装HDFS。

 http://blog.51cto.com/hsbxxl/1971241

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325052685&siteId=291194637