First to install Zookeeper
首先使用XFTP工具 将Zookeeper和HBase的压缩包上传到集群的节点上去(hadoop2)
After upload files to a node in the cluster can be used scp apache-zookeeper-3.5.6-bin.tar.gz jee@hadoop3:/home/jee/software/
to send the file to another two machines in a cluster command
Zookeeper decompressing the compressed package hadoop2sudo tar -zxvf apache-zookeeper-3.5.6-bin.tar.gz -C /usr/local/
Zookeeper package name changed a bit too long we namemv apache-zookeeper-3.5.6-bin/ zookeeper
He then modify users and user groups (need to switch to the root user) chown -R jee:jee zookeeper/
(originally the root user we revise our own ordinary users)
We create a zkData
folder used to store data (Note! To switch back to normal user or else users and user groups this folder is the root of)mkdir zkData
Next we configure Zookeeper
Enter the path: /usr/local/zookeeper/conf
the zoo_sample.cfg renamed zoo.cfg filemv zoo_sample.cfg zoo.cfg
Zoo.cfg modify the configuration file the following information:
#存放数据的文件夹
dataDir=/usr/local/zookeeper/zkData
#Zookeeper分布的节点和其端口 server.后面的数字表示id 后面会配置
server.2=hadoop2:2888:3888
server.3=hadoop3:2888:3888
server.4=hadoop4:2888:3888
Configuration myid
在hadoop2上zookeeper的zkData路径下创建myid文件。写入id。
Hadoop2的myid 为2
Hadoop3的myid 为3
Hadoop4的myid 为4
Id write needs to be consistent and zoo.cfg file configuration.
So far we Zookeeper configuration on a single node has been completed then we 在其他两台节点上解压文件+修改用户组+创建zkData文件夹
then we do Zookeeper other two nodes have been installed to modify only a thin profile that we will have just configured files distributed to two Taiwan node on it
在配置好的节点上执行下面命令:
rsync -rvl /usr/local/zookeeper/conf/* jee@hadoop3:/usr/local/zookeeper/conf
rsync -rvl /usr/local/zookeeper/conf/* jee@hadoop4:/usr/local/zookeeper/conf
Then create myid in zkData these two nodes
hadoop3的zkData的目录下创建myid,写入3
Hadoop4的zkData的目录下创建myid,写入4
Then we start to see if the cluster configuration is successful
Start and stop the cluster:
启动:
/usr/local/zookeeper/bin/zkServer.sh start
停止:
/usr/local/zookeeper/bin/zkServer.sh stop
注:zookeeper集群的每个服务器都需要启动/停止,所有zookeeper服务器启动后,只有一个Leader,其他都是follower
View status:
执行:/usr/local/zookeeper/bin/zkServer.sh status
信息如下:
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost.
Mode: follower
Configuring HBase
First extract the compressed files to a /usr/local/
directory named modify hbase
and modify users and user groupssudo chown -R jee:jee hbase/
Modify the configuration file hbase-env.sh
1.修改JDK环境变量JAVA_HOME
2.修改HBASE-MANAGES-ZK=false;
默认为true,表示使用Hbase自带的zookeeper。修改为false,表示使用独立安装的zookeeper。
3.如果使用的jdk 1.8以上版本,需要移除HBASE-MASTER-OPTS和HBASE-REGIONSERVER-OPTS配置
4.创建并配置HBASE_LOG_DIR路径
Reconfiguration hbase-site.xml
<!--zookeeper的节点主机名-->
<configuration>
<property>
<name>hbase.zookeeper.quorum</name>
<value>hadoop2:2181,hadoop3:2181,hadoop4:2181</value>
<description>The directory shared by RegionServers.
</description>
</property>
<!--hbase的存储根目录,设为hadoopHDFS根目录下的hbase-->
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop2:9000/hbase</value>
<description>The directory shared by RegionServers.
</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed Zookeeper
true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)
</description>
</property>
<property>
<name>hbase.master.port</name>
<value>16000</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/usr/local/zookeeper/zkData</value>
</property>
</configuration>
Overwrite regionservers
files cluster machine name we write
regionservers中的节点表示 这些节点都是RegionServer
配置从节点名称:
hadoop2
hadoop3
hadoop4
Hbase soft connection Hadoop:
将hadoop的core-site.xml和hdfs-site.xml放到hbase/conf目录下。
ln 软链接命令
ln -s /usr/local/hadoop/etc/hadoop/core-site.xml /usr/local/hbase/conf/core-site.xml
ln -s /usr/local/hadoop/etc/hadoop/core-site.xml /usr/local/hbase/conf/hdfs-site.xml
So far we HBase configuration on a single node on and then we have all done将这台节点上的HBase拷贝到其他节点上去
sudo scp -r /usr/local/hbase root@hadoop3:/usr/local/hbase
sudo scp -r /usr/local/hbase root@hadoop4:/usr/local/hbase
然后再修改拷过去的HBase的用户和用户组
并且 软链接是不会拷贝过去的 所以我们需要在Hadoop2和Hadoop3中也创建两个软链接
In addition we HBase time very demanding likely gap between different nodes of a few minutes can lead to HBase on command so we need to set 时钟同步
to make time node in the cluster are the same
将一台节点作为主节点 另外的节点作为从节点 然后从节点向主节点拉取时间即可
我们将Hadoop2作为主节点 Hadoop3和Hadoop4作为从节点 开始设置
1.先在Hadoop2中安装NTP
sudo apt-get install ntp
2.修改被同步机器的/etc/ntp.conf
主要配置:
(1) 哪些机器能够访问时钟同步服务,进行时钟同步。权限配置
(2)配置时钟同步的服务端,时间服务的层级
#限制只能是192.168.124.0/24网段的IP可以连接机器同步时间,不支持远程登陆
//需要根据自己的本地IP修改
restrict 192.168.124.0 mask 255.255.255.0 nomodify notrap
//下面这两个不用修改
server 127.127.1.0 #表示本机
fudge 127.127.1.0 stratum 10 #fudge和server相同,stratum 时间服务器的层级
关闭需要同步时间机器的ntp(除hadoop2的其他机器)
sudo service ntp stop
同步时间(除hadoop2的其他机器都要执行)
sudo ntpdate hadoop2
在除hadoop2的其他机器,将同步时间的命令放在crontab中(切换到root用户下操作)
crontab -e # 打开crontab
0 8 * * * ntpdate deptest1 #需要添加的命令,每天早上八点同步一次