Environmental preparation
zookeeper:zookeeper-3.4.14
hadoop:hadoop-2.8.5
hbase:hbase-1.4.13
master : purpose 、 resourcemanager ,
slave1:secondarynamenode、datanode,
slave2:datanode
1. Hadoop cluster construction
1. Decompress the hadoop installation package
tar zxvf hadoop-2.8.5.tar.gz
2. Add JAVA_HOME to the three configuration files of hadoop-env.sh, mapred-env.sh, and yarn-env.sh
cd hadoop-2.8.5/etc/hadoop/
export JAVA_HOME=/usr/local/jdk1.8.0_191
3, refurbishment core-site.xml, hdfs-site.xml, mapped-site.xml, yarn-site.xml
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoopdata</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<!--
这两个配置用于指定namenode和datanode存放数据的位置
如果不指定,数据会存放到core-site.xml中配hadoop.tmp.dir目录下
-->
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/hadoopdata/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/hadoopdata/data</value>
</property>
<!--配置数据副本数,不要超过datanode节点数-->
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<!--指定secondarynamenode所在节点-->
<property>
<name>dfs.secondary.http.address</name>
<value>slave1:50090</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
4. Modify slaves and add data nodes
slave1
slave2
5. Configure password-free login between clusters (must be configured, otherwise it will be blocked later)
ssh-keygen 四下回车
ssh-copy-id [email protected]
ssh-copy-id [email protected]
每个集群重复一遍,把本地的ssh复制到其他机器
6. Distribute the hadoop installation package to other nodes
scp -r hadoop-2.8.5 slave1:/usr/local/
scp -r hadoop-2.8.5 slave2:/usr/local/
7. Add the following ip and name to all node hosts
192.168.214.129 master
192.168.214.130 slave1
192.168.214.131 slave2
8. Add Hadoop environment variables to all nodes
vi /etc/profile
export HADOOP_HOME=/usr/local/hadoop-2.8.5
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
source /etc/profile
9. Format the file system on the namenode master node
cd bin/
hdfs namenode -format
The above figure shows that the formatting is successful.
10. The master node starts the hdfs system, and the resourcemanager node starts yarn
start-dfs.sh
start-yarn.sh
shut down
cd sbin
stop-yarn.sh
stop-dfs.sh
11. When Hadoop needs to be reinstalled:
1) Delete the hadoop log in each machine
By default, the logs are under HADOOP_HOME/logs. If they are not deleted, the log files will accumulate and occupy disks.
2) Delete the data and files generated by the original namenode and datanode.
Delete the configured hadoop.tmp.dir
directory, if configured dfs.datanode.data.dir
and dfs.datanode.name.dir
also delete it.
12. When starting Hadoop, a datanode fails to start
查看core-site.xml文件中hadoop.tmp.dir的参数,在该目录下找到current目录下VERSION文件文件,修改其中的clusterID与namenode节点相同
/home/hadoop/hadoopdata/name/current
/home/hadoop/hadoopdata/data/current
Two, Zookeeper cluster construction
1. Unzip the zookeeper installation package
tar -zxvf zookeeper-3.4.14.tar.gz
2. Copy conf/zoo_sample.cfg to zoo.cfg and modify zoo.cfg. Create a new directory data/, log/
dataDir=/usr/local/zookeeper-3.4.14/data
dataLogDir=/usr/local/zookeeper-3.4.14/log
server.0=192.168.214.129:2888:3888
server.1=192.168.214.130:2888:3888
server.2=192.168.214.131:2888:3888
3. Distribute the zookeeper installation package to other nodes
scp -r zookeeper-3.4.14 slave1:/usr/local/
scp -r zookeeper-3.4.14 slave2:/usr/local/
4. Create a new myid file under /zookeeper-3.4.14/data/ on all machines, and modify the file according to the name (0, 1, 2) in step 2 above
vi /zookeeper-3.4.14/data/myid
0(master上)
1(slave1)
2(slave2)
5. Start and stop
bin/zkServer.sh start
bin/zkServer.sh stop
Three, HBase cluster construction
1. Decompress the hbase installation package
tar -zxvf hbase-1.4.13-bin.tar.gz
2、修改hbase-env.sh、hbase-site.xml、backup-masters、regionservers
cd hbase-1.4.13/conf
hbase-env.sh
# 加入JAVA_HOME
export JAVA_HOME=/usr/local/java/jdk1.8.0_73
# 指定不使用自带的zookeeper,如果搭建了zookeeper集群,改为false
export HBASE_MANAGES_ZK=true
hbase-site.xml
First create hdfs://master:9000/user/hbase file
hadoop fs -mkdir /user/hbase
configuration>
<!--
可以不配置,如果要配置,需要和zookeeper配置文件zoo.cfg中的dataDir指定的路径相同
zoo.cfg中dataDir=/var/zookeeper
那么:
<property>
<name>hbase.master</name>
<value>master</value>
</property>
<property>
<name>hbase.master.port</name>
<value>15999</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/var/zookeeper</value>
</property>
-->
<!--指定hbase的数据在hdfs上存放的位置-->
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/user/hbase</value>
</property>
<!--指定hbase集群为分布式集群-->
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<!--指定zookeeper集群-->
<property>
<name>hbase.zookeeper.quorum</name>
<value>slave1:2181,slave2:2181</value>
</property>
<!--用于hbase数据读取时的编码格式-->
<property>
<name>hbase.regionserver.thrift.framed</name>
<value>true</value>
</property>
<property>
<name>hbase.regionserver.thrift.compact</name>
<value>true</value>
</property>
</configuration>
backup-masters: add backup hbase-master
vi backup-masters
slave1
regionservers: join the list of RegionServer nodes
vi regionservers
# 删除localhost
master
slave1
slave2
3. Copy hadoop files core-site.xml and hdfs-site.xml to hbase
cp /usr/local/hadoop-2.8.5/etc/hadoop/core-site.xml ./
cp /usr/local/hadoop-2.8.5/etc/hadoop/hdfs-site.xml ./
4. Distribute the hbase installation package to other nodes
scp -r hbase-1.4.13 slave1:/usr/local/
scp -r hbase-1.4.13 slave2:/usr/local/
5. Add environment variables
vi /etc/profile
export HBASE_HOME=/usr/local/hbase-1.4.13
export PATH=$PATH:$HBASE_HOME/bin:$HBASE_HOME/sbin
source /etc/profile
6. Start the hbase cluster
start-hbase.sh
Start another connection and use the hbase shell to enter the hbase console, indicating that the installation is successful.
7. Error when starting HBase
Caused by: java.io.IOException: Problem binding to /10.12.4.75:60000 : Cannot assign requested address. To switch ports use the 'hbase.master.port' configuration property.
解决方法:修改/etc/hosts文件,在本地ip后面加上通过 hostname -f 查询到的名字。