Upload the hadoop compressed package and
decompress it - test whether it is available
tar zxvf hadoop-2.6.4.tar.gz
*解压*
rm hadoop-2.6.4.tar.gz
*删除压缩包*
cd /opt/hadoop-2.6.4/bin
*打开解压包,*
./hadoop version
*测试该版本是否可用*
Add an environment variable to make it globally available
sudo vi ~/.bashrc
vi mode editing – add variables at the very bottom
export HADOOP_HOME=/opt/hadoop-2.6.4
export PATH=$PATH:$HADOOP_HOME/bin
source ~/.bashrc
*使全局变量生效*
hadoop version
*测试-不报则成功*
Master:
Send the decompressed folder configured with hadoop in the master to slave0 slave1
cd /opt/
sudo scp -r hadoop-2.6.4 hadoop@slave0:/opt/
sudo scp -r hadoop-2.6.4 hadoop@slave1:/opt/
scp ~/.bashrc hadoop@slave0:~/.bashrc
scp ~/.bashrc hadoop@slave1:~/.bashrc
slave0 slave1:
effective:
source ~/.bashrc
test:
hadoop version
edit permission
cd /
sudo chown hadoop.hadoop opt/
Cluster construction:
Modify permissions
cd /
sudo chown hadoop.hadoop opt/
Configure cluster building:
modify core-site.xml
1. Enter the directory:
cd /opt/hadoop-2.6.4/etc/hadoop
2. Edit core-site.xml
vi core-site.xml
<–Specify the address of the boss namenode of HDFS–>
fs.default.name
hdfs://master:9000
dfs.permissions
false
3. Edit hdfs-site.xml
sudo vi hdfs-site.xml
<property>
<name>dfs.data.dir</name>
<value>/opt/dfs/data</value>
</property>
<!--指定名称节点目录-->
<property>
<name>dfs.name.dir</name>
<value>/opt/dfs/name</value>
</property>
<!--指定hdfs保存副本的数量-->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.secondary.http.address</name>
<value>master.50090</value>
</property>
4. Copy a copy according to the mapred-site.xml.template template file
cp mapred-site.xml.template mapred-site.xml
5. Edit mapred-site.xml
vi mapred-site.xml
<property>
<name>mapred.job.tracker</name>
<value>master:9001</value>
</property>
<!--告诉hadoop以后运行在yarn上-->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
6. Edit yarn-site.xml
vi yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<!--指定数据获取通过shuffle的方式-->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
7. Configure hadoop_env.sh
vi hadoop_env.sh
export JAVA_HOME=/opt/jdk1.8.0_101
8. Configure mapred-env.sh
vi mapred-env.sh
export JAVA_HOME=/opt/jdk1.8.0_101
9. Configure the slaves file to modify which machines are used as slaves,
vi slaves
Delete the original localhost
add
slave0
slave1
10. Configure masters
vi masters
master
11. After the configuration is successful on the master,
transfer the entire hadoop folder in the master to the master in slave0 and slave1
:
cd /opt
sudo scp -r hadoop-2.6.4 hadoop@slave0:/opt/
sudo scp -r hadoop-2.6.4 hadoop@slave1:/opt/
scp ~/.bashrc hadoop@slave0:~/.bashrc
scp ~/.bashrc hadoop@slave1:~/.bashrc
*/
slave0:
source ~/.bashrc
slave1:
source ~/.bashrc
测试:
slave0:
hadoop version
slave1:
hadoop version
查看到版本号说明OK。
启动集群:
1.格式化硬盘:
hadoop purpose -format
2.启动集群:
cd /opt/hadoop-2.6.4/sbin
./start-all.sh
jps查看进程
发现没有NameNode说明失败了
./stop-all.sh
查看发现配置core-site.xml写错了
cd /opt/hadoop-2.6.4/etc/hadoop
```
scp core-site.xml hadoop@slave0:/opt/hadoop-2.6.4/etc/hadoop/
scp core-site.xml hadoop@slave1:/opt/hadoop-2.6.4/etc/hadoop/
Test again:
hadoop namenode -format
cd /opt/hadoop-2.6.4/sbin
./start-all.sh
Master slave0 slave1:
jps
It is found that there are NameNode
Slave0 in the master and DataNode in the slave1 process, which means that the cluster is successfully started.
In the client browser you can test again
http:// masterIP :50070
Finally: FAQ summary: 1: Master: cd /opt/dfs/name/current cat VERSION
Slave7,Slave8: cd /opt/dfs/data/current cat VERSION
Compare whether the clusterIDs in the master and slave are consistent (consistent means in a cluster) If they are inconsistent: Master: cd
/opt/hadoop-2.6.4/sbin ./stop-dfs.shMaster,Slave7,Slave8: rm -rf /opt/dfs
Master: hdfs namenode -format (hadoop namenode -format) restart
the cluster./start-dfs.sh2: Check whether the firewalls of the three computers are closed:
firewall-cmd --state systemctl stop firewalld.service systemctl
disable firewalld.service3: The following must be checked before: Turn off the firewall Modify ip Modify hostname Set ssh passwordless login Install jdk Install hadoop
/etc/hosts