Big data platform construction (4) Install hadoop2.6.4 version

Upload the hadoop compressed package and
decompress it - test whether it is available

tar zxvf hadoop-2.6.4.tar.gz
*解压*
rm hadoop-2.6.4.tar.gz
*删除压缩包*
cd /opt/hadoop-2.6.4/bin
*打开解压包,*
./hadoop version
*测试该版本是否可用*

Add an environment variable to make it globally available

sudo vi ~/.bashrc

vi mode editing – add variables at the very bottom


export HADOOP_HOME=/opt/hadoop-2.6.4
export PATH=$PATH:$HADOOP_HOME/bin
source ~/.bashrc
*使全局变量生效*
hadoop version
*测试-不报则成功*

Master:
Send the decompressed folder configured with hadoop in the master to slave0 slave1

cd /opt/
sudo scp -r hadoop-2.6.4 hadoop@slave0:/opt/
sudo scp -r hadoop-2.6.4 hadoop@slave1:/opt/
scp ~/.bashrc hadoop@slave0:~/.bashrc
scp ~/.bashrc hadoop@slave1:~/.bashrc

slave0 slave1:
effective:

source ~/.bashrc

test:

hadoop version

edit permission

cd /  
sudo chown hadoop.hadoop opt/

Cluster construction:

Modify permissions
cd /

sudo chown hadoop.hadoop opt/

Configure cluster building:
modify core-site.xml
1. Enter the directory:

cd /opt/hadoop-2.6.4/etc/hadoop

2. Edit core-site.xml

vi core-site.xml 

<–Specify the address of the boss namenode of HDFS–>


fs.default.name
hdfs://master:9000


dfs.permissions
false

3. Edit hdfs-site.xml

sudo vi hdfs-site.xml 
 <property>
           <name>dfs.data.dir</name>
           <value>/opt/dfs/data</value>
</property>


<!--指定名称节点目录-->
 <property>
           <name>dfs.name.dir</name>
           <value>/opt/dfs/name</value>
</property>
<!--指定hdfs保存副本的数量-->
<property>
          <name>dfs.replication</name>
          <value>1</value>
</property>
<property>
         <name>dfs.secondary.http.address</name>
         <value>master.50090</value>
 </property>

4. Copy a copy according to the mapred-site.xml.template template file

cp mapred-site.xml.template mapred-site.xml

5. Edit mapred-site.xml

vi mapred-site.xml
<property>
                <name>mapred.job.tracker</name>
                <value>master:9001</value>
        </property>
    <!--告诉hadoop以后运行在yarn上-->
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>
    <property>
                <name>mapreduce.jobhistory.address</name>
                <value>master:10020</value>
        </property>

6. Edit yarn-site.xml

vi yarn-site.xml
 <property>
                <name>yarn.resourcemanager.hostname</name>
                <value>master</value>
        </property>

<!--指定数据获取通过shuffle的方式-->
<property>
         <name>yarn.nodemanager.aux-services</name>
         <value>mapreduce_shuffle</value>
</property>
 <property>
         <name>yarn.log-aggregation-enable</name>
         <value>true</value>
 </property>

7. Configure hadoop_env.sh

vi hadoop_env.sh
export JAVA_HOME=/opt/jdk1.8.0_101

8. Configure mapred-env.sh

vi mapred-env.sh
export JAVA_HOME=/opt/jdk1.8.0_101

9. Configure the slaves file to modify which machines are used as slaves,

vi slaves

Delete the original localhost
add

slave0
slave1

10. Configure masters

vi masters
master

11. After the configuration is successful on the master,
transfer the entire hadoop folder in the master to the master in slave0 and slave1
:

cd /opt
sudo scp -r hadoop-2.6.4 hadoop@slave0:/opt/
sudo scp -r hadoop-2.6.4 hadoop@slave1:/opt/
scp ~/.bashrc hadoop@slave0:~/.bashrc
scp ~/.bashrc hadoop@slave1:~/.bashrc

*/
slave0:

source ~/.bashrc

slave1:

source ~/.bashrc

测试:
slave0:

hadoop version

slave1:

hadoop version

查看到版本号说明OK。




启动集群:

1.格式化硬盘:

hadoop purpose -format

2.启动集群:

cd /opt/hadoop-2.6.4/sbin
./start-all.sh

jps查看进程
发现没有NameNode说明失败了

./stop-all.sh

查看发现配置core-site.xml写错了

cd /opt/hadoop-2.6.4/etc/hadoop

```
scp core-site.xml hadoop@slave0:/opt/hadoop-2.6.4/etc/hadoop/
scp core-site.xml hadoop@slave1:/opt/hadoop-2.6.4/etc/hadoop/

Test again:

hadoop namenode -format
cd  /opt/hadoop-2.6.4/sbin
./start-all.sh 

Master slave0 slave1:

jps

It is found that there are NameNode
Slave0 in the master and DataNode in the slave1 process, which means that the cluster is successfully started.

In the client browser you can test again
http:// masterIP :50070

Finally: FAQ summary: 1: Master: cd /opt/dfs/name/current cat VERSION

Slave7,Slave8: cd /opt/dfs/data/current cat VERSION

Compare whether the clusterIDs in the master and slave are consistent (consistent means in a cluster) If they are inconsistent: Master: cd
/opt/hadoop-2.6.4/sbin ./stop-dfs.sh

Master,Slave7,Slave8: rm -rf /opt/dfs

Master: hdfs namenode -format (hadoop namenode -format) restart
the cluster./start-dfs.sh

2: Check whether the firewalls of the three computers are closed:

firewall-cmd --state systemctl stop firewalld.service systemctl
disable firewalld.service

3: The following must be checked before: Turn off the firewall Modify ip Modify hostname Set ssh passwordless login Install jdk Install hadoop
/etc/hosts

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324615791&siteId=291194637