Hadoop environment to build - fully distributed mode

1.Hadoop Installation Preparation:

(1) installed linux operating system
(2) turn off the firewall
(3) Install JDK on linux
(4) hadoop2, hadoop3, hadoop4 three servers have been set free too dense landing

2. Extract the archive and configure Hadoop environment variables

(1) The installation package Hadoop copied to / opt / software file directory
(2) the installation package Hadoop extract to / opt / module directory
command:

tar -zxvf hadoop-2.8.4.tar.gz -C /opt/module          //将hadoop-2.8.4.tar.gz解压到/opt/module目录下

(3) configuration environment variable
a modified environment variable configuration file
modify the command:

vi ~/.bash_profile          //修改环境变量的文件

Adding content is as follows:

HADOOP_HOME=/opt/module/hadoop-2.8.4
export HADOOP_HOME
PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export PATH

b the environmental variables
enter the command:

source ~/.bash_profile
3. Modify the configuration file

(1) modify hadoop-env.sh
command:

vi hadoop-env.sh                             //修改hadoop-env.sh文件

Modify the content as follows:

export JAVA_HOME=/opt/module/jdk1.8.0_144          //修改JAVAHOME地址,改为自己建的jdk地址,应该在25行              

jdk

(2) Modify hdfs-site.xml
command:

cd /opt/module/hadoop-2.8.4/etc/hadoop              //进入etc/hadoop目录

vi hdfs-site.xml                          // 修改hdfs-site.xml文件

Modify the content as follows:

<!--配置数据块的冗余度,默认是3-->
<property>
    <name>dfs.replication</name>
    <value>2</value>
</property> 
<property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>hadoop3:50090</value>
</property>
<!-- 配置HDFS的权限检查,默认是true-->
<!--
<property>
    <name>dfs.permissions</name>
    <value>false</value>
</property>  
-->

hdfs-site.xml

(3) modified core-site.xml
command:

vi core-site.xml

Modify the content as follows:

<!--配置HDFS的主节点,namenode地址,9000是RPC通信端口-->
<property>
    <name>fs.defaultFS</name>
    <value>hdfs://hadoop2:9000</value>
</property> 
<!--配置HDFS数据块和元数据保存的目录,一定要修改 -->
<property>
    <name>hadoop.tmp.dir</name>
    <value>/opt/module/hadoop-2.8.4/data/tmp</value>      
</property> 

core-site.xml

(4) modified mapred-site.xml (default is no need to transition from mapred-site.xml.template copied)

command:

cp mapred-site.xml.template mapred-site.xml       //从mapred-site.xml.template转化
    
vi mapred-site.xml             //修改mapred-site.xml 文件

Modify the content as follows:

<!--配置MR程序运行的框架,Yarn-->
<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>      
</property>            

mapred-site.xml

(5) to modify yarn-site.xml

command:

vi yarn-site.xml 

Modify the content as follows:

<!--配置Yarn的节点-->
<property>
    <name>yarn.resourcemanager.hostname</name>
    <value>hadoop2</value>      
</property>
<!--NodeManager执行MR任务的方式是Shuffle洗牌-->
<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>      
</property>

yarn-site.xml

(6) modified slaves

command:

vi slaves

Modify the content as follows:

hadoop3            //hadoop3作为从节点
hadoop4            //hadoop4作为从节点

slaves

4. By the HDFS namenode format (note, the node where the server again namenode format, i.e., this is formatted in hadoop2)

command:

cd /opt/module/hadoop-2.8.4/data/tmp         //这里是step3配置的HDFS数据库和元数据存储目录

hdfs namenode -format                  //格式化

Verify successful, after a successful return to display:

Storage: Storage directory /opt/module/hadoop-2.8.4/tmp/dfs/name has been successfully formatted

Verify the format was successful

Note
Repeat formatting, hadoop.tmp.dir stop the cluster, and then delete the tmp folder, then re-create a new tmp folder, reformatted, and then start Cluster

5. By scp copy, configured to hadoop2 hadoop sent to another two machines:

command:

//拷贝到hadoop4
scp -r /opt/moudle/hadoop-2.8.4/ root@hadoop3:/opt/moudle/         

//拷贝到hadoop4
scp -r /opt/moudle/hadoop-2.8.4/ root@hadoop4:/opt/moudle/        
6. Start Hadoop cluster

(1) Start
input command

start-all.sh             //hadoop2中启动,因为此机器是主节点

start up

(2) verify that the start:
hadoop2:
hadoop2
hadoop3:
hadoop3
hadoop4:
hadoop4

And planning the same, it is fully distributed Hadoop installation is successful

Published 18 original articles · won praise 0 · Views 463

Guess you like

Origin blog.csdn.net/aflyingcat520/article/details/105340508