Hadoop2.5.2 cluster deployment

1. Environment

Reprint please go to the source: http://eksliang.iteye.com/blog/2223784

Prepare 3 virtual machines and install the Centos 64-bit operating system.

  • 192.168.177.131 mast1.com mast1
  • 192.168.177.132 mast2.com mast2
  • 192.168.177.133 mast3.com mast3

Where mast1 acts as NameNade node, mast2, mast3 acts as DataNode node

 

 

2. Preparations before installation

  1. install jdk
  2. Create a new hadoop user on each machine and configure the ssh public key to log in automatically

This part of the work is omitted, configure ssh public key password automatic login reference: http://eksliang.iteye.com/blog/2187265

 

3. Start deployment

3.1. Download hadoop2.5.2

Download address : http://mirrors.cnnic.cn/apache/hadoop/common/hadoop-2.5.2/

 

3.2, placement hadoop-2.5.2 / etc / hadoop

First configure the machine mast1. After configuration, copy the configuration environment to mast2 and mast3.

3.2.1、core-site.xml
<configuration>  
    <property>  
        <name>fs.defaultFS</name>  
        <value>hdfs://mast1:9000</value>  
    </property>
    
    <property>  
        <name>io.file.buffer.size</name>  
        <value>4096</value>  
    </property>  
</configuration>  

 

  •    io.file.buffer.size: the size of the buffer used when reading and writing files
3.2.2、hdfs-site.xml
<configuration>  
    <property>  
        <name>dfs.nameservices</name>  
        <value>ns</value>  
    </property>  

    <property>
	<name>dfs.namenode.http-address</name>
	<value>mast1:50070</value>
    </property>

    <property>  
        <name>dfs.namenode.secondary.http-address</name>  
        <value>mast1:50090</value>  
    </property>  
    
    <property>  
        <name>dfs.namenode.name.dir</name>  
        <value>file:///home/hadoop/workspace/hdfs/name</value>  
    </property>  
    <property>  
        <name>dfs.datanode.data.dir</name>  
        <value>file:///home/hadoop/workspace/hdfs/data</value>  
    </property>  
    <property>  
        <name>dfs.replication</name>  
        <value>2</value>  
    </property>
  
    <property>  
        <name>dfs.webhdfs.enabled</name>  
        <value>true</value>  
    </property>  
</configuration>

 

  •  dfs.namenode.secondary.http-address:SecondaryNameNode service address
  • dfs.webhdfs.enabled : Enable WebHDFS (REST API) function on NN and DN
3.2.3、mapred-site.xml
<configuration>  
    <property>  
        <name>mapreduce.framework.name</name>  
        <value>yarn</value>  
    </property>  
    <property>  
        <name>mapreduce.jobtracker.http.address</name>  
        <value>mast1:50030</value>  
    </property>  
    <property>  
        <name>mapreduce.jobhistory.address</name>  
        <value>mast1:10020</value>  
    </property>  
    <property>  
        <name>mapreduce.jobhistory.webapp.address</name>  
        <value>mast1:19888</value>  
    </property>  
</configuration>

 

  •  mapreduce.jobhistory.address : mapreduce's history service IPC port
  • mapreduce.jobhistory.webapp.address : http port of mapreduce's history server
3.2.4、yarn-site.xml
<configuration>  
    <property>  
        <name>yarn.nodemanager.aux-services</name>  
        <value>mapreduce_shuffle</value>  
    </property>
    
     <property>  
        <name>yarn.resourcemanager.scheduler.address</name>  
        <value>mast1:8030</value>  
    </property>
    
    <property>  
        <name>yarn.resourcemanager.resource-tracker.address</name>  
        <value>mast1:8031</value>  
    </property>

    <property>  
        <name>yarn.resourcemanager.address</name>  
        <value>mast1:8032</value>  
    </property>  
    
    <property>  
        <name>yarn.resourcemanager.admin.address</name>  
        <value>mast1:8033</value>  
    </property>  
    <property>  
        <name>yarn.resourcemanager.webapp.address</name>  
        <value>mast1:8088</value>  
    </property>  
</configuration>

 

 

 3.2.5.slaves: Specify the file of the DataNode node
mast2
mast3

 

 

3.2.6. Modify JAVA_HOME

Add the JAVA_HOME configuration to the files hadoop-env.sh and yarn-env.sh respectively

 

#export JAVA_HOME=${JAVA_HOME} --原来
export JAVA_HOME=/usr/local/java/jdk1.7.0_67

 Although the environment variable of JAVA_HOME is configured, when hadoop starts, it will prompt that it cannot be found. There is no way to specify the absolute path.

 

 

3.2.7. Configure the environment variables of hadoop, refer to my configuration
[hadoop@Mast1 hadoop]$ vim ~/.bash_profile
export HADOOP_HOME="/home/hadoop/hadoop-2.5.2"
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

 Reminder: The two environment variables HADOOP_COMMON_LIB_NATIVE_DIR and HADOOP_OPTS must be added after 2.5.0, otherwise a small error will be reported when starting the cluster

 

 

3.3. Copy the configuration to mast2, mast3

Reminder: The copying process is copied under the hadoop user

scp -r ~/.bash_profile hadoop@mast2:/home/hadoop/
scp -r ~/.bash_profile hadoop@mast3:/home/hadoop/
scp -r $HADOOP_HOME/etc/hadoop hadoop@mast2:/home/hadoop/hadoop-2.5.2/etc/
scp -r $HADOOP_HOME/etc/hadoop hadoop@mast3:/home/hadoop/hadoop-2.5.2/etc/

 

 

3.4, format the file system

bin / hdfs purpose -format

 

 

3.5, start, stop (hdfs file system) and yarn (resource manager)

#Start the HDFS distributed file system
[hadoop@Mast1 hadoop-2.5.2]$ sbin/start-dfs.sh
#Close the HDFS distributed file system
[hadoop@Mast1 hadoop-2.5.2]$ sbin/stop-dfs.sh
#Start YEAR Explorer
[hadoop@Mast1 hadoop-2.5.2]$ sbin/start-yarn.sh
#Stop YEAR Explorer
[hadoop@Mast1 hadoop-2.5.2]$ sbin/stop-yarn.sh

 

3.6, JPS verification is started

#mast1(NameNode) execute jps above, you can see NameNode, ResourceManager
[hadoop@Mast1 hadoop-2.5.2]$ jps
3428 NameNode
4057 ResourceManager
4307 Jps

#Switch to mast2 or mast3 (DataNode) node to execute jps
[hadoop@Mast2 ~]$ jps
2726 DataNode
3154 Jps
3012 NodeManager

 

3.7. Browser Authentication

http://mast1:50070/

 

http://mast1:8088/

 http://mast2:50075/

 Remarks:

  1. The official documents of hadoop2.5.2 are placed in the ~/hadoop-2.5.2\hadoop-2.5.2\share\doc\hadoop directory of the download package, and you can view core.xml, hdfs.xml, mapreduce.xml, year.xml All default configurations, and his various operations
  2. A well-written blog about hadoop's parameters in Chinese: http://segmentfault.com/a/1190000000709725#articleHeader2

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326351635&siteId=291194637