CentOS6.5 configuration hadoop

1. HadoopCluster construction (self-study course CentOS6.5)

1.1. Configure secret-free access between virtual machines

ssh-keygen -t rsa #生成公钥
ssh-copy-id [email protected] #将本机公钥复制到指定IP的虚拟机中

1.2, configure exemption between virtual machines ip(hostname access)

vi /etc/hosts  #配置主机名和ip的映射
# 例如
192.168.32.11 vm1
192.168.32.12 vm2
192.168.32.13 vm3
192.168.32.14 vm4
192.168.32.15 vm5
192.168.32.16 vm6

1.3, installationhadoop

Local Uploadhadoop

rz   #上传文件,xshell会出现弹框选中上传的文件

Unzip the hadooppackage to the specified /local/folder (folder optional)

tar -zxvf xxxxx -C /local/

The unzipped folder usually has a version suffix, you can rename it according to your habits

cd /local/ #进入hadoop的解压目录
mv hadoopx.x.x hadoop #将hadoop文件夹重命名去掉版本号后缀

Configured hadoopenvironment variables

vi /etc/profile
# 在打开的文件下加入
export HADOOP_HOME=/local/hadoop
export $PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

(1) core-site.xmlContents of change

<configuration>
	<property>
		<name>fs.defaultFS</name>
        <!--本机的名称:端口 -->
       	<value>hdfs://vm1:9000</value>
    </property>
	<property>
		<name>dfs.namenode.name.dir</name>
        <value>file:///local/hadoop/dfs/tmp</value>
	</property>
	<property>
		<name>io.file.buffer.size</name>
        <value>131072</value>
	</property>
</configuration>

(2) hdfs-site.xmlContents of change

<configuration>
    <property>
        <name>dfs.namenode.http-address</name>
        <value>vm1:50070</value>
    </property>
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>vm2:50090</value>
    </property>
    <property>
        <name>dfs.replication</name>
        <value>2</value>
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:///local/hadoop/dfs/name</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:///local/hadoop/dfs/data</value>
    </property>
    <property>
        <name>dfs.permissions</name>
        <value>false</value>
    </property>
</configuration>

(3) yarn-site.xmlContents of change

<configuration>
    <property>
            <name>yarn.resourcemanager.hostname</name>
            <value>vm1</value>
    </property>
    <property>
            <name>yarn.resourcemanager.address</name>
            <value>vm1:8032</value>
    </property>
    <property>
            <name>yarn.resourcemanager.resource-tracker.address</name>
    		<value>vm1:8031</value>
    </property>
    <property>
            <name>yarn.resourcemanager.scheduler.address</name>
            <value>vm1:8030</value>
    </property>
    <property>
            <name>yarn.resourcemanager.admin.address</name>
            <value>vm1:8033</value>
    </property>
    <property>
            <name>yarn.resourcemanager.webapp.address</name>
            <value>vm1:8088</value>
    </property>
    <property>
    		<name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
    </property>
    <property>
    		<name>yarn.nodemanager.vmem-check-enabled</name>
            <value>flase</value>
    </property>
    <property>
    		<name>yarn.nodemanager.vmem-pmem-ratio</name>
            <value>6</value>
    <description>每个任务使用的虚拟内存占物理内存的百分比</description>
    </property>
</configuration>

(4) mapred-site.xmlContents of change

My default suffix is ​​there, templateso I changed my name first

mv 源文件名 更改的文件名

change content

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>vm1:10020</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>vm1:19888</value>
    </property>
    <property>
        <name>mapreduce.application.classpath</name>
    	  <value>
    	    /local/hadoop/etc/*,
            /local/hadoop/etc/hadoop/*,
            /local/hadoop/lib/*,
            /local/hadoop/share/hadoop/common/*,
            /local/hadoop/share/hadoop/common/lib/*,
            /local/hadoop/share/hadoop/mapreduce/*,
            /local/hadoop/share/hadoop/mapreduce/lib/*,
            /local/hadoop/share/hadoop/hdfs/*,
            /local/hadoop/share/hadoop/hdfs/lib/*,
            /local/hadoop/share/hadoop/yarn/*,
            /local/hadoop/share/hadoop/yarn/lib/*
        </value>
    </property>
    <property>
        <name>mapred.remote.os</name>
        <value>Linux</value>
    </property>
</configuration>

The hadoopcopy to the node (three machines in claim same path, so does not need to configure all three)

scp -r hadoop/ vm3:/local/
scp -r hadoop/ vm2:/local/

(5) Format on the first machine

Enter the hadooproot directory

bin/hdfs namenode -format
chkconfig --level 2345 iptables off #关闭防火墙,为了方便就直接干掉了
sbin/start-dfs.sh  #启动hdfs
sbin/start-yarn.sh #启动yarn

Insert picture description here

Now open the browser to access the address configured on the first machinehttp://192.168.32.11:50070/

Insert picture description here

Configuration is complete (I’m a beginner, I hope you can correct me)

Guess you like

Origin blog.csdn.net/qq_37771811/article/details/104817109