Centos system realizes hadoop installation and configuration "two"

1> Install the jdk package

tar -zxvf jdk1.8.0_162.tar.gz

Install hadoop package

tar -zxvf hadoop-2.6.5.tar.gz

Configure environment variables

 

export JAVA_HOME=/usr/soft/jdk1.8.0_162

export JRE_HOME=$JAVA_HOME/jre

export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH

export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH

 

export HADOOP_HOME=/usr/soft/hadoop-2.6.5

export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

 

 

2> Open the hadoop installation file, in the share directory, there are package files that support development

In share/doc, it is help; it has no impact on development and deployment, and can be deleted in order to save resources

Inside share/hadoop, all are java packages that support development

 

Open the hadoop installation file, in the sbin directory, there are system control files, such as start/stop, etc.

 

3> Open the hadoop installation file, the etc directory is full of hadoop configuration files

Things to pay attention to:

(1)hadoop-env.sh

    export JAVA_HOME=/usr/soft/jdk1.8.0_162

 

 

(2) core-site.xml -- public configuration of hadoop running

<configuration>

<property> -- configures the default filesystem that hadoop runs on

<name>fs.defaultFS</name>

<value>hdfs://hadoop02:9000/</value>

</property>

<property> -- the working directory where the hadoop job is

<name>hadoop.tmp.dir</name>

<value>/usr/soft/hadoop-2.6.5/tmp/</value>

</property>

</configuration>

 

(3)hdfs-site.xml

<configuration>

<property> --Number of hadoop filesystem file copies

<name>dfs.replication</name>

<value>1</value>

</property>

</configuration>

 

 

(4) mapred-site.xml -- this file does not exist and needs to be created, but there is a corresponding template file

cp mapred-site.xml.template  mapred-site.xml

 

<configuration>

<property> --Execution management platform used by mapreduce under hadoop

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

</configuration>

 

(5)yarn-site.xml

<configuration>

<property> --yarn's master node machine name

<name>yarn.resourcemanager.hostname</name>

<value>hadoop02</value>

</property>

<property> --Number of hadoop filesystem file copies

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

</configuration>

 

(6) slaves -- slave node configuration file

hadoop02

 

 

---------------------------------------------------------------------------

4> On/Off

 

First, format the HDFS file system

hdfs purpose-format

 

Manual start:

 

start-dfs.sh

start-yarn.sh

mr-jobhistory-daemon.sh start historyserver

 

Manual stop:

 

mr-jobhistory-daemon.sh stop hisoryserver

stop-yarn.sh

stop-dfs.sh

--------------------------------------------------------------------------

The method of implementing node expansion on the fly:

(1) Add a new machine name to the slaves configuration file

(2) Newly added nodes need to have jdk, hadoop, ssh installed

(3) Then overwrite the configuration under hadoop according to the configuration in the cluster once

(4) Just start the datanode service:

hadoop-daemon.sh start datanode

 

 

 

 

 

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325316541&siteId=291194637