1> Install the jdk package
tar -zxvf jdk1.8.0_162.tar.gz
Install hadoop package
tar -zxvf hadoop-2.6.5.tar.gz
Configure environment variables
export JAVA_HOME=/usr/soft/jdk1.8.0_162
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
export HADOOP_HOME=/usr/soft/hadoop-2.6.5
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
2> Open the hadoop installation file, in the share directory, there are package files that support development
In share/doc, it is help; it has no impact on development and deployment, and can be deleted in order to save resources
Inside share/hadoop, all are java packages that support development
Open the hadoop installation file, in the sbin directory, there are system control files, such as start/stop, etc.
3> Open the hadoop installation file, the etc directory is full of hadoop configuration files
Things to pay attention to:
(1)hadoop-env.sh
export JAVA_HOME=/usr/soft/jdk1.8.0_162
(2) core-site.xml -- public configuration of hadoop running
<configuration>
<property> -- configures the default filesystem that hadoop runs on
<name>fs.defaultFS</name>
<value>hdfs://hadoop02:9000/</value>
</property>
<property> -- the working directory where the hadoop job is
<name>hadoop.tmp.dir</name>
<value>/usr/soft/hadoop-2.6.5/tmp/</value>
</property>
</configuration>
(3)hdfs-site.xml
<configuration>
<property> --Number of hadoop filesystem file copies
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
(4) mapred-site.xml -- this file does not exist and needs to be created, but there is a corresponding template file
cp mapred-site.xml.template mapred-site.xml
<configuration>
<property> --Execution management platform used by mapreduce under hadoop
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
(5)yarn-site.xml
<configuration>
<property> --yarn's master node machine name
<name>yarn.resourcemanager.hostname</name>
<value>hadoop02</value>
</property>
<property> --Number of hadoop filesystem file copies
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
(6) slaves -- slave node configuration file
hadoop02
---------------------------------------------------------------------------
4> On/Off
First, format the HDFS file system
hdfs purpose-format
Manual start:
start-dfs.sh
start-yarn.sh
mr-jobhistory-daemon.sh start historyserver
Manual stop:
mr-jobhistory-daemon.sh stop hisoryserver
stop-yarn.sh
stop-dfs.sh
--------------------------------------------------------------------------
The method of implementing node expansion on the fly:
(1) Add a new machine name to the slaves configuration file
(2) Newly added nodes need to have jdk, hadoop, ssh installed
(3) Then overwrite the configuration under hadoop according to the configuration in the cluster once
(4) Just start the datanode service:
hadoop-daemon.sh start datanode