hadoop-3.0.0 cluster construction
- download hadoop package
wget -c http://ftp.jaist.ac.jp/pub/apache/hadoop/common/hadoop-3.0.0/hadoop-3.0.0.tar.gz
- decompress
tar -zxvf hadoop-3.0.0.tar.gz -C /usr/java/
- configure
- Configure environment variables, open
vim /etc/profile
files
make it effective immediatelyexport HADOOP_HOME=/usr/java/hadoop-3.0.0 export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
source /etc/profile
- configure
/etc/hosts
192.168.56.101 master 192.168.56.102 slave1 192.168.56.103 slave2
- turn off firewall
systemctl stop firewalld
- Configuration
vim core-site.xml
file (in the etc/hadoop directory), add the following configuration
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://master:9000</value> </property> <property> <name>hadoop.proxyuser.wujinlei.groups</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.wujinlei.hosts</name> <value>*</value> </property> </configuration>
- Configuration
vim hdfs-site.xml
file, add the following configuration.master
Open ports on the machine9870
for external access to the web page (NameNode HTTP UI
) to view the cluster status.slave
Open ports on the machine9864
for external access to web pages (DataNode HTTP UI
).
<configuration> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/home/wujinlei/hadoop/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/home/wujinlei/hadoop/dfs/data</value> </property> <property> <name>dfs.namenode.http-address</name> <value>master:9870</value> </property> <property> <name>dfs.datanode.http.address</name> <value>master:9864</value> </property> </configuration>
- Configuration
vim yarn-site.xml
, add the following configuration,master
open the port on the machine8088
for external access to the web page, and view the cluster task scheduling
<configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>master</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>master:8088</value> </property> <property> <name>yarn.application.classpath</name> <value> /usr/java/hadoop-3.0.0/etc/hadoop, /usr/java/hadoop-3.0.0/share/hadoop/common/lib/*, /usr/java/hadoop-3.0.0/share/hadoop/common/*, /usr/java/hadoop-3.0.0/share/hadoop/hdfs, /usr/java/hadoop-3.0.0/share/hadoop/hdfs/lib/*, /usr/java/hadoop-3.0.0/share/hadoop/hdfs/*, /usr/java/hadoop-3.0.0/share/hadoop/mapreduce/*, /usr/java/hadoop-3.0.0/share/hadoop/yarn, /usr/java/hadoop-3.0.0/share/hadoop/yarn/lib/*, /usr/java/hadoop-3.0.0/share/hadoop/yarn/*, /usr/java/jdk1.8.0_45/lib/tools.jar </value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> </configuration>
- Configuration
vim mapred-site.xml
, add the following configuration
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>0.0.0.0:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>0.0.0.0:19888</value> </property> </configuration>
- Edit the file under etc/hadoop
works
as follows
slave1 slave2
- Configure environment variables, open
- start the cluster
- Copy the above configured hadoop file to the other two nodes
slave1
andslave2
- Format before starting, command
hdfs namenode -format
- Start alone
- start dfs, command
start-dfs.sh
- start yarn, command
start-yarn.sh
- start dfs, command
- start all
- Order
start-all.sh
- Order
- Copy the above configured hadoop file to the other two nodes
- visit web page
http://192.168.56.101:8088
http://192.168.56.101:9870
- start history service
mapred historyserver
This service is used to access historical task details