A, Yarn configuration:
1. Configure yarn-env.sh: join JAVA_HOME.
2. Configure yarn-site.xml: add the following, will change the hostname node1
<!--Reducer获取数据的方式-->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!--指定Yarn的ResourceManager的地址-->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>node1</value>
</property>
3. Configure mapred-env.xml:
Copy files
cp mapred-site.xml.template mapred-site.xml
Add the following in mapred-env.xml in:
<!--指定MR运行在Yarn上-->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
Second, start the cluster:
1. Before starting NameNode and must ensure DataNode has already started.
2. Start ResourceManager:
sbin/yarn-daemon.sh start resourcemanager
3. Start NodeManager:
sbin/yarn-daemon.sh start nodemanager
4. Check: Enter jps should be able to see the program has been started
Enter the netstat -lnpt should see the port number 8088 has been launched
Enter the host name and port number 8088 you may see the following page
Third, the program runs WordCount:
1. Delete the previous output:
hdfs dfs -rm -r /user/root/output
2. Run the program:
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /user/root/input /user/root/output
We can see the program processing process: