Construction and initial use of Big Data (eight) MapReduce's

--- --- restore content begins

review:

1. The final calculation of the development program of MR

2.hadoop 2.x there is a yarn: Resource Management >> MR no background field service

  yarn model: container vessel, which will run our AppMaster, map / reduce Task

  Decoupling

  mapreduce on yarn

  Architecture: RM NM

Build:

  RM to diverge and NN, NM to the number and the same DN

                    Figure structures

---------- through the official website:

mapred-site.xml > mapreduce on yarn

<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>


yarn-site.xml

//shuffle 洗牌 M -shuffle> R
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>

<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>node02:2181,node03:2181,node04:2181</value>
</property>

<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>mashibing</value>
</property>

<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>node03</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>node04</value>
</property>

 

Process:
I hdfs and all are operated by root
node01:

cd $HADOOP_HOME/etc/hadoop
cp mapred-site.xml.template mapred-site.xml    
mapred vi - site.xml
yarn vi - site.xml
scp mapred-site.xml yarn-site.xml node02:`pwd`
scp mapred-site.xml yarn-site.xml node03:`pwd`
scp mapred-site.xml yarn-site.xml node04:`pwd`
slaves vi // can not control, build hdfs time been rehabilitated. . . 
Start- yarn.sh
node03~04:
yarn-daemon.sh start resourcemanager
http://node03:8088
http://node04:8088
This is standby RM. Redirecting to the current active RM: http://node03:8088/

 

------- MR official use case: wc
combat: MR ON YARN operating mode:

hdfs dfs -mkdir -p /data/wc/input
hdfs dfs -D dfs.blocksize=1048576 -put data.txt /data/wc/input
cd $HADOOP_HOME
cd share/hadoop/mapreduce
hadoop jar hadoop-mapreduce-examples-2.6.5.jar wordcount /data/wc/input /data/wc/output

1)webui:
2)cli:

hdfs dfs -ls /data/wc/output
-R & lt -rw - r-- 2 the root Supergroup 0  2019 - 06 - 22 is  . 11 : 37 [ / Data / WC / Output / _SUCCESS // sign of success file 
-rw-R & lt - r-- 2 the root Supergroup 788 922  2019 - 06 - 22 is  . 11 : 37 [ / data / WC / Output / Part-R- 00000  // data file 
Part-R- 00000 
Part -m- 00000 
R & lt / m: R & lt Map + the reduce / Map m
hdfs dfs -cat /data/wc/output/part-r-00000
hdfs dfs -get /data/wc/output/part-r-00000 ./

Throw a question:
data.txt uploads will cut into two block end of the calculation and found that the data is right ~! ~? Listen source code analysis back ~! ~~

 

Guess you like

Origin www.cnblogs.com/littlepage/p/11166161.html