MapReduce programming practice
1. The origin of big data
1. Example: (1) Product recommendation Question 1: How to store a large number of orders?
Question 2: How is the bulk order calculated?
(2) Weather forecast: Question 1: How to store a large amount of weather data?
Question 2: How to calculate a large amount of weather data?
2、大数据核心的问题:
(1)数据的存储:分布式文件系统
(2)数据的计算:分布式计算(MapReduce)
3、MapReduce的计算模型的来源
(1)根据Google发表一篇论文:MapReduce
(2)问题:PageRank(网页排名)---> 举例
2. Demo of MapReduce programming model : WordCount word count
hadoop jar hadoop-mapreduce-examples-2.7.3.jar wordcount /input/data.txt /output/wc1
**3. MapReduce's data processing flow (very, very, very important)
4. Develop your own MapReduce program (Java program)
Five, Yarn scheduling process**
Many people know that I have big data training materials, and they all naively think that I have a full set of video learning materials such as big data development, hadoop, and spark. I want to say you are right, I do have a full set of videos on big data development, hadoop, spark.
If you are interested in big data development, you can join the group to receive free learning materials: 763835121