[] Qiang teacher polymerization operation mode used in MongoDB MapReduce

Look at the video.

[] Qiang teacher polymerization operation mode used in MongoDB MapReduce

MapReduce polymerization can be calculated very complex logic, very flexible, but, MapReduce is very slow, should not be used for real-time data analysis. MapReduce can parallel execution on multiple Server, Server only responsible for completing each part wordload, and finally sent to the Master Server wordload merger, calculate the final result set, returned to the client.

In the case of a large amount of data returned by a query using MongoDB, do some of the more complex statistical and aggregation operations do take a long time, they can be implemented in MongoDB MapReduce.


MapReduce is a very flexible and powerful data aggregation tools. Its advantage is that the polymerization can be a task into a plurality of smaller tasks, assigned to the multi-parallel processing on the server. MongoDB also provides MapReduce, of course, the query language is certainly JavaScript.


In MongoDB MapReduce mainly in the following stages:

  • Map: Map to the operation of a set of each document

  • Shuffle: The Key group documents, and generates a series (> = 1) for each of the different Key value table (List of values).

  • Reduce: processing value elements in the table, until the value of only one element in the table. Value table will then return to the process Shuffle, loop process until each of the Key table corresponds to only one value, and this value is only one table element, which is the result of the MR.

  • Finalize: This step is not necessary. After the final result obtained MR, and then processed some data "pruning" in nature.

Published 64 original articles · won praise 0 · Views 4475

Guess you like

Origin blog.csdn.net/zhaoyuqiang/article/details/105160309