mapreduce and spark difference

https://blog.csdn.net/wyz0516071128/article/details/81219342

Although both the server side using the same concurrency model, but at the task level (especially Spark MapReduce tasks and tasks) has adopted a different parallel mechanisms: Hadoop MapReduce uses a multi-process model, which uses a multi-threaded model Spark .

Overall, Spark uses a classical scheduler / workers mode, the first step Spark each application running is to build a reusable resource pool, and then in the resource pool to run all ShuffleMapTask and ReduceTask (Note that although Spark program is very flexible manner, no longer limited to writing Mapper and Reducer, but internally Spark engine can only use two types of Task expressed a complex application that ShuffleMapTask and ReduceTask), and MapReduce application is different, it does not We will build a reusable resource pool, but let each Task dynamic application resources, and run immediately after the release of resources.

Guess you like

Origin www.cnblogs.com/gouhaiping/p/12508464.html