Similarities and differences of the four frameworks of MapReduce Tez Storm Spark

Similarities and differences between the four frameworks of MapReduce\Tez\Storm\Spark

1) MapReduce: It is an offline computing framework that abstracts an algorithm into two stages of Map and Reduce for processing, which is very suitable for data-intensive computing.
2) Spark: The MapReduce computing framework is not suitable for iterative computing and interactive computing. MapReduce is a disk computing framework, while Spark is an in-memory computing framework. It puts data into memory as much as possible to improve iterative applications and interactive computing. The computational efficiency of the application. 3) Storm: MapReduce is also not suitable for streaming computing and real-time analysis, such as advertising click computing, etc. Storm is better at this kind of computing, and its real-time performance is much better than the MapReduce computing framework.
4) Tez: A computing framework that runs on YARN to support DAG jobs, and summarizes MapReduce data processing. It splits the Map/Reduce process into several sub-processes, and can combine multiple Map/Reduce tasks into a larger DAG task, reducing the file storage between Map/Reduce. At the same time, a reasonable combination of its sub-processes can also reduce the running time of the task.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326364468&siteId=291194637