Getting large data flow computing Introduction ---

1. Stream computing is what?

Flow is calculated with respect to a batch of it, we've learned that Mapreduce batch, it belongs to the off-line calculation, calculation of data are past a certain point in time, and we have developed a software management system, also queried past some point of data entry. Then lost to calculate it, it is time to enter the entry began to calculate, and calculation speed is also very fast, up to milliseconds after completion will be able to calculate real-time feedback, or stored. This calculation is for general traffic, ah, the electricity supplier, ah, ah the weather and other requirements of real-time push scene.

2. What are the calculated flow

Flow calculation includes Storm, Spark Streaming, Samza, Flink.

name the company Applicable scene Types of
Storm Twitter Streaming Flow Calculation
Spark Streaming Apache Suitable for off-line computing and real-time computing needs at the same time. Hybrid Computing
Samza linkedin Spark and the combined stream is processed Kafka Flow Calculation
Flink Apache Fast processing Hybrid Computing
Hadoop (this does not include) Apache Off-line calculation

3. several structural differences calculated flow

Storm architecture

Spark Streaming Architecture

Samza architecture

Flink architecture


Series Portal

Guess you like

Origin www.cnblogs.com/shun7man/p/12392736.html