[Big data] [Spark] Structured Streaming basis

I. Overview

Most flow calculation engine (such as storm, spark streaming, etc.) of interest are simply computing data stream: for example, using a map of a function of each data stream are converted, or to reduce a number of data functions polymerization . But, in fact, in most of the stream computing applications, it requires much more than just a simple flow calculation engine. Instead, flow is calculated to occupy only a part of it in the streaming applications. So now there is a new term called continuous computing / application, continuous application.
Spark 2.0, the introduction of structured streaming, is to achieve the above mentioned continuous application, is calculated continuously. First, structured streaming api is a higher order than the spark, the spark is mainly based on higher-order batch api, such as dataset / dataframe. In addition, structured streaming also provides many other features streaming computing applications can not provide.

Guess you like

Origin blog.csdn.net/cheidou123/article/details/94221898