spark notes of DStream

3.1 What is DStream

Spark Streaming Discretized Stream is the abstract basis, representative of continuous data streams and through a variety of Spark calculation result data stream sub-operations. On the internal implementation, DStream is a series of consecutive RDD to represent. Each period RDD containing data in the interval, as follows:
spark notes of DStream
the operation of data is performed in units according to the RDD
spark notes of DStream
Spark Streaming DStream used to create data flow generated by the source, some of the operations may be used on existing DStream create a new DStream.

It workflow as shown in the following diagrams, the received real-time data, the data in batches, and then pass the results to generate the final Spark Engine processing the batch.
spark notes of DStream

Guess you like

Origin blog.51cto.com/14473726/2435677