DStreams output

Output operation specifies the operation of the stream data obtained by the data conversion operation to be executed (e.g., the
Results push an external database or output to the screen). And lazy evaluation RDD Similar, if a
A DStream and derive the DStream output operations have not been executed, then these
DStream it will not be evaluated. If no output StreamingContext operation, the whole
A context it would not start.
print()

10 beginning at the node driving the printing operation flow of the program in each batch of data DStream
Elements. This is used for development and debugging. In the Python API, the same operation called pprint ().

 

saveAsTextFiles(prefix, [suffix])

DStream store this file in text form. Store files each batch
Names based on parameters prefix and suffix. "Prefix-Time_IN_MS [.suffix]".

  

saveAsObjectFiles (prefix, [suffix]) 

to Java object serialization way data is stored in the Stream SequenceFiles. 
Each batch file name is stored based on the parameters for the
"prefix-TIME_IN_MS [.suffix]" . Pytho n in currently unavailable.

 

saveAsHadoopFiles(prefix, [suffix])

Save the data in the Stream to Hadoop files. Each batch of stored file name
Based on the parameters for the "prefix-TIME_IN_MS [.suffix]". Python API Python currently unavailable.

 

foreachRDD(func)

This is the most common output operation, i.e. to function func for generating each stream RDD. In which the parameters should be passed in the function func
Implement each of the data RDD pushed to external systems such as the files stored in RDD or written to the database through the network. Note: function func
Driving operation is performed in streaming applications, wherein the general function RDD operation while forcing the flow for the calculation of the RDD.

 

 

 

 

Write data to the HBASE

 

Guess you like

Origin www.cnblogs.com/JBLi/p/11364269.html