spark streaming socket stream stream processing - Create a data input source of the program nc

Client code, initiates a request to establish a connection to the server vim NetworkWordCount.py
#! / Usr / bin / env python3
from__future__import print_function
Import SYS
from pyspark Import SparkContext
from pyspark.streaming Import StreamingContext

__ == if__name " main ":
IF len (the sys.argv) =. 3:!
Print ( "the Usage: NetworkWordCount.py <. hostname, Port>", File = sys.stderror)
Exit (-1)
SC = SparkContext (appName = "PSWC")
SSC = StreamingContext (SC,. 1)
Lines = ssc.socketTextStream (sys.arg [. 1], int (sys.arg [2]))
Counts = lines.flatMap (the lambda Line: line.split ( " ")) Map (the lambda X:. (X,. 1)) reduceByKey (the lambda A, B:. A + B)
counts.pprint ()
ssc.start ()
ssc.awaitTermination ()
server-side code, here comes nc program of
$ nc -lk 9999 # nc open a new window, enter the data source, in this page, enter, listening window automatically get word frequency statistics, l was listening listen k is even a lot of times, broken reconnect, just with empty ports on the line used here 9999
CD / usr / local / Spark / MyCode / Streaming / Socket
/ usr / local / Spark / bin / localhost 9999 Spark-Submit NetWorkWordCount.py

发布了25 篇原创文章 · 获赞 0 · 访问量 374

Guess you like

Origin blog.csdn.net/qq_45371603/article/details/104617169
Recommended