SparkStreaming WordCount case of (a)

I. Introduction case
using the netcat tool to port 9999 to send data continuously read data SparkStreaming port and count the number of different word appears.
Two, netcat operation
1, the virtual machine installed netcat
[@ hadoop1 the root Spark] yum the install -Y NC #
2, start the program and data transmission
[root @ hadoop1 spark] # nc -lk 9999
Third, the code implements
a, Maven dependent

org .apache.spark
Spark-streaming_2.11
2.1.1

2, Java Code

object SparkStreamingDemo {
  def main(args: Array[String]): Unit = {
    //1.初始化 Spark 配置信息
    val sparkConf = new SparkConf().setMaster("local[*]").setAppName("SparkStreamingDemo")
    //2.初始化SparkStreamingContext
    val ssc = new StreamingContext(sparkConf, Seconds(5))
    // 3.通过监控端口创建 DStream,读进来的数据为一行行
    val lineStreams = ssc.socketTextStream("hadoop1", 9999)
    // 4.将每一行数据做切分,形成一个个单词
    val wordStreams = lineStreams.flatMap(_.split(" "))
    // 5.将单词映射成元组(word,1)
    val wordAndOneStreams = wordStreams.map((_, 1))
    // 6.将相同的单词次数做统计
    val wordAndCountStreams = wordAndOneStreams.reduceByKey(_+_)
    // 7.打印
    wordAndCountStreams.print()
    // 8.启动SparkStreamingContext
    ssc.start()
    ssc.awaitTermination()
  }
}
Published 22 original articles · won praise 22 · views 781

Guess you like

Origin blog.csdn.net/weixin_45568892/article/details/104397489