Transform operation of Spark-Streaming, real-time blacklist filtering case - Code World

Transform operation of Spark-Streaming, real-time blacklist filtering case

Others 2022-04-27 14:57:10 views: 0

  
   Transform operations, when applied to DStreams, can be used to perform arbitrary RDD-to-RDD transformations. It can be used to implement operations that are not provided in the DStream API. For example, the DStream API does not provide the operation of joining each batch in a DStream with a specific RDD. But we can use the transform operation to achieve this function ourselves.

  
   DStream.join(), can only join other DStreams. After the RDD of each batch of DStream is calculated, it will join the RDD of other DStreams.

Case:

object TransformDemo {
  def main(args: Array[String]): Unit = {
    Logger.getLogger("org").setLevel(Level.WARN)
    val config = new SparkConf().setAppName("TransformDemo").setMaster("local[2]")
    val ssc = new StreamingContext(config, Seconds(2))
    //Define the blacklist array
    val blackList = Array(("tom", true), ("jim", true))
    //Blacklist RDD
    val blackListRDD = ssc.sparkContext.parallelize (blackList)
    //Define a socket input stream
    ssc.socketTextStream("hadoop01", 8888).map(line => {
      val fields = line.split(" ")
      val name = fields(0)
      val clickDate = fields (1)
      (name, clickDate)
    }). transform (rdd => {
      / / Perform blacklist filtering, get the sent data and blacklist data for join connection
      //(tom,2017-03-02) leftOuterJoin (tom,true)  ===> (tom,(2017-03-02,Some(true)))
      rdd.leftOuterJoin(blackListRDD).filter(tuple => {
        //Filter out the data in the blacklist, isEmpty judges whether it is null, and returns true if it is null
        //Filter out such data (jom,(2017-09-09,None))
        if (tuple._2._2.isEmpty) true else false
      })
    }).print()
    ssc.start()
    ssc.awaitTermination ()
  }
}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325665664&siteId=291194637

Transform operation of Spark-Streaming, real-time blacklist filtering case

Transform operation of Spark-Streaming, real-time blacklist filtering case

Spark-Streaming real-time data analysis

Spark-Streaming kafka count Case

Spark-Streaming hdfs count Case

Openresty blacklist filtering

Spark Streaming real-time computing instance

Window sliding window of Spark-Streaming and statistical case of hot search terms

Window sliding window of Spark-Streaming and statistical case of hot search terms

Window sliding window of Spark-Streaming and statistical case of hot search terms

News real-time analysis system Spark2.X cluster operation mode

Real-time streaming Storm, Spark Streaming, Samza, Flink compare

Canal real-time monitoring case

Filtering Approaches for Real-Time Anti-Aliasing（2011 SIGGRAPH）

spark+kafka real-time streaming machine learning

Spark Streaming real-time computing framework learning 01

Spark- practical operation case

python real-time video streaming

[Spark][spark_streaming]#4_Transform

Data warehouse and real-time presentation Case number warehouse

Ali cloud real-time computing solutions products & Case Summary

script- real-time monitoring of the operation of the service Lander

The real-time transmission of linux operation commands is displayed on kibana

Spark-Streaming and how it works

Spark-Streaming and how it works

Flume real case operation and maintenance articles

Transform operation of Flink

Use Kafka + Spark Streaming + Cassandra build real-time data processing engine

Flume+Kafka+Spark Streaming+MySQL real-time log analysis

Big data stream processing and real-time analysis: Comparison and selection of Spark Streaming and Flink Stream SQL

Recommended

Ranking

Reason Codes enhanced accounting documents

The first test case appium

Force command and dispatch emergency communication plan

Interview - What is concurrent programming

21 classic Python interview questions [recommended collection]

[Dahua Data Structure-Introduction to Data Structure ①]

Spring @ConfigurationProperties

Why do we want to broadcast live on WeChat public account? What are the advantages?

How to bring up the toolbar under the desktop of Apple laptop

Dialog: Manually enter information

Daily

More

2024-05-21(35)

2024-05-20(5)

2024-05-19(0)

2024-05-18(31)

2024-05-17(6)

2024-05-16(23)

2024-05-15(5)

2024-05-14(9)

2024-05-13(8)

2024-05-12(28)