How to stop SparkStreaming gracefully

About how to stop SparkStreaming gracefully, there are many on the Internet. I tested a simple method and shared it.

A simple SparkStreaming example that reads data from a file and saves the result to a specified directory

package SparkStream

import org.apache.spark.SparkConf
import org.apache.spark.streaming.{Seconds, StreamingContext}

/**
  * Created by admin on 2019/3/21.
  * 功能: 演示正常的使用SparkStreaming
  *
  */
object SparkStreaming {
  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("stop spark streaming").setMaster("local[2]")
    val ssc = new StreamingContext(conf, Seconds(5))
    val file = ssc.textFileStream("C://test//stu1.txt")
    val res = file.map { line =>
      val arr = line.split("\\|")
      //arr(0) + "888888" + arr(2)
      (arr(0)+"88",1)
    }.reduceByKey(_+_)
    res.saveAsTextFiles("c://test//result")
    ssc.start()
    ssc.awaitTermination()

  }
}

How to stop SparkStreaming gracefully? This is the focus of this article. General idea:

In the driver, add a piece of code. This code scans the directory on the HDFS every 10 seconds. If the directory is found to exist, it calls the stop method of the StreamContext object to gracefully terminate itself.

package SparkStream



import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.{FileSystem, Path}
import org.apache.spark.SparkConf
import org.apache.spark.streaming.{Seconds, StreamingContext}

/**
  * Created by admin on 2019/3/21.
  * 功能: 演示如何优雅的停止掉SparkStreaming
  *
  */
object StopSparkStreaming {
  val shutdownMarker = "c://test//source1"
  var stopFlag: Boolean = false

  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("stop spark streaming").setMaster("local[2]")
    val ssc = new StreamingContext(conf, Seconds(5))
    val file = ssc.textFileStream("C://test//stu1.txt")
    val res = file.map { line =>
      val arr = line.split("\\|")
      //arr(0) + "888888" + arr(2)
      (arr(0)+"88",1)
    }.reduceByKey(_+_)
    res.saveAsTextFiles("c://test//result")
    ssc.start()
    //检查间隔毫秒
    val checkIntervalMillis = 10000
    var isStopped = false
    while (!isStopped) {
      println("calling awaitTerminationOrTimeout")
      //等待执行停止。执行过程中发生的任何异常都会在此线程中抛出,如果执行停止了返回true,
      //线程等待超时长,当超过timeout时间后,会监测ExecutorService是否已经关闭,若关闭则返回true,否则返回false。
      isStopped = ssc.awaitTerminationOrTimeout(checkIntervalMillis)
      if (isStopped) {
        println("confirmed! The streaming context is stopped. Exiting application...")
      } else {
        println("Streaming App is still running. Timeout...")
      }
      //判断文件夹是否存在
      checkShutdownMarker
      if (!isStopped && stopFlag) {
        println("stopping ssc right now")
        //第一个true:停止相关的SparkContext。无论这个流媒体上下文是否已经启动,底层的SparkContext都将被停止。
        //第二个true:则通过等待所有接收到的数据的处理完成,从而优雅地停止。
        ssc.stop(true, true)
        println("ssc is stopped!!!!!!!")
      }
    }
  }

  def checkShutdownMarker = {
    if (!stopFlag) {
      //开始检查hdfs是否有stop-spark文件夹
      val fs = FileSystem.get(new Configuration())
      //如果有返回true,如果没有返回false
      stopFlag = fs.exists(new Path(shutdownMarker))
    }
  }
}

First schedule ss normally for a period of time, and then create the directory "c://test//source1". The background log is as follows, only partial records are kept

Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:48:40 INFO InputInfoTracker: remove old batch metadata: 
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:48:50 INFO InputInfoTracker: remove old batch metadata: 
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:49:00 INFO InputInfoTracker: remove old batch metadata: 
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:49:10 INFO InputInfoTracker: remove old batch metadata: 
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:49:20 INFO InputInfoTracker: remove old batch metadata: 
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:49:30 INFO InputInfoTracker: remove old batch metadata: 1553154505000 ms
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:49:40 INFO InputInfoTracker: remove old batch metadata: 1553154515000 ms
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:49:50 INFO InputInfoTracker: remove old batch metadata: 1553154525000 ms
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:50:00 INFO InputInfoTracker: remove old batch metadata: 1553154535000 ms
19/03/21 15:50:00 INFO BlockManager: Removing RDD 35
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:50:10 INFO InputInfoTracker: remove old batch metadata: 1553154545000 ms
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:50:20 INFO InputInfoTracker: remove old batch metadata: 1553154555000 ms
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:50:30 INFO InputInfoTracker: remove old batch metadata: 1553154565000 ms
Streaming App is still running. Timeout...
stopping ssc right now
19/03/21 15:50:32 INFO JobGenerator: Stopping JobGenerator gracefully
19/03/21 15:50:32 INFO JobGenerator: Waiting for all received blocks to be consumed for job generation
19/03/21 15:50:32 INFO JobGenerator: Waited for all received blocks to be consumed for job generation
19/03/21 15:50:35 INFO FileInputDStream: Finding new files took 2 ms
19/03/21 15:50:35 INFO FileInputDStream: New files at time 1553154635000 ms:

19/03/21 15:50:35 INFO JobScheduler: Added jobs for time 1553154635000 ms
19/03/21 15:50:35 INFO JobScheduler: Starting job streaming job 1553154635000 ms.0 from job set of time 1553154635000 ms
19/03/21 15:50:35 INFO SparkContext: Starting job: saveAsTextFiles at StopSparkStreaming.scala:28
19/03/21 15:50:35 INFO DAGScheduler: Registering RDD 132 (map at StopSparkStreaming.scala:23)
19/03/21 15:50:35 INFO DAGScheduler: Got job 26 (saveAsTextFiles at StopSparkStreaming.scala:28) with 2 output partitions
19/03/21 15:50:35 INFO DAGScheduler: Final stage: ResultStage 53 (saveAsTextFiles at StopSparkStreaming.scala:28)
19/03/21 15:50:35 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 52)
19/03/21 15:50:35 INFO DAGScheduler: Missing parents: List()
19/03/21 15:50:35 INFO DAGScheduler: Submitting ResultStage 53 (MapPartitionsRDD[134] at saveAsTextFiles at StopSparkStreaming.scala:28), which has no missing parents
19/03/21 15:50:35 INFO MemoryStore: Block broadcast_26 stored as values in memory (estimated size 64.5 KB, free 324.6 KB)
19/03/21 15:50:35 INFO MemoryStore: Block broadcast_26_piece0 stored as bytes in memory (estimated size 22.2 KB, free 346.8 KB)
19/03/21 15:50:35 INFO BlockManagerInfo: Added broadcast_26_piece0 in memory on localhost:50448 (size: 22.2 KB, free: 1121.9 MB)
19/03/21 15:50:35 INFO SparkContext: Created broadcast 26 from broadcast at DAGScheduler.scala:1006
19/03/21 15:50:35 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 53 (MapPartitionsRDD[134] at saveAsTextFiles at StopSparkStreaming.scala:28)
19/03/21 15:50:35 INFO TaskSchedulerImpl: Adding task set 53.0 with 2 tasks
19/03/21 15:50:35 INFO TaskSetManager: Starting task 0.0 in stage 53.0 (TID 52, localhost, partition 0,PROCESS_LOCAL, 1894 bytes)
19/03/21 15:50:35 INFO TaskSetManager: Starting task 1.0 in stage 53.0 (TID 53, localhost, partition 1,PROCESS_LOCAL, 1894 bytes)
19/03/21 15:50:35 INFO Executor: Running task 0.0 in stage 53.0 (TID 52)
19/03/21 15:50:35 INFO Executor: Running task 1.0 in stage 53.0 (TID 53)
19/03/21 15:50:35 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 0 blocks
19/03/21 15:50:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms
19/03/21 15:50:35 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 0 blocks
19/03/21 15:50:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
19/03/21 15:50:35 INFO FileOutputCommitter: Saved output of task 'attempt_201903211550_0053_m_000001_53' to file:/c:/test/result-1553154635000/_temporary/0/task_201903211550_0053_m_000001
19/03/21 15:50:35 INFO SparkHadoopMapRedUtil: attempt_201903211550_0053_m_000001_53: Committed
19/03/21 15:50:35 INFO Executor: Finished task 1.0 in stage 53.0 (TID 53). 2080 bytes result sent to driver
19/03/21 15:50:35 INFO TaskSetManager: Finished task 1.0 in stage 53.0 (TID 53) in 277 ms on localhost (1/2)
19/03/21 15:50:35 INFO FileOutputCommitter: Saved output of task 'attempt_201903211550_0053_m_000000_52' to file:/c:/test/result-1553154635000/_temporary/0/task_201903211550_0053_m_000000
19/03/21 15:50:35 INFO SparkHadoopMapRedUtil: attempt_201903211550_0053_m_000000_52: Committed
19/03/21 15:50:35 INFO Executor: Finished task 0.0 in stage 53.0 (TID 52). 2080 bytes result sent to driver
19/03/21 15:50:35 INFO TaskSetManager: Finished task 0.0 in stage 53.0 (TID 52) in 335 ms on localhost (2/2)
19/03/21 15:50:35 INFO TaskSchedulerImpl: Removed TaskSet 53.0, whose tasks have all completed, from pool 
19/03/21 15:50:35 INFO DAGScheduler: ResultStage 53 (saveAsTextFiles at StopSparkStreaming.scala:28) finished in 0.335 s
19/03/21 15:50:35 INFO DAGScheduler: Job 26 finished: saveAsTextFiles at StopSparkStreaming.scala:28, took 0.351543 s
19/03/21 15:50:35 INFO JobScheduler: Finished job streaming job 1553154635000 ms.0 from job set of time 1553154635000 ms
19/03/21 15:50:35 INFO JobScheduler: Total delay: 0.680 s for time 1553154635000 ms (execution: 0.630 s)
19/03/21 15:50:35 INFO ShuffledRDD: Removing RDD 128 from persistence list
19/03/21 15:50:35 INFO BlockManager: Removing RDD 128
19/03/21 15:50:35 INFO MapPartitionsRDD: Removing RDD 127 from persistence list
19/03/21 15:50:35 INFO BlockManager: Removing RDD 127
19/03/21 15:50:35 INFO MapPartitionsRDD: Removing RDD 126 from persistence list
19/03/21 15:50:35 INFO BlockManager: Removing RDD 126
19/03/21 15:50:35 INFO UnionRDD: Removing RDD 70 from persistence list
19/03/21 15:50:35 INFO BlockManager: Removing RDD 70
19/03/21 15:50:35 INFO FileInputDStream: Cleared 1 old files that were older than 1553154575000 ms: 1553154570000 ms
19/03/21 15:50:35 INFO ReceivedBlockTracker: Deleting batches ArrayBuffer()
19/03/21 15:50:35 INFO InputInfoTracker: remove old batch metadata: 1553154570000 ms
19/03/21 15:50:40 INFO RecurringTimer: Stopped timer for JobGenerator after time 1553154640000
19/03/21 15:50:40 INFO FileInputDStream: Finding new files took 5 ms
19/03/21 15:50:40 INFO FileInputDStream: New files at time 1553154640000 ms:

19/03/21 15:50:40 INFO JobScheduler: Added jobs for time 1553154640000 ms
19/03/21 15:50:40 INFO JobScheduler: Starting job streaming job 1553154640000 ms.0 from job set of time 1553154640000 ms
19/03/21 15:50:40 INFO JobGenerator: Stopped generation timer
19/03/21 15:50:40 INFO JobGenerator: Waiting for jobs to be processed and checkpoints to be written
19/03/21 15:50:40 INFO SparkContext: Starting job: saveAsTextFiles at StopSparkStreaming.scala:28
19/03/21 15:50:40 INFO DAGScheduler: Registering RDD 137 (map at StopSparkStreaming.scala:23)
19/03/21 15:50:40 INFO DAGScheduler: Got job 27 (saveAsTextFiles at StopSparkStreaming.scala:28) with 2 output partitions
19/03/21 15:50:40 INFO DAGScheduler: Final stage: ResultStage 55 (saveAsTextFiles at StopSparkStreaming.scala:28)
19/03/21 15:50:40 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 54)
19/03/21 15:50:40 INFO DAGScheduler: Missing parents: List()
19/03/21 15:50:40 INFO DAGScheduler: Submitting ResultStage 55 (MapPartitionsRDD[139] at saveAsTextFiles at StopSparkStreaming.scala:28), which has no missing parents
19/03/21 15:50:40 INFO MemoryStore: Block broadcast_27 stored as values in memory (estimated size 64.5 KB, free 411.2 KB)
19/03/21 15:50:40 INFO MemoryStore: Block broadcast_27_piece0 stored as bytes in memory (estimated size 22.2 KB, free 433.4 KB)
19/03/21 15:50:40 INFO BlockManagerInfo: Added broadcast_27_piece0 in memory on localhost:50448 (size: 22.2 KB, free: 1121.9 MB)
19/03/21 15:50:40 INFO SparkContext: Created broadcast 27 from broadcast at DAGScheduler.scala:1006
19/03/21 15:50:40 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 55 (MapPartitionsRDD[139] at saveAsTextFiles at StopSparkStreaming.scala:28)
19/03/21 15:50:40 INFO TaskSchedulerImpl: Adding task set 55.0 with 2 tasks
19/03/21 15:50:40 INFO TaskSetManager: Starting task 0.0 in stage 55.0 (TID 54, localhost, partition 0,PROCESS_LOCAL, 1894 bytes)
19/03/21 15:50:40 INFO TaskSetManager: Starting task 1.0 in stage 55.0 (TID 55, localhost, partition 1,PROCESS_LOCAL, 1894 bytes)
19/03/21 15:50:40 INFO Executor: Running task 1.0 in stage 55.0 (TID 55)
19/03/21 15:50:40 INFO Executor: Running task 0.0 in stage 55.0 (TID 54)
19/03/21 15:50:40 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 0 blocks
19/03/21 15:50:40 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
19/03/21 15:50:40 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 0 blocks
19/03/21 15:50:40 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms
19/03/21 15:50:40 INFO FileOutputCommitter: Saved output of task 'attempt_201903211550_0055_m_000001_55' to file:/c:/test/result-1553154640000/_temporary/0/task_201903211550_0055_m_000001
19/03/21 15:50:40 INFO SparkHadoopMapRedUtil: attempt_201903211550_0055_m_000001_55: Committed
19/03/21 15:50:40 INFO Executor: Finished task 1.0 in stage 55.0 (TID 55). 2080 bytes result sent to driver
19/03/21 15:50:40 INFO TaskSetManager: Finished task 1.0 in stage 55.0 (TID 55) in 303 ms on localhost (1/2)
19/03/21 15:50:40 INFO FileOutputCommitter: Saved output of task 'attempt_201903211550_0055_m_000000_54' to file:/c:/test/result-1553154640000/_temporary/0/task_201903211550_0055_m_000000
19/03/21 15:50:40 INFO SparkHadoopMapRedUtil: attempt_201903211550_0055_m_000000_54: Committed
19/03/21 15:50:40 INFO Executor: Finished task 0.0 in stage 55.0 (TID 54). 2080 bytes result sent to driver
19/03/21 15:50:40 INFO TaskSetManager: Finished task 0.0 in stage 55.0 (TID 54) in 359 ms on localhost (2/2)
19/03/21 15:50:40 INFO TaskSchedulerImpl: Removed TaskSet 55.0, whose tasks have all completed, from pool 
19/03/21 15:50:40 INFO DAGScheduler: ResultStage 55 (saveAsTextFiles at StopSparkStreaming.scala:28) finished in 0.361 s
19/03/21 15:50:40 INFO DAGScheduler: Job 27 finished: saveAsTextFiles at StopSparkStreaming.scala:28, took 0.378762 s
19/03/21 15:50:40 INFO JobScheduler: Finished job streaming job 1553154640000 ms.0 from job set of time 1553154640000 ms
19/03/21 15:50:40 INFO JobScheduler: Total delay: 0.772 s for time 1553154640000 ms (execution: 0.741 s)
19/03/21 15:50:40 INFO ShuffledRDD: Removing RDD 133 from persistence list
19/03/21 15:50:40 INFO MapPartitionsRDD: Removing RDD 132 from persistence list
19/03/21 15:50:40 INFO BlockManager: Removing RDD 132
19/03/21 15:50:40 INFO BlockManager: Removing RDD 133
19/03/21 15:50:40 INFO MapPartitionsRDD: Removing RDD 131 from persistence list
19/03/21 15:50:40 INFO BlockManager: Removing RDD 131
19/03/21 15:50:40 INFO UnionRDD: Removing RDD 75 from persistence list
19/03/21 15:50:40 INFO FileInputDStream: Cleared 1 old files that were older than 1553154580000 ms: 1553154575000 ms
19/03/21 15:50:40 INFO ReceivedBlockTracker: Deleting batches ArrayBuffer()
19/03/21 15:50:40 INFO InputInfoTracker: remove old batch metadata: 1553154575000 ms
19/03/21 15:50:40 INFO BlockManager: Removing RDD 75
19/03/21 15:50:40 INFO JobGenerator: Waited for jobs to be processed and checkpoints to be written
19/03/21 15:50:40 INFO JobGenerator: Stopped JobGenerator
19/03/21 15:50:40 INFO JobScheduler: Stopped JobScheduler
19/03/21 15:50:40 INFO StreamingContext: StreamingContext stopped successfully
19/03/21 15:50:40 INFO SparkUI: Stopped Spark web UI at http://192.168.17.10:4040
19/03/21 15:50:40 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
19/03/21 15:50:41 INFO MemoryStore: MemoryStore cleared
19/03/21 15:50:41 INFO BlockManager: BlockManager stopped
19/03/21 15:50:41 INFO BlockManagerMaster: BlockManagerMaster stopped
19/03/21 15:50:41 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
19/03/21 15:50:41 INFO SparkContext: Successfully stopped SparkContext
ssc is stopped!!!!!!!
calling awaitTerminationOrTimeout
confirmed! The streaming context is stopped. Exiting application...
19/03/21 15:50:41 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
19/03/21 15:50:41 INFO ShutdownHookManager: Shutdown hook called
19/03/21 15:50:41 INFO ShutdownHookManager: Deleting directory C:\Users\admin\AppData\Local\Temp\spark-ec9347c3-5dbd-4f27-a52f-40e71d565521
19/03/21 15:50:41 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.

Process finished with exit code 0

It can be seen that from "19/03/21 15:49:10" to "19/03/21 15:50:30", ss scans the directory (c://test//result) every 10 seconds. Because there is no directory at this time, ss will continue to run after the timeout. After monitoring the existence of the directory, call the ssc.stop(true, true) method to stop, and then exit the entire while loop.

Guess you like

Origin blog.csdn.net/zhaoxiangchong/article/details/88719500