Spark Radio variables and totalizer

A broadcast illustrates variable

  

II. Code 

1 val conf = new SparkConf()
2 conf.setMaster("local").setAppName("brocast")
3 val sc = new SparkContext(conf)
4 val list = List("hello xasxt")
5 val broadCast = sc.broadcast(list)
6 val lineRDD = sc.textFile("./words.txt")
7 lineRDD.filter { x => broadCast.value.contains(x) }.foreach { println}
8 sc.stop()

III. Precautions

  1 can not be a RDD using variable broadcast broadcasted, because RDD data is not stored elastic distributed data sets []. RDD results can be broadcasted [collect, data can not be too much].

  2. Broadcast variables can only Driver-side definition, can not be defined at the end Executor.

  3. In can modify the value of the variable broadcast Driver terminal, the broadcast can not modify the value of the variable in the Executor end.

IV. Accumulator illustration

  

V. Code

1 val conf = new SparkConf()
2 conf.setMaster("local").setAppName("accumulator")
3 val sc = new SparkContext(conf)
4 val accumulator = sc.accumulator(0)
5 sc.textFile("./words.txt").foreach { x =>{accumulator.add(1)}}
6 println(accumulator.value)
7 sc.stop()

VI. Precautions

  1. accumulator Driver end defined initial value assigned, the accumulator can only read Driver side, side update Excutor.

  

Guess you like

Origin www.cnblogs.com/yszd/p/11228392.html