Spark - Programming and Execution Principles

 

 

illustrate with an example

val rdd = sc.parallelize(List(('a',1),('a',2)))
val rdd2 = sc.parallelize(List(('b',1),('b',2)))

val x1 = rdd exactly rdd2
val x2 = rdd exactly rdd2
val x3 = rdd exactly rdd2
val x4 = rdd exactly rdd2
var a1 = x1 join x2
var a2 = x3 join x4
var a3 = a1 union a2
a3.collect

#result
res14: Array[(Char, (Int, Int))] = Array((a,(1,1)), (a,(1,2)), (a,(2,1)), (a,(2,2)), (a,(1,1)), (a,(1,2)), (a,(2,1)), (a,(2,2)), (b,(1,1)), (b,(1,2)), (b,(2,1)), (b,(2,2)), (b,(1,1)), (b,(1,2)), (b,(2,1)), (b,(2,2)))

The DAG diagram is as follows

 

 

refer to

Simple example of Spark

 

 

 

 

 

 

 

 

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326757492&siteId=291194637