illustrate with an example
val rdd = sc.parallelize(List(('a',1),('a',2))) val rdd2 = sc.parallelize(List(('b',1),('b',2))) val x1 = rdd exactly rdd2 val x2 = rdd exactly rdd2 val x3 = rdd exactly rdd2 val x4 = rdd exactly rdd2 var a1 = x1 join x2 var a2 = x3 join x4 var a3 = a1 union a2 a3.collect #result res14: Array[(Char, (Int, Int))] = Array((a,(1,1)), (a,(1,2)), (a,(2,1)), (a,(2,2)), (a,(1,1)), (a,(1,2)), (a,(2,1)), (a,(2,2)), (b,(1,1)), (b,(1,2)), (b,(2,1)), (b,(2,2)), (b,(1,1)), (b,(1,2)), (b,(2,1)), (b,(2,2)))
The DAG diagram is as follows
refer to