Spark-编程和执行原理

 

 

用一个例子来说明

val rdd = sc.parallelize(List(('a',1),('a',2)))
val rdd2 = sc.parallelize(List(('b',1),('b',2)))

val x1 = rdd union rdd2
val x2 = rdd union rdd2
val x3 = rdd union rdd2
val x4 = rdd union rdd2
var a1 = x1 join x2
var a2 = x3 join x4
var a3 = a1 union a2
a3.collect

#结果
res14: Array[(Char, (Int, Int))] = Array((a,(1,1)), (a,(1,2)), (a,(2,1)), (a,(2,2)), (a,(1,1)), (a,(1,2)), (a,(2,1)), (a,(2,2)), (b,(1,1)), (b,(1,2)), (b,(2,1)), (b,(2,2)), (b,(1,1)), (b,(1,2)), (b,(2,1)), (b,(2,2)))

DAG图如下

 

 

参考

Spark 简单实例

 

 

 

 

 

 

 

 

猜你喜欢

转载自xxniao.iteye.com/blog/2324202