wordcount例子
val conf = new SparkConf().setAppName("GroupAndReduce").setMaster("local")
val sc = new SparkContext(conf)
val words = Array("one", "two", "two", "three", "three", "three")
val wordsRDD = sc.parallelize(words).map(row => (row, 1))
val wordsCountWithGroup = wordsRDD.
groupByKey(). // 其实groupByKey之后下面的 pair._2 已经成了一个value的list
map(pair => (pair._1, pair._2.sum)). // pair._1 和 pair._2 代表 word 和 list(里面都是1)
collect().
foreach(println)
如果是要把string(“abc”)聚合成一个
val conf = new SparkConf().setAppName("GroupAndReduce").setMaster("local")
val sc = new SparkContext(conf)
val words = Array("one", "two", "two", "three", "three", "three")
val wordsRDD = sc.parallelize(words).map(row => (row, "abc"))
val wordsCountWithGroup = wordsRDD.
groupByKey().
map(pair => {
val onestr = pair._2.toArray.sorted.mkString("@@@")
(pair._1, onestr)
collect().
foreach(println)