spark flatMap

val rdd1 = sc.parallelize(
Seq(("one two three four five six seven"), 
("one two three four five six seven"), 
("one two three four five six seven")))

然后

rdd1.map(_.split(" ")).collect

结果
Array[Array[String]] = Array(Array(one, two, three, four, five, six, seven),
Array(one, two, three, four, five, six, seven),
Array(one, two, three, four, five, six, seven))

rdd1.flatMap(_.split(" ")).collect

结果
Array[String] = Array(one, two, three, four, five, six, seven,
one, two, three, four, five, six, seven,
one, two, three, four, five, six, seven)

发布了1142 篇原创文章 · 获赞 196 · 访问量 260万+

猜你喜欢

转载自blog.csdn.net/guotong1988/article/details/104048197