转:Spark RDD算子练习题

版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://blog.csdn.net/qq_40825218/article/details/83720732
给定数据如下:

12 张三 25 男 chinese 50
12 张三 25 男 math 60
12 张三 25 男 english 70
12 李四 20 男 chinese 50
12 李四 20 男 math 50
12 李四 20 男 english 50
12 王芳 19 女 chinese 70
12 王芳 19 女 math 70
12 王芳 19 女 english 70
13 张大三 25 男 chinese 60
13 张大三 25 男 math 60
13 张大三 25 男 english 70
13 李大四 20 男 chinese 50
13 李大四 20 男 math 60
13 李大四 20 男 english 50
13 王小芳 19 女 chinese 70
13 王小芳 19 女 math 80
13 王小芳 19 女 english 70
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
需求如下:

val file = sc.textFile("file:///jar/score")
val name = file.map(x => {val line = x.split(" ");line(0) + "," + line(1)})
val numPeo = name.distinct.count()


1.1 一共有多少个小于20岁的人参加考试?
val file = sc.textFile("file:///jar/score")
val age = file.map(x => {val line = x.split(" ");line(0) + "," + line(1) + "," + line(2)})
val numPeo = age.distinct.filter(_.split(",")(2).toInt<20).count()

1.2  一共有多少个等于20岁的人参加考试?
val file = sc.textFile("file:///jar/score")
val age = file.map(x => {val line = x.split(" ");line(0) + "," + line(1) + "," + line(2)})
val numPeo = age.distinct.filter(_.split(",")(2).toInt == 20).count()

1.3 一共有多少个大于20岁的人参加考试?
val file = sc.textFile("file:///jar/score")
val age = file.map(x => {val line = x.split(" ");line(0) + "," + line(1) + "," + line(2)})
val numPeo = age.distinct.filter(_.split(",")(2).toInt == 20).count()

2. 一共有多个男生参加考试?
val file = sc.textFile("file:///jar/score")
val sex = file.map(x => {val line = x.split(" ");line(0) + "," + line(1) + "," + line(3)})
val numPeo = sex.distinct.filter(_.split(",")(2) == "男").count()

2.1 一共有多少个女生参加考试?
val file = sc.textFile("file:///jar/score")
val sex = file.map(x => {val line = x.split(" ");line(0) + "," + line(1) + "," + line(3)})
val numPeo = sex.distinct.filter(_.split(",")(2) == "女").count()

3. 12班有多少人参加考试?
val file = sc.textFile("file:///jar/score")
val classNum = file.map(x => {val line = x.split(" ");line(0) + "," + line(1) })
val numPeo = classNum.distinct.filter(_.split(",")(0).toInt == 12).count()
sc.makeRDD(Array(numPeo)).saveAsTextFile("file:///jar/result/class12numPeo")

3.1 13班有多少人参加考试?
val file = sc.textFile("file:///jar/score")
val classNum = file.map(x => {val line = x.split(" ");line(0) + "," + line(1) })
val numPeo = classNum.distinct.filter(_.split(",")(0).toInt == 13).count()
sc.makeRDD(Array(numPeo)).saveAsTextFile("file:///jar/result/class13numPeo")

4. 语文科目的平均成绩是多少?
val chineseLine = file.map(x => {val line = x.split(" "); line(4)+ "," + line(5)})
val chineseGennal = chineseLine.filter(_.split(",")(0) == "chinese")
val chineseLength = chineseGennal.count.toInt//6
val chineseSum = chineseGennal.map(_.split(",")(1).toInt).reduce(_ + _)//350
val chineseAvg = chineseSum/chineseLength//58
sc.makeRDD(Array(chineseGennal.map(_.split(",")(1).toInt)
                              .reduce(_ + _)/chineseGennal.count.toInt))
                              .saveAsTextFile("file:///jar/result/chineseAvg")

4.1 数学科目的平均成绩是多少?
val mathLine = file.map(x => {val line = x.split(" "); line(4)+ "," + line(5)})
val mathGennal = mathLine.filter(_.split(",")(0) == "math")
val mathLength = mathGennal.count.toInt
val mathSum = mathGennal.map(_.split(",")(1).toInt).reduce(_ + _)
val mathAvg = mathSum/mathLength
sc.makeRDD(Array(mathGennal.map(_.split(",")(1).toInt)
                           .reduce(_ + _)/mathGennal.count.toInt))
                           .saveAsTextFile("file:///jar/result/mathAvg")

4.2 英语科目的平均成绩是多少?
val englishLine = file.map(x => {val line = x.split(" "); line(4)+ "," + line(5)})
val englishGennal = englishLine.filter(_.split(",")(0) == "english")
val englishLength = englishGennal.count.toInt
val englishSum = englishGennal.map(_.split(",")(1).toInt).reduce(_ + _)
val englishAvg = englishSum/englishLength
sc.makeRDD(Array(englishGennal.map(_.split(",")(1).toInt)
                              .reduce(_ + _)/englishGennal.count.toInt))
                              .saveAsTextFile("file:///jar/result/englishAvg")

5. 单个人平均成绩是多少?
val scoreLine = file.map(x => {val line = x.split(" "); (line(0)+","+line(1),line(5).toInt)})
val perScore = scoreLine.map(a => (a._1,(a._2,1)))
                        .reduceByKey((a,b) => (a._1+b._1,a._2+b._2))
                        .map(y => (y._1,y._2._1/y._2._2))
                        .saveAsTextFile("file:///jar/result/perScore")

6. 12班平均成绩是多少?
val classScore12 = file.map(x => {val line = x.split(" "); (line(0),line(5).toInt)})
                       .filter(a =>(a._1 == "12"))
classScore12.map(a => (a._1,(a._2,1)))
            .reduceByKey((a,b) => (a._1+b._1,a._2+b._2))
            .map(y => (y._1,y._2._1/y._2._2))//12,60
            .saveAsTextFile("file:///jar/result/perClass12")

6.1 12班男生平均总成绩是多少?
val BoyclassScore12 = file.map(x => {val line = x.split(" "); (line(0) + "," + line(3) + "," + line(5).toInt)}).filter(_.split(",")(0) == "12").filter(_.split(",")(1)=="男")
val BoyclassScore12Num = BoyclassScore12.count//6
val BoyclassScore12Sum= BoyclassScore12.map(y => {val row = y.split(",");row(2).toInt}).reduce(_+_)//330
val BoyperClass12 = BoyclassScore12Sum/BoyclassScore12Num//55


6.2 12班女生平均总成绩是多少?
val GirlclassScore12 = file.map(x => {val line = x.split(" "); (line(0) + "," + line(3) + "," + line(5).toInt)}).filter(_.split(",")(0) == "12").filter(_.split(",")(1)=="女")
val GirlclassScore12Num = GirlclassScore12.count//3
val GirlclassScore12Sum= GirlclassScore12.map(y => {val row = y.split(",");row(2).toInt}).reduce(_+_)//210
val GirlperClass12 = GirlclassScore12Sum/GirlclassScore12Num//70

6.3.0 13班平均成绩是多少?
val classScore13 = file.map(x => {val line = x.split(" "); (line(0),line(5).toInt)}).filter(a =>(a._1 == "13"))
val perClass13 = classScore13.map(a => (a._1,(a._2,1))).reduceByKey((a,b) => (a._1+b._1,a._2+b._2)).map(y => (y._1,y._2._1/y._2._2))//12,63

6.3.1 13班男生平均总成绩是多少?
val BoyclassScore13 = file.map(x => {val line = x.split(" "); (line(0) + "," + line(3) + "," + line(5).toInt)}).filter(_.split(",")(0) == "13").filter(_.split(",")(1)=="男")
val BoyclassScore13Num = BoyclassScore13.count//6
val BoyclassScore13Sum= BoyclassScore13.map(y => {val row = y.split(",");row(2).toInt}).reduce(_+_)//350
val BoyperClass13 = BoyclassScore13Sum/BoyclassScore13Num//58
6.3.2 13班女生平均总成绩是多少?
val GirlclassScore13 = file.map(x => {val line = x.split(" "); (line(0) + "," + line(3) + "," + line(5).toInt)}).filter(_.split(",")(0) == "13").filter(_.split(",")(1)=="女")
val GirlclassScore13Num = GirlclassScore13.count//3
val GirlclassScore13Sum= GirlclassScore13.map(y => {val row = y.split(",");row(2).toInt}).reduce(_+_)//220
val GirlperClass13 = GirlclassScore13Sum/GirlclassScore13Num//73

7. 全校语文成绩最高分是多少?
val chineseLine = file.map(x => {val line = x.split(" "); line(4)+ "," + line(5)})
val chineseMax = chineseLine.distinct
                            .filter(_.split(",")(0) == "chinese")
                            .max


7.1 12班语文成绩最低分是多少?
val chineseLine12 = file.map(x => {val line = x.split(" "); line(0)+ "," + line(4)+ "," + line(5)})
val chineseMin12 = chineseLine12.distinct.
                                   .filter(_.split(",")(0).toInt == 12).
                                   .filter(_.split(",")(1) == "chinese")
                                   .min
                                   .saveAsTextFile("file:///jar/result/chineseMin12")

val chineseMax = file.map(x => {val line = x.split(" "); (line(4),line(5).toInt)})
sc.makeRDD(Array(chineseMax.filter( a=>(a._1.equals("chinese")))
                           .map(a => (a._2)).max))
                           .saveAsTextFile("file:///jar/result/chineseMin12")

7.2 13班数学最高成绩是多少?
val mathLine13 = file.map(x => {val line = x.split(" "); line(0)+ "," + line(4)+ "," + line(5)})
val mathMax13 = mathLine13.distinct.filter(_.split(",")(0).toInt == 13).filter(_.split(",")(1) == "math").max//mathMax13: String = 13,math,80

val mathLine13 = file.map(x => {val line = x.split(" "); (line(0)+ "," + line(4),line(5).toInt)})
val mathMax13 = mathLine13.filter(a => (a._1.split(",")(1).equals("math")) && (a._1.split(",")(0).equals("13"))).max//mathMax13: (String, Int) = (13,math,80)

8. 总成绩大于150分的12班的女生有几个?
val sumScore12Line = file.map(x => {val line = x.split(" "); (line(0)+","+line(1)+","+line(3),line(5).toInt)})
val sumScore12Dayu150 = sumScore12Line.reduceByKey(_+_).filter(a => (a._2>150 && a._1.split(",")(0).equals("12") && a._1.split(",")(2).equals("女"))).count

9. 总成绩大于150分,且数学大于等于70,且年龄大于等于19岁的学生的平均成绩是多少?
val complex1 = file.map(x => {val line = x.split(" "); (line(0)+","+line(1)+","+line(3),line(5).toInt)})
val complex2 = file.map(x => {val line = x.split(" "); (line(0)+","+line(1)+","+line(3)+","+line(4),line(5).toInt)})
 //过滤出总分大于150的,并求出平均成绩
val com1 = complex1.map(a => (a._1, (a._2, 1))).reduceByKey((a,b) => (a._1+b._1,a._2+b._2)).filter(a => (a._2._1>150)).map(t => (t._1,t._2._1/t._2._2))
//过滤出 数学大于等于70,且年龄大于等于19岁的学生
val com2 = complex2.filter(a => {val line = a._1.split(","); line(3).equals("math") && a._2>70})
                   .map(a => {val line2 = a._1.split(","); (line2(0)+","+line2(1)+","+line2(2),a._2.toInt)})

(com1).join(com2).map(a =>(a._1,a._2._1))
————————————————
版权声明:本文为CSDN博主「王峥jeff」的原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/qq_40825218/article/details/83720732

发布了12 篇原创文章 · 获赞 130 · 访问量 34万+

猜你喜欢

转载自blog.csdn.net/wdr2003/article/details/102599686