Spark rewrite sorting rules (1)

1. The sample class implements custom sorting and needs to implement the ordered feature, does not need to implement serializable, and does not need new objects
. 2. The common class implements custom sorting, needs to implement the ordered feature, and implements the serializable (common class in this article)




name name age age Yan The value fv
text is Array("mimi1 22 86", "mimi2 22 86", "mimi3 23 87")
in descending order of appearance and age

import org.apache.spark.rdd.RDD
import org.apache.spark.{
    
    SparkConf, SparkContext}

object CustomSort_1 {
    
    
  def main(args: Array[String]): Unit = {
    
    
    val conf = new SparkConf()
    conf.setAppName(this.getClass.getName).setMaster("local[2]")
    val sc: SparkContext = new SparkContext(conf)
    val userInfo: RDD[String] = sc.parallelize(Array"mimi1 22 86", "mimi2 22 86", "mimi3 23 87"))
     //对文本进行拆分,并返回一个person1对象
    val personRDD: RDD[person1] = userInfo.map(x => {
    
    
      val arr = x.split(" ")
      val name = arr(0)
      val age = arr(1).toInt
      val fv = arr(2).toInt
      new person1(name, age, fv)
    })
    //指定排序规则x=>x既按照person1的compare进行排序
    val sorted: RDD[person1] = personRDD.sortBy(x => x)
    println(sorted.collect.toBuffer)
  }
}
//普通类实现自定义排序,需要实现ordered特质,实现serializable
// 样例类实现自定义排序,需要实现ordered特质,不需要实现serializable,不需要new对象
//case class person1.....   使用时 person1(name, age, fv)
class person1(val name:String,val age:Int, val fv:Int) extends Serializable with Ordered[person1]{
    
    
  override def compare(that: person1): Int = {
    
    
    if(this.fv!=that.fv)
     that.fv- this.fv
    else that.age - this.age
  }

  override def toString: String = s"$name,$age,$fv"
}

operation result

ArrayBuffer(mimi3,23,87, mimi1,22,86, mimi2,22,86)

Guess you like

Origin blog.csdn.net/qq_42706464/article/details/108354900