【Spark】Spark 重命名 DataFrame 的列名

val df = Seq((2L, "a", "foo", 3.0)).toDF
df.printSchema
// root
//  |-- _1: long (nullable = false)
//  |-- _2: string (nullable = true)
//  |-- _3: string (nullable = true)
//  |-- _4: double (nullable = false)

最简单的办法toDF方法

val schemas= Seq("id", "x1", "x2", "x3")
val dfRenamed = df.toDF(schemas: _*)
 
dfRenamed.printSchema
// root
// |-- id: long (nullable = false)
// |-- x1: string (nullable = true)
// |-- x2: string (nullable = true)
// |-- x3: double (nullable = false)

如果要重命名单个列,可以使用以下任select一项alias:

df.select($"_1".alias("x1"))

可以很容易到多列:

val lookup = Map("_1" -> "foo", "_3" -> "bar")
 
df.select(df.columns.map(c => col(c).as(lookup.getOrElse(c, c))): _*)

或者withColumnRenamed

df.withColumnRenamed("_1", "x1")

发布了94 篇原创文章 · 获赞 110 · 访问量 5048

猜你喜欢

转载自blog.csdn.net/beautiful_huang/article/details/103892036