我可以帮您将多个数据框列合并为一个字符串数组吗?
例如,我有这个数据框:
这是一个例子:
例如,我有这个数据框:
val df = sqlContext.createDataFrame(Seq((1, "Jack", "125", "Text"), (2,"Mary", "152", "Text2"))).toDF("Id", "Name", "Number", "Comment")
这是一个例子:
它看起来像这样:
scala> df.show
+---+----+------+-------+
| Id|Name|Number|Comment|
+---+----+------+-------+
| 1|Jack| 125| Text|
| 2|Mary| 152| Text2|
+---+----+------+-------+
scala> df.printSchema
root
|-- Id: integer (nullable = false)
|-- Name: string (nullable = true)
|-- Number: string (nullable = true)
|-- Comment: string (nullable = true)
我该如何进行转换,使其看起来像这样:
scala> df.show
+---+-----------------+
| Id| List|
+---+-----------------+
| 1| [Jack,125,Text]|
| 2| [Mary,152,Text2]|
+---+-----------------+
scala> df.printSchema
root
|-- Id: integer (nullable = false)
|-- List: Array (nullable = true)
| |-- element: string (containsNull = true)