我有一个类似这样的Spark DataFrame:
+------+--------+--------------+--------------------+
| dbn| boro|total_students| sBus|
+------+--------+--------------+--------------------+
|17K548|Brooklyn| 399|[B41, B43, B44-SB...|
|09X543| Bronx| 378|[Bx13, Bx15, Bx17...|
|09X327| Bronx| 543|[Bx1, Bx11, Bx13,...|
+------+--------+--------------+--------------------+
我该如何使每一行复制sBus中的每个元素,并且将sBus变成普通字符串列?
结果应该像这样:
+------+--------+--------------+--------------------+
| dbn| boro|total_students| sBus|
+------+--------+--------------+--------------------+
|17K548|Brooklyn| 399| B41 |
|17K548|Brooklyn| 399| B43 |
|17K548|Brooklyn| 399| B44-SB |
+------+--------+--------------+--------------------+
and so on...
sBus
和sSw
之间的笛卡尔积作为结果吗? - zero323explode
函数(例如请参考 http://stackoverflow.com/q/36484385/1560062),但如果你有多个列,那么这并不是那么简单的。 - zero323