我发现PySpark有一个名为drop
的方法,但它似乎只能一次删除一列。有什么办法可以同时删除多列?
df.drop(['col1','col2'])
TypeError Traceback (most recent call last)
<ipython-input-96-653b0465e457> in <module>()
----> 1 selectedMachineView = machineView.drop([['GpuName','GPU1_TwoPartHwID']])
/usr/hdp/current/spark-client/python/pyspark/sql/dataframe.pyc in drop(self, col)
1257 jdf = self._jdf.drop(col._jc)
1258 else:
-> 1259 raise TypeError("col should be a string or a Column")
1260 return DataFrame(jdf, self.sql_ctx)
1261
TypeError: col should be a string or a Column
DataFrame.drop(*cols)
中,cols
是一个Python列表,在它前面放置星号将其转换为位置参数。 - Mike Williamson