python
我正在使用以下方式将CSV文件读入Pyspark数据框(InputDataFrame)::
InputDataFrame = spark.read.csv(path=file_path,inferSchema=True,ignoreLeadingWhiteSpace=True,header=True)
阅读完了之后,我正在使用:
InputDataFrame.schema.names
查找列名。但我在控制台上得到了以下日志::
Traceback (most recent call last):
File "/snap/pycharm-community/143/helpers/pydev/_pydevd_bundle/pydevd_xml.py", line 284, in frame_vars_to_xml
xml += var_to_xml(v, str(k), evaluate_full_value=eval_full_val)
File "/snap/pycharm-community/143/helpers/pydev/_pydevd_bundle/pydevd_xml.py", line 384, in var_to_xml
xml_shape = ' shape="%s"' % make_valid_xml_value(str(v.shape))
File "/home/ajinkya/.local/lib/python3.6/site-packages/pyspark/sql/dataframe.py", line 1300, in __getattr__
"'%s' object has no attribute '%s'" % (self.__class__.__name__, name))
AttributeError: 'DataFrame' object has no attribute 'shape'
Unexpected error, recovered safely.
有人能解释一下为什么会发生这种情况吗?还有没有其他方法可以找到Pyspark Dataframe的推断模式。
''' 使用Pycharm IDE进行开发 '''
InputDataFrame.printSchema()
。 - MaFFFile "/snap/pycharm-community/143/helpers/pydev/_pydevd_bundle/pydevd_xml.py", line 384, in var_to_xml xml_shape = ' shape="%s"' % make_valid_xml_value(str(v.shape))
,您在第384行调用了v.shape,而v是一个Spark DataFrame,而Spark DataFrame并没有shape属性。 - Paul