Pandas.to_csv 出现错误:“ascii”编解码器无法在位置8处编码字符u'\u2013':序数不在128的范围内。

3
我正在尝试将一份panda数据框保存为csv文件,但遇到了以下错误:
df.to_csv(location, sep='|', index=False, header=True)

'ascii'编解码器无法在第8位编码字符u'\u2013':该序数不在128范围内

我当前的pandas版本为:

>>> import pandas as pd
>>> pd.__version__
u'0.19.2'
>>>

在另一台机器上,同样的命令可以运行。安装的pandas版本为0.18.1。

>>> import pandas as pd
>>> pd.__version__
u'0.18.1'
>>>

我知道加上 encoding='utf-8' 可以解决这个错误。但是我想知道是否有最近的更改导致了 pandas 的后续版本出现了问题。

谢谢。

1个回答

1

https://github.com/pandas-dev/pandas/blob/v0.18.1/pandas/core/frame.py中,我们找到了:

formatter = fmt.CSVFormatter(self, path_or_buf,
                                     line_terminator      = line_terminator,
                                     sep                  = sep,
                                     encoding             = encoding,
                                     compression          = compression,
                                     quoting              = quoting,
                                     na_rep               = na_rep,
                                     float_format         = float_format,
                                     cols                 = columns,
                                     header               = header,
                                     index                = index,
                                     index_label          = index_label,
                                     mode                 = mode,
                                     chunksize            = chunksize,
                                     quotechar            = quotechar,
                                     engine               = kwds.get("engine"),
                                     tupleize_cols        = tupleize_cols,
                                     date_format          = date_format,
                                     doublequote          = doublequote,
                                     escapechar           = escapechar,
                                     decimal              = decimal     ) 

我们发现:https://github.com/pandas-dev/pandas/blob/v0.19.2/pandas/core/frame.py

formatter = fmt.CSVFormatter(self,  path_or_buf,
                                    line_terminator =line_terminator,
                                    sep             =sep,
                                    encoding        =encoding,
                                    compression     =compression,
                                    quoting         =quoting,
                                    na_rep          =na_rep,
                                    float_format    =float_format,
                                    cols            =columns,
                                    header          =header,
                                    index           =index,
                                    index_label     =index_label,
                                    mode            =mode,
                                    chunksize       =chunksize,
                                    quotechar       =quotechar,
                                    
                                    tupleize_cols   =tupleize_cols,
                                    date_format     =date_format,
                                    doublequote     =doublequote,
                                    escapechar      =escapechar,
                                    decimal         =decimal)                                     

唯一的区别在于“engine”参数... 现在我们应该更深入地了解这个“engine”参数:-(在这里:https://github.com/pandas-dev/pandas/blob/v0.18.1/pandas/formats/format.py和在这里:https://github.com/pandas-dev/pandas/blob/v0.19.2/pandas/formats/format.py祝你好运!

感谢您的回答和有用的链接。我已经升级到最新版本,一切都很好。 - ProgSky

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接