UnicodeEncodeError: 'ascii'编解码器无法在位置6处编码字符u'\u2019'，因为其序号不在128的范围内。

Question

UnicodeEncodeError: 'ascii'编解码器无法在位置6处编码字符u'\u2019'，因为其序号不在128的范围内。

pythonpython-2.7web-scrapingpython-unicode

10

我正在尝试从TripAdvisor获取阿姆斯特丹的500家餐厅名单，但在第308家餐厅后，我收到了以下错误提示：

Traceback (most recent call last):
  File "C:/Users/dtrinh/PycharmProjects/TripAdvisorData/LinkPull-HK.py", line 43, in <module>
    writer.writerow(rest_array)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 6: ordinal not in range(128)

我尝试了在StackOverflow上找到的几种方法，但目前没有任何有效的解决方案。我想知道是否有人可以查看我的代码，并提供潜在的解决方案。

        for item in soup2.findAll('div', attrs={'class', 'title'}):
            if 'Cuisine' in item.text:
                item.text.strip()
                content = item.findNext('div', attrs=('class', 'content'))
                cuisine_type = content.text.encode('utf8', 'ignore').strip().split(r'\xa0')
        rest_array = [account_name, rest_address, postcode, phonenumber, cuisine_type]
        #print rest_array
        with open('ListingsPull-Amsterdam.csv', 'a') as file:
                writer = csv.writer(file)
                writer.writerow(rest_array)
    break

- wavey

1

cuisine_type 是一个列表，因为你使用了 .split（而且我不确定为什么要在非断点空格上分割...）。然而，传递给 .writerow 的行的内容需要是字符串或数字。此外，在使用 Python 2 的 csv 模块时，应该以二进制模式打开 CSV 文件，如文档中所述。你可能会发现这篇文章有帮助：实用 Unicode，它是由 SO 老手 Ned Batchelder 写的。 - PM 2Ring

3个回答

4

你正在将非ASCII字符写入CSV输出文件。请确保使用适当的字符编码打开输出文件，以允许字符进行编码。一个安全的选择通常是UTF-8。尝试以下操作：

with open('ListingsPull-Amsterdam.csv', 'a', encoding='utf-8') as file:
    writer = csv.writer(file)
    writer.writerow(rest_array)

编辑：这是针对 Python 3.x 的，抱歉。

- Irmen de Jong

1

我认为他在Python 3上不会遇到这个错误。他的问题标记为“python-2.7”。 - Laurent LAPORTE

-1

在你的脚本开头添加这些行。

import sys
reload(sys)
sys.setdefaultencoding('utf-8')

- Rahul Prasad

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Laurent LAPORTE · Accepted Answer

rest_array 包含 Unicode 字符串。当你使用 csv.writer 写入行时，需要将字节字符串序列化（你正在使用 Python 2.7）。

我建议你使用 "utf8" 编码：

with open('ListingsPull-Amsterdam.csv', mode='a') as fd:
    writer = csv.writer(fd)
    rest_array = [text.encode("utf8") for text in rest_array]
    writer.writerow(rest_array)

注意：请不要使用file作为变量名，因为它会遮盖内置函数file()（一个别名为open()函数的函数）。

如果您想在Microsoft Excel中打开这个CSV文件，您可以考虑使用另一种编码方式，例如"cp1252"（它允许使用字符u"\u2019"）。