Python UnicodeEncodeError: 'ascii'编解码器无法编码字符

Question

Python UnicodeEncodeError: 'ascii'编解码器无法编码字符

3

有一个包含一些中文字符的json数据。

{
  "font_size": "47",
  "sentences": [
    "你好",
    "sample sentence1",
    "sample sentence2",
    "sample sentence3",
    "sample sentence4",
    "sample sentence5",
    "sample sentence6",
    "sample sentence7",
    "sample sentence8",
    "sample sentence9"
  ]
}

我创建了一个Flask应用，并使用它来接收上述JSON数据。我使用以下curl命令来发布数据。

curl -X POST \
  http://0.0.0.0:5000/ \
  -H 'Cache-Control: no-cache' \
  -H 'Content-Type: application/json;charset=UTF-8' \
  -H 'Postman-Token: af380f7a-42a8-cfbb-9177-74bb348ce5ed' \
  -d '{
  "font_size": "47",
  "sentences": [
    "你好",
    "sample sentence1",
    "sample sentence2",
    "sample sentence3",
    "sample sentence4",
    "sample sentence5",
    "sample sentence6",
    "sample sentence7",
    "sample sentence8",
    "sample sentence9"
  ]
}'

当我收到来自 request.data 的 json 数据后，我将其转换为 json 格式。实际上，request.data 是一个 str 类型。

json_data = json.loads(request.data)

那么我需要使用json_data来格式化一个字符串。

subtitles.format(**json_data)

我遇到了一个错误。

UnicodeEncodeError: 'ascii'编解码器无法对0-1位置的字符进行编码：该编号不在128的范围内

如何解决？谢谢提前。

编辑

subtitles是从文件中读取的。

subtitles_file = "{path}/templates/sorry/subtitles.ass".format(path=root_path)
with open(subtitles_file, 'r') as file:
     subtitles = file.read()

使用Python 2还是Python 3

我使用的是Python 2，但出现了这个错误。然而，Python 3可以自动处理此问题。因此，请享用Python 3。

- CoXier

是的，我知道这个链接，但我不知道如何修复我的程序。也许我很蠢。 - CoXier

subtitles 定义在哪里？尝试将其编码为 Unicode 字符串，然后进行格式化。 - sshashank124

这是Python 2还是3？对于许多问题来说，这并没有什么区别，但对于Unicode编码问题来说，通常非常重要。 - abarnert

此外，如果这确实是Python 2，你使用Python 2的原因是什么？因为如果你使用当前版本，这些问题将自动消失。(另外，Python 3是Flask和大多数其他软件包的主要支持版本。) - abarnert

根据您在我的答案上的评论，我已经添加了python-2.7标签，但是在将来，请在您认为相关时自行添加它，特别是当有人在评论中直接询问时。 - abarnert

显示剩余4条评论

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- abarnert · Accepted Answer

在Python 2中，当你使用open和read打开文件时，得到的是一个普通的str，而不是unicode。

同时，即使request.data是str而不是unicode，如果其中任何字符串是非ASCII字符，json_data将包含unicode。

因此，当你执行subtitles.format时，它会尝试使用默认编码对每个unicode进行encode——如果你没有做任何处理，这个默认编码是ASCII。这就会导致出现这个错误。

最简单的解决方法是将subtitles更改为unicode。像这样：

with open(subtitles_file, 'r') as file:
    subtitles = file.read().decode('utf-8')

…或：

with codecs.open(subtitles_file, 'r', 'utf-8') as file:
    subtitles = file.read()

我猜你想要使用UTF-8编码；如果你的文件是其他编码方式，那么请使用对应的编码方式。