使用subprocess.run时出现UTF-8编码异常

Question

使用subprocess.run时出现UTF-8编码异常

4

我在使用subprocess.run函数执行一些命令时遇到了麻烦，因为这些命令包含有重音符号的字符（例如 "é"）。

考虑下面这个简单的例子：

# -*- coding: utf-8 -*-
import subprocess

cmd = "echo é"

result = subprocess.run(cmd, shell=True, stdout=subprocess.PIPE)

print("Output of subprocess.run : {}".format(result.stdout.hex()))
print("é char encoded manually : {}".format("é".encode("utf-8").hex()))

它产生以下输出:

Output of subprocess.run : 820d0a
é char encoded manually : c3a9

我不理解subprocess.run返回的值，难道它也应该是c3a9吗？我知道0d0a是CR+LF，但为什么是82?

因此，当我尝试运行这一行代码时：

output = result.stdout.decode("utf-8")

我遇到了一个UnicodeDecodeError异常，错误信息如下：'utf-8'解码器无法解码0x82字节，位于位置0，起始字节无效

我尝试像这样明确指定编码格式：

result = subprocess.run(cmd, shell=True, stdout=subprocess.PIPE, encoding="utf-8")

但是当调用subprocess.run时，会引发相同的异常（'utf-8' codec can't decode byte 0x82 in position 0: invalid start byte）。

我在使用Python3.8.5版本的Windows 10上运行此程序。

希望有人能够帮助解决这个问题，可以给一些提示吗？

- PaulM

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- MagnusO_O · Accepted Answer

尝试修复：使用cp437解码。

print("Output of subprocess.run : {}".format(result.stdout.decode('cp437')))

# or

result = subprocess.run(cmd, shell=True, stdout=subprocess.PIPE, text=True, 
                        encoding="cp437")

print(f"Output of subprocess.run : {result.stdout}")

从其他Stackoverflow答案来看，Windows终端代码问题似乎是旧问题，现在应该已经修复了，但似乎仍然存在。

https://dev59.com/EFjUa4cB1Zd3GeqPPTuX#37260867

无论如何，我对Windows 10终端编码没有更深入的了解，但是cp437适用于我的Win10系统。

然而，Python 3.9.13 documentation 3. Using Python on Windows 3.7. UTF-8 mode提供了一个选项来临时或永久更改编码（请注意文档中提到的警告）。