如何使用\u转义码对Python 3字符串进行编码？

Question

如何使用\u转义码对Python 3字符串进行编码？

10

在Python 3中，假设我有以下代码：

>>> thai_string = 'สีเ'

使用encode会给出：

>>> thai_string.encode('utf-8')
b'\xe0\xb8\xaa\xe0\xb8\xb5'

我的问题是：如何让encode()返回使用\u而不是\x的bytes序列？以及如何将它们解码回Python 3的str类型？

我尝试使用内置的ascii，但是它会返回：

>>> ascii(thai_string)
"'\\u0e2a\\u0e35'"

但是这似乎不太对，因为我无法解码它以获取 thai_string。

Python 文档告诉我：

\xhh 转义十六进制值为 hh 的字符
\uxxxx 转义十六位十六进制值为 xxxx 的字符

文档说，\u 仅用于字符串文字，但我不确定这意味着什么。这暗示我的问题有缺陷吗？

- Michael Currie

.decode('utf-8') 是怎么样的呢？Python 中的字符串不是本来就是 Unicode 吗？ - Zizouz212

@Zizouz212，thai_string和ascii(thai_string)都没有decode方法，而thai_string.encode('utf-8').decode('utf-8')会让我回到起点，即thai_string，这不是期望的输出。 - Michael Currie

Python文档与转义序列\u相关的内容：https://docs.python.org/3/reference/lexical_analysis.html和https://docs.python.org/3/library/codecs.html#encodings-and-unicode。 - 0 _

这回答解决了你的问题吗？如何在Python中处理代理对？ - ti7

@FelipeBuccioni 那段代码会破坏包含反斜杠后跟文字x的字符串。 - benrg

显示剩余2条评论

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Simeon Visser · Accepted Answer

你可以使用 unicode_escape：

>>> thai_string.encode('unicode_escape')
b'\\u0e2a\\u0e35\\u0e40'

请注意，encode()始终会返回一个字节字符串（bytes），而unicode_escape编码旨在：

生成适用于Python源代码中的Unicode文字面值的字符串