类型错误：'str'不支持缓冲区接口

Question

类型错误：'str'不支持缓冲区接口

272

plaintext = input("Please enter the text you want to compress")
filename = input("Please enter the desired filename")
with gzip.open(filename + ".gz", "wb") as outfile:
    outfile.write(plaintext)

上述Python代码给我以下错误：

Traceback (most recent call last):
  File "C:/Users/Ankur Gupta/Desktop/Python_works/gzip_work1.py", line 33, in <module>
    compress_string()
  File "C:/Users/Ankur Gupta/Desktop/Python_works/gzip_work1.py", line 15, in compress_string
    outfile.write(plaintext)
  File "C:\Python32\lib\gzip.py", line 312, in write
    self.crc = zlib.crc32(data, self.crc) & 0xffffffff
TypeError: 'str' does not support the buffer interface

- Future King

1

@MikePennington：请解释一下为什么压缩文本没有用？ - galinette

7个回答

97

这个问题有一个更简单的解决方案。

您只需要在模式中添加一个t，使其变为wt。这会导致Python将文件作为文本文件而不是二进制文件打开。然后一切都会正常工作。

完整的程序如下：

plaintext = input("Please enter the text you want to compress")
filename = input("Please enter the desired filename")
with gzip.open(filename + ".gz", "wt") as outfile:
    outfile.write(plaintext)

- user1175849

它也能在Python2上运行吗？有没有办法让代码在Python2和Python3上都能运行？ - Loïc Faure-Lacroix

哇，伙计你真厉害！谢谢！让我给你点赞。这应该是被采纳的答案 :)) - Loïc

15

添加 "t" 可能会产生副作用。在 Windows 上，编码为文本的文件将换行符 ("\n") 转换为 CRLF ("\r\n")。 - BitwiseMan

43

如果不对Python 3中的“字符串”进行显式编码转换，就无法将其序列化为字节。

outfile.write(plaintext.encode('utf-8'))

可能是你想要的。此外，它适用于Python 2.x和3.x。

- user2665694

28

对于Python 3.x，您可以通过以下方式将文本转换为原始字节：

bytes("my data", "encoding")

例如：

bytes("attack at dawn", "utf-8")

返回：返回的对象可以与 outfile.write 一起使用。

- Skurmedel

10

这个问题通常发生在从py2转换到py3时。在py2中，plaintext既代表一个字符串，又代表一个字节数组类型，它是类型灵活的，能够摆动两边。在py3中，plaintext现在只是一个字符串，更加明确，当以二进制模式打开outfile时，方法outfile.write()实际上需要一个字节数组，因此会引发异常。将输入更改为plaintext.encode('utf-8')即可解决该问题。如果您对此有疑问，请继续阅读。

在py2中，文件写入声明使它看起来像是要传入一个字符串：file.write(str)。实际上，您传递的是一个字节数组，您应该像这样阅读声明：file.write(bytes)。如果您像这样阅读，问题就很简单了，file.write(bytes)需要一个字节类型，在py3中，要从str中获取bytes，您需要将其转换：

py3>> outfile.write(plaintext.encode('utf-8'))

为什么py2文档中声明file.write接受的是字符串？因为在py2中，声明的区别并不重要，因为：

py2>> str==bytes         #str and bytes aliased a single hybrid class in py2
True

py2的str-bytes类有一些方法/构造函数，使其在某些方面像字符串类，在其他方面像字节数组类。这对于file.write非常方便:

py2>> plaintext='my string literal'
py2>> type(plaintext)
str                              #is it a string or is it a byte array? it's both!

py2>> outfile.write(plaintext)   #can use plaintext as a byte array

为什么py3破坏了这个好系统？因为在py2中，基本字符串函数对于其他国家/地区的字符无法正常工作。例如：如何测量包含非ASCII字符的单词的长度？

py2>> len('¡no')        #length of string=3, length of UTF-8 byte array=4, since with variable len encoding the non-ASCII chars = 2-6 bytes
4                       #always gives bytes.len not str.len

一直以来，你认为在py2中请求字符串的len，实际上获取的是编码的字节数组长度。这种歧义是双重用途类的根本问题。任何方法调用的哪个版本应该被实现？

好消息是，py3修复了这个问题。它将str和bytes类分开。str类具有类似于字符串的方法，而单独的bytes类具有字节数组方法：

py3>> len('¡ok')       #string
3
py3>> len('¡ok'.encode('utf-8'))     #bytes
4

希望这能帮助消除疑惑，使迁移过程变得更加容易应对。

- Riaz Rizvi

4

>>> s = bytes("s","utf-8")
>>> print(s)
b's'
>>> s = s.decode("utf-8")
>>> print(s)
s

如果你想去掉令人烦恼的'b'字符，这对你可能会有用。如果有更好的想法，请随时建议或在此处进行编辑。我只是个新手。

- Tapasit Suesasiton

你也可以使用s.encode('utf-8')代替s = bytes("s", "utf-8")，这样就像 s.decode('utf-8')一样符合Pythonic风格。 - Hans Zimermann

4

在使用 django.test.TestCase 进行单元测试时，针对 Django，我更改了我的Python2语法：

def test_view(self):
    response = self.client.get(reverse('myview'))
    self.assertIn(str(self.obj.id), response.content)
    ...

使用 Python3 的 .decode('utf8') 语法：

def test_view(self):
    response = self.client.get(reverse('myview'))
    self.assertIn(str(self.obj.id), response.content.decode('utf8'))
    ...

- Aaron Lelevier

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Michał Niklas · Accepted Answer

如果你使用Python3.x，那么string类型与Python2.x不同，你必须将其转换为字节（编码）。

plaintext = input("Please enter the text you want to compress")
filename = input("Please enter the desired filename")
with gzip.open(filename + ".gz", "wb") as outfile:
    outfile.write(bytes(plaintext, 'UTF-8'))

不要使用像 string 或 file 这样的变量名，因为它们是模块或函数的名称。

编辑 @Tom

是的，非 ASCII 文本也可以被压缩/解压缩。我使用 UTF-8 编码的波兰字母：

plaintext = 'Polish text: ąćęłńóśźżĄĆĘŁŃÓŚŹŻ'
filename = 'foo.gz'
with gzip.open(filename, 'wb') as outfile:
    outfile.write(bytes(plaintext, 'UTF-8'))
with gzip.open(filename, 'r') as infile:
    outfile_content = infile.read().decode('UTF-8')
print(outfile_content)