使用open()时出现“ValueError: embedded null character”错误。

Question

使用open()时出现“ValueError: embedded null character”错误。

20

我在学校学习Python，目前遇到作业难题。我们需要拿两个文件进行比较。但我只是想简单地打开这些文件以便使用，结果一直出现错误信息"ValueError: embedded null character"

file1 = input("Enter the name of the first file: ")
file1_open = open(file1)
file1_content = file1_open.read()

这个错误是什么意思？

- Erica

文件从哪里来？ - Padraic Cunningham

我的老师添加了测试文件，用于运行程序。其中一个测试器中第一个导致错误的文件是“Tests/4-test.txt”。 - Erica

你的字符串中嵌入了一个空字节，这在Python中无法工作，你需要删除这些空字节。你使用的是什么操作系统？ - Padraic Cunningham

如果您正在使用Linux，请尝试使用以下命令：tr -d '\000' < Tests/4-test.txt > Tests/4-test_cleaned.txt，并使用test_cleaned.txt。 - Padraic Cunningham

@Erica 请尝试：file1_open = open(file1, 'rb')，然后告诉我们。 - pippo1980

显示剩余2条评论

10个回答

8

Python 3.5 的文件默认编码为'utf-8'。

Windows的文件默认编码通常不同于'utf-8'。

如果你打算同时打开两个文本文件，可以尝试以下操作：

import locale
locale.getdefaultlocale()
file1 = input("Enter the name of the first file: ")
file1_open = open(file1, encoding=locale.getdefaultlocale()[1])
file1_content = file1_open.read()

标准库中应该有一些自动检测功能。

否则你可以自己创建：

def guess_encoding(csv_file):
    """guess the encoding of the given file"""
    import io
    import locale
    with io.open(csv_file, "rb") as f:
        data = f.read(5)
    if data.startswith(b"\xEF\xBB\xBF"):  # UTF-8 with a "BOM"
        return "utf-8-sig"
    elif data.startswith(b"\xFF\xFE") or data.startswith(b"\xFE\xFF"):
        return "utf-16"
    else:  # in Windows, guessing utf-8 doesn't work, so we have to try
        try:
            with io.open(csv_file, encoding="utf-8") as f:
                preview = f.read(222222)
                return "utf-8"
        except:
            return locale.getdefaultlocale()[1]

然后

file1 = input("Enter the name of the first file: ")
file1_open = open(file1, encoding=guess_encoding(file1))
file1_content = file1_open.read()

- stonebig

3

我不被允许将任何东西导入到我的程序中。 - Erica

6

尝试使用r（原始格式）。 r'D:\python_projects\templates\0.html'

- Rahul Pandey

5

在Windows中，当指定文件名的完整路径时，应该使用双反斜杠作为分隔符，而不是单个反斜杠。例如，应该使用C:\\FileName.txt，而不是C:\FileName.txt。

- srinivasan dasarathi

2

文件路径名的第一个斜杠会引发错误。

需要原始字符串，即r
原始字符串

FileHandle = open(r'..', encoding='utf8')

FilePath='C://FileName.txt'
FilePath=r'C:/FileName.txt'

这两行代码都是用于定义文件路径的，它们的作用相同。其中，第一行使用了单引号，第二行使用了原始字符串（r）。

- abc

2

当将文件复制到以数字开头的文件夹中时，我遇到了这个错误。如果在数字之前使用双\符号写入文件夹路径，则问题将得到解决。

- yunusemredemirbas

1

不要使用这个路径 D:\path\0.html，尝试使用这个路径 D:/path/0.html。错误的原因是Python解释器将 \0 解释为路径字符串。

- Viraj Patel

1

我使用以下代码也遇到了相同的错误：

with zipfile.ZipFile("C:\local_files\REPORT.zip",mode='w') as z:
    z.writestr(data)

这是因为我在writestr()方法中传递了字节串，即数据，但没有指定文件名Report.zip，它应该被保存在哪里。所以我改变了我的代码，然后它就可以工作了。

with zipfile.ZipFile("C:\local_files\REPORT.zip",mode='w') as z:
    z.writestr('Report.zip', data)

- kamakshi singh

1

问题是由于需要解码的字节数据导致的。

当你将一个变量插入解释器时，它显示它的repr属性，而print()函数使用str(在这种情况下相同)，并忽略所有不可打印字符，例如：\x00、\x01，并用其他字符替换它们。

一种解决方案是“解码”file1_content（忽略字节）。

file1_content = ''.join(x for x in file1_content if x.isprintable())

- Francisco de Larrañaga

-1

如果您正在尝试打开文件，则应使用由os生成的路径，如下所示：

import os
os.path.join("path","to","the","file")

- Sabito stands with Ukraine

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Алексей Семенихин · Accepted Answer

看起来你在使用 "\" 和 "/" 字符时出了问题。如果你在输入中使用了它们 - 尝试将其中一个更改为另一个...