如何在 Python 中打开 json.gz.part 文件？

Question

如何在 Python 中打开 json.gz.part 文件？

3

我有一个目录里有很多json.gz文件，其中一些是json.gz.part。据说，在保存它们时，一些文件太大了，所以它们被分割了。

我尝试使用以下方式正常打开它们：

with gzip.open(file, 'r') as fin:
        json_bytes = fin.read()  
    json_str = json_bytes.decode('utf-8')            # 2. string (i.e. JSON)
    bb = json.loads(json_str)

但是当涉及到 .gz.part 文件时，我会收到一个错误：

uncompress = self._decompressor.decompress(buf, size)

error: Error -3 while decompressing data: invalid code lengths set

我尝试了jiffyclub's的解决方案，但是我收到了以下错误信息：

    _read_eof = gzip.GzipFile._read_eof

AttributeError: type object 'GzipFile' has no attribute '_read_eof'

编辑：

如果我逐行阅读，我能够读取大部分内容文件，直到出现错误：

with gzip.open(file2,'r') as fin:        
        for line in fin: 
            print(line.decode('utf-8'))

打印大部分内容后，我得到：

error: Error -3 while decompressing data: invalid code lengths set

但是使用这种最后一种方法，我无法将其内容转换为JSON文件。

- Marlon Teixeira

3

这里的.part是指还有其他部分，还是指“部分下载”，需要等待下载完成？ - tadman

1

这是一个很好的观点。我可以尝试重新下载它们以便检查一下。 - Marlon Teixeira

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Silvio Borges · Accepted Answer

import gzip
import shutil

# open the .gz file
with gzip.open('file.gz.part', 'rb') as f_in:
    # open the decompressed file
    with open('file.part', 'wb') as f_out:
        # decompress the .gz file and write the decompressed data to the decompressed file
        shutil.copyfileobj(f_in, f_out)

# now you can open the decompressed file
with open('file.part', 'r') as f:
    # do something with the file
    contents = f.read()

这段代码将打开.gz.part文件，解压数据，并将解压后的数据写入名为file.part的新文件。然后您可以像处理其他文本文件一样打开file.part文件并读取其内容。