如何使用Python读取7z文件的内容

Question

如何使用Python读取7z文件的内容

40

我该如何读取和保存7z文件的内容？我使用Python 2.7.9，我可以像这样进行解压缩或归档操作，但是我无法在Python中读取文件的内容，只能在CMD中列出文件的内容。

import subprocess
import os

source = 'filename.7z'
directory = 'C:\Directory'
pw = '123456'
subprocess.call(r'"C:\Program Files (x86)\7-Zip\7z.exe" x '+source +' -o'+directory+' -p'+pw)

- Ken Kem

可能是重复的问题：Python - 如何使用7zip进行压缩而不是zip，更改代码 - John Zwinck

7个回答

7

我最终陷入了这种情况：被迫使用7z，并且需要准确知道每个zip存档文件中提取的文件。为应对此问题，您可以检查调用7z后的输出并查找文件名。以下是7z输出的格式：

$ 7z l sample.zip

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=utf8,Utf16=on,HugeFiles=on,64 bits,8 CPUs x64)

Scanning the drive for archives:
1 file, 472 bytes (1 KiB)

Listing archive: sample.zip

--
Path = sample.zip
Type = zip
Physical Size = 472

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
2018-12-01 17:09:59 .....            0            0  sample1.txt
2018-12-01 17:10:01 .....            0            0  sample2.txt
2018-12-01 17:10:03 .....            0            0  sample3.txt
------------------- ----- ------------ ------------  ------------------------
2018-12-01 17:10:03                  0            0  3 files

以及如何使用Python解析该输出：

import subprocess

def find_header(split_line):
    return 'Name' in split_line and 'Date' in split_line

def all_hyphens(line):
    return set(line) == set('-')

def parse_lines(lines):
    found_header = False
    found_first_hyphens = False
    files = []
    for line in lines:

        # After the header is a row of hyphens
        # and the data ends with a row of hyphens
        if found_header:
            is_hyphen = all_hyphens(''.join(line.split()))

            if not found_first_hyphens:
                found_first_hyphens = True
                # now the data starts
                continue

            # Finding a second row of hyphens means we're done
            if found_first_hyphens and is_hyphen:
                return files

        split_line = line.split()

        # Check for the column headers
        if find_header(split_line):
            found_header=True
            continue

        if found_header and found_first_hyphens:
            files.append(split_line[-1])
            continue

    raise ValueError("We parsed this zipfile without finding a second row of hyphens")



byte_result=subprocess.check_output('7z l sample.zip', shell=True)
str_result = byte_result.decode('utf-8')
line_result = str_result.splitlines()
files = parse_lines(line_result)

- Kyle Heuton

5

注意，整段代码可以简化为 [l.split()[-1] for l in str_result.rsplit("\n\n",1)[-1].splitlines()[2:-2]]。 - bfontaine

5

你可以使用 libarchive 或者 pylzma。如果你可以升级到python3.3+，你也可以使用标准库中的lzma。

- mr nick

2

我一直使用Python 2.7.9，不知道3.3+是否有7z的标准库，非常感谢。 - Ken Kem

61

请注意，lzma 只适用于单个文件，而不是 7z 归档文件。 - bfontaine

17

所以lzma不是正确的库，我只是浪费时间让它能够工作了。点踩。 - shinzou

3

libarchive 默认情况下无法在 Windows 上工作。它会安装失败并显示以下错误：Library can not be loaded: Could not find module 'libarchive.so'.... 您需要自己编译 libarchive.dll 并将其及其依赖的其他几个 .dll 添加到系统 PATH 中识别的某个文件夹中。 - user136036

3

你可以使用pyunpack和patool库。

!pip install pyunpack

!pip install patool

from pyunpack import Archive
Archive('7z file source').extractall('destination')

refer

https://pypi.org/project/patool/
https://pypi.org/project/pyunpack/

- Ruman Khan

1

支出和调用7z将提取文件，然后您可以使用open()打开这些文件。

如果您想直接在Python中查看7z存档的内容，则需要使用库。这是一个：https://pypi.python.org/pypi/libarchive - 如我所说，我不是Python用户，无法保证它的可靠性，但通常在所有语言中使用第三方库都很容易。

一般来说，7z支持似乎有限。如果您可以使用替代格式（zip/gzip），那么我认为您会发现Python库（和示例代码）的范围更加全面。

希望对您有所帮助。

- EyePeaSea

在实践中，使用subprocess模块非常有用来使用7z，你真的需要单独从命令行测试7zip命令，并依靠文档正确构建命令。一旦你理解如何形成7z命令行命令，就可以准备在Python中构建你的子进程命令了。 - cdabel

0

以下是我如何使用Python获取test.7z中所有文件列表的方法：

from subprocess import Popen, PIPE
proc = Popen([r"C:\Program Files\7-Zip\7z.exe", "l", "-ba", "-slt", "test.7z"], stdout=PIPE)
files = [l.split('Path = ')[1] for l in proc.stdout.read().decode().splitlines() if l.startswith('Path = ')]

按照使用7zip命令行列出zip文件内容的非冗长机器友好输出方法进行操作。

如果您不想安装其他依赖包，这是一个有用的解决方案。

- Basj

0

通过以下方式提取目录中的所有 .7z 文件。

首先，安装

!pip install patool
!pip install pyunpack

然后

import os
from pyunpack import Archive

path = "path_to_file"
file_type = '.7z'

for filename in os.listdir(path=path):
    if filename.endswith(file_type):
        print(filename)
        print(f"{path}/{filename}")
        Archive(f"{path}/{filename}").extractall(f"{path}")```

- Emmanuel Ogungbemi

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Zhou Hongbo · Accepted Answer

如果您使用Python 3，那么有一个有用的库py7zr，它支持7zip压缩，解压缩，加密和解密。

import py7zr
with py7zr.SevenZipFile('sample.7z', mode='r') as z:
    z.extractall()