使用Python Requests下载压缩的tar.gz文件并使用tar解压缩

13

我需要使用请求调用来下载一个tar gz文件,我发现requests.get会自动解压缩文件,我尝试使用这里提供的解决方案,但是当我尝试使用tar解压缩时,它说它不是gzip格式。

我尝试了以下方法:

response = requests.get(url,auth=(user, key),stream=True)
if response.status_code == 200:
    with open(target_path, 'wb') as f:
        f.write(response.raw)

if response.status_code == 200:
    with open(target_path, 'wb') as f:
        f.write(response.raw)

raw = response.raw
with open(target_path, 'wb') as out_file:
    while True:
        chunk = raw.read(1024, decode_content=True)
        if not chunk:
            break
        out_file.write(chunk) 

在解压缩的同时,所有上述内容都会引发错误:

$ tar -xvzf /tmp/file.tar.gz -C /

gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now

注意:由于需要进行身份验证等操作,因此无法使用 urllib.open ,我必须使用 requests 库。

2个回答

25

您只需要将f.write(response.raw)更改为f.write(response.raw.read())

尝试下面的代码,这应该会给您一个正确的tar gz文件。

import requests

url = 'https://pypi.python.org/packages/source/x/xlrd/xlrd-0.9.4.tar.gz'
target_path = 'xlrd-0.9.4.tar.gz'

response = requests.get(url, stream=True)
if response.status_code == 200:
    with open(target_path, 'wb') as f:
        f.write(response.raw.read())

0
对于仅使用内置解决方案(无第三方库requests)的情况,可以使用urllib.request.Request
from urllib import request

url = "https://pypi.python.org/packages/source/x/xlrd/xlrd-0.9.4.tar.gz"
target_path = "xlrd-0.9.4.tar.gz"

# https://docs.python.org/3/library/urllib.request.html#urllib.request.Request
# NOTE: Transfer-Encoding: chunked (streaming) will be auto-selected
with request.urlopen(request.Request(url), timeout=15.0) as response:
    if response.status == 200:
        with open(target_path, "wb") as f:
            f.write(response.read())

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接