我使用Python 2.7的request模块将大块数据发送到一个我无法更改的服务。由于数据主要是文本,因此虽然它很大但可以压缩得很好。服务器会接受gzip-或deflate-编码,但是我不知道如何指示请求自动进行POST并正确地对数据进行编码。
是否有一个最简化的示例可用,展示如何实现这一点?
# Works if backend supports gzip
additional_headers['content-encoding'] = 'gzip'
request_body = zlib.compress(json.dumps(post_data))
r = requests.post('http://post.example.url', data=request_body, headers=additional_headers)
gzip
库而不是zlib
。因此,我设置了payload = gzip.compress(json.dumps(payload).encode('utf-8'))
并设置了标题:Content-Type=application/json
和Content-Encoding=gzip
。 - tobycoleman我已经测试了Robᵩ提出的解决方案并进行了一些修改,它确实有效。
伪代码(抱歉,我从我的代码中推断了它,因此我不得不裁剪掉一些部分,并且没有进行测试,但是您可以获得您的想法)
additional_headers['content-encoding'] = 'gzip'
s = StringIO.StringIO()
g = gzip.GzipFile(fileobj=s, mode='w')
g.write(json_body)
g.close()
gzipped_body = s.getvalue()
request_body = gzipped_body
r = requests.post(endpoint_url, data=request_body, headers=additional_headers)
我需要将我的帖子进行分块处理,因为我有几个非常大的文件正在并行上传。这是我想出的解决方案。
最初的回答:
import requests
import zlib
"""Generator that reads a file in chunks and compresses them"""
def chunked_read_and_compress(file_to_send, zlib_obj, chunk_size):
compression_incomplete = True
with open(file_to_send,'rb') as f:
# The zlib might not give us any data back, so we have nothing to yield, just
# run another loop until we get data to yield.
while compression_incomplete:
plain_data = f.read(chunk_size)
if plain_data:
compressed_data = zlib_obj.compress(plain_data)
else:
compressed_data = zlib_obj.flush()
compression_incomplete = False
if compressed_data:
yield compressed_data
"""Post a file to a url that is content-encoded gzipped compressed and chunked (for large files)"""
def post_file_gzipped(url, file_to_send, chunk_size=5*1024*1024, compress_level=6, headers={}, requests_kwargs={}):
headers_to_send = {'Content-Encoding': 'gzip'}
headers_to_send.update(headers)
zlib_obj = zlib.compressobj(compress_level, zlib.DEFLATED, 31)
return requests.post(url, data=chunked_read_and_compress(file_to_send, zlib_obj, chunk_size), headers=headers_to_send, **requests_kwargs)
resp = post_file_gzipped('http://httpbin.org/post', 'somefile')
resp.raise_for_status()
For python 3:
from io import BytesIO
import gzip
def zip_payload(payload: str) -> bytes:
btsio = BytesIO()
g = gzip.GzipFile(fileobj=btsio, mode='w')
g.write(bytes(payload, 'utf8'))
g.close()
return btsio.getvalue()
headers = {
'Content-Encoding': 'gzip'
}
zipped_payload = zip_payload(payload)
requests.post(url, zipped_payload, headers=headers)
zipped_payload = gzip.compress("Hello world".encode('utf-8'))
。 - illagrenan由于头文件不正确或缺失,所以被接受的答案可能是错误的:
additional_headers['content-encoding'] = 'gzip'
request_body = zlib.compress(json.dumps(post_data))
additional_headers['Content-Encoding'] = 'gzip'
compress = zlib.compressobj(wbits=16+zlib.MAX_WBITS)
body = compress.compress(data) + compress.flush()
additional_headers['Content-Encoding'] = 'deflate'
compress = zlib.compressobj()
body = compress.compress(data) + compress.flush()
#UNPROVEN
r=requests.Request('POST', 'http://httpbin.org/post', data={"hello":"goodbye"})
p=r.prepare()
s=StringIO.StringIO()
g=gzip.GzipFile(fileobj=s,mode='w')
g.write(p.body)
g.close()
p.body=s.getvalue()
p.headers['content-encoding']='gzip'
p.headers['content-length'] = str(len(p.body)) # Not sure about this
r=requests.Session().send(p)
requests.post()
调用中使用了data=
。 - Robᵩ