Python中的Gzip和子进程的stdout

Question

Python中的Gzip和子进程的stdout

3

我正在使用Python 2.6.4，发现无法像我希望的那样使用gzip和子进程。以下是问题的示例：

May 17 18:05:36> python
Python 2.6.4 (r264:75706, Mar 10 2010, 14:41:19)
[GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.

>>> import gzip
>>> import subprocess
>>> fh = gzip.open("tmp","wb")
>>> subprocess.Popen("echo HI", shell=True, stdout=fh).wait()
0
>>> fh.close()
>>>
[2]+  Stopped                 python
May 17 18:17:49> file tmp
tmp: data
May 17 18:17:53> less tmp
"tmp" may be a binary file.  See it anyway?
May 17 18:17:58> zcat tmp

zcat: tmp: not in gzip format

以下是less文件内的示例

HI
^_<8B>^H^Hh<C0><F1>K^B<FF>tmp^@^C^@^@^@^@^@^@^@^@^@

看起来像是将文本放入stdout中，然后放入一个空的gzip文件。如果我删除“Hi\n”，那么就会得到这个结果：

May 17 18:22:34> file tmp
tmp: gzip compressed data, was "tmp", last modified: Mon May 17 18:17:12 2010, max compression

这里发生了什么事情？

更新： 之前的问题是在询问同样的事情：我能在Python中使用已打开的gzip文件吗？

- pythonic metaphor

4个回答

8

只需将该管道连接

from subprocess import Popen,PIPE
GZ = Popen("gzip > outfile.gz",stdin=PIPE,shell=True)
P = Popen("echo HI",stdout=GZ.stdin,shell=True)
# these next three must be in order
P.wait()
GZ.stdin.close()
GZ.wait()

- amwinter

1

我不完全确定为什么这不起作用（也许输出重定向没有调用Python的写入，这是gzip使用的内容？），但这可以工作：

>>> fh.write(subprocess.Popen("echo Hi", shell=True, stdout=subprocess.PIPE).stdout.read())

- Personman

对于一个非常大的文件，这可能会导致内存问题。 - fodon

-2

你不需要使用subprocess来写入gzip.GzipFile。相反，像任何其他类似文件的对象一样写入即可。结果会自动压缩！

import gzip
with gzip.open("tmp.gz", "wb") as fh:
    fh.write('echo HI')

- unutbu

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Ignacio Vazquez-Abrams · Accepted Answer

使用 subprocess 时，不能使用类似文件的对象，只能使用真正的文件。 GzipFile 的 fileno() 方法返回底层文件的文件描述符（FD），因此将其重定向到 echo 中。然后关闭 GzipFile，写入一个空的 gzip 文件。