在Python中，zip（* [iter（s）] * n）是如何工作的？

Question

在Python中，zip（* [iter（s）] * n）是如何工作的？

125

s = [1,2,3,4,5,6,7,8,9]
n = 3

list(zip(*[iter(s)]*n)) # returns [(1,2,3),(4,5,6),(7,8,9)]

zip(*[iter(s)]*n)是如何工作的？如果用更冗长的代码来写，它会是什么样子？

_{这是一种用于将列表分成相等大小的块的技术 - 请参阅该问题以获取问题的概述。}

- Oliver Zheng

1

请看这里，其中也解释了它的工作原理：https://dev59.com/T0vSa4cB1Zd3GeqPhMYR#2202485 - Matt Joiner

如果这里的答案不够用，请查看我的博客：http://telliott99.blogspot.com/2010/01/chunks-of-sequence-in-python.html。 - telliott99

10

虽然非常有趣，但这种技术必定违背了 Python 的核心“易读性”价值观！ - Demis

9个回答

53

其他很棒的答案和评论已经很好地解释了参数拆包和zip()的作用。

正如Ignacio和ujukatzel所说，将三个对同一迭代器的引用传递给zip()，它将按顺序从每个迭代器引用中获取整数并生成3元组。

1,2,3,4,5,6,7,8,9  1,2,3,4,5,6,7,8,9  1,2,3,4,5,6,7,8,9
^                    ^                    ^            
      ^                    ^                    ^
            ^                    ^                    ^

既然你要求更详细的代码示例：

chunk_size = 3
L = [1,2,3,4,5,6,7,8,9]

# iterate over L in steps of 3
for start in range(0,len(L),chunk_size): # xrange() in 2.x; range() in 3.x
    end = start + chunk_size
    print L[start:end] # three-item chunks

根据 start 和 end 的值：

[0:3) #[1,2,3]
[3:6) #[4,5,6]
[6:9) #[7,8,9]

顺便提一下，你可以使用map()函数并将初始参数设置为None，以获得相同的结果：

>>> map(None,*[iter(s)]*3)
[(1, 2, 3), (4, 5, 6), (7, 8, 9)]

了解更多关于 zip() 和 map() 的信息, 请访问: http://muffinresearch.co.uk/archives/2007/10/16/python-transposing-lists-with-map-and-zip/

- mechanical_meat

34

我认为在所有回答中都忽略了一件事情（对于熟悉迭代器的人可能很明显，但对于其他人来说并不那么明显）：

由于我们使用了同一个迭代器，它被消耗了，剩下的元素将被zip使用。因此，如果我们仅使用列表而不是迭代器，例如：

l = range(9)
zip(*([l]*3)) # note: not an iter here, the lists are not emptied as we iterate 
# output 
[(0, 0, 0), (1, 1, 1), (2, 2, 2), (3, 3, 3), (4, 4, 4), (5, 5, 5), (6, 6, 6), (7, 7, 7), (8, 8, 8)]

使用迭代器，弹出值并仅保留剩余可用的值，因此对于zip函数，一旦0被消耗，1就可用，然后是2，以此类推。这是非常微妙但相当聪明的!!!

- gabhijit

+1，你救了我！我简直不敢相信其他答案都跳过了这个关键细节，假设每个人都知道这个。你能提供任何包含这些信息的文档参考吗？ - Snehasish Karmakar

10

iter(s) 返回 s 的迭代器。

[iter(s)]*n 则创建一个包含 n 个指向同一个迭代器的列表。

因此，当执行 zip(*[iter(s)]*n) 时，它按顺序从列表中的三个迭代器中提取一个元素。由于所有迭代器都是同一对象，因此它只是将列表分组成大小为 n 的块。

- sttwister

9

不是“同一列表的n个迭代器”，而是“同一个迭代器对象被使用n次”。即使它们来自同一个列表，不同的迭代器对象也不会共享状态。 - Thomas Wouters

谢谢，已经更正了。事实上，那正是我“想”的，但写成了其他的东西。 - sttwister

7

扒开层层“巧思”，你可能会发现以下拼写更易于理解：

x = iter(s)
for a, b, c in zip(*([x] * n)):
    print(a, b, c)

这将等同于更不聪明的方式：

x = iter(accounts_iter)
for a, b, c in zip(x, x, x):
    print(a, b, c)

现在应该变得清晰了。只有一个迭代器对象x，每次迭代时，zip()会在内部调用next(x)三次，分别对应传递给它的每个迭代器对象。但是这里每次都是同一个迭代器对象。因此它提供了前三个next(x)结果，并将共享迭代器对象等待下一个结果。重复以上步骤。

顺便说一句，我怀疑你在脑子中错误地解析了*([iter(x)]*n)。尾随的*n首先发生，然后前缀*被应用于创建的n元列表*n。 f(*iterable)是使用可变数量的参数调用f()的快捷方式，其中每个对象iterable提供一个参数。

- Tim Peters

7

使用zip的一个建议是：如果列表长度不能被整除，它将截断您的列表。为了解决这个问题，您可以使用itertools.izip_longest（如果您能接受填充值），或者您可以使用以下代码：

def n_split(iterable, n):
    num_extra = len(iterable) % n
    zipped = zip(*[iter(iterable)] * n)
    return zipped if not num_extra else zipped + [iterable[-num_extra:], ]

使用方法：

for ints in n_split(range(1,12), 3):
    print ', '.join([str(i) for i in ints])

输出：

1, 2, 3
4, 5, 6
7, 8, 9
10, 11

- jmagnusson

3

这已经在 itertools 中有记录了：http://docs.python.org/2/library/itertools.html#recipes 的 grouper。无需重复发明轮子。 - jamylak

2

我需要逐步分解每个部分，真正内化它是如何工作的。我的REPL笔记：

>>> # refresher on using list multiples to repeat item
>>> lst = list(range(15))
>>> lst
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
>>> # lst id value
>>> id(lst)
139755081359872
>>> [id(x) for x in [lst]*3]
[139755081359872, 139755081359872, 139755081359872]

# replacing lst with an iterator of lst
# It's the same iterator three times
>>> [id(x) for x in [iter(lst)]*3 ]
[139755085005296, 139755085005296, 139755085005296]
# without starred expression zip would only see single n-item list.
>>> print([iter(lst)]*3)
[<list_iterator object at 0x7f1b440837c0>, <list_iterator object at 0x7f1b440837c0>, <list_iterator object at 0x7f1b440837c0>]
# Must use starred expression to expand n arguments
>>> print(*[iter(lst)]*3)
<list_iterator object at 0x7f1b4418b1f0> <list_iterator object at 0x7f1b4418b1f0> <list_iterator object at 0x7f1b4418b1f0>

# by repeating the same iterator, n-times,
# each pass of zip will call the same iterator.__next__() n times
# this is equivalent to manually calling __next__() until complete
>>> iter_lst = iter(lst)
>>> ((iter_lst.__next__(), iter_lst.__next__(), iter_lst.__next__()))
(0, 1, 2)
>>> ((iter_lst.__next__(), iter_lst.__next__(), iter_lst.__next__()))
(3, 4, 5)
>>> ((iter_lst.__next__(), iter_lst.__next__(), iter_lst.__next__()))
(6, 7, 8)
>>> ((iter_lst.__next__(), iter_lst.__next__(), iter_lst.__next__()))
(9, 10, 11)
>>> ((iter_lst.__next__(), iter_lst.__next__(), iter_lst.__next__()))
(12, 13, 14)
>>> ((iter_lst.__next__(), iter_lst.__next__(), iter_lst.__next__()))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

# all together now!
# continuing with same iterator multiple times in list
>>> print(*[iter(lst)]*3)
<list_iterator object at 0x7f1b4418b1f0> <list_iterator object at 0x7f1b4418b1f0> <list_iterator object at 0x7f1b4418b1f0>
>>> zip(*[iter(lst)]*3)
<zip object at 0x7f1b43f14e00>
>>> list(zip(*[iter(lst)]*3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 14)]

# NOTE: must use list multiples. Explicit listing creates 3 unique iterators
>>> [iter(lst)]*3 == [iter(lst), iter(lst), iter(lst)]
False
>>> list(zip(*[[iter(lst), iter(lst), iter(lst)]))
[(0, 0, 0), (1, 1, 1), (2, 2, 2), (3, 3, 3), ....

- ChrisFreeman

1

如果您使用Python解释器或ipython，那么理解代码的执行过程会更容易，例如：n = 2。

In [35]: [iter("ABCDEFGH")]*2
Out[35]: [<iterator at 0x6be4128>, <iterator at 0x6be4128>]

所以，我们有一个指向同一迭代器对象的两个迭代器列表。记住，对象上的iter返回一个迭代器对象，在这种情况下，由于*2的Python语法糖，它是相同的迭代器两次。迭代器也只运行一次。

此外，zip接受任意数量的可迭代对象（序列是可迭代对象），并从每个输入序列的第i个元素创建元组。在我们的情况下，由于两个迭代器是相同的，zip为每个2元组的输出移动相同的迭代器两次。

In [41]: help(zip)
Help on built-in function zip in module __builtin__:

zip(...)
    zip(seq1 [, seq2 [...]]) -> [(seq1[0], seq2[0] ...), (...)]

    Return a list of tuples, where each tuple contains the i-th element
    from each of the argument sequences.  The returned list is truncated
    in length to the length of the shortest argument sequence.

解包 (*) 运算符确保迭代器运行到耗尽，这种情况下是直到没有足够的输入来创建一个 2 元组。

这可以扩展到任何值的 n，zip(*[iter(s)]*n) 的工作方式如描述的那样。

- akhan

抱歉回复慢了。但是你能解释一下“由于*2的Python语法糖，迭代器被使用两次。迭代器只运行一次。”这部分吗？如果可以的话，为什么结果不是[（“A”，“A”)....]？谢谢。 - Bowen Liu

@BowenLiu *只是方便复制一个对象。先用标量试试，然后再用列表试试。还可以尝试一下print(*zip(*[iter("ABCDEFG")]*2))和print(*zip(*[iter("ABCDEFG"), iter("ABCDEFG")]))的区别。然后开始逐步拆解这两个语句，看看它们实际上的迭代器对象是什么。 - akhan

0

x = [1,2,3,4,5,6,7,8,9]
zip(*[iter(x)] * 3)

等同于：

x = [1,2,3,4,5,6,7,8,9]
iter_var = iter(x)
zip(iter_var,iter_var,iter_var)

每次 zip() 获取 iter_var 中的下一个值时，它会移动到 x 的下一个值。尝试运行 next(iter_var) 来查看它是如何工作的。

- crimander_jones

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Ignacio Vazquez-Abrams · Accepted Answer

iter()是一个序列的迭代器。[x] * n会生成一个包含n个x的列表，也就是长度为n的列表，其中每个元素都是x。*arg将一个序列解压成函数调用的参数。因此，您将同一个迭代器三次传递给zip()，它会每次从迭代器中获取一个项目。

x = iter([1,2,3,4,5,6,7,8,9])
print(list(zip(x, x, x)))