如何像这样重新排列列表（Python）？

Question

如何像这样重新排列列表（Python）？

3

例如，列表 to_be 包含: 3个 "a" , 4个 "b" , 3个 "c" , 5个 "d" ...

to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d", ...]

现在我希望它变成这样：

done = ["a", "b", "c", "d", ... , "a", "b", "c", "d", ... , "b", "d", ...] (notice: some items are more than others as in amounts, but they need to be still in a pre-defined order, alphabetically for example)

什么是最快的方法来完成这个任务？

- Shane

你想要删除每个与其前一个相等的项目吗？ - Paulo Scardine

尝试使用Counter集合。 - A. Webb

5个回答

2

to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d"]
counts = collections.Counter(to_be)
answer = []
while counts:
    answer.extend(sorted(counts))
    for k in counts:
        counts[k] -= 1
    counts = {k:v for k,v in counts.iteritems() if v>0}

现在，answer的样子是这样的：

['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

- inspectorG4dget

1

我不确定这是否是最快的，但这是我的尝试：

>>> d = defaultdict(int)
>>> def sort_key(a):
...     d[a] += 1
...     return d[a],a
...

>>> sorted(to_be,key=sort_key)
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

包装在一个函数中：

def weird_sort(x):
    d = defaultdict(int)
    def sort_key(a):
        d[a] += 1
        return (d[a],a)
    return sorted(x,key=sort_key)

当然，这要求你的可迭代元素是可哈希的。

- mgilson

0

比Lattyware的稍微不太优雅：

import collections
def rearrange(l):
    counts = collections.Counter(l)
    output = []
    while (sum([v for k,v in counts.items()]) > 0):
        output.extend(sorted([k for k, v in counts.items() if v > 0))
        for k in counts:
            counts[k] = counts[k] - 1 if counts[k] > 0 else 0
    return counts

- Jordan Lewis

0

手动编写状态机应该更有效率 - 但对于相对较小的列表（<5000），您可以毫无问题地利用Python的好处来完成此操作：

to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d","e", "e"]


def do_it(lst):
    lst = lst[:]
    result = []

    while True:
        group = set(lst)
        result.extend(sorted(group))
        for element in group:
            del lst[lst.index(element)]
        if not lst:
            break
    return result

done = do_it(to_be)

上述函数的“大O”复杂度应该非常高。我甚至没有尝试去计算它。

- jsbueno

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Gareth Latty · Accepted Answer

如果我理解你的要求正确，这可以通过组合itertools.zip_longest, itertools.groupby 和 itertools.chain.from_iterable() 相对容易地完成：

我们首先将项目分成集合（"a"，"b"等），将它们压缩以按照所需顺序获取它们（每个集合中的一个），使用chain生成一个单一的列表，然后移除由压缩引入的None值。

>>> [item for item in itertools.chain.from_iterable(itertools.zip_longest(*[list(x) for _, x in itertools.groupby(to_be)])) if item]
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

你可能想将一些列表推导式分离出来，使其更易读：

>>> groups = itertools.zip_longest(*[list(x) for _, x in itertools.groupby(to_be)])
>>> [item for item in itertools.chain.from_iterable(groups) if item]
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

（给定版本适用于3.x，对于2.x，您将需要izip_longest（）。）

像往常一样，如果您期望空字符串、0等，则需要使用if item is not None，如果您需要保留None值不变，请创建一个 sentinel 对象并针对其进行身份检查。

您还可以在文档中提供的 roundrobin() 配方作为替代方法来使用 zip，这使得它变得简单：

>>> list(roundrobin(*[list(x) for _, x in itertools.groupby(to_be)]))
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

作为最后的注记，细心的人可能会注意到我从groupby()生成器中制作了列表，这可能看起来很浪费，原因来自于文档:

返回的组本身就是一个迭代器，它与groupby()共享底层可迭代对象。由于源是共享的，当groupby()对象被推进时，先前的组将不再可见。因此，如果需要稍后使用该数据，则应将其存储为列表。