对Python列表进行对数分割

Question

对Python列表进行对数分割

3

我是一个有用的助手，可以为您进行翻译。

我想要做以下事情...

我有一个包含n个元素的列表。我想把这个列表分成32个子列表，随着我们向原始列表的末尾前进，这些子列表会包含越来越多的元素。例如：

a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

我希望能够得到以下类似的内容：

b = [[1],[2,3],[4,5,6,7],[8,9,10,11,12]]

我已经为包含 1024 个元素的列表完成了以下操作：

for i in range (0, 32):
    c = a[i**2:(i+1)**2]
    b.append(c)

但是我很困惑如何可靠地处理其他数字，比如256、512、2048或者其他多于32个列表的数字。

- vshotarov

你的样例输出中为什么恰好将 3 加倍了？算法应该在什么情况下将原始列表中的项目加倍，而又为什么不应该这样做？ - Łukasz Rogalski

是的，我需要处理的列表是浮点数。但如果是整数，你会怎么做呢？我不明白为什么那会更容易。 - vshotarov

1

你的两个例子都不是对数分割，而是等差数列。 - Jared Goguen

1

那么你的意图是在调整列表长度（在您的情况下为1024）增加或减少时获得更多或更少的“均匀”拆分吗？并非指它们都具有相同的大小，而是子列表的大小以与其长度大致相同的速度增长，无论进行多少个划分。 - Two-Bit Alchemist

我编辑了帖子，请看一下新的例子。对于混淆，抱歉，这是完美的例子，因为理想情况下它会通过乘以2来增长，但最终我们也必须包括最后一个元素，否则它就被遗漏了。 - vshotarov

显示剩余9条评论

4个回答

1

像这样的东西应该可以解决问题。

for i in range (0, int(np.sqrt(2*len(a)))):
    c = a[i**2:min( (i+1)**2, len(a) )]
    b.append(c)

不太符合Python风格，但可以实现你想要的功能。

def splitList(a, n, inc):
    """
    a list to split
    n number of sublist
    inc ideal difference between the number of elements in two successive sublists
    """
    zr = len(a) # remaining number of elements to split into sublists
    st = 0 # starting index in the full list of the next sublist
    nr = n # remaining number of sublist to construct
    nc = 1 # number of elements in the next sublist
    #
    b=[]
    while (zr/nr >= nc and nr>1):
        b.append( a[st:st+nc] )
        st, zr, nr, nc = st+nc, zr-nc, nr-1, nc+inc
    #
    nc = int(zr/nr)
    for i in range(nr-1):
        b.append( a[st:st+nc] )
        st = st+nc
    #
    b.append( a[st:max(st+nc,len(a))] )
    return b

# Example of call
# b = splitList(a, 32, 2)
# to split a into 32 sublist, where each list ideally has 2 more element
# than the previous

- innoSPG

我测试了这个，但是在最后给我空列表。我在包含512个元素的列表上运行它，在输出列表中，从23到32的元素是空数组。 - vshotarov

@vshotarov，你是对的，我没有测试过它。需要调整范围。稍后我会看一下。同时，你可以删除空元素。 - innoSPG

可以，但重点是将列表分成32个列表。如果其中一些包含相同数量的元素，我会满意，但不能有空列表。 - vshotarov

@vshotarov，我在32这一点上错了。我只考虑了对数形容词。 - innoSPG

我现在明白对数形容词是非常具有误导性的。 - vshotarov

@vshotarov，我再次编辑了帖子以建议一个函数。尽管您已经接受了一个答案，但您可能还想看一下。根据您的解释，子列表不应该比前一个少元素。已接受的答案未满足此标准，这就是为什么我建议使用这个的原因。 - innoSPG

1

总是有这个。

>>> def log_list(l):
    if len(l) == 0:
        return [] #If the list is empty, return an empty list

    new_l = [] #Initialise new list
    new_l.append([l[0]]) #Add first iteration to new list inside of an array

    for i in l[1:]: #For each other iteration,
        if len(new_l) == len(new_l[-1]):
            new_l.append([i]) #Create new array if previous is full
        else:
            new_l[-1].append(i) #If previous not full, add to it

    return new_l

>>> log_list([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
[[1], [2, 3], [4, 5, 6], [7, 8, 9, 10]]

- GarethPW

这个可以工作，但是最后一个列表的元素比前面的一些列表要少。 - vshotarov

1

这段代码非常混乱，但可以完成工作。请注意，如果您对列表进行对数切片，则在开头会得到一些空箱。您的示例给出了算术索引序列。

from math import log, exp

def split_list(_list, divs):
    n = float(len(_list))
    log_n = log(n)
    indices = [0] + [int(exp(log_n*i/divs)) for i in range(divs)]
    unfiltered = [_list[indices[i]:indices[i+1]] for i in range(divs)] + [_list[indices[i+1]:]]
    filtered = [sublist for sublist in unfiltered if sublist]
    return [[] for _ in range(divs- len(filtered))] + filtered


print split_list(range(1024), 32)

编辑：看了评论之后，这里有一个例子可能符合您的需求：

def split_list(_list):
    copy, output = _list[:], []
    length = 1
    while copy:
        output.append([])
        for _ in range(length):
            if len(copy) > 0:
                output[-1].append(copy.pop(0))
        length *= 2
    return output


print split_list(range(15))
# [[0], [1, 2], [3, 4, 5, 6], [7, 8, 9, 10, 11, 12, 13, 14]]

请注意，此代码效率不高，但可以用作编写更好算法的模板。

- Jared Goguen

这是一个有趣的问题，但从你的例子来看，前几个列表实际上是空的？而且有一次元素比之前少，这是一个问题。 - vshotarov

@vshotarov，请查看第二个示例，它可能更接近您想要的内容。 - Jared Goguen

这样做的问题是，它无法控制最终列表的数量。否则它工作得很好！ - vshotarov

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- tobias_k · Accepted Answer

使用iterator，带有enumerate和itertools.islice的for循环：

import itertools
def logsplit(lst):
    iterator = iter(lst)
    for n, e in enumerate(iterator):
        yield itertools.chain([e], itertools.islice(iterator, n))

适用于任意数量的元素。例如：

for r in logsplit(range(50)):
    print(list(r))

输出：

[0]
[1, 2]
[3, 4, 5]
[6, 7, 8, 9]
... some more ...
[36, 37, 38, 39, 40, 41, 42, 43, 44]
[45, 46, 47, 48, 49]

实际上，这与此问题非常相似，只是使用enumerate获取可变块大小。