按列表中元素出现频率排序的方法

Question

按列表中元素出现频率排序的方法

22

我有一个整数列表（或者甚至可能是字符串），我想在Python中按出现频率排序，例如：

a = [1, 1, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 5]

这个列表中数字 5 出现了 4 次，4 出现了 3 次。所以排序后的输出列表将是：

result = [5, 5, 5, 5, 3, 3, 3, 4, 4, 4, 1, 1, 2]

我尝试使用 a.count()，但它只给出元素的出现次数。我想要对它进行排序。有任何想法如何做到这一点吗？

谢谢。

- Kiran

在输出中，数字 4 和 3 的顺序是否有影响？ - thefourtheye

不，这并不重要，如果这样能让它更简单。 - Kiran

3

很好，否则我必须再次进行排序 :-) - thefourtheye

7个回答

10

使用Python 3.3和内置的sorted函数，以计数为键：

>>> a = [1,1,2,3,3,3,4,4,4,5,5,5,5]
>>> sorted(a,key=a.count)
[2, 1, 1, 3, 3, 3, 4, 4, 4, 5, 5, 5, 5]
>>> sorted(a,key=a.count,reverse=True)
[5, 5, 5, 5, 3, 3, 3, 4, 4, 4, 1, 1, 2]

- thegrinner

我认为使用 list.count 会使其非常低效。 - thefourtheye

2

@thefourtheye 我需要计时来确认，但那听起来是正确的。不过，对于像这个例子中的小列表来说，这确实非常安全。 - thegrinner

3

In [15]: a = [1,1,2,3,3,3,4,4,4,5,5,5,5]

In [16]: counts = collections.Counter(a)

In [17]: list(itertools.chain.from_iterable([[k for _ in range(counts[k])] for k in sorted(counts, key=counts.__getitem__, reverse=True)]))
Out[17]: [5, 5, 5, 5, 3, 3, 3, 4, 4, 4, 1, 1, 2]

或者：

answer = []
for k in sorted(counts, key=counts.__getitem__, reverse=True):
    answer.extend([k for _ in range(counts[k])])

当然，[k for _ in range(counts[k])]可以替换为[k]*counts[k]。
因此，第17行变成

list(itertools.chain.from_iterable([[k]*counts[k] for k in sorted(counts, key=counts.__getitem__, reverse=True)]))

- inspectorG4dget

@Aशwiniचhaudhary：我考虑过这个，但如果元素不是原始类型，可能效果不会特别好。还有引用等问题... - inspectorG4dget

如果您担心可变类型，那么Counter根本无法正常工作。 - Ashwini Chaudhary

1

如果您已经在使用numpy，或者使用它是一个选项，这里有另一种选择：

In [309]: import numpy as np

In [310]: a = [1, 2, 3, 3, 1, 3, 5, 4, 4, 4, 5, 5, 5]

In [311]: vals, counts = np.unique(a, return_counts=True)

In [312]: order = np.argsort(counts)[::-1]

In [313]: np.repeat(vals[order], counts[order])
Out[313]: array([5, 5, 5, 5, 4, 4, 4, 3, 3, 3, 1, 1, 2])

那个结果是一个numpy数组。如果你想要得到一个Python列表，调用数组的tolist()方法即可：

In [314]: np.repeat(vals[order], counts[order]).tolist()
Out[314]: [5, 5, 5, 5, 4, 4, 4, 3, 3, 3, 1, 1, 2]

- Warren Weckesser

0

数组中的出现和等大小集合内的出现：

rev=True

arr = [6, 6, 5, 2, 9, 2, 5, 9, 2, 5, 6, 5, 4, 6, 9, 1, 2, 3, 4, 7 ,8 ,8, 8, 2]
print arr

arr.sort(reverse=rev)

ARR = {}
for n in arr:
  if n not in ARR:
    ARR[n] = 0
  ARR[n] += 1

arr=[]
for k,v in sorted(ARR.iteritems(), key=lambda (k,v): (v,k), reverse=rev):
  arr.extend([k]*v)
print arr

结果：

[6, 6, 5, 2, 9, 2, 5, 9, 2, 5, 6, 5, 4, 6, 9, 1, 2, 3, 4, 7, 8, 8, 8, 2]
[2, 2, 2, 2, 2, 6, 6, 6, 6, 5, 5, 5, 5, 9, 9, 9, 8, 8, 8, 4, 4, 7, 3, 1]

- Ivan Motin

0

Dart 解决方案

String sortedString = '';
Map map = {};
for (int i = 0; i < s.length; i++) {
   map[s[i]] = (map[s[i]] ?? 0) + 1;
   // OR 
   //  map.containsKey(s[i])
   //   ? map.update(s[i], (value) => ++value)
   //   : map.addAll({s[i]: 1});
}
var sortedByValueMap = Map.fromEntries(
    map.entries.toList()..sort((e1, e2) => e1.value.compareTo(e2.value)));
sortedByValueMap.forEach((key, value) {
  sortedString += key * value;
});
return sortedString.split('').reversed. Join();

- mohammed gamal

0

不是很有趣的方式...

a = [1,1,2,3,3,3,4,4,4,5,5,5,5]

from collections import Counter
result = []
for v, times in sorted(Counter(a).iteritems(), key=lambda x: x[1], reverse=True):
    result += [v] * times

一句话：

reduce(lambda a, b: a + [b[0]] * b[1], sorted(Counter(a).iteritems(), key=lambda x: x[1], reverse=True), [])

- Kei Minagawa

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- thefourtheye · Accepted Answer

from collections import Counter
print [item for items, c in Counter(a).most_common() for item in [items] * c]
# [5, 5, 5, 5, 3, 3, 3, 4, 4, 4, 1, 1, 2]

甚至更好（高效）的实现

from collections import Counter
from itertools import repeat, chain
print list(chain.from_iterable(repeat(i, c) for i,c in Counter(a).most_common()))
# [5, 5, 5, 5, 3, 3, 3, 4, 4, 4, 1, 1, 2]

或者

from collections import Counter
print sorted(a, key=Counter(a).get, reverse=True)
# [5, 5, 5, 5, 3, 3, 3, 4, 4, 4, 1, 1, 2]

如果您偏爱原地排序

a.sort(key=Counter(a).get, reverse=True)