在numpy数组中找到最常见的子数组

Question

在numpy数组中找到最常见的子数组

4

示例数据：

array(
  [[ 1.,  1.],
   [ 2.,  1.],
   [ 0.,  1.],
   [ 0.,  0.],
   [ 0.,  0.]])

希望达到的预期结果是：

>>> [0.,0.]

ie) 最常见的一对。

似乎不起作用的方法：

使用statistics，因为numpy数组不可哈希。

使用scipy.stats.mode，因为它返回每个轴上的众数，例如）对于我们的示例，它给出

mode=array([[ 0.,  1.]])

- draco_alpine

2个回答

2

通过标准库的一种方法是使用collections.Counter。

这将为您提供最常见的一对和计数。在Counter.most_common()上使用[0]索引来检索最高计数。

import numpy as np
from collections import Counter

A = np.array(
  [[ 1.,  1.],
   [ 2.,  1.],
   [ 0.,  1.],
   [ 0.,  0.],
   [ 0.,  0.]])

c = Counter(map(tuple, A)).most_common()[0]

# ((0.0, 0.0), 2)

唯一的复杂之处在于你需要将其转换为元组，因为Counter仅接受可哈希对象。

- jpp

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- piman314 · Accepted Answer

您可以使用numpy中的unique函数来高效地完成此操作：

pairs, counts = np.unique(a, axis=0, return_counts=True)
print(pairs[counts.argmax()])

返回结果：

[0. 0.]

。这与IT技术有关。