使用numpy.argpartition忽略NaN值

Question

使用numpy.argpartition忽略NaN值

pythonpython-3.xnumpysortingnumpy-ndarray

6

我有一个大数组，大约包含4900万个项目(7000*7000)，需要在其中查找最大的N个项目以及它们的索引，忽略所有NaN。在事先无法删除这些NaN的情况下进行查找，因为我需要从第一个数组中获得最大的N个项目的索引值，以便从另一个数组中提取数据，而该数据与第一个数组中的NaN不同。我尝试了:

np.argpartition(first_array, -N)[-N:]

这对于没有NaN的数组非常有效，但是如果存在NaN，则nan会被认为是最大的项目，因为在Python中被视为无限大。

x = np.array([np.nan, 2, -1, 2, -4, -8, -9, 6, -3]).reshape(3, 3)
y = np.argpartition(x.ravel() , -3)[-3:]
z = x.ravel()[y]
# this is the result I am getting  === [2, 6, nan]
# but I need this ==== [2, 2, 6]

- Gokul

检查 np.nanargmax。 - yatu

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Divakar · Accepted Answer

使用NaN的计数来进行偏移，从而计算索引并提取值。

In [200]: N = 3

In [201]: c = np.isnan(x).sum()

In [204]: idx = np.argpartition(x.ravel() , -N-c)[-N-c:-c]

In [207]: val = x.flat[idx]

In [208]: idx,val
Out[208]: (array([1, 3, 7]), array([2., 2., 6.]))