寻找二维直方图的峰值

Question

寻找二维直方图的峰值

pythonnumpymatplotlib

4

我制作了一张二维直方图，包含一些 (x, y) 数据，得到了如下图片：

histogram-2d

我想要一种方法来获取存储在 H 中最大值的点的 (x, y) 坐标。例如，在上面的图片中，这将是两个具有近似坐标的点：(1090, 1040) 和 (1110, 1090)。

以下是我的代码：

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from os import getcwd
from os.path import join, realpath, dirname

# Path to dir where this code exists.
mypath = realpath(join(getcwd(), dirname(__file__)))
myfile = 'datafile.dat'

x, y = np.loadtxt(join(mypath,myfile), usecols=(1, 2), unpack=True)

fig = plt.figure()
ax = fig.add_subplot(111)

xmin, xmax = min(x), max(x)
ymin, ymax = min(y), max(y)

rang = [[xmin, xmax], [ymin, ymax]]

binsxy = [int((xmax - xmin) / 20), int((ymax - ymin) / 20)]

H, xedges, yedges = np.histogram2d(x, y, range=rang, bins=binsxy)

extent = [yedges[0], yedges[-1], xedges[0], xedges[-1]]
cp = ax.imshow(H.transpose()[::-1], interpolation='nearest', extent=extent, cmap=cm.jet)
fig.colorbar(cp)

plt.show()

编辑

我已经尝试了Marek和Qarma发布的解决方案，试图获取箱子的坐标而不是它们的索引，如下所示：

# Marek's answer
x_cent, y_cent = unravel_index(H.argmax(), H.shape)
print('Marek')
print(x_cent, y_cent)
print(xedges[x_cent], yedges[y_cent])

# qarma's answer
idx = list(H.flatten()).index(H.max())
x_cent2, y_cent2 = idx / H.shape[1], idx % H.shape[1]
local_maxs = np.argwhere(H == H.max())
print('\nqarma')
print(x_cent2, y_cent2)
print(xedges[x_cent2], yedges[y_cent2])
print(xedges[local_maxs[0,0]], yedges[local_maxs[0,1]], xedges[local_maxs[1,0]], yedges[local_maxs[1,1]])

这将导致：

Marek
(53, 50)
(1072.7838144329899, 1005.0837113402063)

qarma
(53, 50)
(1072.7838144329899, 1005.0837113402063)
(1072.7838144329899, 1005.0837113402063, 1092.8257731958763, 1065.3611340206187)

所以最大坐标是一样的，这很好！现在我有一个小问题，因为当我放大2D图时，我发现全局最大值和局部最大值的坐标都有些偏移：

enter image description here

为什么会这样呢？

- Gabriel

scipy.signal.argrelextrema是什么？https://dev59.com/PW445IYBdhLWcg3w7unD#13491866 - Zeugma

3

可能的解决方案：在二维数组中检测峰值。但是根据你的数据，你可能需要调整邻域的大小。 - unutbu

这是一个非常好的问题，谢谢你指引我！当我有更多时间的时候，我一定会去看看，因为它很长。干杯！ - Gabriel

3个回答

2

这里是如何找到第一个全局最大值的方法。

idx = list(H.flatten()).index(H.max())
x, y = idx / H.shape[1], idx % H.shape[1]

找到所有最大值的坐标被留给读者作为练习...

numpy.argwhere(H == H.max())

编辑

你的代码：

H, xedges, yedges = np.histogram2d(x, y, range=rang, bins=binsxy)

这里H包含直方图的值，而xedges, yedges则是直方图箱子的边界。请注意，edges数组的大小在相应维度上比H大一个。因此：

for x, y in numpy.argwhere(H == H.max()):
    # center is between x and x+1
    print numpy.average(xedges[x:x + 2]), numpy.average(yedges[y:y + 2])

- Dima Tisnek

请查看我所做的编辑，并看看是否能够解释我看到的偏移量？ - Gabriel

如果“edges”数组比原来多一个元素，为什么是“x + 2”而不是“x + 1”？ - Gabriel

1

因为 len(foo[1:1+2]) == 2。 - Dima Tisnek

2

这个问题可以帮助您：Python: get the position of the biggest item in a numpy array 您可以使用H.max()获取最大值，然后将其与H进行比较，并使用numpy.nonzero查找所有最大值的位置：numpy.nonzero(H.max() == H)。虽然这比H.argmax()更昂贵，但您将获得所有最大值。

- David Marek

请查看我所做的修改，并看看您能否解释我所看到的偏移量？ - Gabriel

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- erdogant · Accepted Answer

可以使用库findpeaks。

pip install findpeaks

我无法看到你的数据，但让我尝试另一个类似的例子：

from findpeaks import findpeaks

# initialize with default parameters. The "denoise" parameter can be of use in your case
fp = findpeaks()
# import 2D example dataset
img = fp.import_example()
# make the fit
fp.fit(img)
# Make plot
fp.plot()

持续时间可以用来确定峰值的影响。您可以看到点 1、2 和 3 显示了最强的峰值，接着是其余部分。

# Compute persistence
fp.plot_persistence()