计算两个函数的重叠面积

Question

计算两个函数的重叠面积

pythonnumpyscipymontecarlo

9

我需要计算两个函数重叠的面积。在这个简化的例子中，我使用正态分布，但我需要一个更通用的过程来适应其他函数。

请看下面的图像，了解我的意思，其中红色区域是我想要的： enter image description here

这是我目前拥有的最小工作示例：

import matplotlib.pyplot as plt
import numpy as np
from scipy import stats

# Generate random data uniformly distributed.
a = np.random.normal(1., 0.1, 1000)
b = np.random.normal(1., 0.1, 1000)

# Obtain KDE estimates foe each set of data.
xmin, xmax = -1., 2.
x_pts = np.mgrid[xmin:xmax:1000j]
# Kernels.
ker_a = stats.gaussian_kde(a)
ker_b = stats.gaussian_kde(b)
# KDEs for plotting.
kde_a = np.reshape(ker_a(x_pts).T, x_pts.shape)
kde_b = np.reshape(ker_b(x_pts).T, x_pts.shape)


# Random sample from a KDE distribution.
sample = ker_a.resample(size=1000)

# Compute the points below which to integrate.
iso = ker_b(sample)

# Filter the sample.
insample = ker_a(sample) < iso

# As per Monte Carlo, the integral is equivalent to the
# probability of drawing a point that gets through the
# filter.
integral = insample.sum() / float(insample.shape[0])

print integral

plt.xlim(0.4,1.9)
plt.plot(x_pts, kde_a)
plt.plot(x_pts, kde_b)

plt.show()

我在计算积分时使用了Monte Carlo方法。

这种方法的问题在于，当我使用ker_b(sample)（或ker_a(sample)）在任一分布中评估抽样点时，我得到的值直接放在了KDE线上。因此，即使是明显重叠的分布，应该返回非常接近1的共同/重叠区域值，仍然返回较小的值（由于它们是概率密度估计，所以两个曲线的总面积均为1）。

如何修改此代码以获得预期结果？

以下是我如何应用Zhenya的答案。

# Calculate overlap between the two KDEs.
def y_pts(pt):
    y_pt = min(ker_a(pt), ker_b(pt))
    return y_pt
# Store overlap value.
overlap = quad(y_pts, -1., 2.)

- Gabriel

1

http://stackoverflow.com/questions/15361125/calculate-area-between-two-curves-that-are-normal-distributions/15361352#15361352 - ev-br

我正在查看您在链接问题中的答案，尽管我最初认为它只适用于正态分布，但似乎在这里也适用。您是否介意以回答的形式发布您的评论？这样，如果它确实有效，我可以将其标记为已接受。谢谢。 - Gabriel

那个答案使用了求积法 - 这里可以选用吗？如果需要使用蒙特卡罗方法，那么上面的代码需要进行一些更改。我希望我能理解你的结尾评论 - 以 "我得到的值直接放在KDE上..." 开头的句子对我来说很神秘。 - Charles Pehlivanian

嗨@CharlesPehlivanian，我所说的“直接覆盖”是指在核（例如ker_a）中评估一个点会返回与任何其他函数一样的核值。例如，f(x) = x^2 返回放置在给定x下的二次曲线上的值，而由于我想应用蒙特卡罗方法，我需要将它们随机分布在该曲线下方。无论如何，这似乎是一个过于复杂的方法。如果Zhenya发表他的答案，我会更新问题以反映这一点。 - Gabriel

我认为我已经找到了一个相当简单的答案，链接在这里：https://dev59.com/7p7ha4cB1Zd3GeqPmKLv#48582324 - Karop

我认为我在R中找到了一个简单的答案，链接在这里：https://dev59.com/7p7ha4cB1Zd3GeqPmKLv#48582324 - Karop

2个回答

0

我认为另一个解决方案是将两条曲线相乘，然后进行积分。您可能需要进行某种形式的归一化处理。类比于化学中的轨道重叠：https://en.wikipedia.org/wiki/Orbital_overlap

- wordsforthewise

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- ev-br · Accepted Answer

图中的红色区域是 min(f(x), g(x)) 的积分，其中 f 和 g 是绿色和蓝色的两个函数。要计算积分，您可以使用来自 scipy.integrate 的任何积分器（我建议使用 quad）-- 或者使用MC积分器，但我不太明白这样做的意义。