如何在Python中绘制最大似然估计图

Question

如何在Python中绘制最大似然估计图

pythonnumpystatisticsmachine-learning

5

我正在从指数分布中抽取一些样本。在我的第一个实验中，我从这个分布中抽取了1000个样本，而在第二个实验中，我从这个分布中抽取了10,000个样本。(使用numpy.random.exponential)

我想通过视觉比较两个实验的最大似然估计值之间的差异。(由于这是指数分布，MLE将只是样本均值，因此在我的第二个实验中，MLE应该更接近真实密度)。

如何在Python中进行这样的比较？我知道如何在matplotlib中绘制图形，但我不知道应该使用什么类型的图形。

- user103021

3

我不确定我理解了。你有两个MLE，也就是两个数字。用图形展示，相比直接查看这些数字本身，你无法获取太多信息。或者，你可以计算一堆样本大小的MLE，并绘制大小与MLE之间的关系图。然后将其与实际值进行比较。这可能会更好。 - Avaris

抱歉造成困惑。我想要绘制类似这样的东西：http://nipy.sourceforge.net/nitime/_images/ar_est_2vars_01.png 。我想展示真实密度和我的估计版本。 - user103021

仍然有些混淆，但我认为这是关于数学的问题。MLE应该为您提供单个变量的估计值，而不是密度。但对于指数分布，您可以使用均值的估计值来获得估计密度，因为均值和密度参数之间存在直接关系。这是您想要的吗？ - Avaris

是的，我正在使用MLE来获取密度参数的估计值。但我想为它创建一个花哨的可视化效果。我想比较当我们有小样本和大样本时的过程（因此，对于10,000个样本，估计将比1000个样本更接近真实参数..）。仅绘制大小与MLE的图表并不花哨，因为我只有2个大小。 - user103021

嗯，你知道指数密度函数吧。对于一系列的x值，计算真实参数和估计参数的密度值，并绘制出来。但要注意，即使1000是一个相当大的样本量，所以你的估计值会非常接近真实值。你可能看不到太大的密度差异。 - Avaris

MLE的分布有封闭形式表达式，适用于指数分布率（它是正态分布！）。但如果您想进行数值计算，只需重复实验几百次，然后计算密度并绘制它们。 - D3C34C34D

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- ahans · Accepted Answer

根据评论中的说明，我猜测您可能需要以下内容：

import numpy as np
import matplotlib.pyplot as plt

def plot_exponential_density(mu, xmax, fmt, label):
        x = np.arange(0, xmax, 0.1)
        y = 1/mu * np.exp(-x/mu)
        plt.plot(x, y, fmt, label=label)

def sample_and_plot(N, color):
        # first sample N valus
        samples = np.zeros( (N,1) )
        for i in range(0,N):
                samples[i] = np.random.exponential()

        # determine the mean
        mu = np.mean(samples)
        print("N = %d  ==> mu = %f" % (N, mu))

        # plot a histogram of the samples
        (n, bins) = np.histogram(samples, bins=int(np.sqrt(N)), density=True)
        plt.step(bins[:-1], n, color=color, label="samples N = %d" % N)

        xmax = max(bins)

        # plot the density according to the estimated mean
        plot_exponential_density(mu, xmax, color + "--", label="estimated density N = %d" % N)

        return xmax


# sample 100 values, draw a histogram, and the density according to
# the estimated mean
xmax1 = sample_and_plot(100, 'r')
# do the same for 1000 samples
xmax2 = sample_and_plot(10000, 'b')

# finally plot the true density
plot_exponential_density(1, max(xmax1, xmax2), 'k', "true density")

# add a legend
plt.legend()

# and show the plot
plt.show()

enter image description here

我用了100和10,000个样本，因为在1,000个样本时，估计已经相当不错。但即使只有100个样本，我也有些惊讶平均数的估计结果和密度的估计结果是多么好。如果没有知道这些样本来自指数分布，仅凭直方图，我不确定我是否会认出指数分布...