如何在Python中计算和绘制LOWESS曲线？

Question

如何在Python中计算和绘制LOWESS曲线？

pythonmatplotlibplotstatisticsstatsmodels

6

我该如何使用Python找到并绘制类似以下的LOWESS曲线？

loess curve example

我知道statsmodels中的LOWESS实现，但它似乎无法给出95%置信区间线，我可以在其间填充颜色。Seaborn有一个调用statsmodels实现的方法，但它不能绘制置信区间。其他 StackOverflow 答案提供了绘制LOESS/LOWESS线的代码，但没有置信区间。有人能帮忙吗？是否有人知道现有的实现，使我能够做到这一点？

提前致谢。

- Suriname0

这是你需要的吗？https://dev59.com/_lIH5IYBdhLWcg3watfP#59747610 - Jiadong

@ted930511 谢谢，但不需要；似乎之前的评论已被存档，但这个问题是关于计算LOWESS曲线的适当置信区间。对于搜索者来说，当前的答案是“如果你想要计算LOWESS置信区间就使用R”或者“如果你必须使用Python，就从原始论文中实现它们”。 - Suriname0

据我所知，您必须自己实现。请查看此链接：https://james-brennan.github.io/posts/lowess_conf/ - Jiadong

是的，看起来那篇博客提供了可用的实现。我希望它在3年前就存在了！如果您留下一个带有该链接和博客内容简短说明的答案，我会将其标记为解决方案。 - Suriname0

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Jiadong · Accepted Answer

我发现这个链接在这里很有用，下面是代码：

def lowess(x, y, f=1./3.):
    """
    Basic LOWESS smoother with uncertainty. 
    Note:
        - Not robust (so no iteration) and
             only normally distributed errors. 
        - No higher order polynomials d=1 
            so linear smoother.
    """
    # get some paras
    xwidth = f*(x.max()-x.min()) # effective width after reduction factor
    N = len(x) # number of obs
    # Don't assume the data is sorted
    order = np.argsort(x)
    # storage
    y_sm = np.zeros_like(y)
    y_stderr = np.zeros_like(y)
    # define the weigthing function -- clipping too!
    tricube = lambda d : np.clip((1- np.abs(d)**3)**3, 0, 1)
    # run the regression for each observation i
    for i in range(N):
        dist = np.abs((x[order][i]-x[order]))/xwidth
        w = tricube(dist)
        # form linear system with the weights
        A = np.stack([w, x[order]*w]).T
        b = w * y[order]
        ATA = A.T.dot(A)
        ATb = A.T.dot(b)
        # solve the syste
        sol = np.linalg.solve(ATA, ATb)
        # predict for the observation only
        yest = A[i].dot(sol)# equiv of A.dot(yest) just for k
        place = order[i]
        y_sm[place]=yest 
        sigma2 = (np.sum((A.dot(sol) -y [order])**2)/N )
        # Calculate the standard error
        y_stderr[place] = np.sqrt(sigma2 * 
                                A[i].dot(np.linalg.inv(ATA)
                                                    ).dot(A[i]))
    return y_sm, y_stderr


import numpy as np
import matplotlib.pyplot as plt


# make some data
x = 5*np.random.random(100)
y = np.sin(x) * 3*np.exp(-x) + np.random.normal(0, 0.2, 100)
order = np.argsort(x)

#run it
y_sm, y_std = lowess(x, y, f=1./5.)
# plot it
plt.plot(x[order], y_sm[order], color='tomato', label='LOWESS')
plt.fill_between(x[order], y_sm[order] - 1.96*y_std[order],
                 y_sm[order] + 1.96*y_std[order], alpha=0.3, label='LOWESS uncertainty')
plt.plot(x, y, 'k.', label='Observations')
plt.legend(loc='best')
#run it
y_sm, y_std = lowess(x, y, f=1./5.)
# plot it
plt.plot(x[order], y_sm[order], color='tomato', label='LOWESS')
plt.fill_between(x[order], y_sm[order] - y_std[order],
                 y_sm[order] + y_std[order], alpha=0.3, label='LOWESS uncertainty')
plt.plot(x, y, 'k.', label='Observations')
plt.legend(loc='best')