使用seaborn绘制QQ图的FacetGrid

3
我无法用Seaborn绘制FacetGridQQ-plots
我有一个m行(观测值)n列(特征)的矩阵,并且我想为每个特征(列)绘制QQ-plot以将其与正态分布进行比较。
到目前为止,我的代码如下:
import scipy.stats as ss

def qqplots(fpath, expr, title):

    def quantile_plot(x, **kwargs):
        x = ss.zscore(x)
        qntls, xr = ss.probplot(x, dist="norm")
        plt.scatter(xr, qntls, **kwargs)

    expr_m = pd.melt(expr)
    expr_m.columns = ["Feature", "Value"]
    n_feat = len(expr_m["Feature"].value_counts().index)

    n_cols = int(np.sqrt(n_feat)) + 1

    g = sns.FacetGrid(expr_m, col="Feature", col_wrap=n_cols)
    g.map(quantile_plot, "Value");
    plt.savefig(fpath + ".pdf", bbox_inches="tight")
    plt.savefig(fpath + ".png", bbox_inches="tight")
    plt.close()

qqplots("lognorm_qqplot", np.log2(expr), "Log-normal qqplot")

变量expr是一个带有m行(观测值)和n列(特征)的pandas DataFrame。

我收到的异常如下:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-52-f9333a55702e> in <module>()
     39     plt.close()
     40 
---> 41 qqplots("lognorm_qqplot", np.log2(expr), "Log-normal qqplot")

<ipython-input-52-f9333a55702e> in qqplots(fpath, expr, title)
     34 
     35     g = sns.FacetGrid(expr_m, col="Feature", col_wrap=n_cols)
---> 36     g.map(quantile_plot, "Value");
     37     plt.savefig(fpath + ".pdf", bbox_inches="tight")
     38     plt.savefig(fpath + ".png", bbox_inches="tight")

/usr/local/lib/python3.5/site-packages/seaborn/axisgrid.py in map(self, func, *args, **kwargs)
    726 
    727             # Draw the plot
--> 728             self._facet_plot(func, ax, plot_args, kwargs)
    729 
    730         # Finalize the annotations and layout

/usr/local/lib/python3.5/site-packages/seaborn/axisgrid.py in _facet_plot(self, func, ax, plot_args, plot_kwargs)
    810 
    811         # Draw the plot
--> 812         func(*plot_args, **plot_kwargs)
    813 
    814         # Sort out the supporting information

<ipython-input-52-f9333a55702e> in quantile_plot(y, **kwargs)
     25         y = ss.zscore(y)
     26         qntls, xr = ss.probplot(y, dist="norm")
---> 27         plt.scatter(xr, qntls, **kwargs)
     28 
     29     expr_m = pd.melt(expr)

/usr/local/lib/python3.5/site-packages/matplotlib/pyplot.py in scatter(x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, edgecolors, hold, data, **kwargs)
   3249                          vmin=vmin, vmax=vmax, alpha=alpha,
   3250                          linewidths=linewidths, verts=verts,
-> 3251                          edgecolors=edgecolors, data=data, **kwargs)
   3252     finally:
   3253         ax.hold(washold)

/usr/local/lib/python3.5/site-packages/matplotlib/__init__.py in inner(ax, *args, **kwargs)
   1810                     warnings.warn(msg % (label_namer, func.__name__),
   1811                                   RuntimeWarning, stacklevel=2)
-> 1812             return func(ax, *args, **kwargs)
   1813         pre_doc = inner.__doc__
   1814         if pre_doc is None:

/usr/local/lib/python3.5/site-packages/matplotlib/axes/_axes.py in scatter(self, x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, edgecolors, **kwargs)
   3838         y = np.ma.ravel(y)
   3839         if x.size != y.size:
-> 3840             raise ValueError("x and y must be the same size")
   3841 
   3842         s = np.ma.ravel(s)  # This doesn't have to match x, y in size.

ValueError: x and y must be the same size

"ss"是一个全局变量还是一个模块? - giosans
糟糕,忘记添加了。应该是 scipy.stats。感谢提醒。 - gc5
1
@fbrundu 不是一个答案,但你可能想看一下我在这里是如何实现的:http://phobson.github.io/mpl-probscale/tutorial/closer_look_at_viz.html#mapping-probability-plots-to-seaborn-facetgrids - Paul H
2个回答

3
我做到了这一点,并使用以下代码将颜色更改为使用Seaborn配色方案:
def qqplots(fpath, expr, title):

    def quantile_plot(x, **kwargs):
        x = ss.zscore(x)
        ss.probplot(x, plot=plt)

    expr_m = pd.melt(expr)
    expr_m.columns = ["Feature", "Value"]
    n_feat = len(expr_m["Feature"].value_counts().index)

    n_cols = int(np.sqrt(n_feat)) + 1

    g = sns.FacetGrid(expr_m, col="Feature", col_wrap=n_cols)
    g.map(quantile_plot, "Value");
    for ax in g.axes:
        ax.get_lines()[0].set_markerfacecolor(sns.color_palette()[0])
        ax.get_lines()[1].set_color(sns.color_palette()[3])
    plt.savefig(fpath + ".pdf", bbox_inches="tight")
    plt.savefig(fpath + ".png", bbox_inches="tight")
    plt.close()

qqplots("lognorm_qqplot", np.log2(expr), "Log-normal qqplot")

这个答案在应用statsmodels.api中的qqplot时似乎出现了问题。我得到了一个空的绘图网格,然后是每个单独的qqplot - SeanM
我现在无法测试它。如果代码不再起作用,请发布另一个答案。谢谢。 - gc5

1

回答您的问题:“我无法使用seaborn绘制QQ图的FacetGrid。”,我在这里给您一个使用seaborn数据集tips的示例。

要绘制qqplot图,最好的方法之一是使用statsmodels库,该库内置了qqplot函数。如果未传递给定的ax作为参数,则此函数将生成新的图形。因此,使用此函数的FacetGrid.map()会生成单独的图形,而不是将所有内容绘制在网格上。 要解决此问题,您可以使用用户定义的函数,在其中sm.qqplots通过plt.gca()检索当前ax。这里我创建了一个名为qqplot_new的新函数。在这里,qqplots就像测试数据是否服从正态分布。

from matplotlib import pyplot as plt
import seaborn as sns
import statsmodels.api as sm

tips = sns.load_dataset("tips")

def qqplot_new(x, ax=None, **kwargs):
    if ax is None:
        ax = plt.gca()
    sm.qqplot(x, ax=ax, **kwargs)
    
g = sns.FacetGrid(tips, col="time",  row="sex")
g.map(qqplot_new, "total_bill", line='s')

输出:得到的图像


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接