Seaborn Clustermap浮点错误:NaN差异值

4

我尝试运行这段代码:

import pandas as pd
import seaborn as sns

df = pd.DataFrame(clusters, columns=cols)

sns.clustermap(df, cmap="vlag", vmin=0, vmax=1, metric="correlation", 
    z_score=None, standard_scale=None, yticklabels=True, 
    figsize=(size, size))

集群的价值在于:

clusters = [[0.89463602, 0.,         0.,         0.85185185, 0.9023569,  0.,
  0.,         0.83333333, 0.,         0.,         0.,        ],

 [0.75,       0.66666667, 0.,         0.,         0.69444444, 0.,
  0.89272031, 0.,         0.69444444, 0.,         0.69444444,],

 [0.85185185, 0.88910175, 0.,         0.,         0.9043771,  0.,
  0.,         0.,         0.89092141, 0.77777778, 0.69444444,],

 [0.75,       0.89825458, 0.,         0.,         0.77777778, 0.,
  0.8908046,  0.,         0.75,       0.91550069, 0.8,       ],]

我遇到了以下错误:

in linkage
    linkage_wrap(N, X, Z, mthidx[method])
FloatingPointError: NaN dissimilarity value.

有什么想法是什么导致了它?


在这些选项中,metcirc='correlation' 是你想要使用的,其他的选项在计算方面基本上都是可选的,对吗? - yosukesabai
1个回答

5
您的两列数据全是零,没有任何变化,这导致计算相关性时返回nan值。
cols = ["col"+str(i) for i in range(11)]
df = pd.DataFrame(clusters, columns=cols)
df.corr()

            col0       col1     col2    col3    col4        col5    col6    col7    col8    col9    col10
col0    1.000000    -0.652805   NaN 0.755353    0.914034    NaN -0.971167   0.755353    -0.607892   -0.232318   -0.792705
col1    -0.652805   1.000000    NaN -0.967396   -0.353987   NaN 0.461102    -0.967396   0.982783    0.761192    0.976659
col2    NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
col3    0.755353    -0.967396   NaN 1.000000    0.537949    NaN -0.577350   1.000000    -0.978166   -0.573568   -0.990826
col4    0.914034    -0.353987   NaN 0.537949    1.000000    NaN -0.943651   0.537949    -0.352431   0.181392    -0.546475
col5    NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
col6    -0.971167   0.461102    NaN -0.577350   -0.943651   NaN 1.000000    -0.577350   0.401476    0.079648    0.627048
col7    0.755353    -0.967396   NaN 1.000000    0.537949    NaN -0.577350   1.000000    -0.978166   -0.573568   -0.990826
col8    -0.607892   0.982783    NaN -0.978166   -0.352431   NaN 0.401476    -0.978166   1.000000    0.665620    0.962359
col9    -0.232318   0.761192    NaN -0.573568   0.181392    NaN 0.079648    -0.573568   0.665620    1.000000    0.636492
col10   -0.792705   0.976659    NaN -0.990826   -0.546475   NaN 0.627048    -0.990826   0.962359    0.636492    1.000000

df[['col2','col5']]

   col2 col5
0   0.0 0.0
1   0.0 0.0
2   0.0 0.0
3   0.0 0.0

您可以删除这些列并绘图,或者必须使用欧几里得或坎贝拉作为度量。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接