切割一系列面板

Question

切割一系列面板

3

I have a simple dataframe:

>>> df = pd.DataFrame(np.random.randint(0,5,(20, 2)), columns=['col1','col2'])
>>> df['ind1'] = list('AAAAAABBBBCCCCCCCCCC')
>>> df.set_index(['ind1'], inplace=True)
>>> df

      col1  col2
ind1            
A        0     4
A        1     2
A        1     0
A        4     1
A        1     3
A        0     0
B        0     4
B        2     0
B        3     1
B        0     3
C        1     3
C        2     1
C        4     0
C        4     0
C        4     1
C        3     0
C        4     4
C        0     2
C        0     2
C        1     2

我正在尝试获取其两列的滚动相关系数：

>>> df.groupby(level=0).rolling(3,min_periods=1).corr()

ind1
A    <class 'pandas.core.panel.Panel'>
Dimensions: ...
B    <class 'pandas.core.panel.Panel'>
Dimensions: ...
C    <class 'pandas.core.panel.Panel'>
Dimensions: ...
dtype: object

问题在于结果是一系列的面板：

>>> type(df.groupby(level=0).rolling(3,min_periods=1).corr())

pandas.core.series.Series

我能够单独为每一行获取所需的系数...

>>> df.groupby(level=0).rolling(3,min_periods=1).corr()['A']

<class 'pandas.core.panel.Panel'>
Dimensions: 10 (items) x 2 (major_axis) x 2 (minor_axis)
Items axis: C to C
Major_axis axis: col1 to col2
Minor_axis axis: col1 to col2

>>> df.groupby(level=0).rolling(3,min_periods=1).corr().loc['A'].ix[2]

          col1      col2
col1  1.000000 -0.866025
col2 -0.866025  1.000000

>>> df.groupby(level=0).rolling(3,min_periods=1).corr().loc['A'].ix[2,'col1','col2']

-0.86602540378443849

...但我不知道如何切分结果（一系列的面板），以便将结果作为列分配给现有数据框。类似于：

df['cor_coeff'] = df.groupby(level=0).rolling(3,min_periods=1).corr()['some slicing']

有什么线索吗？或者有更好的方法来获得滚动相关系数吗？

- kekert

这是一个写得很好的问题 - 它包含了复制问题所需的所有内容，展示了你迄今为止尝试过的内容，并展示了你对解决方案想要看到的样子的想法。不错！ - ASGM

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- ASGM · Accepted Answer

你的问题在于没有指定other参数就调用了.corr()。即使你的数据框只有两列，Pandas也不知道你实际想要哪种相关性，因此它会计算所有可能的相关性（col1 x col1，col1 x col2，col2 x col1，col2 x col2），并以2x2的数据结构将结果呈现给你。如果你想获取一个相关性的结果，你需要通过设置基础列和other列来指定所需的相关性。如果你没有使用groupby，你可以按照以下方式操作：

df['col1'].rolling(min_periods=1, window=3).corr(other=g['col2'])

如果你使用了 groupby，你需要将其嵌套在一个带有lambda函数的 apply 子句中（或者你可以将其移到单独的函数中，如果你更喜欢）：

df.groupby(level=0).apply(lambda g: g['col1'].rolling(min_periods=1, window=3).corr(other=g['col2']))