我是新手,所以请耐心等待。
我的数据框格式如下:
date,name,country,tag,cat,score
2017-05-21,X,US,free,4,0.0573
2017-05-22,X,US,free,4,0.0626
2017-05-23,X,US,free,4,0.0584
2017-05-24,X,US,free,4,0.0563
2017-05-21,X,MX,free,4,0.0537
2017-05-22,X,MX,free,4,0.0640
2017-05-23,X,MX,free,4,0.0648
2017-05-24,X,MX,free,4,0.0668
我正在尝试想出一种方法,在国家/标签/类别组内找到X天移动平均值,所以我需要:
date,name,country,tag,cat,score,moving_average
2017-05-21,X,US,free,4,0.0573,0
2017-05-22,X,US,free,4,0.0626,0.0605
2017-05-23,X,US,free,4,0.0584,0.0594
2017-05-24,X,US,free,4,0.0563,and so on
...
2017-05-21,X,MX,free,4,0.0537,and so on
2017-05-22,X,MX,free,4,0.0640,and so on
2017-05-23,X,MX,free,4,0.0648,and so on
2017-05-24,X,MX,free,4,0.0668,and so on
我尝试按照需要的列进行分组,然后使用pd.rolling_mean函数,但最终得到了一堆NaN值。
df.groupby(['date', 'name', 'country', 'tag'])['score'].apply(pd.rolling_mean, 2, min_periods=2) # window size 2
我该如何正确地执行这个操作?
date
分组:df.groupby(['name', 'country', 'tag'])['score'].apply(pd.rolling_mean, 2, min_periods=2)
。 - BENYdf.groupby(['col1', 'col2'])['col3'].apply(lambda x: x.rolling(window=3, center=True).mean())
,它可以正常工作。rolling_mean
已经被弃用,你应该使用rolling
代替。 - Woody Pride