import numpy as np
import pandas as pd
np.random.seed(2016)
dates = pd.date_range('1/1/2001','1/1/2003', freq = 'd')
nums = [np.random.randint(100) for x in range(len(dates))]
df = pd.DataFrame({'Dates': dates, 'DOW': dates.strftime('%a'), 'Nums': nums})
df = df[(df.DOW != 'Sat') & (df.DOW !='Sun')]
df = df.drop([7,18]).reset_index(drop = True)
df2 = df.groupby(pd.Grouper(freq='W', key='Dates'))['Nums'].agg(['max','last'])
df2['previous_max'] = df2['max'].shift(1)
df2['change'] = (df2['last']-df2['previous_max'])/df2['previous_max']
print(df2.head())
收益率
max last previous_max change
Dates
2001-01-07 83 39 NaN NaN
2001-01-14 75 75 83.0 -0.096386
2001-01-21 97 18 75.0 -0.760000
2001-01-28 72 37 97.0 -0.618557
2001-02-04 84 24 72.0 -0.666667
df.groupby
与 pd.Grouper
对象 结合使用,可以按周对行进行分组。您可以使用 agg
方法来查找每个组中 Nums
的 max
和 last
值:
In [163]: df2 = df.groupby(pd.Grouper(freq='W', key='Dates'))['Nums'].agg(['max','last'])
In [164]: df2.head()
Out[164]:
max last
Dates
2001-01-07 83 39
2001-01-14 75 75
2001-01-21 97 18
2001-01-28 72 37
2001-02-04 84 24
然后使用 shift(1)
将 max
的值向下移动一行:
In [165]: df2['previous_max'] = df2['max'].shift(1); df2.head()
Out[165]:
max last previous_max
Dates
2001-01-07 83 39 NaN
2001-01-14 75 75 83.0
2001-01-21 97 18 75.0
2001-01-28 72 37 97.0
2001-02-04 84 24 72.0
然后,百分比变化可以通过简单的减法和除法计算得出:
In [166]: df2['change'] = (df2['last']-df2['previous_max'])/df2['previous_max']; df2.head()
Out[166]:
max last previous_max change
Dates
2001-01-07 83 39 NaN NaN
2001-01-14 75 75 83.0 -0.096386
2001-01-21 97 18 75.0 -0.760000
2001-01-28 72 37 97.0 -0.618557
2001-02-04 84 24 72.0 -0.666667