以下是逐步操作的方法。如果您有任何问题,请告诉我。
import pandas as pd
pd.set_option('display.max_columns', None)
c = ['Store','Month Opened','State','City','Month End Date','Sales']
d = [['AAA','5/31/2020','NY','New York','5/31/2020',1000],
['AAA','5/31/2020','NY','New York','6/30/2020',5000],
['AAA','5/31/2020','NY','New York','7/30/2020',3000],
['AAA','5/31/2020','NY','New York','8/31/2020',4000],
['AAA','5/31/2020','NY','New York','9/30/2020',2000],
['BBB','6/30/2020','CT','Hartford','6/30/2020',100],
['BBB','6/30/2020','CT','Hartford','7/30/2020',200],
['CCC','3/31/2020','NJ','Cranbury','3/31/2020',1500]]
df = pd.DataFrame(d,columns = c)
df['Month Opened'] = pd.to_datetime(df['Month Opened'])
df['Month End Date'] = pd.to_datetime(df['Month End Date'])
df1 = df.sort_values('Month End Date').drop_duplicates('Store', keep='last').copy()
df1 = df1[df1['Month End Date'] != '2020-09-30']
df1['Target_End_Date'] = pd.to_datetime ('2020-09-30')
df1['repeats'] = df1['Target_End_Date'].dt.to_period('M').astype(int) - df1['Month End Date'].dt.to_period('M').astype(int)
df1['Month End Date'] = df1['Month End Date'] + pd.DateOffset(months =1)
df1['Sales'] = 0
df1 = df1.loc[df1.index.repeat(df1.repeats)].reset_index(drop=True)
df1['repeats'] = df1.groupby('Store').cumcount()
df1['Month End Date'] = df1.apply(lambda x: x['Month End Date'] + pd.DateOffset(months = x['repeats']), axis=1)
df1['Month End Date'] = pd.to_datetime(df1['Month End Date']) + pd.offsets.MonthEnd(0)
df1.drop(columns=['Target_End_Date','repeats'],inplace=True)
df = pd.concat([df, df1], ignore_index=True)
df = df.sort_values(by=['Store','Month End Date'],ignore_index=True)
print (df)
它的输出是:
Store Month Opened State City Month End Date Sales
0 AAA 2020-05-31 NY New York 2020-05-31 1000
1 AAA 2020-05-31 NY New York 2020-06-30 5000
2 AAA 2020-05-31 NY New York 2020-07-30 3000
3 AAA 2020-05-31 NY New York 2020-08-31 4000
4 AAA 2020-05-31 NY New York 2020-09-30 2000
5 BBB 2020-06-30 CT Hartford 2020-06-30 100
6 BBB 2020-06-30 CT Hartford 2020-07-30 200
7 BBB 2020-06-30 CT Hartford 2020-08-30 0
8 BBB 2020-06-30 CT Hartford 2020-09-30 0
9 CCC 2020-03-31 NJ Cranbury 2020-03-31 1500
10 CCC 2020-03-31 NJ Cranbury 2020-04-30 0
11 CCC 2020-03-31 NJ Cranbury 2020-05-31 0
12 CCC 2020-03-31 NJ Cranbury 2020-06-30 0
13 CCC 2020-03-31 NJ Cranbury 2020-07-31 0
14 CCC 2020-03-31 NJ Cranbury 2020-08-31 0
15 CCC 2020-03-31 NJ Cranbury 2020-09-30 0
请注意,我添加了一个带有CCC的条目,以展示更多的变化。