以下是示例数据:
data={'Person':['a','a','a','a','a','b','b','b','b','b','b'],
'Sales':['50','60','90','30','33','100','600','80','90','400','550'],
'Price':['10','12','8','10','12','10','13','16','14','12','10']}
data=pd.DataFrame(data)
对于每个人(组),我希望价格基于滚动的第二高销售额,但每个组的窗口将不同。结果应如下所示:
result={'Person':['a','a','a','a','a','b','b','b','b','b','b'],
'Sales':['50','60','90','30','33','100','600','80','90','400','550'],
'Price':['10','12','8','10','12','10','13','16','14','12','10'],
'Second_Highest_Price':['','10','12','12','12','','10','10','10','12','10']}
我尝试使用 nlargest(2),但不确定如何在滚动基础上让它工作。
df.groupby("Person").apply(custom_function_to_find_second_highest_sales)
。 - Matt