如何在groupby
后,高效地提取每个组中最大客户数后的第二个索引。
假设有一个数据框df
,其中包含各个州和每个州的10个官员(名称为Officer 1
到Officer 10
)。列Current Status
始终具有值Customer
:
State List Sales Officer Current Status
0 UP Officer 4 Customer
1 MH Officer 5 Customer
2 AP Officer 6 Customer
3 AN Officer 2 Customer
4 GJ Officer 3 Customer
.... so on
预期输出包括每个州客户数量最高的销售人员:
State List Sales Officer
AN Officer 6 403
AP Officer 1 266
Officer 8 266
... and so on
到目前为止,我已执行了以下操作:
df.groupby(['State List', 'Sales Officer'])['Current Status'].count()#.reset_index()
给我下列内容:
State List Sales Officer
AN Officer 1 376
Officer 10 401
Officer 2 353
Officer 3 373
Officer 4 375
Officer 5 382
Officer 6 403
Officer 7 400
Officer 8 385
Officer 9 378
AP Officer 1 266
Officer 10 228
Officer 2 240
Officer 3 248
Officer 4 235
Officer 5 229
Officer 6 242
Officer 7 238
Officer 8 266
Officer 9 243
现在,我遇到了一个问题,需要找出每个州列表中客户数最多的
Sales Officer
。你有什么想法吗?