Pandas - 按两列分组 - 无法重置索引

Question

Pandas - 按两列分组 - 无法重置索引

4

我有一个如下的DF：

Date Bought | Fruit
2018-01       Apple
2018-02       Orange
2018-02       Orange
2018-02       Lemon

我希望按照“购买日期”和“水果”对数据进行分组，并计算出现次数。 预期结果：

Date Bought | Fruit | Count
2018-01       Apple     1
2018-02       Orange    2
2018-02       Lemon     1

我得到了什么：

Date Bought | Fruit | Count
2018-01       Apple     1
2018-02       Orange    2
              Lemon     1

使用的代码：

Initial attempt:
df.groupby(['Date Bought','Fruit'])['Fruit'].agg('count')

#2
df.groupby(['Date Bought','Fruit'])['Fruit'].agg('count').reset_index()
ERROR: Cannot insert Fruit, already exists

#3
df.groupby(['Date Bought','Fruit'])['Fruit'].agg('count').reset_index(inplace=True)
ERROR: Type Error: Cannot reset_index inplace on a Series to create a DataFrame

文档显示 groupby 函数返回的是一个“groupby 对象”，而不是标准的 DF。我该如何按照上述所述进行分组，并保留 DF 格式？

- SheerKahn

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- jezrael · Accepted Answer

这里的问题在于，通过重置索引，您会得到两列具有相同名称的情况。由于使用 Series，可以在 Series.reset_index 函数中设置参数 name：

df1 = (df.groupby(['Date Bought','Fruit'], sort=False)['Fruit']
         .agg('count')
         .reset_index(name='Count'))
print (df1)
  Date Bought   Fruit  Count
0     2018-01   Apple      1
1     2018-02  Orange      2
2     2018-02   Lemon      1