使用Pandas重命名groupby和count结果的列名

3

给定以下数据帧:

import numpy as np
df = pd.DataFrame({'price': np.random.random_integers(0, high=100, size=100)})
ranges = [0,10,20,30,40,50,60,70,80,90,100]
df.groupby(pd.cut(df.price, ranges)).count()

输出:

          price
 price  
(0, 10]     9
(10, 20]    11
(20, 30]    11
(30, 40]    9
(40, 50]    16
(50, 60]    7
(60, 70]    10
(70, 80]    9
(80, 90]    14
(90, 100]   4

我该如何 reset_index 并将列名更改为 binscounts?谢谢。

      bins    counts
0   (0, 10]     9
1   (10, 20]    11
2   (20, 30]    11
3   (30, 40]    9
4   (40, 50]    16
5   (50, 60]    7
6   (60, 70]    10
7   (70, 80]    9
8   (80, 90]    14
9   (90, 100]   4
3个回答

4

这段代码能够运行,但是不够简洁,如果你有其他选项,欢迎分享:

df.groupby(pd.cut(df.price, ranges)).count()\
.rename(columns={'price' : 'counts'})\
.reset_index()\
.rename(columns={'price': 'bins'})

输出:

      bins    counts
0   (0, 10]     9
1   (10, 20]    11
2   (20, 30]    11
3   (30, 40]    9
4   (40, 50]    16
5   (50, 60]    7
6   (60, 70]    10
7   (70, 80]    9
8   (80, 90]    14
9   (90, 100]   4

3
一个想法是使用 pd.cut 中的 rename 来处理 Series,因此如果选择 price 列进行分组处理,则输出为 Series,因此需要添加 Series.reset_index,并设置 name 参数为 2 columns DataFrame
df1 = (df.groupby(pd.cut(df.price, ranges).rename('bins'))['price'].count()
         .reset_index(name='counts'))
print (df1)
        bins  counts
0    (0, 10]      13
1   (10, 20]      13
2   (20, 30]       9
3   (30, 40]       9
4   (40, 50]       7
5   (50, 60]       9
6   (60, 70]       9
7   (70, 80]      12
8   (80, 90]       9
9  (90, 100]       9

谢谢,你的解决方案好多了。 - ah bon

0

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接