Pandas中的部分和与小计

5

我正在尝试实现一个带有小计的表格,如此处所示,但是对于多个列而不是一个,这段代码要么无法与最新的pandas版本(0.18.1)兼容,要么示例有误我的代码在这里生成了以下表格

                                                                   2014    2015    2016
project__name person__username activity__name    issue__subject                        
Influenster   employee1        Development                        161.0   122.0   104.0
                                                 Fix bug           22.0     0.0     0.0
                                                 Refactor view      0.0     7.0     0.0
                               Quality assurance                  172.0   158.0   161.0
              employee2        Development                        119.0   137.0   155.0
                               Quality assurance                  193.0   186.0   205.0
              employee3        Development       Refactor view      0.0     0.0     1.0
Profit tools  employee1        Development                        177.0   136.0   216.0
                               Quality assurance                  162.0   122.0   182.0
              employee2        Development                        154.0   168.0   124.0
                               Quality assurance                  130.0   183.0   192.0
                                                 Fix bug           22.0     0.0     0.0
All                                                              1312.0  1219.0  1340.0

and my desired output would be something like:

                                                                   2014    2015    2016
project__name person__username activity__name    issue__subject                        
Influenster   employee1        Development                        161.0   122.0   104.0
                                                 Fix bug           22.0     0.0     0.0
                                                 Refactor view      0.0     7.0     0.0
                                                 Total              xxx     xxx     xxx
                               Quality assurance                  172.0   158.0   161.0
                                                 Total              xxx     xxx     xxx
                               Total                                xxx     xxx     xxx
              employee2        Development                        119.0   137.0   155.0
                                                 Total              xxx     xxx     xxx
                               Quality assurance                  193.0   186.0   205.0
                                                 Total              xxx     xxx     xxx
                               Total                                xxx     xxx     xxx
              employee3        Development       Refactor view      0.0     0.0     1.0
                                                 Total              xxx     xxx     xxx
                               Total                                xxx     xxx     xxx
              Total                                                 xxx     xxx     xxx
Profit tools  employee1        Development                        177.0   136.0   216.0
                                                 Total              xxx     xxx     xxx
                               Quality assurance                  162.0   122.0   182.0
                                                 Total              xxx     xxx     xxx
                               Total                                xxx     xxx     xxx
              employee2        Development                        154.0   168.0   124.0
                                                 Total              xxx     xxx     xxx
                               Quality assurance                  130.0   183.0   192.0
                                                 Fix bug           22.0     0.0     0.0
                                                 Total              xxx     xxx     xxx
                               Total                                xxx     xxx     xxx
              Total                                                 xxx     xxx     xxx
All                                                              1312.0  1219.0  1340.0

需要如何实现这个问题,任何帮助都将不胜感激。


文档中的代码可以工作,但是你需要使用 pt.stack() 并在数据框中包含一个列值 columns=['issue__subject]。但是总计仅会显示每个列值的分组级别,而不是所有分组级别。 - Parfait
谢谢@Parfait。代码有效地工作了,我只是在表上调用stack()函数,认为它会改变对象,而不是返回一个新的对象。 - lufte
2个回答

8

递归的groupbyapply

def append_tot(df):
    if hasattr(df, 'name') and df.name is not None:
        xs = df.xs(df.name)
    else:
        xs = df
    gb = xs.groupby(level=0)
    n = xs.index.nlevels
    name = tuple('Total' if i == 0 else '' for i in range(n))
    tot = gb.sum().sum().rename(name).to_frame().T
    if n > 1:
        sm = gb.apply(append_tot)
    else:
        sm = gb.sum()
    return pd.concat([sm, tot])

fields = ['project__name', 'person__username',
          'activity__name', 'issue__subject']
append_tot(df.set_index(fields))

在这里输入图片描述

(修正递归函数调用中的拼写错误)


在重命名语句中,我收到了TypeError: 'tuple' object is not callable错误。这是否与pandas版本有关? - Prabhat
未找到 append_tot3 - Akhilesh_IN
不错,一点也不错!刚开始用的时候有些问题,但是我成功地解决了。这个概念真的很棒! - Dmytro Bugayev

2

考虑使用三级透视表,结合堆叠和连接操作,生成最终的groupby对象。正如提到的那样,如果在相应的透视表列值上使用.stack()方法,则可以实现文档中所述的功能。

# ISSUE_SUBJECT PIVOT
pt1 = pd.pivot_table(data=df, values=['2014', '2015', '2016'], 
                     columns=['issue__subject'], aggfunc=np.sum, 
                     index=['project__name', 'person__username', 'activity__name'], 
                     margins=True, margins_name = 'Total')    
pt1 = pt1.stack().reset_index()

# ACTIVITY_NAME PIVOT
pt2 = pd.pivot_table(data=df, values=['2014', '2015', '2016'], 
                     columns=['activity__name'], aggfunc=np.sum, 
                     index=['project__name', 'person__username'], 
                     margins=True, margins_name = 'Total' )    
pt2 = pt2.stack().reset_index()

# PERSON_USERNAME PIVOT
pt3 = pd.pivot_table(data=df, values=['2014', '2015', '2016'],        
                     columns=['person__username'], 
                     aggfunc=np.sum, index=['project__name'],
                     margins=True, margins_name = 'Total')    
pt3 = pt3.stack().reset_index()

# CONCATENATE ALL THREE
gdf = pd.concat([pt1, 
                 pt2[(pt2['project__name']=='Total') | 
                     (pt2['activity__name']=='Total')],
                 pt3[(pt3['project__name']=='Total') |
                     (pt3['person__username']=='Total')]]).reset_index(drop=True)

# REPLACE NaNS IN COLUMN
gdf = gdf.apply(lambda x: np.where(pd.isnull(x), '', x), axis=1)

# FINAL GROUPBY (A COUNT USED TO RENDER GROUPBY)
gdf = gdf.groupby(['project__name', 'person__username',
                   'activity__name', 'issue__subject',
                   '2014', '2015', '2016']).agg(len)

输出

project__name  person__username  activity__name     issue__subject  2014    2015    2016  
Influenster    Total                                                667.0   610.0   626.0     1
               employee1         Development                        161.0   122.0   104.0     1
                                                    Fix bug         22.0    0.0     0.0       1
                                                    Refactor view   0.0     7.0     0.0       1
                                                    Total           183.0   129.0   104.0     1
                                 Quality assurance                  172.0   158.0   161.0     1
                                                    Total           172.0   158.0   161.0     1
                                 Total                              355.0   287.0   265.0     1
               employee2         Development                        119.0   137.0   155.0     1
                                                    Total           119.0   137.0   155.0     1
                                 Quality assurance                  193.0   186.0   205.0     1
                                                    Total           193.0   186.0   205.0     1
                                 Total                              312.0   323.0   360.0     1
               employee3         Development        Refactor view   0.0     0.0     1.0       1
                                                    Total           0.0     0.0     1.0       1
                                 Total                              0.0     0.0     1.0       1
Profit tools   Total                                                645.0   609.0   714.0     1
               employee1         Development                        177.0   136.0   216.0     1
                                                    Total           177.0   136.0   216.0     1
                                 Quality assurance                  162.0   122.0   182.0     1
                                                    Total           162.0   122.0   182.0     1
                                 Total                              339.0   258.0   398.0     1
               employee2         Development                        154.0   168.0   124.0     1
                                                    Total           154.0   168.0   124.0     1
                                 Quality assurance                  130.0   183.0   192.0     1
                                                    Fix bug         22.0    0.0     0.0       1
                                                    Total           152.0   183.0   192.0     1
                                 Total                              306.0   351.0   316.0     1
Total                                                               1268.0  1212.0  1339.0    1
                                                    Fix bug         44.0    0.0     0.0       1
                                                    Refactor view   0.0     7.0     1.0       1
                                                    Total           1312.0  1219.0  1340.0    1
                                 Development                        633.0   570.0   600.0     1
                                 Quality assurance                  679.0   649.0   740.0     1
                                 Total                              1312.0  1219.0  1340.0    1
               Total                                                1312.0  1219.0  1340.0    1
               employee1                                            694.0   545.0   663.0     1
               employee2                                            618.0   674.0   676.0     1
               employee3                                            0.0     0.0     1.0       1

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接