按月分组的数据透视表

Question

按月分组的数据透视表

3

我有一个名为df的DataFrame如下所示：

df = pd.DataFrame([["12", "10-01-2022", 'boot', "shoe", 100, 50],
                   ["211", "10-01-2022", 'sandal', "shoe", 210, 20],
                   ["321", "10-02-2022", 'boot', "shoe", 100, 45],
                   ["413", "10-02-2022", 'boot', "shoe", 100, 45],
                   ["15", "10-02-2022", 'dress', "cloth", 155, 95],
                   ["633", "10-03-2022", 'boot', "shoe", 75, 30],
                   ["247", "10-03-2022", 'boot', "shoe", 75, 30],
                   ["8787", "10-04-2022", 'boot', "shoe", 120, 45],
                   ["9232", "10-05-2022", 'shirt', "cloth", 75, 30],
                   ["12340", "10-05-2022", 'dress', "cloth", 175, 95 ]],
                  columns=["count", "date", "name", "category", "price", "revenue"])

我需要按月份进行聚合，以查看数量、价格和收入的总和，例如:

|name  | category |Count                        | price                       | revenue            |      
|      |          | Jan | Feb | Mar | Apr | Mai | Jan | Feb | Mar | Apr | Mai |Jan | Feb | Mar | Apr | Mai |
|boot  | shoe     | 12  | 734 | 880 | 8787|  -  | 100 | 100 | 75  | 120 | -   | 50 | 45  | 30  | 45 |-|
|sandal| shoe     | 211 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
|dress | cloth    | -   | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
|shirt | cloth    | -   | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |

我该怎么做？

- user188439

如果我需要在一个列中聚合超过一个月的数据，比如“价格10月至12月”，该怎么办？ - user188439

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Scott Boston · Accepted Answer

试试这个：

df = pd.DataFrame([["12", "10-01-2022", 'boot', "shoe", 100, 50],
                   ["211", "10-01-2022", 'sandal', "shoe", 210, 20],
                   ["321", "10-02-2022", 'boot', "shoe", 100, 45],
                   ["413", "10-02-2022", 'boot', "shoe", 100, 45],
                   ["15", "10-02-2022", 'dress', "cloth", 155, 95],
                   ["633", "10-03-2022", 'boot', "shoe", 75, 30],
                   ["247", "10-03-2022", 'boot', "shoe", 75, 30],
                   ["8787", "10-04-2022", 'boot', "shoe", 120, 45],
                   ["9232", "10-05-2022", 'shirt', "cloth", 75, 30],
                   ["12340", "10-05-2022", 'dress', "cloth", 175, 95 ]],
                  columns=["count", "date", "name", "category", "price", "revenue"])

['count'] = df['count'].astype(int)
df['month'] = pd.to_datetime(df['date']).dt.strftime('%b')
df.groupby(['category', 'name', 'month'])[['count', 'revenue', 'price']].sum().unstack(fill_value=0)

输出：

                 count revenue price
month              Oct     Oct   Oct
category name                       
cloth    dress   12355     190   330
         shirt    9232      30    75
shoe     boot    10413     245   570
         sandal    211      20   210