Pandas扩展列的GroupBy函数

Question

Pandas扩展列的GroupBy函数

3

我有以下表格：

	品牌	商品名称
0	Nike	鞋子
1	Nike	袜子
2	Adidas	鞋子
3	Adidas	鞋子
4	Adidas	袜子
5	Flight	短裤

我想使用Pandas中的GroupBy函数，生成下表（按行和列合计）来查看每个特定品牌-商品对出现的次数:

	鞋子	袜子	短裤	总数
Nike	1	1	0	2
Adidas	2	1	0	3
Flight	0	0	1	1
总数	3	2	1	6

然后，我想把表格里的结果转化成百分比形式：

%来自于将单元格数值除以列总数（例如，{鞋子，Adidas}=2/3=67％，或者{总数，Adidas}=3/6=50％）

	鞋子	袜子	短裤	总计
Nike	50%	50%	0%	33%
Adidas	67%	50%	0%	50%
Flight	0%	0%	100%	17%
总计	100%	100%	100%	100%

最后，是否有一种方法可以通过调整因子（例如0.75）将所有单元格数字乘以？

- zen

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Andrej Kesely · Accepted Answer

尝试使用 pd.crosstab ：

out = pd.crosstab(df["Brand"], df["Product"])
out["Total"] = out.sum(axis=1)
out.index.name, out.columns.name = None, None
print(out)

输出：

        Shoes  Shorts  Socks  Total
Adidas      2       0      1      3
Flight      0       1      0      1
Nike        1       0      1      2

编辑：要获得百分比，可以在之后进行以下操作：

out.iloc[:, :-1] = (
    out.iloc[:, :-1]
    .div(out["Total"], axis=0)
    .mul(100)
    .round(0)
    .astype(int)
    .astype(str)
    + "%"
)

out["Total"] = (
    out["Total"]
    .div(out["Total"].sum())
    .mul(100)
    .round(0)
    .astype(int)
    .astype(str)
    + "%"
)

输出:

       Shoes Shorts Socks Total
Adidas   67%     0%   33%   50%
Flight    0%   100%    0%   17%
Nike     50%     0%   50%   33%