如何在 Pandas 数据框中按列选定的行值分组，并将其分配给一个新列？

Question

如何在 Pandas 数据框中按列选定的行值分组，并将其分配给一个新列？

3

我想要一个新列，称为C，其中我可以在Id级别上获取B=6的分组值。

Jan18.loc[Jan18['Enquiry Purpose']==6].groupby(Jan18['Member Reference']).transform('count')

Id  B   No_of_6
1   6   3
2   13  5
1   6   3
2   6   5
1   6   3
2   6   5
1   10  3
2   6   5
2   6   5
2   6   5

- Abhay kumar

2个回答

0

一种使用map的解决方案。该解决方案将在Id组中没有6的数字时返回NaN。

df['No_of_6'] = df.Id.map(df[df.B.eq(6)].groupby('Id').B.count())

Out[113]:
   Id   B  No_of_6
0   1   6        3
1   2  13        5
2   1   6        3
3   2   6        5
4   1   6        3
5   2   6        5
6   1  10        3
7   2   6        5
8   2   6        5
9   2   6        5

- Andy L.

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- jezrael · Accepted Answer

通过Series.eq比较值，使用整数进行转换并使用GroupBy.transform创建新列，每组填充sum：

df['No_of_6'] = df['B'].eq(6).astype(int).groupby(df['Id']).transform('sum')
#alternative
#df['No_of_6'] = df.assign(B= df['B'].eq(6).astype(int)).groupby('Id')['B'].transform('sum')
print (df)
   Id   B  No_of_6
0   1   6        3
1   2  13        5
2   1   6        3
3   2   6        5
4   1   6        3
5   2   6        5
6   1  10        3
7   2   6        5
8   2   6        5
9   2   6        5

一般通过条件创建布尔掩码，然后传递以下内容：

mask = df['B'].eq(6)
#alternative
#mask = (df['B'] == 6)
df['No_of_6'] = mask.astype(int).groupby(df['Id']).transform('sum')