df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'foo'],
'B' : ['one', 'one', 'two', 'three', 'two', 'two', 'one', 'three'],
'C' : [np.nan, 'bla2', np.nan, 'bla3', np.nan, np.nan, np.nan, np.nan]})
输出:
A B C
0 foo one NaN
1 bar one bla2
2 foo two NaN
3 bar three bla3
4 foo two NaN
5 bar two NaN
6 foo one NaN
7 foo three NaN
我想使用groupby来统计不同foo组合中NaN的数量。
期望的输出(编辑): A B C D
0 foo one NaN 2
1 bar one bla2 0
2 foo two NaN 2
3 bar three bla3 0
4 foo two NaN 2
5 bar two NaN 1
6 foo one NaN 2
7 foo three NaN 1
目前我正在尝试这个:
df['count']=df.groupby(['A'])['B'].isnull().transform('sum')
但这并不起作用...
谢谢
.reset_index(name='count')
中的name
参数似乎不被支持:https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.reset_index.html - jmatsen