np.nan
替换为Other
,并使用groupby.value_counts
进行分组,然后使用unstack
和add_prefix
进行格式化。u = df['status'].where(df['status'].eq("ac")|df['status'].isna(),"Other")
out = (u.groupby(df['ID']).value_counts(dropna=False).unstack(fill_value=0)
.add_prefix("Number_").reset_index().rename_axis(None,axis=1))
或者;
a = pd.Series(np.select([df['status'].eq("ac"),df['status'].isna()],
['acc',np.nan],'other'))
out = (a.groupby(df['ID']).value_counts(dropna=True).unstack(fill_value=0)
.add_prefix("Numnber_").reset_index())
print(out)
ID Number_nan Number_Other Number_ac
0 1 0 3 1
1 2 1 0 2
u = df['status'].where(df['status'].eq("ac")|df['status'].isna(),"Other")
out = (pd.crosstab(df['ID'],u.fillna("NAN"),dropna=False)
.add_prefix("Number_").rename_axis(None).reset_index())
在按“ID”分组并求和之前,您可以通过assign创建列:
(df.assign(NumberAcc=df.status.eq("ac"),
NumberNaN=df.status.isna(),
NumberOther=lambda df: ~(df.NumberAcc | df.NumberNaN))
.groupby("ID")
.sum())
NumberAcc NumberNaN NumberOther
ID
1 1 0 3
2 2 1 0
isin
而不是eq
来匹配多个值:u = df['status'].where(df['status'].isin(["ac","bc"])|df['status'].isna(),"Other")
- ankycrosstab
,例如pd.crosstab(df['ID'], df['status'].fillna('NaN'))
。 - Shubham Sharma