如果你想用某些列中的众数来填补数据框df中的缺失值,你只需要通过选择位置创建一个Series,并使用fillna函数和iloc方法即可实现。请参考:
fillna
和
iloc
。
cols = ["workclass", "native-country"]
df[cols]=df[cols].fillna(df.mode().iloc[0])
或者:
df[cols]=df[cols].fillna(mode.iloc[0])
您的解决方案:
df[cols]=df.filter(cols).fillna(mode.iloc[0])
示例:
df = pd.DataFrame({'workclass':['Private','Private',np.nan, 'another', np.nan],
'native-country':['United-States',np.nan,'Canada',np.nan,'United-States'],
'col':[2,3,7,8,9]})
print (df)
col native-country workclass
0 2 United-States Private
1 3 NaN Private
2 7 Canada NaN
3 8 NaN another
4 9 United-States NaN
mode = df.filter(["workclass", "native-country"]).mode()
print (mode)
workclass native-country
0 Private United-States
cols = ["workclass", "native-country"]
df[cols]=df[cols].fillna(df.mode().iloc[0])
print (df)
col native-country workclass
0 2 United-States Private
1 3 United-States Private
2 7 Canada Private
3 8 United-States another
4 9 United-States Private
/anaconda3/envs/exts-ml/lib/python3.6/site-packages/pandas/core/frame.py:4024: SettingWithCopyWarning: 尝试在DataFrame的切片副本上设置值
- Mactilda