在pandas数据帧中没有NaN,在每个groupby的组中,它们只有Int64Index,没有其他非groupby列。我感到困惑。
我错过了什么?
这里是可复制的代码:
df = pd.DataFrame({
"a": np.random.rand(1000),
"b": np.random.rand(1000),
"c": np.random.rand(1000)
})
ranges = np.linspace(0, 1, 100)
df["a_bin"] = pd.cut(df.a, ranges)
df["b_bin"] = pd.cut(df.b, ranges)
print(df.groupby(["a_bin", "b_bin"]).c.mean())
这里是结果:
a_bin b_bin
(0.0, 0.0101] (0.0, 0.0101] NaN
(0.0101, 0.0202] NaN
(0.0202, 0.0303] NaN
(0.0303, 0.0404] NaN
(0.0404, 0.0505] NaN
..
(0.99, 1.0] (0.949, 0.96] NaN
(0.96, 0.97] NaN
(0.97, 0.98] NaN
(0.98, 0.99] NaN
(0.99, 1.0] NaN
Name: c, Length: 9801, dtype: float64
我的 pandas 版本是:1.0.1。
df.groupby(['a_bin','b_bin']).c.count()
检查每个组的计数。好答案。+1 - Ch3steR