我认为你需要使用dropna
来去掉NaN
值:
incoms=data['int_income'].dropna().unique().tolist()
print (incoms)
[75000.0, 50000.0, 0.0, 200000.0, 100000.0, 25000.0, 10000.0, 175000.0, 150000.0, 125000.0]
如果所有的值仅为整数:
incoms=data['int_income'].dropna().astype(int).unique().tolist()
print (incoms)
[75000, 50000, 0, 200000, 100000, 25000, 10000, 175000, 150000, 125000]
或者通过选择所有非 NaN 的值,使用 numpy.isnan
来移除 NaN
:
a = data['int_income'].unique()
incoms= a[~np.isnan(a)].tolist()
print (incoms)
[75000.0, 50000.0, 0.0, 200000.0, 100000.0, 25000.0, 10000.0, 175000.0, 150000.0, 125000.0]
a = data['int_income'].unique()
incoms= a[~np.isnan(a)].astype(int).tolist()
print (incoms)
[75000, 50000, 0, 200000, 100000, 25000, 10000, 175000, 150000, 125000]
纯 Python 解决方案 - 如果数据量大则速度较慢 DataFrame
:
incoms=[x for x in list(set(data['int_income'])) if pd.notnull(x)]
print (incoms)
[0.0, 100000.0, 200000.0, 25000.0, 125000.0, 50000.0, 10000.0, 150000.0, 175000.0, 75000.0]
incoms=[int(x) for x in list(set(data['int_income'])) if pd.notnull(x)]
print (incoms)
[0, 100000, 200000, 25000, 125000, 50000, 10000, 150000, 175000, 75000]
math.isnan
而不是依赖于实现细节,比如str(float("nan")) == "nan"
。 - juanpa.arrivillaga