如何在我的numpy数组中找到NaN /无穷大/过大的值以及dtype（'float64'）？

Question

如何在我的numpy数组中找到NaN /无穷大/过大的值以及dtype（'float64'）？

11

我正在尝试使用scikit learn拟合一个简单的机器学习模型。在这行代码上：

```python ```

clf.fit(features, labels)

我遇到了一个熟悉的错误：

 Input contains NaN, infinity or a value too large for dtype('float64').

我以前遇到过类似的问题，通常是因为数据中存在NaN值。但我已经确认数据中没有NaN值。.fit()方法的两个输入（特征和标签）都是np数组，但它们是从pandas数据框架生成的。在检测NaN值之前，我打印了：

print(features_df[features_df.isnull().any(axis=1)])
print(labels_df[labels_df.isnull().any(axis=1)])

这打印出了空的数据框，所以我知道其中没有行有NaN值。在转换后，我还检查了numpy数组是否存在NaN值，并成功地使用np.sum()方法对它们进行了求和，因此传递给fit的特征或标签np数组中没有NaN值。

这意味着必须存在无穷大的值或非常大的值，两者我都很难相信。有没有办法可以打印出数据框或np数组中任何值:

are NaN, infinity or a value too large for dtype('float64')?

我需要具体指出它们的位置，因为我的眼睛找不到它们，并且没有 NaN 值。

- sometimesiwritecode

1

你尝试使用类似于以下代码筛选出值吗：df = df[df.column_name.notnull()]？df = df[df.notnull()] 也应该有效。df 是 pandas 数据框。 - YKY

我不知道float64类型的值是否太大（你首先是如何将它们放入数组中的？），但是+/-inf、nan可以使用~np.isfinite来查找，前导波浪线表示反转掩码。如果你需要索引而不是掩码，请在掩码上使用np.where。 - Paul Panzer

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- fountainhead · Accepted Answer

假设这是形状为(3,3)的NumPy数组：

ar = np.array([1, 2, 3, 4, np.nan, 5, np.nan, 6, np.inf]).reshape((3,3))
print (ar)
[[ 1.  2.  3.]
 [ 4. nan  5.]
 [nan  6. inf]]

要检查 NaN、正无穷、负无穷或这些值的不同组合，我们可以使用：

numpy.isnan(ar)     # True wherever nan
numpy.isposinf(ar)  # True wherever pos-inf
numpy.isneginf(ar)  # True wherever neg-inf
numpy.isinf(ar)     # True wherever pos-inf or neg-inf
~numpy.isfinite(ar) # True wherever pos-inf or neg-inf or nan

分别返回布尔数组。将布尔数组传递给numpy.where()会给我们两个索引数组（每个维度一个索引数组）：

ar_nan = np.where(np.isnan(ar))
print (ar_nan)

(array([1, 2], dtype=int64), array([1, 0], dtype=int64)) # 表示在(1,1)和(2,0)位置有NaN值

ar_inf = np.where(np.isinf(ar))
print (ar_inf)

(array([2], dtype=int64), array([2], dtype=int64)) # 表示无穷大在(2,2)处

另外，为了查看float64的极限：

np.finfo(np.float64)

finfo（resolution=1e-15，min=-1.7976931348623157e+308， max=1.7976931348623157e+308，dtype=float64）