Python：如何处理列表中NaN的相等性？

Question

Python：如何处理列表中NaN的相等性？

4

我只想找出这些结果背后的逻辑：

>>>nan = float('nan')
>>>nan == nan
False 
# I understand that this is because the __eq__ method is defined this way
>>>nan in [nan]
True 
# This is because the __contains__ method for list is defined to compare the identity first then the content?

但是在这两种情况下，我认为幕后调用的函数是PyObject_RichCompareBool，对吗？为什么会有差异？它们不应该有相同的行为吗？

- Bob Fang

第一个不奇怪，因为nan在所有编程语言中都是这样的（来自标准），对于第二个我不确定。 - simonzack

__contains__ 可能会短路，因为 nan is nan == True。此外，float('nan') in [float('nan')] == False。 - lunixbochs

3个回答

1

您说的是正确的，PyObject_RichCompareBool被调用了，可以看到listobject.c中的list_contains函数。

文档中说：

这相当于Python表达式o1 op o2，其中op是与opid对应的运算符。

但是，这似乎并不完全正确。

在cpython源代码中，我们有以下部分：

int
PyObject_RichCompareBool(PyObject *v, PyObject *w, int op)
{
    PyObject *res;
    int ok;

    /* Quick result when objects are the same.
       Guarantees that identity implies equality. */
    if (v == w) {
        if (op == Py_EQ)
            return 1;
        else if (op == Py_NE)
            return 0;
    }

在这种情况下，由于对象相同，我们具有相等性。

- simonzack

0

从数学上讲，将无限与无限进行比较是没有意义的。这就是为什么nan没有定义相等性的原因。

对于nan in [nan]的情况，引用了不可变变量。但要小心：

>>> nan is nan
True

>>> float('nan') is float('nan')
False

在第一个案例中，引用了不可变变量。在第二个案例中，创建了两个不同的浮点数并进行了比较。

- Dietrich

哇，那你怎么知道解析出来的 nan 是一个 nan 呢？你既不能用 == 也不能用 is 来与默认的 nan 对象进行比较。 - blueFast

我回答自己：在数学库中有一个 isnan 函数。 - blueFast

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Ashwini Chaudhary · Accepted Answer

但在两种情况下，我认为幕后调用了函数PyObject_RichCompareBool，为什么会有差异？它们不应该有相同的行为吗？ ==从未直接在浮点对象上调用PyObject_RichCompareBool，浮点数有它们自己的rich_compare方法（用于__eq__），具体取决于传递给它的参数，可能会或可能不会调用PyObject_RichCompareBool。

 /* Comparison is pretty much a nightmare.  When comparing float to float,
 * we do it as straightforwardly (and long-windedly) as conceivable, so
 * that, e.g., Python x == y delivers the same result as the platform
 * C x == y when x and/or y is a NaN.
 * When mixing float with an integer type, there's no good *uniform* approach.
 * Converting the double to an integer obviously doesn't work, since we
 * may lose info from fractional bits.  Converting the integer to a double
 * also has two failure modes:  (1) a long int may trigger overflow (too
 * large to fit in the dynamic range of a C double); (2) even a C long may have
 * more bits than fit in a C double (e.g., on a a 64-bit box long may have
 * 63 bits of precision, but a C double probably has only 53), and then
 * we can falsely claim equality when low-order integer bits are lost by
 * coercion to double.  So this part is painful too.
 */

static PyObject*
float_richcompare(PyObject *v, PyObject *w, int op)
{
    double i, j;
    int r = 0;

    assert(PyFloat_Check(v));
    i = PyFloat_AS_DOUBLE(v);

    /* Switch on the type of w.  Set i and j to doubles to be compared,
     * and op to the richcomp to use.
     */
    if (PyFloat_Check(w))
        j = PyFloat_AS_DOUBLE(w);

    else if (!Py_IS_FINITE(i)) {
        if (PyInt_Check(w) || PyLong_Check(w))
            /* If i is an infinity, its magnitude exceeds any
             * finite integer, so it doesn't matter which int we
             * compare i with.  If i is a NaN, similarly.
             */
            j = 0.0;
        else
            goto Unimplemented;
    }
...

另一方面，list_contains 直接在项目上调用 PyObject_RichCompareBool，因此在第二种情况下您会得到 True。

请注意，这仅适用于 CPython。PyPy 的list.__contains__ 方法似乎只通过调用它们的 __eq__ 方法来比较项目。

$~/pypy-2.4.0-linux64/bin# ./pypy
Python 2.7.8 (f5dcc2477b97, Sep 18 2014, 11:33:30)
[PyPy 2.4.0 with GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>> nan = float('nan')
>>>> nan == nan
False
>>>> nan is nan
True
>>>> nan in [nan]
False