Python单元测试模拟。ValueError：DataFrame的真实值不明确。

Question

Python单元测试模拟。ValueError：DataFrame的真实值不明确。

pythonpandaspython-unittestpython-unittest.mock

11

我正在编写一个针对我的Python 2.7方法的单元测试用例。

在我的测试方法中，有一个调用方法，它接受一个具有字符串键和panadas数据帧作为该键值的字典。

我想为这个方法编写一个交互测试，以检查它是否使用正确的字典调用了内部方法

def MethodUnderTest(self):
    #some code here
    externalMethod(dictionary_of_string_dataframe)
    #some code here

在单元测试中，我编写了以下的断言来测试这个交互：

mock_externalClass.externalMethod.assert_called_once_with(dictionary_of_string_dataframe)

我创建了 dictionary_of_string_dataframe，方法与实际方法完全相同。事实上，我复制了测试代码中执行此操作的帮助方法，只是为了确保两个字典是相同的。在调试Python控制台上的测试方法时，我甚至打印了两个字典，两者看起来完全相同。

而且我使用 @patch 装饰器对外部类进行了修补，一切都很顺利。

问题是，在上述断言语句中，我得到了以下错误:

 mock_externalClass.externalMethod.assert_called_once_with(dictionary_of_string_dataframe)
  File "C:\Python27\lib\site-packages\mock\mock.py", line 948, in assert_called_once_with
    return self.assert_called_with(*args, **kwargs)
  File "C:\Python27\lib\site-packages\mock\mock.py", line 935, in assert_called_with
    if expected != actual:
  File "C:\Python27\lib\site-packages\mock\mock.py", line 2200, in __ne__
    return not self.__eq__(other)
  File "C:\Python27\lib\site-packages\mock\mock.py", line 2196, in __eq__
    return (other_args, other_kwargs) == (self_args, self_kwargs)
  File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 953, in __nonzero__
    .format(self.__class__.__name__))
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

我对ValueError进行了搜索，但没有得到太多帮助。有人可以告诉我这里发生了什么吗？

我确实查看了以下问题，但那并没有帮助：

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

- Hary

所以只是为了澄清一下：您有两个将字符串映射到pandas数据帧的字典，并且想要检查它们是否相等。您正在自己编写单元测试，即使使用相同的函数和参数创建了两个字典，仍然会出现上述错误。 - victor

当我运行测试用例时，它会在被测试的方法内部调用一个带有字符串:dataframe字典的外部方法。在测试用例本身中，我使用问题中给出的语句来测试这种交互，并且我在测试方法中创建了字典值，以便使用与被测试方法相同的辅助方法mock_externalClass.externalMethod.assert_called_once_with。然而，我遇到了错误。 - Hary

在我看来，你的方法不太对。你基本上是试图强制让你的代码通过测试，而不是让测试确保所需的结果。如果外部方法的签名发生了变化，你的测试将无法检测到代码破坏性的变化，但只有当你修复代码后，你的测试才会失败。我建议你将派生输入并调用外部方法的代码解耦成自己的方法，并在隔离中测试该方法，分离你的关注点。 - Dan

4个回答

5

我通过创建一个类 SAME_DF 来解决这个问题，该类类似于 mock 的 ANY 类

class SAME_DF:
    def __init__(self, df: pd.DataFrame):
        self.df = df

    def __eq__(self, other):
        return isinstance(other, pd.DataFrame) and other.equals(self.df)
 
def test_called_with_df():
    ....
    mock.method.assert_called_once_with(SAME_DF(pd.DataFrame({
        'name': ['Eric', 'Yoav'],
        'age': [28, 34]
    })))

- ericman

这对于解决我的问题非常有效，但是添加一个repr函数将有助于测试失败时的调试。 def __repr__(self): return repr(self.df) - Jeff

好的解决方案！谢谢。 - Pete C

4

这是因为unittest.mock使用==或!=比较输入值。然而，pandas数据帧不能被类似地比较，相反，您必须使用DataFrames的.equals方法。

https://github.com/testing-cabal/mock/blob/master/mock/mock.py

一种可能的解决方法是编写自己的单元测试，遍历字典并使用 .equals 方法比较数据帧。

另一种方法是重写 pandas 数据帧的 __equals__ 方法，这样当 mock 比较它们时将使用正确的方法。

- victor

4

以下是基于原始问题的解决方法。这借鉴了sfogle答案中的一些思路，但它允许您测试调用函数是否为DataFrames或其他参数。

import unittest
from unittest.mock import patch
import pandas as pd

def external_method(df):
    return df

def function_to_test(df):
    external_method(df)
    return df

class MyTest(unittest.TestCase):

    def test_the_function_to_test(self):
        my_test_df = pd.DataFrame({"a": [1, 2, 3, 4]})    
        with patch(__name__ + ".external_method") as mock_external_method:        
            function_to_test(my_test_df)
            mock_external_method.assert_called_once()
            args, kwargs = mock_external_method.call_args
            self.assertTrue(args[0].equals(my_test_df))

- Scott Morken

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- sfogle · Accepted Answer

我在测试我编写的函数是否使用特定预处理数据框作为参数时，也遇到了这个问题，并使用mock的call_args属性以及pandas的testing.assert_frame_equal解决了它。在我的情况下，我想要确定传递给名为run_scoring的高级函数调用下方名为score_function的函数的第二个参数（即我想要断言值的那个参数）的数据框的值。因此，首先我使用[0]检索模拟方法调用的*args部分，然后使用[1]获取我的第二个位置参数（即我想要断言值的那个参数）。接下来，我可以使用pd.testing.assert_frame_equal断言这个数据框的值。

from unittest.mock import Mock
import pandas as pd

import my_code

...

score_function_mocked = Mock()
my_code.score_function = score_function_mocked
my_code.run_scoring()

pd.testing.assert_frame_equal(
    # [0] = *args, [0][1] = second positional arg to my function
    score_function_mocked.call_args[0][1],
    pd.DataFrame({
        'new_york': [1, 0],
        'chicago': [0, 1],
        'austin': [0, 0]
    })

)