我正在处理一个需要具有可哈希、可比较和快速特性的小型自定义数据对象,当我遇到一组奇怪的时间结果时。该对象的某些比较(以及哈希方法)只是委托给一个属性,因此我使用了类似于以下内容的代码:
def __hash__(self):
return self.foo.__hash__()
然而,在测试时,我发现hash(self.foo)
明显更快。好奇心驱使,我测试了__eq__
、__ne__
以及其他魔术比较方法,只发现如果使用简化形式(==
, !=
, <
等),所有这些方法都运行得更快。为什么?我原以为简化形式必须在幕后进行相同的函数调用,但也许情况并非如此?
Timeit 结果
设置:围绕控制所有比较的实例属性的薄包装器。
Python 3.3.4 (v3.3.4:7ff62415e426, Feb 10 2014, 18:13:51) [MSC v.1600 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import timeit
>>>
>>> sugar_setup = '''\
... import datetime
... class Thin(object):
... def __init__(self, f):
... self._foo = f
... def __hash__(self):
... return hash(self._foo)
... def __eq__(self, other):
... return self._foo == other._foo
... def __ne__(self, other):
... return self._foo != other._foo
... def __lt__(self, other):
... return self._foo < other._foo
... def __gt__(self, other):
... return self._foo > other._foo
... '''
>>> explicit_setup = '''\
... import datetime
... class Thin(object):
... def __init__(self, f):
... self._foo = f
... def __hash__(self):
... return self._foo.__hash__()
... def __eq__(self, other):
... return self._foo.__eq__(other._foo)
... def __ne__(self, other):
... return self._foo.__ne__(other._foo)
... def __lt__(self, other):
... return self._foo.__lt__(other._foo)
... def __gt__(self, other):
... return self._foo.__gt__(other._foo)
... '''
测试
我的自定义对象包含了一个datetime
,所以我使用了它,但这应该不会有任何影响。是的,我在测试中创建了datetimes,所以显然有一些相关的开销,但是这个开销从一个测试到另一个测试都是恒定的,所以不应该有任何影响。为了简洁起见,我省略了__ne__
和__gt__
测试,但那些结果基本上与这里显示的结果相同。
>>> test_hash = '''\
... for i in range(1, 1000):
... hash(Thin(datetime.datetime.fromordinal(i)))
... '''
>>> test_eq = '''\
... for i in range(1, 1000):
... a = Thin(datetime.datetime.fromordinal(i))
... b = Thin(datetime.datetime.fromordinal(i+1))
... a == a # True
... a == b # False
... '''
>>> test_lt = '''\
... for i in range(1, 1000):
... a = Thin(datetime.datetime.fromordinal(i))
... b = Thin(datetime.datetime.fromordinal(i+1))
... a < b # True
... b < a # False
... '''
结果
>>> min(timeit.repeat(test_hash, explicit_setup, number=1000, repeat=20))
1.0805227295846862
>>> min(timeit.repeat(test_hash, sugar_setup, number=1000, repeat=20))
1.0135617737162192
>>> min(timeit.repeat(test_eq, explicit_setup, number=1000, repeat=20))
2.349765956168767
>>> min(timeit.repeat(test_eq, sugar_setup, number=1000, repeat=20))
2.1486044757355103
>>> min(timeit.repeat(test_lt, explicit_setup, number=500, repeat=20))
1.156479287717275
>>> min(timeit.repeat(test_lt, sugar_setup, number=500, repeat=20))
1.0673696685109917
- 哈希:
- 显式:1.0805227295846862
- 隐式:1.0135617737162192
- 等于:
- 显式:2.349765956168767
- 隐式:2.1486044757355103
- 小于:
- 显式:1.156479287717275
- 隐式:1.0673696685109917