Python可以pickle lambda函数。由于不同版本的Python实现pickle不同,我们将分别介绍Python 2和3。
在Python 3中,没有名为cPickle的模块。相反,我们有pickle
,默认情况下也不支持对lambda
函数进行pickling。让我们看一下它的分派表:
>> import pickle
>> pickle.Pickler.dispatch_table
<member 'dispatch_table' of '_pickle.Pickler' objects>
等一下,我试图查找 pickle 的 dispatch_table
而不是 _pickle
。 _pickle
是 pickle 的替代和更快的 C 实现。但是我们还没有导入它!如果纯 Python pickle
模块在末尾时可用,则会自动导入此 C 实现。
try:
from _pickle import (
PickleError,
PicklingError,
UnpicklingError,
Pickler,
Unpickler,
dump,
dumps,
load,
loads
)
except ImportError:
Pickler, Unpickler = _Pickler, _Unpickler
dump, dumps, load, loads = _dump, _dumps, _load, _loads
我们仍然面临着在Python 3中如何对lambda函数进行序列化的问题。答案是你
无法使用本地的
pickle
或
_pickle
来实现。你需要导入
dill
或
cloudpickle,并使用它们来替代本地的pickle模块。
>> import dill
>> dill.loads(dill.dumps(lambda x:x))
<function __main__.<lambda>>
pickle
使用的是pickle registry,它实际上是从type
到用于序列化(pickling)该类型对象的函数的映射。
你可以将pickle registry视为:
>> pickle.Pickler.dispatch
{bool: <function pickle.save_bool>,
instance: <function pickle.save_inst>,
classobj: <function pickle.save_global>,
float: <function pickle.save_float>,
function: <function pickle.save_global>,
int: <function pickle.save_int>,
list: <function pickle.save_list>,
long: <function pickle.save_long>,
dict: <function pickle.save_dict>,
builtin_function_or_method: <function pickle.save_global>,
NoneType: <function pickle.save_none>,
str: <function pickle.save_string>,
tuple: <function pickle.save_tuple>,
type: <function pickle.save_global>,
unicode: <function pickle.save_unicode>}
为了pickle自定义类型,Python提供了
copy_reg
模块来注册我们的函数。您可以在
此处阅读更多信息。默认情况下,
copy_reg
模块支持以下附加类型的pickling:
>> import copy_reg
>> copy_reg.dispatch_table
{code: <function ipykernel.codeutil.reduce_code>,
complex: <function copy_reg.pickle_complex>,
_sre.SRE_Pattern: <function re._pickle>,
posix.statvfs_result: <function os._pickle_statvfs_result>,
posix.stat_result: <function os._pickle_stat_result>}
现在,lambda
函数的类型是types.FunctionType
。然而,这种类型的内置函数function: <function pickle.save_global>
无法序列化lambda函数。因此,所有第三方库,如dill
、cloudpickle
等都会覆盖内置方法,以某些附加逻辑来序列化lambda函数。让我们导入dill
并看看它做了什么。
>> import dill
>> pickle.Pickler.dispatch
{_pyio.BufferedReader: <function dill.dill.save_file>,
_pyio.TextIOWrapper: <function dill.dill.save_file>,
_pyio.BufferedWriter: <function dill.dill.save_file>,
_pyio.BufferedRandom: <function dill.dill.save_file>,
functools.partial: <function dill.dill.save_functor>,
operator.attrgetter: <function dill.dill.save_attrgetter>,
operator.itemgetter: <function dill.dill.save_itemgetter>,
cStringIO.StringI: <function dill.dill.save_stringi>,
cStringIO.StringO: <function dill.dill.save_stringo>,
bool: <function pickle.save_bool>,
cell: <function dill.dill.save_cell>,
instancemethod: <function dill.dill.save_instancemethod0>,
instance: <function pickle.save_inst>,
classobj: <function dill.dill.save_classobj>,
code: <function dill.dill.save_code>,
property: <function dill.dill.save_property>,
method-wrapper: <function dill.dill.save_instancemethod>,
dictproxy: <function dill.dill.save_dictproxy>,
wrapper_descriptor: <function dill.dill.save_wrapper_descriptor>,
getset_descriptor: <function dill.dill.save_wrapper_descriptor>,
member_descriptor: <function dill.dill.save_wrapper_descriptor>,
method_descriptor: <function dill.dill.save_wrapper_descriptor>,
file: <function dill.dill.save_file>,
float: <function pickle.save_float>,
staticmethod: <function dill.dill.save_classmethod>,
classmethod: <function dill.dill.save_classmethod>,
function: <function dill.dill.save_function>,
int: <function pickle.save_int>,
list: <function pickle.save_list>,
long: <function pickle.save_long>,
dict: <function dill.dill.save_module_dict>,
builtin_function_or_method: <function dill.dill.save_builtin_method>,
module: <function dill.dill.save_module>,
NotImplementedType: <function dill.dill.save_singleton>,
NoneType: <function pickle.save_none>,
xrange: <function dill.dill.save_singleton>,
slice: <function dill.dill.save_slice>,
ellipsis: <function dill.dill.save_singleton>,
str: <function pickle.save_string>,
tuple: <function pickle.save_tuple>,
super: <function dill.dill.save_functor>,
type: <function dill.dill.save_type>,
weakcallableproxy: <function dill.dill.save_weakproxy>,
weakproxy: <function dill.dill.save_weakproxy>,
weakref: <function dill.dill.save_weakref>,
unicode: <function pickle.save_unicode>,
thread.lock: <function dill.dill.save_lock>}
现在,让我们尝试pickle一个lambda函数。
>> pickle.loads(pickle.dumps(lambda x:x))
<function __main__.<lambda>>
它成功了!
在Python 2中,我们有两个版本的 pickle
-
import pickle # pure Python version
pickle.__file__ # <install directory>/python-2.7/lib64/python2.7/pickle.py
import cPickle # C extension
cPickle.__file__ # <install directory>/python-2.7/lib64/python2.7/lib-dynload/cPickle.so
现在,让我们尝试使用C实现的 cPickle 来pickle一个lambda函数。
>> import cPickle
>> cPickle.loads(cPickle.dumps(lambda x:x))
TypeError: can't pickle function objects
发生了什么问题?让我们来看看的调度表。
>> cPickle.Pickler.dispatch_table
AttributeError: 'builtin_function_or_method' object has no attribute 'dispatch_table'
pickle
和cPickle
的实现方式不同。导入dill
只能让Python版本的pickle
工作。相比于cPickle
,使用pickle
的缺点是它可能比cPickle
慢1000倍。
希望这能解决所有疑惑。
stackless
通常可以做到dill
所能做的一切...主要区别在于,stackless
替换了C中的调用堆栈,而dill
则尝试使用ctypes
注册序列化函数以尽可能地在C层工作。Stackless
可以序列化所有对象。 - Mike McKernscloudpickle
是出路:https://github.com/cloudpipe/cloudpickle - Ufos