-
-
Notifications
You must be signed in to change notification settings - Fork 189
Open
Labels
Description
I had encounter with a strange behavior when I used scikit-learn pipelines with np.vectorize function and pickle it using dill.
I've managed to narrow the situation to this -
When I try to pickle a simple function that have been vectorized with non-regular otype such object or str, if it occurs before running it, dill works fine but if I've used it once, the dill yells with PicklingError.
For example -
import numpy as np
import dill
def f(x):
return x
vf = np.vectorize(f,otypes=[object])
arr = np.asarray(["a","b","c"])
dill.detect.trace(True)
dill.dumps(vf)
out = vf(arr)
print(out)
dill.dumps(vf)and the output is -
T4: <class 'numpy.vectorize'>
# T4
D2: <dict object at 0x000001BB56AF1F00>
F1: <function f at 0x000001BB56E5A430>
F2: <function _create_function at 0x000001BB67356430>
# F2
Co: <code object f at 0x000001BB56B659D0, file "C:\Users\USER\dill_vect.py", line 4>
F2: <function _create_code at 0x000001BB673564C0>
# F2
# Co
D1: <dict object at 0x000001BB56AF1CC0>
# D1
D2: <dict object at 0x000001BB67255E80>
# D2
# F1
D2: <dict object at 0x000001BB6711CB80>
# D2
# D2
['a' 'b' 'c']
T4: <class 'numpy.vectorize'>
# T4
D2: <dict object at 0x000001BB56AF1F00>
F1: <function f at 0x000001BB56E5A430>
F2: <function _create_function at 0x000001BB67356430>
# F2
Co: <code object f at 0x000001BB56B659D0, file "C:\Users\USER\dill_vect.py", line 4>
F2: <function _create_code at 0x000001BB673564C0>
# F2
# Co
D1: <dict object at 0x000001BB56AF1CC0>
# D1
D2: <dict object at 0x000001BB67255E80>
# D2
# F1
D2: <dict object at 0x000001BB6711CB80>
Traceback (most recent call last):
File "C:\Users\USER\dill_vect.py", line 17, in <module>
dill.dumps(vf)
File "C:\ProgramData\Anaconda3\lib\site-packages\dill\_dill.py", line 304, in dumps
dump(obj, file, protocol, byref, fmode, recurse, **kwds)#, strictio)
File "C:\ProgramData\Anaconda3\lib\site-packages\dill\_dill.py", line 276, in dump
Pickler(file, protocol, **_kwds).dump(obj)
File "C:\ProgramData\Anaconda3\lib\site-packages\dill\_dill.py", line 498, in dump
StockPickler.dump(self, obj)
File "C:\ProgramData\Anaconda3\lib\pickle.py", line 487, in dump
self.save(obj)
File "C:\ProgramData\Anaconda3\lib\pickle.py", line 603, in save
self.save_reduce(obj=obj, *rv)
File "C:\ProgramData\Anaconda3\lib\pickle.py", line 717, in save_reduce
save(state)
File "C:\ProgramData\Anaconda3\lib\pickle.py", line 560, in save
f(self, obj) # Call unbound method with explicit self
File "C:\ProgramData\Anaconda3\lib\site-packages\dill\_dill.py", line 990, in save_module_dict
StockPickler.save_dict(pickler, obj)
File "C:\ProgramData\Anaconda3\lib\pickle.py", line 971, in save_dict
self._batch_setitems(obj.items())
File "C:\ProgramData\Anaconda3\lib\pickle.py", line 997, in _batch_setitems
save(v)
File "C:\ProgramData\Anaconda3\lib\pickle.py", line 560, in save
f(self, obj) # Call unbound method with explicit self
File "C:\ProgramData\Anaconda3\lib\site-packages\dill\_dill.py", line 990, in save_module_dict
StockPickler.save_dict(pickler, obj)
File "C:\ProgramData\Anaconda3\lib\pickle.py", line 971, in save_dict
self._batch_setitems(obj.items())
File "C:\ProgramData\Anaconda3\lib\pickle.py", line 1002, in _batch_setitems
save(v)
File "C:\ProgramData\Anaconda3\lib\pickle.py", line 589, in save
self.save_global(obj, rv)
File "C:\ProgramData\Anaconda3\lib\pickle.py", line 1070, in save_global
raise PicklingError(
_pickle.PicklingError: Can't pickle <ufunc 'f (vectorized)'>: it's not found as __main__.f (vectorized)This test ran on Windows, but I've tested it on Linux as well and the same problem occurs.
Packages versions used for the test -
numpy==1.22.4 and dill==0.3.5.1 and also with dill==0.3.4
Reactions are currently unavailable