Skip to content

Multiprocessing in AWS Lambda #1219

@raulsperoni

Description

@raulsperoni

Hello, I'm facing an issue when running Dedupe in a Lambda, even when setting num_cores=0:

[ERROR] OSError: [Errno 38] Function not implemented

Traceback:

    clustered_dupes = deduper.partition(data_d, 0.5)
  File "/var/lang/lib/python3.11/site-packages/dedupe/api.py", line 190, in partition
    pair_scores = self.score(pairs)
  File "/var/lang/lib/python3.11/site-packages/dedupe/api.py", line 115, in score
    matches = core.scoreDuplicates(
  File "/var/lang/lib/python3.11/site-packages/dedupe/core.py", line 129, in scoreDuplicates
    offset = multiprocessing.Value("Q", 0, lock=RLock())
  File "/var/lang/lib/python3.11/multiprocessing/context.py", line 73, in RLock
    return RLock(ctx=self.get_context())
  File "/var/lang/lib/python3.11/multiprocessing/synchronize.py", line 194, in __init__
    SemLock.__init__(self, RECURSIVE_MUTEX, 1, 1, ctx=ctx)
  File "/var/lang/lib/python3.11/multiprocessing/synchronize.py", line 57, in __init__
    sl = self._semlock = _multiprocessing.SemLock(

These lines are not inside the num_cores condition:

# explicitly defining the lock from the "spawn context" seems to
# be necessary for python 3.7 on mac os.
 offset = multiprocessing.Value("Q", 0, lock=RLock())

Can anyone help?
Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions