Skip to content

Asyncio and multiprocess damage the cache #215

Open
@pombredanne

Description

@pombredanne

This error is showing up in test runs https://dev.azure.com/nexB/python-inspector/_build/results?buildId=15849&view=logs&jobId=127eb2c5-1c4a-54f5-86f9-4c9024fa5f36&j=127eb2c5-1c4a-54f5-86f9-4c9024fa5f36&t=eb44e1a8-f951-5ab7-e23d-607c85fad059

This is because the cache is being written to concurrently and we have not designed that cache to be thread safe.
Having a global cache with settings in #214 or #192 is also compounding the issue.

self = <zipfile.ZipFile [closed]>

    def _RealGetContents(self):
        """Read in the table of contents for the ZIP file."""
        fp = self.fp
        try:
            endrec = _EndRecData(fp)
        except OSError:
            raise BadZipFile("File is not a zip file")
        if not endrec:
>           raise BadZipFile("File is not a zip file")
E           zipfile.BadZipFile: File is not a zip file

The solution is likely to use process-safe file locks like used in ScanCode when writing a wheels to avoid any corruption of cached files. Plus also some cache checks, pip style.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions