Skip to content

Remote backend for CacheDir #4394

Open
Open
@aaronborden-rivianvw

Description

@aaronborden-rivianvw

Describe the Feature
Support remote backend implementations for SCons using the custom CacheDir class.

Required information

  • Link to SCons Users thread discussing your issue https://discord.com/channels/571796279483564041/1139589636855955587
  • Version of SCons 4.5.2
  • Version of Python 3.8
  • Which python distribution if applicable python.org
  • How you installed SCons pypi
  • What Platform are you on? Ubuntu Linux 20.04
  • How to reproduce your issue? Please include a small self contained reproducer. Likely a SConstruct should do for most issues.
  • How you invoke scons scons ecu=<ecu> config=<config>

Additional context
Add any other context or screenshots about the feature request here.

The existing cache logic in SCons works well for us. This feature is about making that cache available remotely/distributed rather than assuming the cache is a local directory. This doesn't have to be implemented using the custom CacheDir class, but the existing interface makes sense for a simple remote backend. The docs recommend using an NFS for a shared/remote cache which essentially offloads the remote handling to the OS. We've found NFS doesn't scale across our CI fleet and are looking for a remote/distributed cache.

Here's an example interface/implementation that uses the build signature for a unique key that can be used to store and fetch from a remote backend.

class CustomCacheDir(Scons.CacheDir.CacheDir):
    @classmethod
    def retrieve(cls, env, bsig, src, dst) -> bool:
        """Retrieve an object from the remote cache. Return False if the operation fails, True otherwise."""
        return cls.backend.retrieve(bsig, dst)

    @classmethod
    def store(cls, env, bsig src, dst) -> bool:
        """Store an object in the cache."""
        return cls.backend.store(bsig, src)

    @classmethod
    def exists(cls, env, bsig, src, dst) -> bool:
        """Check if an object exists in the remote cache."""
        return cls.backend.exists(bsig)

    @classmethod
    def copy_from_cache(cls, env, bsig, src, dst) -> str:
        """Copy a file from cache."""
        # if object already exists in the local cache, no need to fetch it.
        if not os.path.exists(src):
            cls.retrieve(env, bsig, src, dst)

        # copy the local cached object to the destination
        return super().copy_from_cache(env, bsig, src, dst)

    @classmethod
    def copy_to_cache(cls, env, bsig src, dst) -> str:
        # store the object in the remote cache
        cls.store(env, bsig, src, dst)

        # store the object in the local cache
        return super().copy_to_cache(env, bsig, src, dst)

The behavior is modeled on ccache's remote backend strategy where by default, you have local pull-through cache. If SCons handles the local cache, then the existing logic mostly works as is. SCons would only delegate the determination of if an object exists in the cache to a custom cachedir class.

Local storage Remote storage What happens
miss miss Compile, write to local, write to remote[1]
miss hit Read from remote, write to local
hit - Read from local, don’t write to remote[2]

[1] Unless remote storage has attribute read-only=true.
[2] Unless local storage is set to share its cache hits with the reshare option.

Determining whether or not the object is in the cache might be unnecessary. If copy_from_cache returns a negative answer (e.g. False), SCons should assume the object wasn't there, compile and then store the object as necessary.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions