Skip to content

[vcpkg-tool] vcpkg should cache binaries downloaded from AWS locally instead of downloading them every time #38684

Open
microsoft/vcpkg-tool
#1406
@petamas

Description

I have multiple build machines (running the exact same version of tools), set up with two binary sources: the default local binary cache, and a shared AWS S3 bucket.

If a required package binary is missing from both caches, vcpkg will build it, then store it both in the local cache, and the remote (AWS) cache. Next time this machine needs the same package binary, it'll restore it from the local cache, which takes next to no time, which is great. If another machine needs the same package binary, it will download it from S3, which takes a few seconds, but it's much quicker than building it locally, which is also great. However, next time this other machine needs the same binary, it will download from AWS again, which is less than ideal: if it stored the binary locally after the first download, then it wouldn't need to spend a couple seconds redownloading it. (And it would cost less in AWS egress fees.)

I propose implementing some kind of local caching for .zip files downloaded from any of the "object storage providers", i.e. AWS, GCS and COS. (They share 99% of their code, so they can be treated the same.)

I'm happy to prepare a PR on this (it should be solvable by some gentle massaging of binarycaching.cpp), but I need some guidance on what the final interface should be:

  • Unconditionally push the downloaded .zip files to the default binary cache.
  • Same, but only if the default binary cache is set to write or readwrite.
  • Push to all writable local (i.e. files) binary caches.
  • Push to all writable binary caches, regardless of whether they're local or remote.
  • Manage a separate local binary cache for each "object storage provider", and provide configuration options to set the location of the cache (eg. through x-aws-config)
  • Something different to any of these

I think I'd be able to implement any of these, but I'd like to know which one would fit best the vision of vcpkg maintainers. The "separate local binary cache for each provider" seems to be the cleanest to me, but I'm fine with doing any of these; however, I'd like some help with defining how the configuration should work from the user side.

Metadata

Assignees

Labels

category:vcpkg-featureThe issue is a new capability of the tool that doesn’t already exist and we haven’t committed

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions