Problem Statement
As a mybinder.org federation operator, I want to control the costs of my deployment. One of the costs that starts small, but grows over time is the storage cost for the build cache (an OCI registry). Because this is a cache, it is fine to delete 'unused' images, since they can be rebuilt if they are ever launched again. However, most registries do not preserve the necessary information to determine which images are 'unused' and therefore safe to delete.
Proposed Solution
Deploy a registry implementation that has sufficient rules to limit size for cost purposes. The registry should be configurable to allow deleting the least recently used (LRU) images to keep the size in check. How much to delete and how often can be tuned per deployment according to cost requirements.
Proposed Implementation
Deploy Harbor registries backed by local Object Storage. Harbor has both last pull time metadata and sufficient garbage collection rules to use it as a regular cache.
How will this fit in the ecosystem?
costs to deploy new members to mybinder.org are lowered, and an example is set for other BinderHub instances that may want to follow suit.
Endorsements
Problem Statement
As a mybinder.org federation operator, I want to control the costs of my deployment. One of the costs that starts small, but grows over time is the storage cost for the build cache (an OCI registry). Because this is a cache, it is fine to delete 'unused' images, since they can be rebuilt if they are ever launched again. However, most registries do not preserve the necessary information to determine which images are 'unused' and therefore safe to delete.
Proposed Solution
Deploy a registry implementation that has sufficient rules to limit size for cost purposes. The registry should be configurable to allow deleting the least recently used (LRU) images to keep the size in check. How much to delete and how often can be tuned per deployment according to cost requirements.
Proposed Implementation
Deploy Harbor registries backed by local Object Storage. Harbor has both last pull time metadata and sufficient garbage collection rules to use it as a regular cache.
How will this fit in the ecosystem?
costs to deploy new members to mybinder.org are lowered, and an example is set for other BinderHub instances that may want to follow suit.
Endorsements