Replies: 1 comment
-
|
So in theory I think this wouldn't be a bad optimization at all. However there remains the question of the lifecycle of such a cache, and how to make sure we don't hold onto such content forever. Otherwise the GRPC will hold onto such caches forever. A garbage collector would be a solution, however it seems like a poor experience to keep things in the cache for x amount of time instead of having a clearer notion of when it's finished being pulled. If we were to do this we would want a pretty aggressive garbage collector. For container tasks, memory is generally a very high priority. CPU/disk throttling will simply make the task run slower, but using too much memory will OOM-kill a task. Hence we try very hard to be memory sensitive. If we can find a clear boundary for when we can get rid of cached items I would be a lot more on board. But this is a crucial piece to solve IMO. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I was taking a look at - https://www.usenix.org/system/files/atc23-brooker.pdf and thought of an idea where instead of having the current implementation of having span cache per layer, we have a 'global' span cache which can reference across other layers and other images.
So basically here -
soci-snapshotter/fs/span-manager/span_manager.go
Lines 269 to 274 in 01a89f7
spanIDwe would get it using them.ztoc.SpanDigests[spanID]potentially skipping fetching an already existing spanDigest from another layer/image.Once a span is fetched we can put the cache like here -
soci-snapshotter/fs/span-manager/span_manager.go
Lines 362 to 365 in 01a89f7
The other complexity might be of how to delete spans from this global cache, as a span could potentially be owned by multiple layer/images. (maybe reference counting might help here?) (or maybe we can have something like a LRU cache with a size cap)
SHA256 being the industry standard for digests we should be fine with the security aspect of sharing the same file across images.
This would definitely help make soci pulls faster, not just for spans shared between layers but also spans shared between other images wont need to be fetched again.
Not sure if this idea was explored earlier but would love your opinion on this 😄
Beta Was this translation helpful? Give feedback.
All reactions