Store container image ref and on-disk size in ClickHouse#11920
Open
dan-stowell wants to merge 10 commits intomasterfrom
Open
Store container image ref and on-disk size in ClickHouse#11920dan-stowell wants to merge 10 commits intomasterfrom
dan-stowell wants to merge 10 commits intomasterfrom
Conversation
e09bb34 to
b5fd10c
Compare
vanja-p
approved these changes
Apr 21, 2026
Contributor
vanja-p
left a comment
There was a problem hiding this comment.
This lgtm, but I think Brandon should take a look as well before merging.
bduffany
reviewed
Apr 21, 2026
Add container_image and container_image_size_bytes fields to track the
container image used for each execution and its estimated on-disk size
after pulling/extracting.
Data flow:
- Proto: Add fields to ExecutionAuxiliaryMetadata (executor->server) and
StoredExecution (server->ClickHouse buffer)
- Runner interface: Add ContainerImageInfo() method
- Container interface: Add optional ImageSizer interface with
ImageSizeBytes() method
- Implementations:
- OCI: sum of EstimatedDiskUsageBytes across all layers
- Firecracker: stat the cached ext4 disk image file
- Docker: ImageInspect API call for image Size
- Podman: runPodman image inspect --format {{.Size}}
- Executor: capture image ref + size after PrepareForTask, set on
auxiliary metadata
- Execution server: read from aux metadata, write to StoredExecution
- ClickHouse schema: add ContainerImage and ContainerImageSizeBytes
columns to Executions table
This enables querying for executions using large container images
(e.g. >6GB yellow zone, >10GB red zone) to help customers optimize
their image sizes.
Co-authored-by: Shelley <shelley@exe.dev>
Add assertions across three test files to verify the container image metadata plumbing: - container_test: Test TracedCommandContainer.ImageSizeBytes() delegation with and without an ImageSizer delegate. - runner_test: Test ContainerImageInfo() on taskRunner, verifying image size is reported when the container implements ImageSizer, and zero when it doesn't. - executor_test: End-to-end test that container image size flows through ExecuteTaskAndStreamResults into ExecutionAuxiliaryMetadata. Also extend rbetest.TestRunnerOverrides to accept runner.PoolOptions, enabling tests to inject custom ContainerProviders. Co-authored-by: Shelley <shelley@exe.dev>
- runner.go: Call r.Container.ImageSizeBytes() instead of reaching through to r.Container.Delegate for the ImageSizer type assertion. TracedCommandContainer already handles this delegation. - docker.go, firecracker.go: Add comments explaining the intentional use of context.TODO() since ImageSizer interface doesn't take a ctx. Co-authored-by: Shelley <shelley@exe.dev>
Thread context through the full ImageSizeBytes call chain: executor.go -> Runner.ContainerImageInfo(ctx) -> TracedCommandContainer.ImageSizeBytes(ctx) -> ImageSizer.ImageSizeBytes(ctx) This eliminates three context.TODO() calls in production code (docker, podman, firecracker) by plumbing the execution context from the executor down to each container implementation. Co-authored-by: Shelley <shelley@exe.dev>
…mmandContainer - Rename proto fields container_image -> container_image_ref in both ExecutionAuxiliaryMetadata and StoredExecution, and update all Go references and ClickHouse schema accordingly. - Remove the separate ImageSizer interface and add ImageSizeBytes(ctx) directly to the CommandContainer interface. Add trivial return-0 implementations to bare and sandbox containers. Co-authored-by: Shelley <shelley@exe.dev>
3531d4c to
786066c
Compare
Contributor
Author
|
@luluz66 has asked that I hold off on landing this change until the Executions table is migrated. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add container_image and container_image_size_bytes fields to track the container image used for each execution and its estimated on-disk size after pulling/extracting.
Data flow:
This enables querying for executions using large container images (e.g. >6GB yellow zone, >10GB red zone) to help customers optimize their image sizes.