feat: add cadvisor as an opt-in additional service#1426
Open
barnabasbusa wants to merge 4 commits into
Open
Conversation
Opt-in via additional_services: [cadvisor], modeled on disruptoor. Runs privileged with bind mounts for the host docker socket, /sys and /var/lib/docker, and is guarded to the Docker backend. Exposes /metrics on :8080 and is auto-scraped by Prometheus when prometheus_grafana is enabled.
…ist) Kurtosis only permits /var/run/docker.sock as a bind-mount host path; mounting /var/run, /sys, /var/lib/docker is rejected at validation. Mount just the socket and add host_pid_namespace, matching disruptoor.
Bundled in the always-provisioned dashboards dir, so it loads whenever grafana runs. Panels query cadvisor metrics (CPU/memory/network/filesystem per container), so they only populate when the cadvisor service is enabled.
Requires kurtosis-tech/kurtosis#3152 (host_cgroup_namespace) and a Kurtosis release that includes it. Without the host cgroup namespace cAdvisor only sees the root cgroup; with it /sys/fs/cgroup reflects the full host hierarchy so it can report per-container metrics.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #764.
Adds cAdvisor as an opt-in additional service for per-container resource metrics, plus a bundled Grafana dashboard. Modeled on how
disruptooris handled.Important
Blocked on upstream Kurtosis. Per-container metrics require cAdvisor to run in the host cgroup namespace, which Kurtosis does not expose yet. This PR uses a new
host_cgroup_namespaceServiceConfig field added in kurtosis-tech/kurtosis#3152. It will only work once that PR merges and a Kurtosis release containing it is out; until then, runningcadvisorwill fail interpretation with an unknown-argument error. Do not merge before bumping the minimum Kurtosis version to the release that includes #3152.Why the cgroup namespace is needed
Without it, cAdvisor's container runs in Docker's default private cgroup namespace, so
/sys/fs/cgrouponly shows its own subtree — cAdvisor reports just the root cgroup, not per-container metrics (verified live: it only resolvedid="/"). Kurtosis capsbind_mountsto/var/run/docker.sock, so mounting host/sys/fs/cgroupisn't an option. The fix ishost_cgroup_namespace=True(Docker--cgroupns=host), which makes/sys/fs/cgroupreflect the full host hierarchy so cAdvisor can see sibling containers' cgroups.How it works
additional_services: [cadvisor]— off by default.privileged: True,host_pid_namespace: True,host_cgroup_namespace: True, and bind mounts/var/run/docker.sock(the samebind_mountsmechanismdisruptooruses)./metricson:8080) appended toprometheus_additional_metrics_jobs, so cadvisor is auto-scraped whenprometheus_grafanais enabled.cadvisor-dashboard.jsonis added to the always-provisioned dashboards dir; its panels only query cadvisor metrics, so they're empty unless the service runs.cadvisor_params(image + cpu/mem limits).Files
src/cadvisor/cadvisor_launcher.star— new launcher.main.star— import, Docker-backend guard, and theadditional_servicesbranch.src/package_io/constants.star—DEFAULT_CADVISOR_IMAGE(gcr.io/cadvisor/cadvisor:v0.52.1).src/package_io/input_parser.star—cadvisor_paramsdefaults/struct/merge.src/package_io/sanity_check.star—cadvisorservice +cadvisor_paramsallowlist.static_files/grafana-config/dashboards/cadvisor-dashboard.json— Grafana dashboard..github/tests/cadvisor.yaml— standalone example (kept out of the per-PR matrix because it needs--privileged, same asdisruptoor.yaml).README.md— docs for the service andcadvisor_params.Usage
kurtosis run --enclave cadvisor . --args-file .github/tests/cadvisor.yaml --privilegedTested
Ran with
--privilegedlocally: cadvisor RUNNING and scraped (up{service="cadvisor"}=1), Grafana provisions thecAdvisordashboard (10 panels). Per-container resolution depends on the host cgroup namespace landing upstream (#3152).