Add integrated scylla-monitoring stack support by dkropachev · Pull Request #724 · scylladb/scylla-ccm

dkropachev · 2026-02-12T14:43:37Z

Summary

Add first-class monitoring integration to CCM: a Prometheus + Grafana + Alertmanager
stack that runs alongside any Scylla cluster as Docker containers with --net=host.

Two modes of operation:

Automatic (--monitoring flag or CCM_MONITORING=1 env var) — monitoring
starts with the cluster and Prometheus scrape targets are kept in sync on every
topology change (add, remove, start, stop, decommission).
Manual (ccm monitoring start/stop/sync) — on-demand control, no automatic
target updates.

Key features:

Per-cluster port offsets (based on cluster ID) for running multiple monitored clusters
simultaneously
Auto-clones scylla-monitoring repo for real Grafana dashboards; falls back to a
built-in overview dashboard when unavailable
Atomic target file writes so Prometheus never reads partial state
Monitoring failures in automatic mode are logged as warnings and never block cluster
operations
Configuration persisted in cluster.conf and restored on load

New CLI surface:

ccm create ... --monitoring [--monitoring-dir=PATH]
ccm monitoring start|stop|enable|disable|sync|status

Changed files

ccmlib/scylla_monitoring.py — new MonitoringStack class
ccmlib/cmds/cluster_cmds.py — ClusterMonitoringCmd + --monitoring flag on create
ccmlib/cluster.py — monitoring fields, _notify_topology_change() hook
ccmlib/scylla_cluster.py — auto-start/stop monitoring, CCM_MONITORING env var
ccmlib/scylla_node.py, ccmlib/node.py — topology change notifications
ccmlib/cluster_factory.py — restore monitoring settings on load
docs/monitoring.md — full reference documentation
README.md — monitoring section, environment variables table
tests/test_scylla_monitoring.py — unit tests for MonitoringStack
tests/test_monitoring_integration.py — integration tests for CLI and hooks

ccmlib/scylla_monitoring.py

tests/test_scylla_monitoring.py

ccmlib/cmds/cluster_cmds.py

ccmlib/scylla_monitoring.py

tests/test_monitoring_integration.py

dkropachev · 2026-02-12T15:47:32Z

@fruch , now it is ready and tested, the only reason why ccmlib/scylla_monitoring.py is so big is that there is no way to reuse all the scripts from https://github.com/scylladb/scylla-monitoring, because it is shared environment, there could be many stacks running, so i had to switch to host networking, so provisioning now is manual and scripts have no cli to help with it.

Spin up a Prometheus + Grafana + Alertmanager stack alongside any CCM cluster using Docker containers with --net=host. Automatic mode (--monitoring or CCM_MONITORING=1) keeps Prometheus scrape targets in sync on every topology change. Manual mode (ccm monitoring start/stop/sync) gives on-demand control. Multi-cluster setups are supported via port offsets based on cluster ID. When scylla-monitoring repo is available, real dashboards are generated; otherwise a built-in fallback overview dashboard is used. New CLI: ccm create ... --monitoring [--monitoring-dir=PATH] ccm monitoring start|stop|enable|disable|sync|status

github-code-quality bot found potential problems Feb 12, 2026

View reviewed changes

dkropachev mentioned this pull request Feb 12, 2026

Add integrated scylla-monitoring stack support #720

Closed

5 tasks

dkropachev marked this pull request as draft February 12, 2026 15:05

dkropachev force-pushed the dk/integrated-monitoring-stack branch 2 times, most recently from 9392bad to 1226411 Compare February 12, 2026 15:16

github-code-quality bot found potential problems Feb 12, 2026

View reviewed changes

tests/test_monitoring_integration.py Fixed Show fixed Hide fixed

tests/test_monitoring_integration.py Fixed Show fixed Hide fixed

tests/test_monitoring_integration.py Fixed Show fixed Hide fixed

dkropachev force-pushed the dk/integrated-monitoring-stack branch from 1226411 to 5c80573 Compare February 12, 2026 15:30

dkropachev marked this pull request as ready for review February 12, 2026 15:47

dkropachev force-pushed the dk/integrated-monitoring-stack branch from 5c80573 to 4378b81 Compare February 19, 2026 13:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add integrated scylla-monitoring stack support#724

Add integrated scylla-monitoring stack support#724
dkropachev wants to merge 1 commit intomasterfrom
dk/integrated-monitoring-stack

dkropachev commented Feb 12, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dkropachev commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

dkropachev commented Feb 12, 2026

Summary

Changed files

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dkropachev commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments