A comprehensive monitoring solution using Prometheus, Thanos, Grafana, and Node Exporter. This stack provides metrics collection, long-term storage, and visualization capabilities.
- Prometheus: Metrics collection and storage
- Thanos: Long-term metrics storage and global query view
- Grafana: Metrics visualization and dashboarding
- Node Exporter: Host metrics collection
- Sigil: Custom Velocity metrics plugin (external)
- Docker
- Docker Compose
- At least 4GB of RAM
- Linux host (for Node Exporter)
- Sufficient disk space for metrics storage (recommend at least 50GB)
-
Clone this repository:
git clone https://github.com/Landfall-SMP/Landfall-Telemetry.git cd Landfall-Telemetry
-
Create the data directories:
mkdir -p data/prometheus data/grafana
-
Set up environment variables (optional):
export GF_SECURITY_ADMIN_PASSWORD=your_secure_password # Default: admin
-
Start the stack:
docker-compose up -d
The stack will automatically:
- Initialize data directories with correct permissions
- Start all services in the correct order
- Begin collecting metrics
This stack uses host filesystem storage for better performance and scalability:
-
Prometheus data:
./data/prometheus
- Contains TSDB data
- Automatically configured with correct permissions (uid 65534)
- Recommend monitoring disk usage and growth
-
Grafana data:
./data/grafana
- Contains dashboards, users, and plugins
- Automatically configured with correct permissions (uid 472)
The initialization of data directories and permissions is handled automatically by the prometheus-init
service in the Docker Compose stack.
Benefits of filesystem storage:
- Better I/O performance
- Easier backup and restore
- Independent scaling of storage
- Persistence across container rebuilds
- Direct access for maintenance
-
Grafana:
http://localhost:2000
- Default credentials:
- Username:
admin
- Password:
admin
(or value ofGF_SECURITY_ADMIN_PASSWORD
)
- Username:
- Default credentials:
-
Prometheus:
http://localhost:9090
-
Thanos Query:
http://localhost:19090
-
Node Exporter:
http://localhost:9100/metrics
- Scrapes metrics
- Configured targets:
- Prometheus itself
- Node Exporter
- Sigil (external service)
- Data stored in host filesystem:
./data/prometheus
Consists of two components:
-
Sidecar:
- Connects to Prometheus
- Handles long-term storage
- Exposes StoreAPI
- Ports: 19191 (HTTP), 10901 (gRPC)
-
Query:
- Global view of metrics
- Deduplicates metrics from Prometheus/sidecar
- Ports: 19090 (HTTP), 19091 (gRPC)
- Collects host metrics
- Access to host filesystem through volume mounts
- Exposes metrics on port 9100
- Modern visualization platform
- Pre-configured with Prometheus/Thanos datasources
- Runs on port 2000
- Data stored in host filesystem:
./data/grafana
All services are connected through a dedicated Docker network named monitoring
. The stack uses:
- Internal service discovery (Docker DNS)
- Host machine access via
host.docker.internal
- Bridge network for container communication
The stack uses host filesystem storage for better performance and scalability:
-
./data/prometheus/
: Prometheus TSDB storage- Contains all metrics data
- Accessed by both Prometheus and Thanos Sidecar
- Initialized with UID 65534 (nobody user)
-
./data/grafana/
: Grafana storage- Contains dashboards, users, and plugins
- Initialized with UID 472 (grafana user)
# Stop the stack first
docker-compose down
# Backup data directories
tar czf backup-$(date +%Y%m%d).tar.gz data/
# Restart the stack
docker-compose up -d
prometheus/prometheus.yml
: Prometheus scrape configurationthanos/bucket.yaml
: Thanos object storage configurationgrafana/
: Grafana provisioning and configuration files
Monitor disk usage of data directories:
du -sh data/prometheus
du -sh data/grafana
Back up the data directories:
# Stop the stack first
docker-compose down
# Backup data directories
tar czf backup-$(date +%Y%m%d).tar.gz data/
# Restart the stack
docker-compose up -d
-
Prometheus not scraping:
docker logs prometheus
-
Node Exporter issues:
docker logs node-exporter
-
Thanos connectivity:
docker logs thanos-sidecar docker logs thanos-query
- All ports are exposed on localhost only
- Grafana password should be changed from default
- Prometheus admin API is enabled for Thanos integration
- Volume permissions are handled by an init container