Skip to content

influxdata/influxdb3-ref-network-telemetry

Repository files navigation

influxdb3-ref-network-telemetry

Reference architecture: InfluxDB 3 Enterprise multi-node cluster monitoring a data-center fabric.

5-node cluster (2 ingest + 1 query + 1 compact + 1 process,query), 8×16 Clos topology with ~1024 interfaces, 128 BGP sessions, ~5k flow records/sec — total ~10k pts/sec — runnable in three minutes via docker compose. Two schedule plugins on the process node detect anomalies and roll up fabric health every 5 seconds (writing back through an ingest node via httpx). Two request plugins on the query node serve top-N talkers and source-IP drill-downs to the browser directly. The dashboard demonstrates all three patterns for feeding a panel — SQL via FastAPI, SQL from browser via DVC, and request-plugin from browser — each with its own latency badge.

architecture

Quickstart

git clone https://github.com/influxdata/influxdb3-ref-network-telemetry.git
cd influxdb3-ref-network-telemetry
make up   # prompts for INFLUXDB3_ENTERPRISE_EMAIL on first run
# Click the validation link in the email
open http://localhost:8080
make scenario name=congestion_hotspot
make scenario name=east_west_burst

Or run the full scripted demo:

make demo

What's in this repo

Path Purpose
simulator/ Python simulator generating fabric telemetry; round-robins writes across two ingest nodes
plugins/ Four Processing Engine plugins (Python) + _writeback.py shared httpx writer for cross-node write-back
ui/ FastAPI + HTMX + uPlot dashboard with three teaching patterns
influxdb/init.sh Bootstraps DB + 6 tables (via configure API) + LVC + 2 DVCs + 4 triggers
docker-compose.yml 10-service stack: token-bootstrap + 5 InfluxDB nodes + init + simulator + ui + scenarios
Makefile up / down / clean / scenario / cli / test / demo
tests/ Three tiers: unit (no Docker) / scenario (testcontainers) / smoke (full 5-node stack)
ARCHITECTURE.md Multi-node compose, plugin write-back path, schema rationale, scaling notes
SCENARIOS.md Per-scenario walkthroughs
CLI_EXAMPLES.md Curated influxdb3 CLI commands and curl snippets

Headline Enterprise features

⬢ Multi-node split — ingest / query / compact / process

Five-node cluster with each role on its own service in compose. The process node runs schedule plugins; the query node hosts request plugins; ingest is split across two nodes. The shared influxdb-data named volume across all five gives the cluster catalog and object-store consistency without explicit coordination.

⬢ Two schedule plugins on the process node — every:5s

Both schedule_fabric_health and schedule_anomaly_detector run every 5 seconds via the every: schedule format (this repo's preferred syntax for short intervals). Each plugin reads via influxdb3_local.query() (local), then writes back via httpx to an ingest node — see plugins/_writeback.py for the round-robin pattern.

⬢ Two request plugins on the query node — direct browser fetch

request_top_talkers and request_src_ip_detail are called by the browser directly, with "served by Processing Engine: N ms" latency badges on each panel.

⬢ Source-IP typeahead — DVC SQL from the browser

The search box runs SELECT src_ip FROM distinct_cache('flow_records', 'src_ip_distinct') WHERE src_ip LIKE '...' LIMIT 20 directly from JavaScript against /api/v3/query_sql, with a sub-millisecond latency badge. No Python wrapper between the browser and the cache.

⬢ Per-table retention on fabric_health — 24 hours

Set at table-create time via the configure API. The schedule_fabric_health plugin writes ~17k rows/day at the 5s cadence; retention drops anything older. Exclusive demo of per-table retention in the portfolio.

⚡ Three teaching patterns side-by-side. Watch the latency badges. Direct SQL via FastAPI runs at backend speed (10-50 ms typical, network-hop bound). Direct SQL from the browser via the DVC runs at cache speed (1-5 ms). Request plugins run at plugin speed (10-100 ms depending on payload composition). Pick the right pattern for the job — see ARCHITECTURE.md § "Three patterns for feeding a UI panel".

Running the tests

make test            # tier 1 + tier 2 (skip smoke)
make test-unit       # tier 1 only (no Docker)
make test-scenarios  # tier 2 (testcontainers; multi-node)
make test-smoke      # tier 3 (real 5-node stack; ~5 min)

Scaling to production

The 5-node compose is the smallest viable shape for the multi-node split. For larger deployments, see ARCHITECTURE.md § "Scaling to production" and the other portfolio repos.

License

Apache 2.0 — see LICENSE.

About

Reference architecture: InfluxDB 3 Enterprise 5-node cluster (2 ingest + query + compact + process) monitoring a data-center Clos fabric. Two schedule plugins write back via httpx; two request plugins served direct to the browser. Per-table 24h retention. ~10k pts/s, runnable in 3 minutes via docker compose.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors