spiceai
diff --git a/‎METRICS.md‎
Lines changed: 20 additions & 22 deletions b/‎METRICS.md‎
Lines changed: 20 additions & 22 deletions
diff --git a/‎README.md‎
Lines changed: 22 additions & 21 deletions b/‎README.md‎
Lines changed: 22 additions & 21 deletions
@@ -13,24 +13,25 @@ SpiceBench (OTel instruments)
 
 ## Metric Checklist
 
-| #   | Metric                               | OTel Instrument                                                                                           | Source                                                                                           | Emitted to telemetry     | Status          |
-| --- | ------------------------------------ | --------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------ | ------------------------ | --------------- |
-| 1   | **Data Size** (total bytes ingested) | `ingestion_bytes_total` (Gauge\<u64\>)                                                                    | SUT adapter `metrics` → `ingestion.bytes_ingested`                                               | ✅ via `Telemetry.emit()` | ✅ Implemented   |
-| 2   | **Ingestion records/s**              | `ingestion_rows_per_sec` (Gauge\<f64\>)                                                                   | SUT adapter `metrics` → `ingestion.rows_per_sec`                                                 | ✅ via `Telemetry.emit()` | ✅ Implemented   |
-| 3   | **Ingestion rows total**             | `ingestion_rows_total` (Gauge\<u64\>)                                                                     | SUT adapter `metrics` → `ingestion.rows_ingested`                                                | ✅ via `Telemetry.emit()` | ✅ Implemented   |
-| 4   | **Connections / Clients**            | `active_connections` (Gauge\<u64\>)                                                                       | CLI `--concurrency` + SUT adapter `metrics` → `ingestion.active_connections`                     | ✅ via `Telemetry.emit()` | ✅ Implemented   |
-| 5   | **Queries/s, Requests/s**            | `queries_per_sec` (Gauge\<f64\>), `queries_total` (Counter\<u64\>)                                        | Computed from total iterations / test duration                                                   | ✅ via `Telemetry.emit()` | ✅ Implemented   |
-| 6   | **Query Latency (p50)**              | `median_duration_ms` (Gauge\<u64\>)                                                                       | Query driver per-query statistics                                                                | ✅ via `Telemetry.emit()` | ✅ Implemented   |
-| 7   | **Query Latency (p99)**              | `p99_duration_ms` (Gauge\<u64\>)                                                                          | Query driver per-query statistics                                                                | ✅ via `Telemetry.emit()` | ✅ Implemented   |
-| 8   | **Efficiency (cores)**               | `efficiency_queries_per_core` (Gauge\<f64\>)                                                              | Computed: `queries_per_sec / cpu_cores`                                                          | ✅ via `Telemetry.emit()` | ✅ Implemented   |
-| 9   | **Resource Usage – CPU**             | `sut_cpu_usage_percent` (Gauge\<f64\>)                                                                    | SUT adapter `metrics` → `resource.cpu_usage_percent`                                             | ✅ via `Telemetry.emit()` | ✅ Implemented   |
-| 10  | **Resource Usage – Memory**          | `peak_memory_usage_mb` / `median_memory_usage_mb` (Gauge\<f64\>), `sut_memory_usage_bytes` (Gauge\<u64\>) | Local process via `sysinfo` + SUT adapter `metrics`                                              | ✅ via `Telemetry.emit()` | ✅ Implemented   |
-| 11  | **Resource Usage – Disk**            | `sut_disk_read_bytes` / `sut_disk_write_bytes` (Gauge\<u64\>)                                             | SUT adapter `metrics` → `resource.disk_read_bytes` / `disk_write_bytes`                          | ✅ via `Telemetry.emit()` | ✅ Implemented   |
-| 12  | **Resource Usage – IOPS**            | `sut_disk_read_iops` / `sut_disk_write_iops` (Gauge\<u64\>)                                               | SUT adapter `metrics` → `resource.disk_read_iops` / `disk_write_iops`                            | ✅ via `Telemetry.emit()` | ✅ Implemented   |
-| 13  | **E2E Latency**                      | `e2e_latency_ms` (Histogram\<f64\>)                                                                       | **Instrument defined; not yet recorded** — requires timestamped events + query-back verification | ⚠️ Instrument only        | 🔲 Not yet wired |
-| 14  | **E2E Duration**                     | `test_duration_ms` (Gauge\<u64\>)                                                                         | Wall-clock time of benchmark phase                                                               | ✅ via `Telemetry.emit()` | ✅ Implemented   |
-| 15  | **Query Queue Length**               | `query_queue_length` (Gauge\<u64\>)                                                                       | Query worker queue depth at query execution start (attributes: `query_name`, `client_id`)        | ✅ via `Telemetry.emit()` | ✅ Implemented   |
-| 16  | **Query Queue Duration**             | `query_queue_duration_ms` (Histogram\<f64\>)                                                              | Query worker queue wait time before execution (attributes: `query_name`, `client_id`)            | ✅ via `Telemetry.emit()` | ✅ Implemented   |
+| #   | Metric                               | OTel Instrument                                                                                           | Source                                                                                                    | Emitted to telemetry     | Status        |
+| --- | ------------------------------------ | --------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------- | ------------------------ | ------------- |
+| 1   | **Data Size** (total bytes ingested) | `ingestion_bytes_total` (Gauge\<u64\>)                                                                    | SUT adapter `metrics` → `ingestion.bytes_ingested`                                                        | ✅ via `Telemetry.emit()` | ✅ Implemented |
+| 2   | **Ingestion records/s**              | `ingestion_rows_per_sec` (Gauge\<f64\>)                                                                   | SUT adapter `metrics` → `ingestion.rows_per_sec`                                                          | ✅ via `Telemetry.emit()` | ✅ Implemented |
+| 3   | **Ingestion rows total**             | `ingestion_rows_total` (Gauge\<u64\>)                                                                     | SUT adapter `metrics` → `ingestion.rows_ingested`                                                         | ✅ via `Telemetry.emit()` | ✅ Implemented |
+| 4   | **Connections / Clients**            | `active_connections` (Gauge\<u64\>)                                                                       | CLI `--concurrency` + SUT adapter `metrics` → `ingestion.active_connections`                              | ✅ via `Telemetry.emit()` | ✅ Implemented |
+| 5   | **Queries/s, Requests/s**            | `queries_per_sec` (Gauge\<f64\>), `queries_total` (Counter\<u64\>)                                        | Computed from total iterations / test duration                                                            | ✅ via `Telemetry.emit()` | ✅ Implemented |
+| 6   | **Query Latency (p50)**              | `median_duration_ms` (Gauge\<u64\>)                                                                       | Query driver per-query statistics                                                                         | ✅ via `Telemetry.emit()` | ✅ Implemented |
+| 7   | **Query Latency (p99)**              | `p99_duration_ms` (Gauge\<u64\>)                                                                          | Query driver per-query statistics                                                                         | ✅ via `Telemetry.emit()` | ✅ Implemented |
+| 8   | **Efficiency (cores)**               | `efficiency_queries_per_core` (Gauge\<f64\>)                                                              | Computed: `queries_per_sec / cpu_cores`                                                                   | ✅ via `Telemetry.emit()` | ✅ Implemented |
+| 9   | **Resource Usage – CPU**             | `sut_cpu_usage_percent` (Gauge\<f64\>)                                                                    | SUT adapter `metrics` → `resource.cpu_usage_percent`                                                      | ✅ via `Telemetry.emit()` | ✅ Implemented |
+| 10  | **Resource Usage – Memory**          | `peak_memory_usage_mb` / `median_memory_usage_mb` (Gauge\<f64\>), `sut_memory_usage_bytes` (Gauge\<u64\>) | Local process via `sysinfo` + SUT adapter `metrics`                                                       | ✅ via `Telemetry.emit()` | ✅ Implemented |
+| 11  | **Resource Usage – Disk**            | `sut_disk_read_bytes` / `sut_disk_write_bytes` (Gauge\<u64\>)                                             | SUT adapter `metrics` → `resource.disk_read_bytes` / `disk_write_bytes`                                   | ✅ via `Telemetry.emit()` | ✅ Implemented |
+| 12  | **Resource Usage – IOPS**            | `sut_disk_read_iops` / `sut_disk_write_iops` (Gauge\<u64\>)                                               | SUT adapter `metrics` → `resource.disk_read_iops` / `disk_write_iops`                                     | ✅ via `Telemetry.emit()` | ✅ Implemented |
+| 13  | **E2E Latency**                      | `e2e_latency_ms` (Histogram\<f64\>)                                                                       | Raw freshness scraper samples (`MAX(__created_at)` deltas); percentiles are computed in dashboard queries | ✅ via `Telemetry.emit()` | ✅ Implemented |
+| 14  | **E2E Duration**                     | `test_duration_ms` (Gauge\<u64\>)                                                                         | Wall-clock time of benchmark phase                                                                        | ✅ via `Telemetry.emit()` | ✅ Implemented |
+| 15  | **Query Queue Length**               | `query_queue_length` (Gauge\<u64\>)                                                                       | Query worker queue depth at query execution start (attributes: `query_name`, `client_id`)                 | ✅ via `Telemetry.emit()` | ✅ Implemented |
+| 16  | **Query Queue Duration**             | `query_queue_duration_ms` (Histogram\<f64\>)                                                              | Query worker queue wait time before execution (attributes: `query_name`, `client_id`)                     | ✅ via `Telemetry.emit()` | ✅ Implemented |
+| 17  | **Checkpoint In-flight Queries**     | `checkpoint_in_flight_queries` (Gauge\<u64\>)                                                             | Active in-flight query count while checkpoint validation windows are enabled (`client_id`)                | ✅ via `Telemetry.emit()` | ✅ Implemented |
 
 ## Streaming Metrics (real-time, optional)
 
@@ -89,7 +90,4 @@ The default `Handler::metrics()` implementation returns empty metrics, so existi
 
 ## Remaining Work
 
-- [ ] **E2E Latency**: Implement event-creation-to-queryable latency measurement. This requires:
-  1. Timestamping generated events at creation time
-  2. Querying the SUT for those events after ingestion
-  3. Recording the delta as `e2e_latency_ms` histogram observations
+- [ ] **E2E Latency dashboard expansion**: Add optional additional percentile panels (e.g., p50/p90/p99.9) computed from `e2e_latency_ms` in Flux.
@@ -160,27 +160,28 @@ Common CLI/workflow usage:
 
 ### Metrics
 
-| Metric                  | OTel Instrument                                  | Description                                           | Status          |
-| ----------------------- | ------------------------------------------------ | ----------------------------------------------------- | --------------- |
-| Iterations              | `iterations` (Gauge)                             | Number of query iterations per query                  | ✅ Implemented   |
-| Query Status            | `query_status` (Gauge)                           | Pass/fail status per query                            | ✅ Implemented   |
-| Query Latency (p50)     | `median_duration_ms` (Gauge)                     | Median duration per query                             | ✅ Implemented   |
-| Query Latency (min/max) | `min_duration_ms`, `max_duration_ms`             | Min and max duration per query                        | ✅ Implemented   |
-| Query Latency (p99)     | `p99_duration_ms` (Gauge)                        | 99th percentile duration per query                    | ✅ Implemented   |
-| Health Latency          | `health_latency_ms` (Histogram)                  | Latency of `/health` and `/v1/ready` probes           | ✅ Implemented   |
-| E2E Duration            | `test_duration_ms` (Gauge)                       | Total wall-clock time for the benchmark phase         | ✅ Implemented   |
-| Peak/Median Memory      | `peak_memory_usage_mb`, `median_memory_usage_mb` | Memory usage of the spiced process                    | ✅ Implemented   |
-| Ingestion Rows/Bytes    | `ingestion_rows_total`, `ingestion_bytes_total`  | Total data ingested (from SUT adapter)                | ✅ Implemented   |
-| Ingestion records/s     | `ingestion_rows_per_sec` (Gauge)                 | Sustained ingestion throughput (from SUT adapter)     | ✅ Implemented   |
-| Queries/s               | `queries_per_sec` (Gauge)                        | Query throughput under load                           | ✅ Implemented   |
-| Total Queries           | `queries_total` (Counter)                        | Total queries executed during the run                 | ✅ Implemented   |
-| Active Connections      | `active_connections` (Gauge)                     | Number of concurrent connections/clients              | ✅ Implemented   |
-| SUT CPU                 | `sut_cpu_usage_percent` (Gauge)                  | SUT CPU utilization (from adapter `metrics`)          | ✅ Implemented   |
-| SUT Memory              | `sut_memory_usage_bytes` (Gauge)                 | SUT memory usage (from adapter `metrics`)             | ✅ Implemented   |
-| SUT Disk I/O            | `sut_disk_{read,write}_bytes` (Gauge)            | SUT disk read/write bytes (from adapter `metrics`)    | ✅ Implemented   |
-| SUT Disk IOPS           | `sut_disk_{read,write}_iops` (Gauge)             | SUT disk IOPS (from adapter `metrics`)                | ✅ Implemented   |
-| Efficiency              | `efficiency_queries_per_core` (Gauge)            | Query throughput normalized by CPU cores              | ✅ Implemented   |
-| E2E Latency             | `e2e_latency_ms` (Histogram)                     | Time from event creation to the event being queryable | 🔲 Not yet wired |
+| Metric                  | OTel Instrument                                  | Description                                                                           | Status        |
+| ----------------------- | ------------------------------------------------ | ------------------------------------------------------------------------------------- | ------------- |
+| Iterations              | `iterations` (Gauge)                             | Number of query iterations per query                                                  | ✅ Implemented |
+| Query Status            | `query_status` (Gauge)                           | Pass/fail status per query                                                            | ✅ Implemented |
+| Query Latency (p50)     | `median_duration_ms` (Gauge)                     | Median duration per query                                                             | ✅ Implemented |
+| Query Latency (min/max) | `min_duration_ms`, `max_duration_ms`             | Min and max duration per query                                                        | ✅ Implemented |
+| Query Latency (p99)     | `p99_duration_ms` (Gauge)                        | 99th percentile duration per query                                                    | ✅ Implemented |
+| Health Latency          | `health_latency_ms` (Histogram)                  | Latency of `/health` and `/v1/ready` probes                                           | ✅ Implemented |
+| E2E Duration            | `test_duration_ms` (Gauge)                       | Total wall-clock time for the benchmark phase                                         | ✅ Implemented |
+| Peak/Median Memory      | `peak_memory_usage_mb`, `median_memory_usage_mb` | Memory usage of the spiced process                                                    | ✅ Implemented |
+| Ingestion Rows/Bytes    | `ingestion_rows_total`, `ingestion_bytes_total`  | Total data ingested (from SUT adapter)                                                | ✅ Implemented |
+| Ingestion records/s     | `ingestion_rows_per_sec` (Gauge)                 | Sustained ingestion throughput (from SUT adapter)                                     | ✅ Implemented |
+| Queries/s               | `queries_per_sec` (Gauge)                        | Query throughput under load                                                           | ✅ Implemented |
+| Total Queries           | `queries_total` (Counter)                        | Total queries executed during the run                                                 | ✅ Implemented |
+| Active Connections      | `active_connections` (Gauge)                     | Number of concurrent connections/clients                                              | ✅ Implemented |
+| SUT CPU                 | `sut_cpu_usage_percent` (Gauge)                  | SUT CPU utilization (from adapter `metrics`)                                          | ✅ Implemented |
+| SUT Memory              | `sut_memory_usage_bytes` (Gauge)                 | SUT memory usage (from adapter `metrics`)                                             | ✅ Implemented |
+| SUT Disk I/O            | `sut_disk_{read,write}_bytes` (Gauge)            | SUT disk read/write bytes (from adapter `metrics`)                                    | ✅ Implemented |
+| SUT Disk IOPS           | `sut_disk_{read,write}_iops` (Gauge)             | SUT disk IOPS (from adapter `metrics`)                                                | ✅ Implemented |
+| Efficiency              | `efficiency_queries_per_core` (Gauge)            | Query throughput normalized by CPU cores                                              | ✅ Implemented |
+| E2E Latency             | `e2e_latency_ms` (Histogram)                     | Raw event-to-queryable freshness samples; percentile is computed in dashboard queries | ✅ Implemented |
+| Checkpoint In-flight    | `checkpoint_in_flight_queries` (Gauge)           | In-flight query count during checkpoint validation                                    | ✅ Implemented |
 
 #### Grafana Dashboard