|
| 1 | +# Spicebench |
| 2 | + |
| 3 | +A benchmark for data & AI platforms focused on operational data. Unlike static benchmarks such as ClickBench or TPC-H that run queries on pre-created datasets, Spicebench measures end-to-end performance across dynamic real-time data generation, ingestion, indexing/acceleration/materialization, and query execution — all running concurrently. |
| 4 | + |
| 5 | +## Architecture |
| 6 | + |
| 7 | +```mermaid |
| 8 | +flowchart TB |
| 9 | + subgraph GHA["GitHub Actions – Workflow Orchestration"] |
| 10 | + direction TB |
| 11 | + trigger["Trigger\n(schedule / manual / PR)"] |
| 12 | + orchestrator["Benchmark Orchestrator"] |
| 13 | + trigger --> orchestrator |
| 14 | + end |
| 15 | +
|
| 16 | + subgraph datagen["Data Generation"] |
| 17 | + generator["Data Generator\n(configurable rate & schema)"] |
| 18 | + end |
| 19 | +
|
| 20 | + subgraph adapters["System Adapters"] |
| 21 | + direction TB |
| 22 | + adapter_iface["Adapter Interface\n(setup / teardown / ingest / query)"] |
| 23 | + spice["Spice Cloud Adapter\n(Management API)"] |
| 24 | + databricks["Databricks Adapter\n(REST API)"] |
| 25 | + snowflake["Snowflake Adapter\n(SQL API)"] |
| 26 | + other["... Other Adapters"] |
| 27 | + adapter_iface --- spice |
| 28 | + adapter_iface --- databricks |
| 29 | + adapter_iface --- snowflake |
| 30 | + adapter_iface --- other |
| 31 | + end |
| 32 | +
|
| 33 | + subgraph sut["System Under Test"] |
| 34 | + direction TB |
| 35 | + ingest_ep["Ingestion Endpoint"] |
| 36 | + query_ep["Query Endpoint"] |
| 37 | + end |
| 38 | +
|
| 39 | + subgraph workload["Concurrent Workload Engine"] |
| 40 | + direction LR |
| 41 | + ingestion_driver["Ingestion Driver\n(continuous writes)"] |
| 42 | + query_driver["Query Driver\n(continuous reads)"] |
| 43 | + end |
| 44 | +
|
| 45 | + subgraph metrics["Metrics Collection (OTel)"] |
| 46 | + direction TB |
| 47 | + collector["Metrics Collector\n(OpenTelemetry SDK)"] |
| 48 | + m1["Data Size"] |
| 49 | + m2["Ingestion records/s"] |
| 50 | + m3["Connections / Clients"] |
| 51 | + m4["Queries/s & Requests/s"] |
| 52 | + m5["Query Latency (p50/p95/p99)"] |
| 53 | + m6["Efficiency (cores)"] |
| 54 | + m7["Resource Usage\n(CPU/Mem/Disk/IOPS)"] |
| 55 | + m8["E2E Latency\n(event creation → query)"] |
| 56 | + m9["E2E Duration"] |
| 57 | + collector --- m1 |
| 58 | + collector --- m2 |
| 59 | + collector --- m3 |
| 60 | + collector --- m4 |
| 61 | + collector --- m5 |
| 62 | + collector --- m6 |
| 63 | + collector --- m7 |
| 64 | + collector --- m8 |
| 65 | + collector --- m9 |
| 66 | + end |
| 67 | +
|
| 68 | + subgraph telemetry["telemetry.spiceai.io"] |
| 69 | + otel_endpoint["OTel Collector Endpoint"] |
| 70 | + end |
| 71 | +
|
| 72 | + subgraph reporting["Reporting"] |
| 73 | + results["Results Store"] |
| 74 | + report["Report Generator\n(comparisons & charts)"] |
| 75 | + results --> report |
| 76 | + end |
| 77 | +
|
| 78 | + orchestrator -->|"configure & launch"| datagen |
| 79 | + orchestrator -->|"setup via adapter"| adapters |
| 80 | + orchestrator -->|"start workloads"| workload |
| 81 | +
|
| 82 | + generator -->|"raw events"| ingestion_driver |
| 83 | + adapter_iface -->|"provision / configure"| sut |
| 84 | +
|
| 85 | + ingestion_driver -->|"write events"| ingest_ep |
| 86 | + query_driver -->|"execute queries"| query_ep |
| 87 | +
|
| 88 | + ingestion_driver -->|"write metrics"| collector |
| 89 | + query_driver -->|"query metrics"| collector |
| 90 | + sut -.->|"resource metrics"| collector |
| 91 | +
|
| 92 | + collector -->|"OTLP export"| otel_endpoint |
| 93 | + collector --> results |
| 94 | +``` |
| 95 | + |
| 96 | +### Component Overview |
| 97 | + |
| 98 | +| Component | Responsibility | |
| 99 | +| ------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | |
| 100 | +| **GitHub Actions Orchestrator** | Triggers benchmark runs on schedule, PR, or manual dispatch. Manages the full lifecycle: provision → run → collect → report → teardown. | |
| 101 | +| **Data Generator** | Produces realistic operational data at configurable rates and schemas. Emits timestamped events for E2E latency measurement. | |
| 102 | +| **System Adapters** | Pluggable interface for provisioning and interacting with each platform. Each adapter implements `setup`, `teardown`, `ingest`, and `query` operations using platform-specific APIs. | |
| 103 | +| **Concurrent Workload Engine** | Drives continuous ingestion and query execution in parallel, simulating real operational workloads where reads and writes happen simultaneously. | |
| 104 | +| **Metrics Collector** | Emits all benchmark metrics via OpenTelemetry (OTLP) to `telemetry.spiceai.io`. Captures data from both the workload drivers and the system under test. | |
| 105 | +| **Report Generator** | Aggregates results and produces cross-system comparisons. | |
| 106 | + |
| 107 | +### Metrics |
| 108 | + |
| 109 | +| Metric | Description | |
| 110 | +| --------------------- | ---------------------------------------------------------------------- | |
| 111 | +| Data Size | Total volume of data ingested during the benchmark run | |
| 112 | +| Ingestion records/s | Sustained ingestion throughput | |
| 113 | +| Connections / Clients | Number of concurrent connections maintained | |
| 114 | +| Queries/s, Requests/s | Query throughput under concurrent ingestion load | |
| 115 | +| Query Latency | Per-query performance breakdown (p50, p95, p99) across the query suite | |
| 116 | +| Efficiency (cores) | Performance normalized by compute resources | |
| 117 | +| Resource Usage | CPU, memory, disk, and IOPS utilization during the run | |
| 118 | +| E2E Latency | Time from event creation to the event being queryable | |
| 119 | +| E2E Duration | Total wall-clock time for the full benchmark run | |
| 120 | + |
| 121 | +### Adding a New System Adapter |
| 122 | + |
| 123 | +To benchmark a new platform, implement the adapter interface: |
| 124 | + |
| 125 | +1. **Setup** — Provision infrastructure and configure the target system (e.g., via [Spice Cloud Management API](https://docs.spice.ai/api/management) or Databricks REST API). |
| 126 | +2. **Ingest** — Write generated events to the system's ingestion endpoint. |
| 127 | +3. **Query** — Execute the benchmark query suite against the system. |
| 128 | +4. **Teardown** — Clean up provisioned resources. |
| 129 | + |
| 130 | +## License |
| 131 | + |
| 132 | +See [LICENSE](LICENSE) for details. |
0 commit comments