Skip to content

Commit 6986a0b

Browse files
docs(otel): document runtime.telemetry.metric_prefix, properties, and DataDog OTLP setup (#1542)
Co-authored-by: lukekim <80174+lukekim@users.noreply.github.com>
1 parent 59d80fa commit 6986a0b

3 files changed

Lines changed: 146 additions & 0 deletions

File tree

website/docs/features/observability/index.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -162,6 +162,8 @@ export OTEL_EXPORTER_OTLP_ENDPOINT="https://otlp.us3.datadoghq.com"
162162
export OTEL_EXPORTER_OTLP_HEADERS="DD-API-KEY=${DD_API_KEY}"
163163
```
164164

165+
For a complete Datadog setup including metric prefixing and custom tags via OTLP resource attributes, see the [Datadog monitoring guide](/docs/next/monitoring/datadog#opentelemetry-otlp-export).
166+
165167
#### Grafana Cloud (OTLP/HTTP)
166168

167169
Grafana Cloud's OTLP gateway expects HTTP Basic authentication. Obtain the base64-encoded `instanceID:accessPolicyToken` credential from the Grafana Cloud "OpenTelemetry" connection page and store it in a secret:
@@ -200,6 +202,25 @@ runtime:
200202
api-key: ${secrets:collector_api_key}
201203
```
202204

205+
## Metric Naming and Custom Tags
206+
207+
Two runtime fields control how exported metrics are named and labeled across **all** readers (Prometheus scrape, cluster OTLP reader, and the `otel_exporter` push exporter):
208+
209+
- [`runtime.telemetry.metric_prefix`](/docs/next/reference/spicepod/runtime#runtimetelemetrymetric_prefix) — prepends a string to every metric name (e.g. `spiceai.query_duration_ms`). Useful for namespacing in shared backends.
210+
- [`runtime.telemetry.properties`](/docs/next/reference/spicepod/runtime#runtimetelemetryproperties) — attaches custom key/value attributes as OpenTelemetry resource attributes, which most backends surface as dimensions or tags.
211+
212+
```yaml
213+
runtime:
214+
telemetry:
215+
metric_prefix: 'spiceai.'
216+
properties:
217+
environment: prod
218+
region: us-west-2
219+
team: data-platform
220+
```
221+
222+
Both fields apply to every exporter the runtime has enabled. See the [Datadog monitoring guide](/docs/next/monitoring/datadog#opentelemetry-otlp-export) for backend-specific notes (Datadog requires `dd-otel-metric-config` to map resource attributes to tags).
223+
203224
### Metric Filtering
204225

205226
To export only specific metrics, use the `metrics` parameter:

website/docs/monitoring/datadog/index.md

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,3 +43,97 @@ instances:
4343
3. Dashboard is now configured to display Spice.ai OSS key performance metrics
4444

4545
<img width="800" src="/img/datadog/spice_datadog_dashboard.png"/>
46+
47+
## OpenTelemetry OTLP Export
48+
49+
As an alternative to scraping the Prometheus endpoint with the Datadog Agent, Spice can push metrics directly to Datadog's [OTLP Metrics Intake Endpoint](https://docs.datadoghq.com/opentelemetry/setup/intake_endpoint/) over HTTP. This is the recommended approach for agentless deployments (e.g. serverless, ephemeral containers) and for environments where the Datadog API key is managed through Spice's [secret stores](../../components/secret-stores).
50+
51+
### Minimal Configuration
52+
53+
Replace `us3` with the Datadog site for the target account (`us3`, `us5`, `eu`, `ap1`, etc.) and store the Datadog API key in a secret:
54+
55+
```yaml
56+
runtime:
57+
telemetry:
58+
otel_exporter:
59+
endpoint: https://otlp.us3.datadoghq.com/v1/metrics
60+
headers:
61+
DD-API-KEY: ${secrets:DD_API_KEY}
62+
```
63+
64+
Metrics begin appearing in the Datadog Metrics Explorer within a minute or two.
65+
66+
### Namespace Spice Metrics with a Prefix
67+
68+
Use [`runtime.telemetry.metric_prefix`](/docs/next/reference/spicepod/runtime#runtimetelemetrymetric_prefix) to prepend a string to every exported metric name. This avoids collisions with metrics from other services in the same Datadog account:
69+
70+
```yaml
71+
runtime:
72+
telemetry:
73+
metric_prefix: 'spiceai.'
74+
```
75+
76+
The runtime metric `query_duration_ms` is then exported as `spiceai.query_duration_ms`.
77+
78+
### Add Custom Tags via Resource Attributes
79+
80+
Attach custom key/value pairs to every metric using [`runtime.telemetry.properties`](/docs/next/reference/spicepod/runtime#runtimetelemetryproperties). Spice sends these as OpenTelemetry resource attributes:
81+
82+
```yaml
83+
runtime:
84+
telemetry:
85+
properties:
86+
environment: prod
87+
region: us-west-2
88+
team: data-platform
89+
```
90+
91+
For these resource attributes to surface as **tags** in Datadog, the Datadog OTLP intake also requires the `dd-otel-metric-config` header with `resource_attributes_as_tags` enabled (see [Datadog OTLP Metrics Intake Endpoint](https://docs.datadoghq.com/opentelemetry/setup/intake_endpoint/)):
92+
93+
```yaml
94+
runtime:
95+
telemetry:
96+
otel_exporter:
97+
endpoint: https://otlp.us3.datadoghq.com/v1/metrics
98+
headers:
99+
DD-API-KEY: ${secrets:DD_API_KEY}
100+
dd-otel-metric-config: '{"resource_attributes_as_tags": true}'
101+
```
102+
103+
:::note Tags can lag behind metrics
104+
Datadog typically ingests OTLP metrics within seconds, but the associated tags (from resource attributes) can take noticeably longer to appear in the UI — sometimes several minutes after the first datapoints. The metrics and tags do eventually converge.
105+
:::
106+
107+
:::caution Manage tag cardinality in Datadog
108+
Datadog [bills on custom metric cardinality](https://docs.datadoghq.com/account_management/billing/custom_metrics/), driven by the number of unique tag-value combinations per metric. The custom tags added via `runtime.telemetry.properties` are typically low-cardinality (`environment`, `region`, `team`), but Spice metrics also carry a number of automatically populated dimensions — for example `dataset`, `protocol`, `client`, `client_version`, `client_system`, `user_agent`, `runtime`, `runtime_version`, `runtime_system` (see [Available Metrics](../../features/observability#available-metrics)) — some of which can grow with the size of the deployment.
109+
110+
Datadog's [Metrics without Limits™](https://docs.datadoghq.com/metrics/metrics-without-limits/) decouples ingestion from indexing for exactly this case. With Metrics without Limits™, every tag Spice emits is still ingested, but each metric is configured with one of:
111+
112+
- an **allowlist** that keeps only the tags actually used in dashboards, monitors, and queries (e.g. keep `dataset` and `environment`, drop the rest), or
113+
- a **blocklist** that drops specific auto-populated tags that are not useful for a given metric (e.g. exclude `user_agent` or `client_version`).
114+
115+
Only the indexed (queryable) tag combinations count toward custom metric billing. Configuration is done per metric in the Metrics Summary page or via the Metrics API, and the in-app UI surfaces an estimated indexed-metric volume before saving and can pre-populate an allowlist from tags actively queried in dashboards, monitors, and notebooks.
116+
:::
117+
118+
### Full Example
119+
120+
A complete `runtime.telemetry` block combining metric prefixing, custom tags, and Datadog OTLP export:
121+
122+
```yaml
123+
runtime:
124+
telemetry:
125+
metric_prefix: 'spiceai.'
126+
properties:
127+
environment: prod
128+
region: us-west-2
129+
team: data-platform
130+
otel_exporter:
131+
endpoint: https://otlp.us3.datadoghq.com/v1/metrics
132+
headers:
133+
DD-API-KEY: ${secrets:DD_API_KEY}
134+
dd-otel-metric-config: '{"resource_attributes_as_tags": true}'
135+
```
136+
137+
With this configuration, every Spice metric (e.g. `spiceai.query_duration_ms`, `spiceai.query_executions`) arrives in Datadog tagged with `environment:prod`, `region:us-west-2`, and `team:data-platform`.
138+
139+
For general OTLP exporter options (push interval, metric filtering, gRPC vs HTTP), see [OpenTelemetry Metrics Exporter](../../features/observability#opentelemetry-metrics-exporter).

website/docs/reference/spicepod/runtime.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -345,6 +345,37 @@ runtime:
345345

346346
Enables or disables runtime telemetry collection. Defaults to `true`.
347347

348+
### `runtime.telemetry.metric_prefix` {#runtimetelemetrymetric_prefix}
349+
350+
Optional string prepended to every exported metric name. Useful for namespacing Spice metrics in shared backends (e.g. Datadog, Grafana Cloud, New Relic) so they do not collide with metrics from other services. Defaults to no prefix.
351+
352+
The prefix applies to **all** metric readers — the Prometheus scrape endpoint (`--metrics`), the cluster on-demand OTLP reader, and the `otel_exporter` push exporter — because OpenTelemetry views are configured at the meter-provider level rather than per reader.
353+
354+
```yaml
355+
runtime:
356+
telemetry:
357+
metric_prefix: 'spiceai.'
358+
```
359+
360+
With this configuration, the runtime metric `query_duration_ms` is exported as `spiceai.query_duration_ms`.
361+
362+
### `runtime.telemetry.properties` {#runtimetelemetryproperties}
363+
364+
Map of custom key/value attributes attached to telemetry metrics emitted by `spiced`. Applied as OpenTelemetry resource attributes on the runtime's `MeterProvider`, so they appear as dimensions/tags on every metric exported via the Prometheus scrape endpoint, the cluster on-demand OTLP reader, and the `otel_exporter` push exporter. Defaults to empty.
365+
366+
```yaml
367+
runtime:
368+
telemetry:
369+
properties:
370+
environment: prod
371+
region: us-west-2
372+
team: data-platform
373+
```
374+
375+
The standard OpenTelemetry environment variables (`OTEL_SERVICE_NAME`, `OTEL_RESOURCE_ATTRIBUTES`) are still honored and act as defaults; explicit `properties` entries take precedence on key conflicts.
376+
377+
For backends that map OTLP resource attributes to tags through additional configuration (e.g. Datadog), see the [Datadog OTLP guide](/docs/next/monitoring/datadog#opentelemetry-otlp-export).
378+
348379
### `runtime.telemetry.otel_exporter`
349380

350381
Configures an [OpenTelemetry](https://opentelemetry.io/) metrics exporter to push metrics to an OpenTelemetry collector. The exporter automatically infers the protocol (gRPC or HTTP) based on the endpoint configuration.

0 commit comments

Comments
 (0)