A dbt package for transforming OpenTelemetry (OTLP) data into queryable, analytics-ready schemas.
dbt-opentelemetry provides staging models and macros that transform raw OTLP data (logs, spans, metrics) into normalized, queryable tables. Optional domain-specific models are available for AI/LLM agent tracking (Claude Code, specifically), and APM metrics.
Designed to work with OpenTelemetry data that has been flatted into records with the otlp2records library, which is used by the duckdb-otlp extension and otlp2parquet.
Raw OTLP Tables (Iceberg/Parquet)
├── logs
├── spans
├── gauge (metrics)
└── sum (metrics)
↓
[Staging Models - always enabled]
├── stg_otel_logs
├── stg_otel_spans
├── stg_otel_metrics_gauge
└── stg_otel_metrics_sum
↓
[Core Models - always enabled]
├── dim_otel_resources (resource deduplication)
├── fct_otel_logs
├── fct_otel_spans
├── fct_otel_metrics_gauge
└── fct_otel_metrics_sum
↓
[Reporting Models - always enabled]
└── rpt_otel_service_inventory
↓
[Domain Models - optional]
├── AI Agents: domain_agent_events, marts_agent_session_summary
└── APM: marts_service_red_5m
| Warehouse | Status |
|---|---|
| DuckDB | Supported |
Add to your packages.yml:
packages:
- package: smithclay/dbt_opentelemetry
version: [">=0.0.1", "<1.0.0"]Then run:
dbt depsConfigure where your OTLP data lives by setting these variables in your dbt_project.yml:
vars:
# Database/catalog containing your OTLP tables
otel_database: "my_catalog"
# Schema containing your OTLP tables
otel_schema: "raw"
# Table names (if different from defaults)
otel_logs_table: "logs"
otel_spans_table: "spans"
otel_gauge_table: "gauge"
otel_sum_table: "sum"Domain-specific models are disabled by default. Enable them in your dbt_project.yml:
vars:
# AI/LLM agent tracking models
otel_enable_ai_agents: true
# APM RED metrics models
otel_enable_apm: true| Model | Description |
|---|---|
stg_otel_logs |
Normalized logs with event_ts, service_name, severity_class, is_error |
stg_otel_spans |
Normalized spans with start_ts, end_ts, duration_ms, is_error |
stg_otel_metrics_gauge |
Gauge metrics with event_ts and service identity |
stg_otel_metrics_sum |
Sum metrics with event_ts and service identity |
| Model | Description |
|---|---|
dim_otel_resources |
Deduplicated resource dimension with resource_hash, service identity, and resource attributes |
fct_otel_logs |
Fact table for logs with foreign key to dim_otel_resources |
fct_otel_spans |
Fact table for spans with foreign key to dim_otel_resources |
fct_otel_metrics_gauge |
Fact table for gauge metrics with foreign key to dim_otel_resources |
fct_otel_metrics_sum |
Fact table for sum metrics with foreign key to dim_otel_resources |
| Model | Description |
|---|---|
rpt_otel_service_inventory |
Service inventory showing services with instance, pod, node, host, and container counts |
Transforms Claude Code telemetry into analytics-ready tables. Filters for scope_name = 'com.anthropic.claude_code.events' and scope_version = '2.1.*'.
| Model | Description |
|---|---|
domain_agent_events |
Individual Claude Code events with promoted attributes |
marts_agent_session_summary |
Session-level aggregations for cost, token, and efficiency analysis |
| Event | Description | Key Fields |
|---|---|---|
user_prompt |
User message sent to Claude | prompt_text, prompt_length |
api_request |
Successful API call | model, input_tokens, output_tokens, cache_read_tokens, cost_usd, api_duration_ms |
api_error |
Failed API call | api_error_message, api_status_code, api_attempt_number |
tool_result |
Tool execution completed | tool_name, tool_success, tool_duration_ms, tool_result_size_bytes |
tool_decision |
Permission decision for tool | tool_name, tool_decision (accept/reject), decision_source |
Token & Cost Metrics:
input_tokens_total,output_tokens_total,tokens_totalcache_read_tokens_total,cache_creation_tokens_totalcost_usd_total,cost_per_prompt
Efficiency Metrics:
tool_success_rate- Ratio of successful tool executionsapi_error_rate- Ratio of failed API callscache_hit_rate- Ratio of tokens served from cachetokens_per_prompt- Average tokens per user prompt
Usage Metrics:
prompt_count,api_request_count,tool_result_countunique_tools_used,tool_accepts,tool_rejectssession_duration_seconds,primary_model
-- Cost by user over last 7 days
select
user_email,
count(*) as sessions,
sum(cost_usd_total) as total_cost,
sum(tokens_total) as total_tokens
from marts_agent_session_summary
where session_start_ts > current_date - interval '7 days'
group by user_email
order by total_cost desc;
-- Tool usage breakdown
select
tool_name,
count(*) as invocations,
avg(tool_duration_ms) as avg_duration_ms,
sum(case when tool_success = 'true' then 1 else 0 end)::float / count(*) as success_rate
from domain_agent_events
where event_name = 'tool_result'
group by tool_name
order by invocations desc;
-- Model efficiency comparison
select
model,
count(*) as requests,
avg(api_duration_ms) as avg_latency_ms,
sum(cost_usd) as total_cost,
avg(output_tokens::float / nullif(input_tokens, 0)) as output_input_ratio
from domain_agent_events
where event_name = 'api_request'
group by model;| Model | Description |
|---|---|
marts_service_red_5m |
RED metrics (Rate, Errors, Duration p50/p95/p99) by service, 5-minute buckets |
Use these macros to extract attributes from JSON blobs in your own models:
-- Extract string attribute from JSON column
-- Handles dotted keys, converts 'undefined'/'null' strings to SQL NULL
{{ dbt_opentelemetry.attr_str('resource_attributes', 'service.name') }}
{{ dbt_opentelemetry.attr_str('attributes', 'event.name') }}
-- Time bucket aggregation (5 minute buckets)
{{ dbt_opentelemetry.bucket_5m('event_ts') }}
-- Generate resource hash for deduplication
{{ dbt_opentelemetry.resource_hash('resource_attributes') }}# dbt_project.yml
name: my_observability_project
version: 0.0.1
vars:
# Point to your OTLP data
otel_database: "iceberg_catalog"
otel_schema: "otel_raw"
# Enable APM models
otel_enable_apm: true
models:
my_observability_project:
+schema: analytics- dbt Core >= 1.6.0, < 3.0.0
- DuckDB adapter (for DuckDB warehouse)
This project uses pre-commit for basic code quality checks.
# Install and run with uvx (no install required)
uvx pre-commit install
uvx pre-commit run --all-filesSQLFluff linting requires the full dbt environment. Run it separately:
# After dbt compile (generates manifest for templater)
cd integration_tests
dbt deps && dbt compile
# Lint with dbt templater
uvx sqlfluff lint ../models/ --templater dbt --dialect duckdb
# Or quick lint without dbt context
uvx sqlfluff lint ../models/ --dialect duckdbFor CI, consider adding SQLFluff after dbt build in your workflow.
Contributions are welcome! Please open an issue or pull request on GitHub.
Apache 2.0 - see LICENSE for details.