dbt-opentelemetry

A dbt package for transforming OpenTelemetry (OTLP) data into queryable, analytics-ready schemas.

Overview

dbt-opentelemetry provides staging models and macros that transform raw OTLP data (logs, spans, metrics) into normalized, queryable tables. Optional domain-specific models are available for AI/LLM agent tracking (Claude Code, specifically), and APM metrics.

Designed to work with OpenTelemetry data that has been flatted into records with the otlp2records library, which is used by the duckdb-otlp extension and otlp2parquet.

Data Flow

Raw OTLP Tables (Iceberg/Parquet)
├── logs
├── spans
├── gauge (metrics)
└── sum (metrics)
         ↓
    [Staging Models - always enabled]
    ├── stg_otel_logs
    ├── stg_otel_spans
    ├── stg_otel_metrics_gauge
    └── stg_otel_metrics_sum
         ↓
    [Core Models - always enabled]
    ├── dim_otel_resources (resource deduplication)
    ├── fct_otel_logs
    ├── fct_otel_spans
    ├── fct_otel_metrics_gauge
    └── fct_otel_metrics_sum
         ↓
    [Reporting Models - always enabled]
    └── rpt_otel_service_inventory
         ↓
    [Domain Models - optional]
    ├── AI Agents: domain_agent_events, marts_agent_session_summary
    └── APM: marts_service_red_5m

Supported Warehouses

Warehouse	Status
DuckDB	Supported

Installation

Add to your packages.yml:

packages:
  - package: smithclay/dbt_opentelemetry
    version: [">=0.0.1", "<1.0.0"]

Then run:

dbt deps

Configuration

Source Location

Configure where your OTLP data lives by setting these variables in your dbt_project.yml:

vars:
  # Database/catalog containing your OTLP tables
  otel_database: "my_catalog"

  # Schema containing your OTLP tables
  otel_schema: "raw"

  # Table names (if different from defaults)
  otel_logs_table: "logs"
  otel_spans_table: "spans"
  otel_gauge_table: "gauge"
  otel_sum_table: "sum"

Enabling Domain Models

Domain-specific models are disabled by default. Enable them in your dbt_project.yml:

vars:
  # AI/LLM agent tracking models
  otel_enable_ai_agents: true

  # APM RED metrics models
  otel_enable_apm: true

Models

Staging Models (always enabled)

Model	Description
`stg_otel_logs`	Normalized logs with `event_ts`, `service_name`, `severity_class`, `is_error`
`stg_otel_spans`	Normalized spans with `start_ts`, `end_ts`, `duration_ms`, `is_error`
`stg_otel_metrics_gauge`	Gauge metrics with `event_ts` and service identity
`stg_otel_metrics_sum`	Sum metrics with `event_ts` and service identity

Core Models (always enabled)

Model	Description
`dim_otel_resources`	Deduplicated resource dimension with `resource_hash`, service identity, and resource attributes
`fct_otel_logs`	Fact table for logs with foreign key to `dim_otel_resources`
`fct_otel_spans`	Fact table for spans with foreign key to `dim_otel_resources`
`fct_otel_metrics_gauge`	Fact table for gauge metrics with foreign key to `dim_otel_resources`
`fct_otel_metrics_sum`	Fact table for sum metrics with foreign key to `dim_otel_resources`

Reporting Models (always enabled)

Model	Description
`rpt_otel_service_inventory`	Service inventory showing services with instance, pod, node, host, and container counts

AI Agent Models (`otel_enable_ai_agents: true`)

Transforms Claude Code telemetry into analytics-ready tables. Filters for scope_name = 'com.anthropic.claude_code.events' and scope_version = '2.1.*'.

Model	Description
`domain_agent_events`	Individual Claude Code events with promoted attributes
`marts_agent_session_summary`	Session-level aggregations for cost, token, and efficiency analysis

Event Types (`domain_agent_events`)

Event	Description	Key Fields
`user_prompt`	User message sent to Claude	`prompt_text`, `prompt_length`
`api_request`	Successful API call	`model`, `input_tokens`, `output_tokens`, `cache_read_tokens`, `cost_usd`, `api_duration_ms`
`api_error`	Failed API call	`api_error_message`, `api_status_code`, `api_attempt_number`
`tool_result`	Tool execution completed	`tool_name`, `tool_success`, `tool_duration_ms`, `tool_result_size_bytes`
`tool_decision`	Permission decision for tool	`tool_name`, `tool_decision` (accept/reject), `decision_source`

Session Summary Metrics (`marts_agent_session_summary`)

Token & Cost Metrics:

input_tokens_total, output_tokens_total, tokens_total
cache_read_tokens_total, cache_creation_tokens_total
cost_usd_total, cost_per_prompt

Efficiency Metrics:

tool_success_rate - Ratio of successful tool executions
api_error_rate - Ratio of failed API calls
cache_hit_rate - Ratio of tokens served from cache
tokens_per_prompt - Average tokens per user prompt

Usage Metrics:

prompt_count, api_request_count, tool_result_count
unique_tools_used, tool_accepts, tool_rejects
session_duration_seconds, primary_model

Example Queries

-- Cost by user over last 7 days
select
  user_email,
  count(*) as sessions,
  sum(cost_usd_total) as total_cost,
  sum(tokens_total) as total_tokens
from marts_agent_session_summary
where session_start_ts > current_date - interval '7 days'
group by user_email
order by total_cost desc;

-- Tool usage breakdown
select
  tool_name,
  count(*) as invocations,
  avg(tool_duration_ms) as avg_duration_ms,
  sum(case when tool_success = 'true' then 1 else 0 end)::float / count(*) as success_rate
from domain_agent_events
where event_name = 'tool_result'
group by tool_name
order by invocations desc;

-- Model efficiency comparison
select
  model,
  count(*) as requests,
  avg(api_duration_ms) as avg_latency_ms,
  sum(cost_usd) as total_cost,
  avg(output_tokens::float / nullif(input_tokens, 0)) as output_input_ratio
from domain_agent_events
where event_name = 'api_request'
group by model;

APM Models (`otel_enable_apm: true`)

Model	Description
`marts_service_red_5m`	RED metrics (Rate, Errors, Duration p50/p95/p99) by service, 5-minute buckets

Macros

Use these macros to extract attributes from JSON blobs in your own models:

-- Extract string attribute from JSON column
-- Handles dotted keys, converts 'undefined'/'null' strings to SQL NULL
{{ dbt_opentelemetry.attr_str('resource_attributes', 'service.name') }}
{{ dbt_opentelemetry.attr_str('attributes', 'event.name') }}

-- Time bucket aggregation (5 minute buckets)
{{ dbt_opentelemetry.bucket_5m('event_ts') }}

-- Generate resource hash for deduplication
{{ dbt_opentelemetry.resource_hash('resource_attributes') }}

Example Project Configuration

# dbt_project.yml
name: my_observability_project
version: 0.0.1

vars:
  # Point to your OTLP data
  otel_database: "iceberg_catalog"
  otel_schema: "otel_raw"

  # Enable APM models
  otel_enable_apm: true

models:
  my_observability_project:
    +schema: analytics

Requirements

dbt Core >= 1.6.0, < 3.0.0
DuckDB adapter (for DuckDB warehouse)

Development

Pre-commit Hooks

This project uses pre-commit for basic code quality checks.

# Install and run with uvx (no install required)
uvx pre-commit install
uvx pre-commit run --all-files

SQL Linting with SQLFluff

SQLFluff linting requires the full dbt environment. Run it separately:

# After dbt compile (generates manifest for templater)
cd integration_tests
dbt deps && dbt compile

# Lint with dbt templater
uvx sqlfluff lint ../models/ --templater dbt --dialect duckdb

# Or quick lint without dbt context
uvx sqlfluff lint ../models/ --dialect duckdb

For CI, consider adding SQLFluff after dbt build in your workflow.

Contributing

Contributions are welcome! Please open an issue or pull request on GitHub.

License

Apache 2.0 - see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github/workflows		.github/workflows
adapters/duckdb_iceberg		adapters/duckdb_iceberg
integration_tests		integration_tests
macros		macros
models		models
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.sqlfluff		.sqlfluff
.sqlfluffignore		.sqlfluffignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
dbt_project.yml		dbt_project.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dbt-opentelemetry

Overview

Data Flow

Supported Warehouses

Installation

Configuration

Source Location

Enabling Domain Models

Models

Staging Models (always enabled)

Core Models (always enabled)

Reporting Models (always enabled)

AI Agent Models (`otel_enable_ai_agents: true`)

Event Types (`domain_agent_events`)

Session Summary Metrics (`marts_agent_session_summary`)

Example Queries

APM Models (`otel_enable_apm: true`)

Macros

Example Project Configuration

Requirements

Development

Pre-commit Hooks

SQL Linting with SQLFluff

Contributing

License

About

Uh oh!

Releases 1

Packages

License

smithclay/dbt_opentelemetry

Folders and files

Latest commit

History

Repository files navigation

dbt-opentelemetry

Overview

Data Flow

Supported Warehouses

Installation

Configuration

Source Location

Enabling Domain Models

Models

Staging Models (always enabled)

Core Models (always enabled)

Reporting Models (always enabled)

AI Agent Models (otel_enable_ai_agents: true)

Event Types (domain_agent_events)

Session Summary Metrics (marts_agent_session_summary)

Example Queries

APM Models (otel_enable_apm: true)

Macros

Example Project Configuration

Requirements

Development

Pre-commit Hooks

SQL Linting with SQLFluff

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

AI Agent Models (`otel_enable_ai_agents: true`)

Event Types (`domain_agent_events`)

Session Summary Metrics (`marts_agent_session_summary`)

APM Models (`otel_enable_apm: true`)

Packages