Skip to content

Feature Request: Respect Existing OpenTelemetry TracerProvider #1199

@apoliakov

Description

@apoliakov

Requested here: https://discord.com/channels/1156433345631232100/1166779411920597002/1480698051319300147

Summary

DBOS SDK unconditionally overwrites the global OpenTelemetry TracerProvider during initialization and couples span creation to its own OTLP export pipeline. This makes it impossible for DBOS to integrate with existing APM solutions (Datadog dd-trace, Grafana Agent, etc.) that register their own TracerProvider via @opentelemetry/api.

We request that DBOS become a well-behaved OpenTelemetry citizen: use an existing global TracerProvider when one is already registered, and only create its own when none exists.

Context: Our Setup

We run a NestJS application on ECS Fargate with dd-trace for APM. The tracing architecture:

  • dd-trace (import 'dd-trace/init') auto-instruments NestJS, Fastify, gRPC, Prisma, Redis, and axios. It registers itself as the global OTel TracerProvider, so any library that creates spans via @opentelemetry/api flows through dd-trace into Datadog.
  • Datadog Agent runs as a sidecar container, receiving spans on port 8126 (Datadog's native protocol).
  • dd-trace provides Datadog-specific features we depend on: continuous profiling, Database Monitoring (DBM) propagation, Application Security Monitoring, runtime metrics, and automatic log-trace correlation (dd.trace_id/dd.span_id injection).

We recently adopted DBOS for durable workflow execution. DBOS workflows and steps are invisible to our APM.

The Problem

Issue 1: DBOS Overwrites the Global TracerProvider

When DBOS.launch() is called with enableOTLP: true, it unconditionally replaces the global TracerProvider:

// @dbos-inc/dbos-sdk/dist/src/telemetry/traces.js — installTraceContextManager()
function installTraceContextManager(appName = 'dbos') {
    if (!utils_1.globalParams.enableOTLP) {
        return;
    }
    const { BasicTracerProvider } = require('@opentelemetry/sdk-trace-base');
    const provider = new BasicTracerProvider({
        resource: { attributes: { 'service.name': appName } },
    });
    trace.setGlobalTracerProvider(provider);  // ← Overwrites dd-trace's provider
}

This is called again in the Tracer constructor:

// @dbos-inc/dbos-sdk/dist/src/telemetry/traces.js — Tracer constructor
constructor(telemetryCollector, appName = 'dbos') {
    // ...
    const tracer = new BasicTracerProvider({
        resource: { attributes: { 'service.name': appName } },
    });
    trace.setGlobalTracerProvider(tracer);  // ← Overwrites again
}

Impact: dd-trace's TracerProvider is replaced. Any OTel-based instrumentation (e.g., Prisma's @prisma/instrumentation) that was registered with dd-trace's provider stops correlating with HTTP request traces. dd-trace's own auto-instrumentation (NestJS, gRPC, etc.) continues working because it uses internal references rather than the global provider, but the OTel interop layer is broken.

Issue 2: Span Creation is Coupled to OTLP Export

Every span creation point has this guard:

// @dbos-inc/dbos-sdk/dist/src/telemetry/traces.js — startSpan()
startSpan(name, attributes, inputSpan) {
    if (!utils_1.globalParams.enableOTLP) {
        return new StubSpan();  // ← No-op when OTLP is disabled
    }
    // ... create real span
}

The same guard exists in runWithTrace():

function runWithTrace(span, func) {
    if (!utils_1.globalParams.enableOTLP) {
        return func();  // ← No trace context propagation
    }
    const { context, trace } = require('@opentelemetry/api');
    return context.with(trace.setSpan(context.active(), span), func);
}

Impact: If enableOTLP is false (the default unless DBOS__CLOUD=true), DBOS creates StubSpan no-ops for every workflow and step. There is no way to get real spans without also enabling DBOS's OTLP export pipeline. Even if dd-trace's provider were not overwritten, DBOS would still produce no spans.

Issue 3: Custom Export Pipeline Instead of Standard OTel SpanProcessors

DBOS uses a bespoke TelemetryCollector that batches spans on a 100ms interval and exports them directly via OTLPTraceExporter:

// @dbos-inc/dbos-sdk/dist/src/telemetry/collector.js
constructor(exporter) {
    this.exporter = exporter;
    this.signalBufferID = setInterval(() => {
        void this.processAndExportSignals();
    }, this.processAndExportSignalsIntervalMs);
}
// @dbos-inc/dbos-sdk/dist/src/telemetry/exporters.js
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-proto');
this.tracesExporters.push(new OTLPTraceExporter({ url: endpoint }));

Impact: Spans are manually collected and exported outside the standard OTel SDK pipeline (BatchSpanProcessor / SimpleSpanProcessor). This means spans don't flow through whatever SpanProcessor is registered on the global TracerProvider. Even if the global provider is dd-trace's, DBOS's spans bypass it entirely.

Net Result

These three issues combined mean DBOS's telemetry is a closed system. The only supported path is:

DBOS → BasicTracerProvider → TelemetryCollector → OTLPTraceExporter → OTLP endpoint

There is no integration point for dd-trace, Grafana Agent, or any other TracerProvider to capture DBOS spans.

Current Workaround

We must run two parallel trace pipelines into the same Datadog backend:

NestJS / Prisma / gRPC / Redis         DBOS workflows & steps
         │                                       │
    dd-trace                               @dbos-inc/otel
    (Datadog protocol)                     (OTLP HTTP)
         │                                       │
         ▼                                       ▼
    DD Agent :8126                       DD Agent :4318
         │                                       │
         └───────────┐          ┌────────────────┘
                     ▼          ▼
                Datadog Backend

This requires:

  • Enabling a second receiver (OTLP) on our Datadog Agent sidecar
  • Exposing an additional port (4318) on the container
  • Installing @dbos-inc/otel and configuring OTLP endpoints
  • Two independent trace pipelines for a single service

It works, but it's operationally complex and the two trace trees (dd-trace HTTP spans and DBOS workflow spans) are not correlated — a workflow triggered by an HTTP request appears as two separate traces in Datadog.

Proposed Changes

Change 1: Respect Existing Global TracerProvider

function installTraceContextManager(appName = 'dbos') {
    const { context, trace } = require('@opentelemetry/api');
    const { AsyncLocalStorageContextManager } = require('@opentelemetry/context-async-hooks');

    // Always set up context propagation (needed for parent-child span linking)
    if (!context['_getContextManager']()) {
        const contextManager = new AsyncLocalStorageContextManager();
        contextManager.enable();
        context.setGlobalContextManager(contextManager);
    }

    // Only create a TracerProvider if none exists
    const existing = trace.getTracerProvider();
    const isNoopProvider = !existing || existing.getTracer('test').startSpan('test').constructor.name === 'NonRecordingSpan';

    if (isNoopProvider) {
        const { BasicTracerProvider } = require('@opentelemetry/sdk-trace-base');
        const provider = new BasicTracerProvider({
            resource: { attributes: { 'service.name': appName } },
        });
        trace.setGlobalTracerProvider(provider);
    }
    // else: someone (dd-trace, etc.) already registered — use theirs
}

Remove the duplicate setGlobalTracerProvider call from the Tracer constructor.

Change 2: Decouple Span Creation from OTLP Export

Introduce a separate flag or auto-detect whether span creation should be active:

// Create real spans when ANY TracerProvider is registered (dd-trace, OTel SDK, etc.)
// Export to OTLP only when enableOTLP is true AND endpoints are configured
startSpan(name, attributes, inputSpan) {
    if (!this.spanCreationEnabled) {
        return new StubSpan();
    }
    // ... create real span using trace.getTracer('dbos-tracer')
}

Where spanCreationEnabled is true when:

  • enableOTLP is explicitly true, OR
  • A non-noop global TracerProvider is already registered (indicating an external APM tool)

This way, enableOTLP: false + dd-trace = real spans created, flowing through dd-trace. enableOTLP: false + no provider = StubSpan (current behavior). enableOTLP: true = DBOS creates its own provider and exporter (current behavior).

Change 3: Use Standard OTel SpanProcessors

When DBOS creates its own BasicTracerProvider, register export via standard BatchSpanProcessor:

const { BatchSpanProcessor } = require('@opentelemetry/sdk-trace-base');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-proto');

const provider = new BasicTracerProvider({ ... });

for (const endpoint of tracesEndpoints) {
    provider.addSpanProcessor(
        new BatchSpanProcessor(new OTLPTraceExporter({ url: endpoint }))
    );
}

trace.setGlobalTracerProvider(provider);

This replaces the custom TelemetryCollector for traces. Benefits:

  • Spans flow through the standard OTel pipeline
  • Compatible with any SpanProcessor (dd-trace's, custom sampling, etc.)
  • When an external provider is used, spans export through that provider's processors automatically — no DBOS-side export needed

Expected Behavior After Changes

Scenario 1: DBOS + dd-trace (our use case)

import 'dd-trace/init';  // Registers as global TracerProvider

DBOS.setConfig({ name: 'merchant-api', systemDatabaseUrl: '...' });
await DBOS.launch();
// DBOS sees dd-trace's provider → creates real spans via dd-trace
// Workflow and step spans appear in Datadog APM under the same trace as HTTP spans
// No OTLP endpoint needed

Scenario 2: DBOS standalone with OTLP (e.g., Jaeger, Grafana)

DBOS.setConfig({
    name: 'my-app',
    enableOTLP: true,
    otlpTracesEndpoints: ['http://localhost:4318/v1/traces'],
});
await DBOS.launch();
// No existing provider → DBOS creates BasicTracerProvider + BatchSpanProcessor
// Exports to OTLP endpoint (current behavior, unchanged)

Scenario 3: DBOS on DBOS Cloud

DBOS__CLOUD=true  →  enableOTLP defaults to true  →  current behavior, unchanged

Impact

This change would make DBOS compatible with the broader OpenTelemetry ecosystem. Any APM vendor that registers a TracerProvider (Datadog, Dynatrace, New Relic, Honeycomb, Grafana, etc.) would automatically capture DBOS workflow and step spans with zero additional configuration.

The key principle: DBOS should be a TracerProvider consumer, not a TracerProvider owner — unless no provider exists.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions