Skip to content

feat(claude-agent-sdk): add instrumentation for @anthropic-ai/claude-agent-sdk#7603

Open
mr-lee wants to merge 1 commit intoDataDog:masterfrom
mr-lee:feat/claude-agent-sdk-integration
Open

feat(claude-agent-sdk): add instrumentation for @anthropic-ai/claude-agent-sdk#7603
mr-lee wants to merge 1 commit intoDataDog:masterfrom
mr-lee:feat/claude-agent-sdk-integration

Conversation

@mr-lee
Copy link
Copy Markdown

@mr-lee mr-lee commented Feb 23, 2026

Summary

Adds automatic instrumentation for the Claude Agent SDK (@anthropic-ai/claude-agent-sdk), providing full visibility into agentic sessions via APM tracing and LLM Observability.

Span hierarchy

agent (session)
  └── agent (turn)
       ├── tool ({tool_name})     — dynamic name from SDK hook
       └── agent (subagent-{type}) — dynamic name from agent_type

Key design decisions

  • Hooks API: Uses the SDK's first-class hooks API (SessionStart, SessionEnd, Stop, PreToolUse, PostToolUse, SubagentStart, SubagentStop, etc.) — not monkey-patching internals
  • Turn spans are agent kind (not workflow) — turns are complete agentic response cycles
  • Dynamic span names: session, turn, {toolName}, subagent-{agentType} instead of static prefixed names
  • Model name splitting: anthropic/claude-sonnet-4-6model_name: claude-sonnet-4-6, model_provider: anthropic
  • Turn output: Uses last_assistant_message from SDK's Stop hook (the final agent text for each turn)
  • Rich metadata: start_trigger, project_dir, exit_reason, transcript_path, agent_type, is_interrupt
  • User hooks preserved via mergeHooks (user matchers placed before tracer matchers)
  • Pure ESM package handled via import-in-the-middle with esmFirst: true

Known gap

No LLM-level spans — the Agent SDK bundles its own Anthropic client internally (never imports @anthropic-ai/sdk), so the existing Anthropic shimmer doesn't fire. This matches dd-trace-py's approach: agent + tool spans only, with token metrics aggregated on the agent span from ResultMessage.

Performance

~186μs overhead per session (10 spans, 22 hook invocations), ~18μs per span. Negligible vs real query() calls.

Test plan

  • 41 APM tracing tests (shimmer unit tests + channel-based span tests + wrapQuery integration)
  • 13 LLM Obs tests (session, turn, tool, subagent with spec-aligned assertions)
  • Lint clean (eslint + actionlint)
  • Rebased on current master (single squashed commit)
  • CI: claude-agent-sdk job passes on Node 18 + latest

@mr-lee mr-lee requested review from a team as code owners February 23, 2026 19:41
@mr-lee mr-lee force-pushed the feat/claude-agent-sdk-integration branch 3 times, most recently from ca06588 to 0b55aab Compare February 23, 2026 22:39
@codecov
Copy link
Copy Markdown

codecov bot commented Feb 23, 2026

Codecov Report

❌ Patch coverage is 97.95082% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 37.81%. Comparing base (abc727b) to head (d65cfaf).

Files with missing lines Patch % Lines
...s/datadog-instrumentations/src/claude-agent-sdk.js 98.11% 2 Missing ⚠️
...ges/datadog-plugin-claude-agent-sdk/src/tracing.js 96.00% 2 Missing ⚠️
...ages/datadog-instrumentations/src/helpers/hooks.js 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #7603       +/-   ##
===========================================
- Coverage   73.82%   37.81%   -36.01%     
===========================================
  Files         773      250      -523     
  Lines       35972    11670    -24302     
===========================================
- Hits        26556     4413    -22143     
+ Misses       9416     7257     -2159     
Flag Coverage Δ
aiguard-macos ?
aiguard-ubuntu ?
aiguard-windows ?
apm-capabilities-tracing-macos ?
apm-capabilities-tracing-ubuntu ?
apm-capabilities-tracing-windows ?
apm-integrations-child-process ?
apm-integrations-couchbase-18 ?
apm-integrations-couchbase-eol ?
apm-integrations-oracledb ?
appsec-express ?
appsec-fastify ?
appsec-graphql ?
appsec-kafka ?
appsec-ldapjs ?
appsec-lodash ?
appsec-macos ?
appsec-mongodb-core ?
appsec-mongoose ?
appsec-mysql ?
appsec-node-serialize ?
appsec-passport ?
appsec-postgres ?
appsec-sourcing ?
appsec-stripe ?
appsec-template ?
appsec-ubuntu ?
appsec-windows ?
instrumentations-instrumentation-bluebird ?
instrumentations-instrumentation-body-parser ?
instrumentations-instrumentation-child_process ?
instrumentations-instrumentation-cookie-parser ?
instrumentations-instrumentation-express ?
instrumentations-instrumentation-express-mongo-sanitize ?
instrumentations-instrumentation-express-session ?
instrumentations-instrumentation-fs ?
instrumentations-instrumentation-generic-pool ?
instrumentations-instrumentation-http ?
instrumentations-instrumentation-knex ?
instrumentations-instrumentation-mongoose ?
instrumentations-instrumentation-multer ?
instrumentations-instrumentation-mysql2 ?
instrumentations-instrumentation-passport ?
instrumentations-instrumentation-passport-http ?
instrumentations-instrumentation-passport-local ?
instrumentations-instrumentation-pg ?
instrumentations-instrumentation-promise ?
instrumentations-instrumentation-promise-js ?
instrumentations-instrumentation-q ?
instrumentations-instrumentation-url ?
instrumentations-instrumentation-when ?
llmobs-ai ?
llmobs-anthropic ?
llmobs-bedrock ?
llmobs-claude-agent-sdk 37.81% <97.95%> (?)
llmobs-google-genai ?
llmobs-langchain ?
llmobs-openai ?
llmobs-vertex-ai ?
platform-core ?
platform-esbuild ?
platform-instrumentations-misc ?
platform-shimmer ?
platform-unit-guardrails ?
platform-webpack ?
plugins-azure-durable-functions ?
plugins-azure-event-hubs ?
plugins-azure-service-bus ?
plugins-bullmq ?
plugins-cassandra ?
plugins-cookie ?
plugins-cookie-parser ?
plugins-crypto ?
plugins-dd-trace-api ?
plugins-express-mongo-sanitize ?
plugins-express-session ?
plugins-fastify ?
plugins-fetch ?
plugins-fs ?
plugins-generic-pool ?
plugins-google-cloud-pubsub ?
plugins-grpc ?
plugins-handlebars ?
plugins-hapi ?
plugins-hono ?
plugins-ioredis ?
plugins-knex ?
plugins-langgraph ?
plugins-ldapjs ?
plugins-light-my-request ?
plugins-limitd-client ?
plugins-lodash ?
plugins-mariadb ?
plugins-memcached ?
plugins-microgateway-core ?
plugins-moleculer ?
plugins-mongodb ?
plugins-mongodb-core ?
plugins-mongoose ?
plugins-multer ?
plugins-mysql ?
plugins-mysql2 ?
plugins-node-serialize ?
plugins-opensearch ?
plugins-passport-http ?
plugins-postgres ?
plugins-process ?
plugins-pug ?
plugins-redis ?
plugins-router ?
plugins-sequelize ?
plugins-test-and-upstream-amqp10 ?
plugins-test-and-upstream-amqplib ?
plugins-test-and-upstream-apollo ?
plugins-test-and-upstream-avsc ?
plugins-test-and-upstream-bunyan ?
plugins-test-and-upstream-connect ?
plugins-test-and-upstream-graphql ?
plugins-test-and-upstream-koa ?
plugins-test-and-upstream-protobufjs ?
plugins-test-and-upstream-rhea ?
plugins-undici ?
plugins-url ?
plugins-valkey ?
plugins-vm ?
plugins-winston ?
plugins-ws ?
profiling-macos ?
profiling-ubuntu ?
profiling-windows ?
serverless-azure-functions-client ?
serverless-azure-functions-eventhubs ?
serverless-azure-functions-servicebus ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mr-lee mr-lee force-pushed the feat/claude-agent-sdk-integration branch 2 times, most recently from 6e28bea to ddaa3f1 Compare February 24, 2026 02:42
@mr-lee mr-lee requested a review from a team as a code owner February 24, 2026 03:11
@mr-lee mr-lee requested review from ida613 and removed request for a team February 24, 2026 03:11
@wconti27 wconti27 added the AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos label Mar 4, 2026
@imsherrill
Copy link
Copy Markdown

imsherrill commented Mar 5, 2026

this would be really great, but much less useful without individual llm call spans bc that is where cost estimates come from

@tlhunter
Copy link
Copy Markdown
Member

It seems like this is in your wheelhouse @sabrenner

@sabrenner
Copy link
Copy Markdown
Collaborator

@tlhunter yep, will be taking a look at this PR throughout the next week (we're aligning on some things internally)

@BridgeAR
Copy link
Copy Markdown
Member

BridgeAR commented Apr 9, 2026

@mr-lee could you please rebase and fix reported lint issues already? :)

@mr-lee mr-lee force-pushed the feat/claude-agent-sdk-integration branch from e0cf971 to 65d163b Compare April 10, 2026 00:44
@mr-lee mr-lee requested a review from a team as a code owner April 10, 2026 00:44
@mr-lee mr-lee force-pushed the feat/claude-agent-sdk-integration branch 2 times, most recently from d65cfaf to f90a168 Compare April 10, 2026 02:08
…agent-sdk

Adds automatic instrumentation for the Claude Agent SDK, providing
full visibility into agentic sessions via APM tracing and LLM Obs.

Span hierarchy aligned with trajectory-spec APPENDIX-DD-LLMOBS-MAPPING:

  agent (session)
    └── agent (turn)
         ├── tool ({tool_name})
         └── agent (subagent-{agent_type})

Key design decisions:
- Uses SDK's first-class hooks API (SessionStart, SessionEnd, Stop,
  PreToolUse, PostToolUse, SubagentStart, SubagentStop, etc.)
- Turn spans are `agent` kind (not workflow) per spec
- Dynamic span names: session, turn, {toolName}, subagent-{type}
- Model name split from provider prefix (anthropic/claude-sonnet-4-6)
- Turn output from Stop hook's last_assistant_message
- Captures cwd, transcript_path, agent_type, is_interrupt, start_trigger
- Pure ESM package handled via import-in-the-middle with esmFirst: true
- User hooks preserved via mergeHooks (user matchers before tracer matchers)

Known gap: No LLM-level spans - the Agent SDK bundles its own Anthropic
client internally, so the existing @anthropic-ai/sdk shimmer doesn't fire.
This matches dd-trace-py's approach (agent + tool spans only).

Tests: 41 APM tracing + 13 LLM Obs tests.
@mr-lee mr-lee force-pushed the feat/claude-agent-sdk-integration branch from f90a168 to c17d0d8 Compare April 10, 2026 02:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos semver-minor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants