Add semantic conventions for GenAI agent planning operation by Krishnachaitanyakc · Pull Request #3594 · open-telemetry/semantic-conventions

Krishnachaitanyakc · 2026-04-02T21:50:52Z

Partially addresses #2664

Summary

Adds plan as a new gen_ai.operation.name value following the same pattern as execute_tool -- a dedicated operation name because planning has independent duration, error status, and parent-child structure that justify a distinct span. Zero new attributes; only a new enum member reusing existing attributes.

The problem

An agent produces a wrong answer. Was it bad planning (wrong task decomposition) or bad execution (tool returned stale data)? The remediation is different: bad planning needs prompt changes; bad execution needs tool or retrieval fixes.

Without a plan span, the operator sees this:

invoke_agent "research_agent"          1200ms
├── chat "gpt-4o"                       400ms  ← planning? reasoning? unclear
├── execute_tool "web_search"           350ms
├── execute_tool "summarize"            200ms
└── chat "gpt-4o"                       250ms  ← final response? another plan?

The first chat span might be planning, reasoning, or a direct answer attempt. The operator cannot distinguish these failure modes.

With a plan span:

invoke_agent "research_agent"          1200ms
├── plan "research_agent"               400ms  ← planning duration isolated
│   └── chat "gpt-4o"                  400ms  (LLM generates the plan)
├── execute_tool "web_search"           350ms  (step 1)
├── execute_tool "summarize"            200ms  (step 2)
└── chat "gpt-4o"                       250ms  (final response)

Planning latency, errors, and the LLM call that produced the plan are now isolated under a parent boundary. An operator can filter on gen_ai.operation.name = plan, set sampling rules on planning spans independently, and immediately see whether planning or execution consumed the time budget.

Why a span, not an attribute or grouping primitive

execute_tool got its own span -- not an attribute on chat -- because tool execution has independent duration, error status, and parent-child structure. Planning has the same properties. A grouping attribute (#3575) correlates sibling spans; a plan span creates a parent boundary with its own duration. You cannot model a parent as an attribute on its own child.

Relationship to `gen_ai.task` (#2912)

A plan formulates strategy before execution; a task executes assigned work. The plan span is the parent of the planning LLM call and a sibling of the task/tool spans that follow.

Cross-provider evidence

Framework	Planning Hook	Auto-instrumentable	Source
CrewAI	`CrewPlanner` -- explicit planning phase before task execution	Yes	planning_handler.py
LlamaIndex	`SubQuestionQueryEngine` -- question decomposition before sub-queries	Yes	sub_question_query_engine.py
LangChain	`AgentExecutor._take_next_step()` -- deprecated, private API	Partial	--
Google ADK	`planner` agent with `plan()` method	Unverified	--

Instrumentation SHOULD only emit plan when the framework exposes an explicit planning boundary (see emission rules in spans.yaml).

Out of scope

Plan-specific attributes (strategy, step.count), reflection, and delegation are deferred to follow-up PRs.

Reference implementation

AgentTelemetry (PyPI: agenttelemetry)

Add plan operation for GenAI agent spans

ce11579

github-actions bot added the enhancement New feature or request label Apr 2, 2026

Krishnachaitanyakc marked this pull request as ready for review April 2, 2026 21:58

Krishnachaitanyakc requested review from a team as code owners April 2, 2026 21:58

trask added the area:gen-ai label Apr 3, 2026

github-project-automation bot added this to Semantic Conventions Triage Apr 4, 2026

github-project-automation bot moved this to Untriaged in Semantic Conventions Triage Apr 4, 2026

lmolkova moved this from Untriaged to Awaiting codeowners approval in Semantic Conventions Triage Apr 6, 2026

trask added this to GenAI Semantic Conventions and Instrumentation libraries Apr 14, 2026

trask moved this to In Progress in GenAI Semantic Conventions and Instrumentation libraries Apr 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add semantic conventions for GenAI agent planning operation#3594

Add semantic conventions for GenAI agent planning operation#3594
Krishnachaitanyakc wants to merge 1 commit intoopen-telemetry:mainfrom
Krishnachaitanyakc:plan-operation

Krishnachaitanyakc commented Apr 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Krishnachaitanyakc commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

The problem

Why a span, not an attribute or grouping primitive

Relationship to gen_ai.task (#2912)

Cross-provider evidence

Out of scope

Reference implementation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Krishnachaitanyakc commented Apr 2, 2026 •

edited

Loading

Relationship to `gen_ai.task` (#2912)