Skip to content

Conversation

@Nick-heo-eg
Copy link

@Nick-heo-eg Nick-heo-eg commented Jan 26, 2026

Summary

This PR adds a small set of attributes to the existing gen_ai.evaluation.result event to improve traceability of decision boundaries where multiple alternative outcomes were evaluated.

The design follows the direction discussed in #3244 and does not introduce new events.


Changes

Added four new attributes to model/gen-ai/registry.yaml:

  • gen_ai.evaluation.judgment.phase
  • gen_ai.evaluation.judgment.selected_path
  • gen_ai.evaluation.judgment.alternatives_evaluated
  • gen_ai.evaluation.judgment.human_in_loop

These attributes are connected to the gen_ai.evaluation.result event in model/gen-ai/events.yaml.


Motivation

Current GenAI traces capture execution outcomes but do not provide an explicit signal that alternative paths (e.g. allow vs block) were evaluated.

These attributes allow systems to demonstrate that such evaluations occurred, without exposing internal reasoning or policy logic.


Design Rationale

  • Extends existing event: Uses gen_ai.evaluation.result rather than creating a new event
  • Optional attributes: All attributes are opt-in, applying only when judgment boundaries exist
  • Minimal scope: Records only that alternatives were evaluated and which path was selected
  • No reasoning exposure: Does not capture policy logic, scoring details, or decision rationale

Note

Documentation files under docs/gen-ai/ are autogenerated from the registry and event schemas and are not edited directly in this PR.


Related Issue

This commit adds attributes to the gen_ai.evaluation.result event
to support traceability of decision boundaries where multiple
alternative outcomes were evaluated.

Implements discussion from open-telemetry#3244
…t intent

Adds two concrete JSON examples demonstrating judgment boundary attribute usage:
- Content safety pre-execution check with automatic decision
- Cost boundary evaluation with human escalation

Clarifies that judgment boundary attributes are intended for event-level
auditability and post-hoc inspection rather than high-cardinality metric
aggregation.
@Nick-heo-eg
Copy link
Author

Note: This PR reintroduces the same changes as #3297, which was closed during repository cleanup.
The contents are unchanged; this PR exists to restore the proposal with a clean fork/PR state.

@lmolkova lmolkova moved this from Untriaged to Awaiting codeowners approval in Semantic Conventions Triage Jan 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Awaiting codeowners approval

Development

Successfully merging this pull request may close these issues.

2 participants