Skip to content

[Feature]: Enforce output_model validation in EventLoopNode — NodeSpec field defined but never connected #5929

@Waryjustice

Description

@Waryjustice

Problem Statement

NodeSpec has two fields that were clearly designed for structured output validation:

"core/framework/graph/node.py:237"
output_model: type[BaseModel] | None = Field(
default=None,
description="Optional Pydantic model class for validating and parsing LLM output. "
"When set, the LLM response will be validated against this model."
)

core/framework/graph/node.py:244

max_validation_retries: int = Field(default=2, ...)

test_pydantic_validation.py exists and tests these fields on NodeSpec. But nothing in the runtime ever reads them. event_loop_node.py's set_output handler (lines 1798–1824) writes values directly into shared memory with zero schema
validation. A node that declares output_model=OrderSchema today behaves identically to one that doesn't.

This means:

  • Malformed or incomplete LLM outputs silently corrupt downstream node inputs
  • Type mismatches only surface as cryptic KeyError or AttributeError deep in the graph
  • Agents have no feedback loop to self-correct structured output failures
  • The output_model field is dead API — it misleads contributors into thinking validation is active

Proposed Solution

Wire output_model into the set_output handling path in event_loop_node.py. When a node has output_model set:

  1. After all output_keys are collected, attempt output_model(**collected_outputs)
  2. On ValidationError → trigger a RETRY verdict with structured feedback injected into conversation history, e.g.: [Output validation failed]: Field 'order_id' is required but was not set.
    Field 'amount' expected float, got str.
    Restate your output using set_output with the correct types.
  3. Retry up to max_validation_retries times (default: 2), then ESCALATE if still failing
  4. On success → store the validated Pydantic model instance (or .model_dump()) in shared memory as usual

Files to Change

ore/framework/graph/event_loop_node.py -Read ctx.node_spec.output_model after set_output collection. Call _validate_outputs() helper. Inject ValidationError feedback as RETRY. │

core/framework/graph/node.py -No change needed — field already defined correctly

core/tests/test_pydantic_validation.py - Extend existing tests to cover the runtime validation path, not just NodeSpec field existence

##What does not Change
What Does NOT Change

  • Nodes without output_model are completely unaffected — zero behaviour change
  • The set_output tool interface is unchanged
  • No new dependencies — pydantic is already a core dependency

Example Usage (after this fix)

class ExtractedOrder(BaseModel):
order_id: str
amount: float
status: Literal["pending", "complete", "refunded"]

node = NodeSpec(
node_id="extract_order",
output_model=ExtractedOrder,
max_validation_retries=3,
...
)

##Why This Matters

Every node that processes structured data is currently relying on the LLM to get types right by chance. One field name typo or type mismatch causes silent corruption. This fix closes the feedback loop: the agent gets told exactly
what's wrong and retries — the same mechanism already used for judge verdicts.

The infrastructure is all there. output_model is defined. max_validation_retries is defined. The RETRY injection pattern exists. This is purely a wiring task.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions