Skip to content

fix: set additionalProperties: false for pydantic models with extra='ignore'#1458

Open
octo-patch wants to merge 1 commit into
guidance-ai:mainfrom
octo-patch:fix/issue-1050-pydantic-additional-properties
Open

fix: set additionalProperties: false for pydantic models with extra='ignore'#1458
octo-patch wants to merge 1 commit into
guidance-ai:mainfrom
octo-patch:fix/issue-1050-pydantic-additional-properties

Conversation

@octo-patch

Copy link
Copy Markdown

Fixes #1050
Related: #1125

Problem

When using a pydantic BaseModel (with the default extra='ignore' or explicit extra='ignore') as the schema for guidance.json(), the generated JSON grammar allows the LLM to produce arbitrary additional fields that pydantic would silently discard during validation.

This behaviour wastes tokens and surprises users who expect only the declared fields to be generated, as reported in #1050 and #1125.

Solution

Override model_schema() in GenerateJsonSchemaSafe to add "additionalProperties": false whenever the model's extra_fields_behavior is:

  • None — the implicit pydantic default, which behaves as extra='ignore'
  • "ignore" — explicitly set

Models with extra='allow' retain flexible schemas (pydantic sets additionalProperties: true).
Models with extra='forbid' already receive additionalProperties: false from pydantic itself.

The fix applies recursively to nested pydantic models within $defs, since each model's model_schema() is called independently.

Before / After

from pydantic import BaseModel
from guidance import json as gen_json

class MyModel(BaseModel):
    name: str
    age: int

grammar = gen_json(schema=MyModel)

# Before this fix:
grammar.match('{"name": "Alice", "age": 30, "extra": true}')  # -> Match (allowed!)

# After this fix:
grammar.match('{"name": "Alice", "age": 30, "extra": true}')  # -> None (rejected)
grammar.match('{"name": "Alice", "age": 30}')                 # -> Match (correct)

Testing

Added TestAdditionalProperties to tests/unit/library/test_pydantic.py covering:

  • Default extra rejects additional properties
  • Explicit extra='ignore' rejects additional properties
  • extra='allow' still permits additional properties
  • Nested models also reject additional properties

…gnore

When using pydantic BaseModel (with default or explicit extra='ignore') as a
json() schema, the LLM could previously generate arbitrary extra fields that
would be silently discarded by pydantic validation. This wastes tokens and
surprises users expecting only declared fields.

This patch extends GenerateJsonSchemaSafe to override model_schema() and add
additionalProperties: false whenever the model's extra_fields_behavior is None
(implicit default) or ignore (explicitly set). Models with extra='allow'
retain their flexible schemas; extra='forbid' is already handled by pydantic.

The change applies recursively to nested pydantic models as well.

Closes guidance-ai#1050
Related: guidance-ai#1125
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Default behavior of json generation is likely more verbose than users expect

1 participant