fix(streaming): use field defaults for missing fields in partial streaming#2067
Open
veeceey wants to merge 1 commit into567-labs:mainfrom
Open
fix(streaming): use field defaults for missing fields in partial streaming#2067veeceey wants to merge 1 commit into567-labs:mainfrom
veeceey wants to merge 1 commit into567-labs:mainfrom
Conversation
Author
Manual Test ResultsEnvironment
Test 1: Literal default preserved from first partial chunk>>> from pydantic import BaseModel
>>> from typing import Literal
>>> from instructor.dsl.partial import Partial
>>>
>>> class Person(BaseModel):
... type: Literal["Person"] = "Person"
... name: str
... age: int
...
>>> PartialModel = Partial[Person]
>>> chunks = ['{"name": "Jo', 'hn", "age": 25}']
>>> results = list(PartialModel.model_from_chunks(iter(chunks)))
>>> results[0].type
'Person'
>>> results[0].name
'Jo'Before fix: Test 2: Integer default preserved>>> class Config(BaseModel):
... retries: int = 3
... name: str
...
>>> PartialConfig = Partial[Config]
>>> chunks = ['{"name": "tes', 't_config"}']
>>> results = list(PartialConfig.model_from_chunks(iter(chunks)))
>>> results[0].retries
3Before fix: Test 3: default_factory preserved>>> from pydantic import Field
>>>
>>> class Container(BaseModel):
... items: list[str] = Field(default_factory=list)
... label: str
...
>>> PartialContainer = Partial[Container]
>>> chunks = ['{"label": "te', 'st"}']
>>> results = list(PartialContainer.model_from_chunks(iter(chunks)))
>>> results[0].items
[]Before fix: Test 4: Fields without defaults still get None (unchanged behavior)>>> class Simple(BaseModel):
... name: str
... age: int
...
>>> PartialSimple = Partial[Simple]
>>> chunks = ['{"name": "Jo', 'hn", "age": 25}']
>>> results = list(PartialSimple.model_from_chunks(iter(chunks)))
>>> results[0].age # age not yet in first chunk
>>> # None (unchanged behavior for fields without defaults)Result: PASS - fields without defaults remain Test 5: Full test suiteResult: PASS - all new and existing tests pass. |
Author
|
Hi team -- friendly ping! Just checking if anyone has had a chance to look at this PR. Happy to address any feedback or questions. Thanks! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Describe your changes
When streaming partial responses with
Partial[Model], fields that have default values (e.g.,type: Literal["Person"] = "Person") were being set toNonein every partial chunk until the LLM actually streamed that field's value. This is problematic for frontend rendering where discriminator/type fields are needed immediately to determine how to render the content.Root cause: In
_build_partial_object(), when a field was missing from the streamed JSON data, it was unconditionally set toNone(line 183). This ignored the field's declared default value.Fix: Check if the field has a default value (
field_info.default) or a default factory (field_info.default_factory) and use those instead ofNonefor missing fields. This means:type: Literal["Person"] = "Person"->"Person"from the very first chunkretries: int = 3->3from the very first chunkitems: list[str] = Field(default_factory=list)->[]from the very first chunkNone(unchanged behavior)Before:
After:
Issue ticket number and link
Fixes #2054
Checklist before requesting a review