Skip to content

estuary-cdk: remove default _meta value from document schemas#3211

Merged
Alex-Bair merged 3 commits intomainfrom
bair/estuary-cdk-remove-schematizing-default-meta
Aug 22, 2025
Merged

estuary-cdk: remove default _meta value from document schemas#3211
Alex-Bair merged 3 commits intomainfrom
bair/estuary-cdk-remove-schematizing-default-meta

Conversation

@Alex-Bair
Copy link
Copy Markdown
Member

@Alex-Bair Alex-Bair commented Aug 21, 2025

Description:

The CDK's BaseDocument was schematizing a default value for the _meta field, but Pydantic doesn't add default values when the exclude_unset argument is set to True during serialization. This caused issues with Dekaf, where the schema said a field would exist, but documents didn't actually contain that field.

Removing the default annotation from the BaseDocument schema is the fastest, clearly ok way to fix this. Should the CDK inject a default _meta value into each document? Maybe, but doing so would take some more investigation & testing.

Workflow steps:

(How does one use this feature, and how has it changed)

Documentation links affected:

(list any documentation links that you created, or existing ones that you've identified as needing updates, along with a brief description)

Notes for reviewers:

The PR is best reviewed commit-by-commit, since many snapshot files needed updated to not have a default annotation for _meta.

I also updated recently failing snapshots to ensure every connector gets built with the new "don't schematize a default _meta value" behavior.


This change is Reviewable

The CDK's `BaseDocument` was schematizing a default value for the `_meta`
field, but Pydantic doesn't add default values when the `exclude_unset`
argument is set to `True` during serialization. This caused issues with
Dekaf, where the schema said a field would exist, but documents didn't
actually contain that field.

Removing the default annotation from the `BaseDocument` schema is the
fastest, clearly ok way to fix this. Should the CDK inject a default
`_meta` value into each document? Maybe, but doing so would take some
more investigation & testing.
@Alex-Bair Alex-Bair force-pushed the bair/estuary-cdk-remove-schematizing-default-meta branch from 0acc255 to 37b504d Compare August 22, 2025 13:49
Comment thread estuary-cdk/estuary_cdk/capture/common.py
@Alex-Bair Alex-Bair marked this pull request as ready for review August 22, 2025 13:58
@Alex-Bair Alex-Bair requested a review from a team August 22, 2025 13:59
Copy link
Copy Markdown
Member

@jgraettinger jgraettinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Alex-Bair Alex-Bair merged commit f5d148f into main Aug 22, 2025
92 of 108 checks passed
@Alex-Bair Alex-Bair deleted the bair/estuary-cdk-remove-schematizing-default-meta branch August 22, 2025 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants