-
Notifications
You must be signed in to change notification settings - Fork 67
fix: Use Pydantic exclude_none=True to omit null namespace fields from catalog serialization #825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…m catalog serialization Re-applies fix from commit 4c5a6d2 that was previously implemented but lost. When PyAirbyte serializes ConfiguredAirbyteCatalog to JSON for passing to connectors, fields with None values (like namespace) were being serialized as explicit null values. The destination CDK validation is strict and rejects explicit null values, but accepts omitted fields entirely. Using exclude_none=True causes Pydantic to omit null fields from JSON output instead of serializing them as explicit nulls, which resolves the issue. Fixes: https://airbytehq-team.slack.com/archives/C06FZ238P8W/p1759500714217339 Devin run: https://app.devin.ai/sessions/6f299c861622435da6fbd89f3aeba145 Co-Authored-By: AJ Steers <[email protected]>
Original prompt from AJ Steers
|
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This PyAirbyte VersionYou can test this version of PyAirbyte using the following: # Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1759870941-fix-namespace-null-handling' pyairbyte --help
# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1759870941-fix-namespace-null-handling' Helpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
Community SupportQuestions? Join the #pyairbyte channel in our Slack workspace. |
Devin, reply here with context: Was this fixed prior but reverted? Did we not merge the prior fix? Or was the fix applicable to a different codepath or model? |
Hi @aaronsteers, Based on my investigation of the git history: Yes, this was previously fixed in PR #748 but that PR never merged. The previous attempt:
Current state on main branch:
This PR (#825):
The issue is occurring because without Let me know if you need any additional context! Requested by @aaronsteers |
📝 WalkthroughWalkthroughUpdated JSON serialization for configured catalogs in sources and destinations to omit None-valued fields by adding exclude_none=True to model_dump_json() calls. No control flow or error handling changes. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Would you like to add a brief note in release docs that catalog JSON may now omit None fields to inform integrators, wdyt? Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
airbyte/sources/base.py (1)
755-755
: Consider applying exclude_none=True to other model_dump_json calls
We found additional calls (state_writers, state_providers, _state_backend, _message_iterators, _executors/base) lackingexclude_none
. Would you addexclude_none=True
for consistency and to omit None fields? wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte/destinations/base.py
(1 hunks)airbyte/sources/base.py
(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/destinations/base.py (2)
airbyte/shared/sql_processor.py (1)
catalog_provider
(226-242)airbyte/shared/catalog_providers.py (1)
configured_catalog
(70-72)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (No Creds)
- GitHub Check: Pytest (Fast)
🔇 Additional comments (1)
airbyte/destinations/base.py (1)
268-268
: LGTM! Null handling fix looks good.This correctly adds
exclude_none=True
to omit None-valued fields from the catalog JSON, which should resolve the "namespace: Null value is not allowed" error when writing to destination-s3-data-lake. The change aligns with how the destination CDK expects catalog data (omitted fields vs. explicit nulls).
Fix namespace null handling for low-code connectors with S3-data-lake destination
Summary
Fixes PyAirbyte namespace handling issue where low-code connectors (like source-tiktok-marketing and source-snapchat-marketing) fail when writing to destination-s3-data-lake with the error "streams.0.stream.namespace: Null value is not allowed".
The fix adds
exclude_none=True
to catalog serialization in bothdestinations/base.py
andsources/base.py
. This makes Pydantic omit null fields entirely from JSON output instead of serializing them as explicit nulls, which the destination CDK can handle.Root cause: When PyAirbyte serializes
ConfiguredAirbyteCatalog
to JSON for passing to connectors, fields withNone
values (likenamespace
) were being serialized as explicit JSON null values ("namespace": null
). The destination CDK validation rejects explicit null values but accepts omitted fields.Prior work: This exact fix was implemented in commit 4c5a6d2 by AJ Steers but was somehow lost in subsequent changes.
Review & Testing Checklist for Human
This is a moderate risk change that requires end-to-end testing:
"namespace": null
)Notes
Summary by CodeRabbit