Skip to content

Comments

feat(taps): add optional sanitization for JSON-serializable documents#49

Open
mporracindie wants to merge 2 commits intoMeltanoLabs:mainfrom
blueprint-data:main
Open

feat(taps): add optional sanitization for JSON-serializable documents#49
mporracindie wants to merge 2 commits intoMeltanoLabs:mainfrom
blueprint-data:main

Conversation

@mporracindie
Copy link

This pull request introduces a new feature for sanitizing MongoDB documents to ensure JSON-serializability, along with associated updates to the codebase. The key changes include adding a sanitize_documents configuration option, implementing a utility function for document sanitization, and updating the data processing logic to use this feature when enabled.

New Feature: Document Sanitization

  • README.md: Updated documentation to include the new sanitize_documents configuration option, detailing its purpose and behavior.
  • tap_mongodb/tap.py: Added the sanitize_documents configuration option to the tap's schema, with a default value of False.
  • tap_mongodb/utils.py: Introduced the sanitize_doc utility function, which converts MongoDB-specific types (e.g., ObjectId, UUID, datetime, Binary) to JSON-serializable equivalents.

Code Updates for Sanitization Integration

  • tap_mongodb/streams.py: Modified the get_records method to apply the sanitize_doc function to documents when the sanitize_documents configuration is enabled. This ensures that all extracted data is JSON-compatible. [1] [2]
  • tap_mongodb/streams.py: Imported the sanitize_doc function to enable its use in the get_records method.

@mporracindie mporracindie requested a review from menzenski as a code owner July 24, 2025 02:26
@mporracindie mporracindie changed the title Allow sanitizing documents to ensure JSON-serializability feat(taps): add optional sanitization for JSON-serializable documents Jul 24, 2025
@edgarrmondragon edgarrmondragon self-assigned this Jul 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants