Skip to content

Conversation

vai-airbyte
Copy link
Contributor

What

Fixes array column serialization errors in source-iterable streams by properly defining array item types in JSONSchema definitions.

Problem: Array columns (e.g., campaigns.labels, emailListIds, channelIds, categories) were being written as null values to the S3 Data Lake destination with DESTINATION_SERIALIZATION_ERROR in the _airbyte_meta column.

Root Cause: Multiple stream schemas had array definitions with empty items: {}, which is ambiguous and prevents proper type mapping to destinations like Iceberg/Glue.

Affected Streams:

  • email_unsubscribe (emailListIds, channelIds)
  • email_send (categories)
  • email_send_skip (categories)
  • email_subscribe (emailListIds)
  • campaigns (listIds, suppressionListIds, labels)

How

Updated JSONSchema definitions across affected stream schema files to specify explicit item types for all array fields:

Before:

"emailListIds": {
  "type": ["null", "array"],
  "items": {}  
}

After:

"emailListIds": {
  "type": ["null", "array"],
  "items": {
    "type": "integer" 
  }
}

Review guide

source_iterable/schemas/campaigns.json - Check listIds, suppressionListIds, labels
source_iterable/schemas/email_unsubscribe.json - Check emailListIds, channelIds
source_iterable/schemas/email_send.json - Check categories (in transactional data)
source_iterable/schemas/email_send_skip.json - Check categories (in transactional data)
source_iterable/schemas/email_subscribe.json - Check emailListIds

Verify that all "items": {} instances have been replaced with proper type definitions.

User Impact

None expected - this is a schema clarification that aligns with actual data types.

Can this PR be safely reverted and rolled back?

  • YES 💚
  • NO ❌

Copy link
Contributor

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Helpful Resources

PR Slash Commands

Airbyte Maintainers (that's you!) can execute the following slash commands on your PR:

  • /format-fix - Fixes most formatting issues.
  • /bump-version - Bumps connector versions.
    • You can specify a custom changelog by passing changelog. Example: /bump-version changelog="My cool update"
    • Leaving the changelog arg blank will auto-populate the changelog from the PR title.
  • /run-cat-tests - Runs legacy CAT tests (Connector Acceptance Tests)
  • /build-connector-images - Builds and publishes a pre-release docker image for the modified connector(s).
  • JVM connectors:
    • /update-connector-cdk-version connector=<CONNECTOR_NAME> - Updates the specified connector to the latest CDK version.
      Example: /update-connector-cdk-version connector=destination-bigquery
    • /bump-bulk-cdk-version type=patch changelog='foo' - Bump the Bulk CDK's version. type can be major/minor/patch.
  • Python connectors:
    • /poe connector source-example lock - Run the Poe lock task on the source-example connector, committing the results back to the branch.
    • /poe source example lock - Alias for /poe connector source-example lock.
    • /poe source example use-cdk-branch my/branch - Pin the source-example CDK reference to the branch name specified.
    • /poe source example use-cdk-latest - Update the source-example CDK dependency to the latest available version.

📝 Edit this welcome message.

@vai-airbyte
Copy link
Contributor Author

vai-airbyte commented Oct 10, 2025

/bump-version changelog="Fix array schema definitions"

Bump Version job started... Check job output.

🔴 Job completed successfully (no changes, this is sus).
Bump Version job started... Check job output.

🔴 Job completed successfully (no changes, this is sus).

@vai-airbyte
Copy link
Contributor Author

vai-airbyte commented Oct 10, 2025

/format-fix

Format-fix job started... Check job output.

✅ Changes applied successfully. (d77f212)

Copy link
Contributor

github-actions bot commented Oct 10, 2025

source-iterable Connector Test Results

44 tests   41 ✅  4m 27s ⏱️
 2 suites   3 💤
 2 files     0 ❌

Results for commit 90c38b7.

♻️ This comment has been updated with latest results.

@vai-airbyte vai-airbyte changed the title Fix schema issues in campaigns, email send, and subscriptions 🐛 Source Iterable: Fix schema issues in campaigns, email send, and subscriptions Oct 10, 2025
Copy link
Contributor

github-actions bot commented Oct 10, 2025

Deploy preview for airbyte-docs ready!

✅ Preview
https://airbyte-docs-a73m3aoaf-airbyte-growth.vercel.app

Built with commit 90c38b7.
This pull request is being automatically deployed with vercel-action

@vai-airbyte vai-airbyte requested a review from agarctfi October 10, 2025 13:22
@vai-airbyte vai-airbyte marked this pull request as ready for review October 10, 2025 13:23
Copy link
Contributor

@agarctfi agarctfi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@vai-airbyte vai-airbyte merged commit 8f59f03 into master Oct 10, 2025
48 of 49 checks passed
@vai-airbyte vai-airbyte deleted the iterable-fix-schemas-vai branch October 10, 2025 14:35
matteogp pushed a commit that referenced this pull request Oct 10, 2025
…bscriptions (#67602)

## What
Fixes array column serialization errors in source-iterable streams by
properly defining array item types in JSONSchema definitions.

**Problem:** Array columns (e.g., `campaigns.labels`, `emailListIds`,
`channelIds`, `categories`) were being written as null values to the S3
Data Lake destination with `DESTINATION_SERIALIZATION_ERROR` in the
`_airbyte_meta` column.

**Root Cause:** Multiple stream schemas had array definitions with empty
`items: {}`, which is ambiguous and prevents proper type mapping to
destinations like Iceberg/Glue.

**Affected Streams:**
- `email_unsubscribe` (emailListIds, channelIds)
- `email_send` (categories)
- `email_send_skip` (categories)  
- `email_subscribe` (emailListIds)
- `campaigns` (listIds, suppressionListIds, labels)


## How
Updated JSONSchema definitions across affected stream schema files to
specify explicit item types for all array fields:

**Before:**

```
"emailListIds": {
  "type": ["null", "array"],
  "items": {}  
}
```

After:

```
"emailListIds": {
  "type": ["null", "array"],
  "items": {
    "type": "integer" 
  }
}
```

## Review guide
source_iterable/schemas/campaigns.json - Check listIds,
suppressionListIds, labels
source_iterable/schemas/email_unsubscribe.json - Check emailListIds,
channelIds
source_iterable/schemas/email_send.json - Check categories (in
transactional data)
source_iterable/schemas/email_send_skip.json - Check categories (in
transactional data)
source_iterable/schemas/email_subscribe.json - Check emailListIds

Verify that all "items": {} instances have been replaced with proper
type definitions.

## User Impact
None expected - this is a schema clarification that aligns with actual
data types.

## Can this PR be safely reverted and rolled back?

- [X] YES 💚
- [ ] NO ❌

---------

Co-authored-by: Octavia Squidington III <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants