Skip to content

feat(source-mixpanel): update to latest cdk, set up concurrency #55189

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 50 commits into from
Apr 14, 2025

Conversation

darynaishchenko
Copy link
Collaborator

What

resolved: https://github.com/airbytehq/airbyte-internal-issues/issues/11795

How

Review guide

User Impact

Can this PR be safely reverted and rolled back?

  • YES πŸ’š
  • NO ❌

@darynaishchenko darynaishchenko self-assigned this Mar 4, 2025
Copy link

vercel bot commented Mar 4, 2025

The latest updates on your projects. Learn more about Vercel for Git β†—οΈŽ

Name Status Preview Comments Updated (UTC)
airbyte-docs βœ… Ready (Inspect) Visit Preview πŸ’¬ Add feedback Apr 14, 2025 4:07pm

@darynaishchenko
Copy link
Collaborator Author

darynaishchenko commented Mar 5, 2025

/format-fix

Format-fix job started... Check job output.

βœ… Changes applied successfully. (40e00b5)

@octavia-squidington-iii octavia-squidington-iii added the area/documentation Improvements or additions to documentation label Mar 5, 2025
@darynaishchenko darynaishchenko changed the title feat(source-mixpanel): update to latest cdk, /set up concurrency feat(source-mixpanel): update to latest cdk, set up concurrency Mar 5, 2025
@darynaishchenko
Copy link
Collaborator Author

Regression tests:

test_catalog_are_the_same [failed]
Due to DynamicSchemaLoader fields no longer have description and multipleOffield defined in json schema files in control version.
https://json-schema.org/draft-07/schema#" instead of old "http://json-schema.org/draft-07/schema#

Record count
Export has more records in target(records were added between reads)
Engage has more records in target(records were added between reads)

TestDataIntegrity.test_record_schema_match_with_state
Null fields are no longer in record due to cdk update.

Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I misunderstood the solution, please ping me. Else, I'm good with this change! Thanks for doing the last mile on this connector ❀️


request_params["from_date"] = from_date_value.format("YYYY-MM-DD")
request_params["to_date"] = to_date_value.format("YYYY-MM-DD")
request_params["where"] = f'properties["$time"]>=datetime({time_value})'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I understand the current solution: We don't have slices of 30 days anymore. This is fine because it was slow and the very low rate limiting anyway, we can't be concurrent and there is risk is doing one query per 30 days instead of one query per page

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we don't have slices of 30 days, previous query with slices and initial state can't be implemented using concurrent cdk. now we have one slice, it should be faster (we don't need to wait 1 min between requests) and uses less requests(better rate limiting).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defined export as StateDelegatingStream, so for the full refresh read we have 30 days slices, which is more safe for the first read.

@darynaishchenko
Copy link
Collaborator Author

/format-fix

@darynaishchenko
Copy link
Collaborator Author

darynaishchenko commented Apr 14, 2025

/format-fix

Format-fix job started... Check job output.

βœ… Changes applied successfully. (bf3b7a1)

@darynaishchenko
Copy link
Collaborator Author

darynaishchenko commented Apr 14, 2025

/format-fix

Format-fix job started... Check job output.

βœ… Changes applied successfully. (a03683a)

@darynaishchenko darynaishchenko merged commit 7bb7b7f into master Apr 14, 2025
27 checks passed
@darynaishchenko darynaishchenko deleted the daryna/source-mixpanel/set-up-concurrency branch April 14, 2025 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/mixpanel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants