-
Notifications
You must be signed in to change notification settings - Fork 30
feat: pass transformations to retriever instance #703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: pass transformations to retriever instance #703
Conversation
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This CDK VersionYou can test this version of the CDK using the following: # Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@maxi297/pass_transformations_to_custom_retriever_record_selector#egg=airbyte-python-cdk[dev]' --help
# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch maxi297/pass_transformations_to_custom_retriever_record_selector Helpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
|
📝 WalkthroughWalkthroughThe changes update the Changes
Sequence Diagram(s)sequenceDiagram
participant Test as Test Case
participant Factory as ModelToComponentFactory
participant Retriever as TestingCustomRetriever
participant Stream as DeclarativeStream
Test->>Factory: create_custom_component(model, config, **kwargs)
Factory->>Factory: _create_nested_component(..., **kwargs)
Factory->>Retriever: Instantiate with merged kwargs
Factory->>Stream: Instantiate DeclarativeStream with retriever
Test->>Stream: Assert properties
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Suggested reviewers
Would you like to consider adding more test cases for different types of nested component parameter propagation, or is the current coverage sufficient for your needs? Wdyt? Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
unit_tests/sources/declarative/parsers/testing_components.py (1)
49-51
: Simple and effective testing component - nice work!The
TestingCustomRetriever
follows the established pattern of other testing components in this file. The minimal implementation with justpass
is perfect for testing the factory's kwargs propagation functionality without introducing unnecessary complexity. The dataclass decorator and inheritance fromSimpleRetriever
align well with the testing needs described in the PR objectives.One small question though - would it be worth adding the
__test__: ClassVar[bool] = False
attribute like the other testing components to explicitly tell pytest this isn't a test class, wdyt?@dataclass class TestingCustomRetriever(SimpleRetriever): + __test__: ClassVar[bool] = False # Tell Pytest this is not a Pytest class, despite its name + passunit_tests/sources/declarative/parsers/test_model_to_component_factory.py (1)
949-999
: Nice test coverage for the custom retriever transformation feature!The test effectively validates that transformations are properly passed to custom retrievers. The manifest structure and assertions look good and follow existing patterns.
One thought - would it be worth making the final assertion more specific? Instead of just checking
assert stream.retriever.record_selector.transformations
, maybe verify the actual transformation type and configuration, wdyt?Something like:
- assert stream.retriever.record_selector.transformations + assert len(stream.retriever.record_selector.transformations) == 1 + transformation = stream.retriever.record_selector.transformations[0] + assert isinstance(transformation, AddFields) + assert transformation.fields[0].path == ["extra"]This would make the test more robust by ensuring the correct transformation was actually applied.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(3 hunks)unit_tests/sources/declarative/parsers/test_model_to_component_factory.py
(1 hunks)unit_tests/sources/declarative/parsers/testing_components.py
(2 hunks)
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
📚 Learning: 2025-01-13T23:39:15.457Z
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#174
File: unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py:21-29
Timestamp: 2025-01-13T23:39:15.457Z
Learning: The CustomPageIncrement class in unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py is imported from another connector definition and should not be modified in this context.
Applied to files:
unit_tests/sources/declarative/parsers/testing_components.py
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py
📚 Learning: 2025-01-14T00:20:32.310Z
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the `airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py` file, the strict module name checks in `_get_class_from_fully_qualified_class_name` (requiring `module_name` to be "components" and `module_name_full` to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.
Applied to files:
unit_tests/sources/declarative/parsers/testing_components.py
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
📚 Learning: 2024-11-15T01:04:21.272Z
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
Applied to files:
unit_tests/sources/declarative/parsers/testing_components.py
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
📚 Learning: 2024-11-18T23:40:06.391Z
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Applied to files:
unit_tests/sources/declarative/parsers/testing_components.py
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
📚 Learning: 2024-11-15T00:59:08.154Z
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.
Applied to files:
unit_tests/sources/declarative/parsers/testing_components.py
📚 Learning: 2024-12-11T16:34:46.319Z
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the `declarative_component_schema.py` file is auto-generated from `declarative_component_schema.yaml` and should be ignored in the recommended reviewing order.
Applied to files:
unit_tests/sources/declarative/parsers/testing_components.py
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
🪛 GitHub Actions: Linters
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
[error] 1707-1711: Ruff formatting error: code needs reformatting in this section. Run 'ruff format' to fix.
[error] 1835-1841: Ruff formatting error: code needs reformatting in this section. Run 'ruff format' to fix.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Check: destination-motherduck
- GitHub Check: Check: source-pokeapi
- GitHub Check: Check: source-intercom
- GitHub Check: Check: source-hardcoded-records
- GitHub Check: Check: source-shopify
- GitHub Check: SDM Docker Image Build
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Analyze (python)
🔇 Additional comments (4)
unit_tests/sources/declarative/parsers/testing_components.py (1)
16-16
: Import looks good for the new test component.The
SimpleRetriever
import is appropriately placed with the other declarative component imports and supports the newTestingCustomRetriever
class below.airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (3)
1812-1812
: Method signature updated correctly for kwargs propagationGreat addition of
**kwargs
to the method signature! This enables the keyword argument propagation that the PR aims to achieve.
1844-1844
: Perfect use of dictionary union for parameter mergingThe use of
matching_parameters | matching_kwargs
is exactly right for this use case - it ensures model parameters take precedence over kwargs while still allowing the propagation of additional arguments. This maintains backward compatibility while enabling the new functionality.
1710-1710
: Formatting issue: Missing comma after kwargsThe pipeline failure indicates a Ruff formatting error. The line needs proper formatting - wdyt about adding a comma after
**kwargs
?- model, model_field, model_value, config, **kwargs, + model, model_field, model_value, config, **kwargs⛔ Skipped due to learnings
Learnt from: ChristoGrab PR: airbytehq/airbyte-python-cdk#58 File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0 Timestamp: 2024-11-18T23:40:06.391Z Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Learnt from: aaronsteers PR: airbytehq/airbyte-python-cdk#174 File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102 Timestamp: 2025-01-14T00:20:32.310Z Learning: In the `airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py` file, the strict module name checks in `_get_class_from_fully_qualified_class_name` (requiring `module_name` to be "components" and `module_name_full` to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.
Learnt from: pnilan PR: airbytehq/airbyte-python-cdk#0 File: :0-0 Timestamp: 2024-12-11T16:34:46.319Z Learning: In the airbytehq/airbyte-python-cdk repository, the `declarative_component_schema.py` file is auto-generated from `declarative_component_schema.yaml` and should be ignored in the recommended reviewing order.
Learnt from: aaronsteers PR: airbytehq/airbyte-python-cdk#58 File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65 Timestamp: 2024-11-15T01:04:21.272Z Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
/autofix
|
Should be covered by 25ca5b8 |
What
Currently if we wants transformations to be passed to the record selector when there is a custom retriever, we need to do this.
How
The ModelToComponentFactory will match function arguments not only on model parameters but also on kwargs being passed. If there is a collision, model parameters will be prioritize to keep backward compatibility
Summary by CodeRabbit
New Features
Tests