Skip to content

fix: (CDK) (Manifest) - Add deprecations support and handle deprecation warnings; deprecate url_base and path for HttpRequester #486

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

bazarnov
Copy link
Contributor

@bazarnov bazarnov commented Apr 16, 2025

What

Resolves:

How

Updated Component Models:

  • Modified the HttpRequester to support the new url field and prioritize it.
  • Updated schema definitions, validation logic, and deprecated fields in models.
  • Added logic to handle warnings during component creation (e.g., get_model_deprecations).
  • Ensure the backward compatibility with the deprecated url_base and path fields.

Handling Deprecation Warnings:

  • Added a new deprecation_warnings method to the DeclarativeSource and ManifestDeclarativeSource class to collect and process deprecation notices.
  • Updated the _categorise_groups() to handle deprecation warnings alongside message groups.
  • Added logic to collect these warnings in components using BaseModelWithDeprecations.
  • The new airbyte_cdk/sources/declarative/models/base_model_with_deprecations.py module is added to cover the deprecation warnings for the fields marked as deprecated. This class is static (not autogenerated) and re-used when the declarative_component_schema.py is being generated. In this way, there is no need to set additional validations on top of the existing component implementations.

User Impact

  • No impact is expected, this is not a breaking change.

Deprecation Warning examples

The deprecation rules applied to the declarative_component_schema.yaml are automatically picked up and propagated to the declarative_component_schema.py (auto-gen models), see the example bellow.

# declarative component schema content

HttpRequester:
  required:
    - url
    # - url_base # removed from the required fields
  # .. the rest of the component content
  properties:
    url_base:
        deprecated: true
        deprecation_message: "Use `url` field instead."
        # .. the rest of the deprecated field content
    url:
       # .. the content of the deprecated field replacement
class HttpRequester(BaseModelWithDeprecations):
    type: Literal["HttpRequester"]
    url_base: Optional[str] = Field(
        None,
        deprecated=True, # deprecation flag is set onto the field
        deprecation_message="Use `url` field instead.", # deprecation message is set onto the field, will be used as a part of the warning.
        description="....",
        examples=[],
        title="API Base URL",
    )
    url: Optional[str] = Field(
        None,
        description="....",
        examples=[],
        title="API URL",
    )

Warning example

When using the deprecated field or creating an instance of the class with the deprecated field.

Python default behavior:

DeprecationWarning: Component type: `HttpRequester`. Field 'url_base' is deprecated. Use `url` field instead.

If multiple fields are marked as deprecated and used - more than 1 deprecation message is shown.

DeprecationWarning: Component type: `HttpRequester`. Field 'url_base' is deprecated. Use `url` field instead.

DeprecationWarning: Component type: `HttpRequester`. Field 'path' is deprecated. Use `url` field instead.

Connector Builder behavior:

Screenshot 2025-04-14 at 21 10 37

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features
    • Deprecation warnings for deprecated fields are now integrated and displayed in the Connector Builder UI.
    • Added support for a new url field in HTTP requesters, replacing the deprecated url_base and path fields.
  • Bug Fixes
    • Improved handling and aggregation of deprecation warnings to ensure clear user notifications.
  • Tests
    • Added coverage for deprecation warning scenarios in source mocks.
  • Chores
    • Enhanced code generation to detect and handle deprecated fields automatically in generated models.

@bazarnov bazarnov self-assigned this Apr 16, 2025
@github-actions github-actions bot added bug Something isn't working security labels Apr 16, 2025
@bazarnov bazarnov changed the title fix: (CDK) (Manifest) - Add deprecations support and deprecation warnings for deprecated manifest fields fix: (CDK) (Manifest) - Add deprecations support and handle deprecation warnings Apr 16, 2025
@bazarnov bazarnov marked this pull request as ready for review April 16, 2025 11:14
@Copilot Copilot AI review requested due to automatic review settings April 16, 2025 11:14
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Copy link
Contributor

coderabbitai bot commented Apr 16, 2025

📝 Walkthrough

Walkthrough

This update introduces a mechanism for handling deprecation warnings within Airbyte's declarative source framework. A new base model, BaseModelWithDeprecations, is added to emit warnings when deprecated fields are used and to collect these warnings as log messages. The component factory and source classes are extended to gather and expose deprecation warnings, which are then integrated into the test reading and result categorization processes. Code generation scripts are updated to replace base models with the new deprecation-aware model when deprecated fields are present. Mock classes in unit tests are also updated to support the new interface. Additionally, the HttpRequester component and related schema are updated to deprecate url_base and path in favor of a new url field, with corresponding changes to URL construction logic.

Changes

File(s) Change Summary
airbyte_cdk/sources/declarative/models/base_model_with_deprecations.py Introduced BaseModelWithDeprecations to emit and collect deprecation warnings when deprecated fields are accessed or initialized.
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py Enhanced ModelToComponentFactory to collect deprecation warnings from models, store them, and provide a method to retrieve these warnings. Adjusted HTTP requester creation to support new url field and dynamic URL retrieval in simple retriever creation.
airbyte_cdk/sources/declarative/declarative_source.py
airbyte_cdk/sources/declarative/manifest_declarative_source.py
Added deprecation_warnings method to both DeclarativeSource and ManifestDeclarativeSource classes, returning deprecation log messages from model deprecations.
airbyte_cdk/connector_builder/test_reader/reader.py Modified test reading logic to retrieve deprecation warnings from sources and integrate them into log message aggregation and categorization. Updated method signatures accordingly.
bin/generate_component_manifest_files.py Added logic to replace base models with BaseModelWithDeprecations for classes containing deprecated fields during code generation, and inserted necessary imports. Updated code generation command to preserve deprecation metadata keys (deprecated, deprecation_message).
unit_tests/connector_builder/test_connector_builder_handler.py Updated mock source class in tests to implement the new deprecation_warnings method, returning an empty list.
airbyte_cdk/sources/declarative/declarative_component_schema.yaml Deprecated url_base and path properties in HttpRequester schema, added new url property as replacement, updated descriptions and deprecation messages accordingly.
airbyte_cdk/sources/declarative/models/declarative_component_schema.py Changed HttpRequester to inherit from BaseModelWithDeprecations. Made url_base and path optional and deprecated with messages, added new optional url field. Simplified types in KeyTransformation and DpathFlattenFields.
airbyte_cdk/sources/declarative/requesters/http_requester.py Added optional url attribute to HttpRequester. Updated URL construction to prioritize url or fallback to url_base and path. Added methods get_url and _get_url for flexible URL retrieval. Modified send_request to use new URL construction logic.
airbyte_cdk/sources/declarative/requesters/requester.py Added abstract method get_url to Requester interface to provide full base URL for requests.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant TestReader
    participant DeclarativeSource
    participant ModelToComponentFactory
    participant BaseModelWithDeprecations

    User->>TestReader: run_test_read()
    TestReader->>DeclarativeSource: deprecation_warnings()
    DeclarativeSource->>ModelToComponentFactory: get_model_deprecations()
    ModelToComponentFactory->>BaseModelWithDeprecations: collect _deprecation_logs
    DeclarativeSource-->>TestReader: List[AirbyteLogMessage]
    TestReader->>TestReader: _categorise_groups(..., deprecation_warnings)
    TestReader-->>User: StreamRead (including deprecation warnings in logs)
Loading

Suggested labels

chore, declarative-component-schema, enhancement

Suggested reviewers

  • maxi297
  • lmossman
  • tolik0

Would you like me to suggest adding some documentation or code comments to guide downstream connector developers on how to use or extend BaseModelWithDeprecations for their own deprecated fields? It might help adoption and clarity, wdyt?


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
airbyte_cdk/sources/declarative/models/base_model_with_deprecations.py (1)

71-91: Overriding __getattribute__ may impact performance.

Overriding __getattribute__ is a powerful technique for intercepting attribute access, but it gets called for every attribute access which could impact performance. Have you considered using __getattr__ instead, which is only called when an attribute is not found through normal means? It would be less comprehensive but more performant.

Also, the exception handling seems to swallow all attribute errors, which might mask legitimate errors. Perhaps a more targeted exception handling approach would be better?

-    def __getattribute__(self, name: str) -> Any:
-        """
-        Show warnings for deprecated fields during field usage.
-        """
-
-        value = super().__getattribute__(name)
-
-        try:
-            model_fields = super().__getattribute__(FIELDS_TAG)
-            field_info = model_fields.get(name)
-            is_deprecated_field = (
-                field_info.field_info.extra.get(DEPRECATED, False) if field_info else False
-            )
-            if is_deprecated_field:
-                deprecation_message = field_info.extra.get(DEPRECATION_MESSAGE, "")
-                self._deprecated_warning(name, deprecation_message)
-        except (AttributeError, KeyError):
-            pass
-
-        return value
+    def __getattr__(self, name: str) -> Any:
+        """
+        Called only for attributes that don't exist on the instance,
+        so we don't need to do anything here - just raise the standard AttributeError.
+        """
+        raise AttributeError(f"{self.__class__.__name__} has no attribute '{name}'")
+
+    def __getattribute__(self, name: str) -> Any:
+        """
+        Show warnings for deprecated fields during field usage.
+        """
+
+        # Skip our special processing for dunder methods and our internal attributes
+        if name.startswith('__') or name == DEPRECATION_LOGS_TAG:
+            return super().__getattribute__(name)
+
+        value = super().__getattribute__(name)
+
+        # Only check for deprecation if this is a field defined in the model
+        try:
+            model_fields = super().__getattribute__(FIELDS_TAG)
+            if name in model_fields:
+                field_info = model_fields[name]
+                if field_info.field_info.extra.get(DEPRECATED, False):
+                    deprecation_message = field_info.field_info.extra.get(DEPRECATION_MESSAGE, "")
+                    self._deprecated_warning(name, deprecation_message)
+        except AttributeError:
+            # This happens when FIELDS_TAG isn't available, which is fine
+            pass
+
+        return value
bin/generate_component_manifest_files.py (1)

32-87: Solid implementation of base model replacement for deprecated fields.

The function looks great and serves its purpose well. It correctly identifies classes with deprecated fields and replaces their base model with BaseModelWithDeprecations.

Just noticed a tiny typo in the comment on line 62: "warinings" should be "warnings", wdyt?

Overall, great approach using regex to handle the class detection and replacement!

-  # update the imports to include the new base model with deprecation warinings
+  # update the imports to include the new base model with deprecation warnings
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

746-757: Well-implemented deprecation collection methods

These new methods are clearly implemented:

  • get_model_deprecations() provides a public interface to access collected deprecation logs
  • _collect_model_deprecations() properly handles the collection of deprecation warnings with duplicate prevention

The implementation follows good practices with appropriate attribute checking and exclusion of duplicate entries.

Would it be clearer to add a type hint directly to the class attribute declaration rather than just in the initialization? Something like:

_deprecation_logs: List[AirbyteLogMessage] = []  # As a class attribute

This would make the type more visible when reviewing the class, wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bf998bd and ddfde29.

📒 Files selected for processing (7)
  • airbyte_cdk/connector_builder/test_reader/reader.py (5 hunks)
  • airbyte_cdk/sources/declarative/declarative_source.py (2 hunks)
  • airbyte_cdk/sources/declarative/manifest_declarative_source.py (2 hunks)
  • airbyte_cdk/sources/declarative/models/base_model_with_deprecations.py (1 hunks)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4 hunks)
  • bin/generate_component_manifest_files.py (4 hunks)
  • unit_tests/connector_builder/test_connector_builder_handler.py (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (5)
airbyte_cdk/sources/declarative/manifest_declarative_source.py (3)
airbyte_cdk/sources/declarative/declarative_source.py (1)
  • deprecation_warnings (41-45)
unit_tests/connector_builder/test_connector_builder_handler.py (1)
  • deprecation_warnings (821-822)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
  • get_model_deprecations (746-750)
airbyte_cdk/sources/declarative/declarative_source.py (2)
airbyte_cdk/sources/declarative/manifest_declarative_source.py (1)
  • deprecation_warnings (127-128)
unit_tests/connector_builder/test_connector_builder_handler.py (1)
  • deprecation_warnings (821-822)
unit_tests/connector_builder/test_connector_builder_handler.py (2)
airbyte_cdk/sources/declarative/declarative_source.py (1)
  • deprecation_warnings (41-45)
airbyte_cdk/sources/declarative/manifest_declarative_source.py (1)
  • deprecation_warnings (127-128)
airbyte_cdk/connector_builder/test_reader/reader.py (4)
airbyte_cdk/sources/declarative/declarative_source.py (1)
  • deprecation_warnings (41-45)
airbyte_cdk/sources/declarative/manifest_declarative_source.py (1)
  • deprecation_warnings (127-128)
unit_tests/connector_builder/test_connector_builder_handler.py (3)
  • deprecation_warnings (821-822)
  • primary_key (802-803)
  • cursor_field (806-807)
airbyte_cdk/connector_builder/models.py (1)
  • LogMessage (25-29)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
airbyte_cdk/sources/declarative/models/base_model_with_deprecations.py (1)
  • BaseModelWithDeprecations (27-113)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Check: 'source-shopify' (skip=false)
🔇 Additional comments (22)
airbyte_cdk/sources/declarative/models/base_model_with_deprecations.py (6)

1-5: Nice explanatory comment about the purpose of this file.

The comment clearly explains that this is a static class model used for displaying deprecation warnings. Good practice to mention this upfront!


21-24: Clear constant definitions.

Using constants for field names and attributes makes the code more maintainable and less prone to typos. Good practice!


42-47: Config setting to allow extra fields.

Setting the extra = "allow" is a good choice here as it ensures models can have additional fields not explicitly defined, which is important for backward compatibility when dealing with deprecated fields.


51-70: Good implementation of checking for deprecated fields during initialization.

The method properly checks if initialized fields are marked as deprecated and triggers warnings accordingly. I like the use of the field's extra attributes to store deprecation metadata.


92-114: Solid implementation of warning method.

The method emits a warning using the standard Python warning system and also collects it as an Airbyte log message. Good documentation and clear parameter names.


16-19:

❓ Verification inconclusive

Global warning format looks good, but have you considered potential side effects?

Redefining the global warnings.formatwarning affects all warnings in the application, not just deprecation warnings from this model. Is this the intended behavior or would a more targeted approach be better, wdyt?


🏁 Script executed:

#!/bin/bash
# Check if warnings.formatwarning is modified elsewhere in the codebase
rg -A 1 "warnings\.formatwarning\s*=" --type py

Length of output: 299


Global warning format change may affect all warnings globally – can you confirm this behavior is intended, or would a more targeted approach be preferable?

I noticed that the global override of warnings.formatwarning in airbyte_cdk/sources/declarative/models/base_model_with_deprecations.py (lines 16–19) is the only instance in the codebase. This means any warning, not just deprecation warnings, will be formatted by this lambda. Could you verify if this global effect is desired, or would you consider scoping it (for example, using a context manager) to avoid unintended side effects? wdyt?

airbyte_cdk/sources/declarative/manifest_declarative_source.py (2)

18-26: Clean import addition for AirbyteLogMessage.

Adding AirbyteLogMessage to the existing imports from airbyte_cdk.models is tidy and follows the project's import style.


127-129: Well-implemented deprecation_warnings method.

The method follows a consistent pattern with other methods in the class, properly delegating to the constructor's get_model_deprecations() method. The fallback to an empty list is a good defensive programming practice.

airbyte_cdk/sources/declarative/declarative_source.py (2)

7-11: Clean import additions.

You've correctly added the necessary imports for the new method while maintaining the project's import style.


41-45: Good base implementation of deprecation_warnings.

Providing a default implementation that returns an empty list is a solid approach for the base class. This ensures that all subclasses have a consistent interface, even if they don't override this method.

unit_tests/connector_builder/test_connector_builder_handler.py (2)

10-10: Updated typing import to include List.

The import update ensures that List is available for the type annotation in the new method.


821-822: Added mock implementation for deprecation_warnings.

Your mock implementation matches the base class behavior, returning an empty list. This ensures tests will continue to work with the updated interface.

bin/generate_component_manifest_files.py (2)

103-105: Integration with post-processing looks good.

The function call to replace_base_model_for_classes_with_deprecated_fields is well-placed in the post-processing pipeline, ensuring that all generated code gets the appropriate base class replacements before being written to disk.


141-146: Great addition of field extra keys for deprecation support.

The command-line arguments for datamodel-codegen are well-documented with clear comments explaining their purpose. This enables the code generator to preserve the deprecation metadata from the YAML schema files.

airbyte_cdk/connector_builder/test_reader/reader.py (4)

116-117: Good addition of deprecation warnings retrieval.

This line properly connects to the new deprecation_warnings() method on the source, supporting the broader feature of collecting and exposing deprecation warnings.


133-133: Passing deprecation warnings to _categorise_groups looks good.

Simple but effective update to pass the newly retrieved deprecation warnings to the method for processing.


246-250: Well-typed parameter addition for deprecation warnings.

The optional parameter is properly typed and has an appropriate default value. This ensures backward compatibility while enabling the new functionality.


311-321: Clean implementation of deprecation warning processing.

The new code follows the existing pattern for processing message groups, which is great for consistency. I like the use of Python's pattern matching feature for type checking.

The error handling for unknown types of deprecation warnings matches the existing error handling for unknown message group types, maintaining consistency in the codebase.

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4)

30-30: Import addition looks good

Adding AirbyteLogMessage to the imports is necessary for handling the new deprecation logging functionality.


111-114: Clean imports for new deprecation functionality

The imports for the new deprecation-related components are clearly organized and properly separated in their own import statement.


591-592: Good initialization of deprecation logs list

Adding _deprecation_logs as a class attribute initialized with an empty list in the constructor is the right approach. The comment provides helpful context.


739-743: Proper handling of model deprecations

The code correctly checks if the model is an instance of BaseModelWithDeprecations before attempting to collect deprecation logs. This maintains compatibility with existing models while adding support for the new deprecation-aware models.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ddfde29 and a308674.

📒 Files selected for processing (3)
  • airbyte_cdk/sources/declarative/manifest_declarative_source.py (2 hunks)
  • airbyte_cdk/sources/declarative/models/base_model_with_deprecations.py (1 hunks)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4 hunks)
✅ Files skipped from review due to trivial changes (1)
  • airbyte_cdk/sources/declarative/manifest_declarative_source.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-amplitude' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Analyze (python)
🔇 Additional comments (3)
airbyte_cdk/sources/declarative/models/base_model_with_deprecations.py (3)

49-70: LGTM! Good initialization handling for deprecated fields

The implementation properly initializes _deprecation_logs as an instance attribute (addressing previous review comments) and correctly checks for deprecated fields during component initialization.


17-19: Clean warning format customization

Nice simplification of the warning format that focuses on what's important - the category and message, without the noisy file/line information.


91-112: The warning message implementation looks good

The method properly emits warnings through Python's built-in system and collects them in the instance's log collection. This dual approach ensures visibility both in standard output and in the Connector Builder UI.

@bazarnov bazarnov changed the title fix: (CDK) (Manifest) - Add deprecations support and handle deprecation warnings fix: (CDK) (Manifest) - Add deprecations support and handle deprecation warnings; deprecate url_base and path for HttpRequester Apr 16, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)

4184-4184: Missing newline at end of file.

The static analysis tool detected that there's no newline character at the end of the file. While this won't affect functionality, it's a common convention to have files end with a newline. Would you like to add one, wdyt?

- '{{ "goodbye, cruel world" | regex_search("goodbye,\s(.*)$") }} -> "cruel world"'
+ '{{ "goodbye, cruel world" | regex_search("goodbye,\s(.*)$") }} -> "cruel world"'
+ 
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 4184-4184: no new line character at the end of file

(new-line-at-end-of-file)

airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)

2173-2285: Consider adding a code example for the new URL approach in the documentation.

The field documentation is clear, but would it be helpful to add a short example in the class docstring that shows how to migrate from url_base + path to the new unified url approach? This could make it even easier for developers to understand the migration path, wdyt?

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4)

111-114: Minor import consolidation question.

Imports from base_model_with_deprecations look straightforward. Would you consider grouping them in a single line for cleanliness, or are you good with this layout? wdyt?


591-592: Optional data structure improvement.

Storing deprecation logs in a list works but checking for duplicates can be O(n). Would you consider using a set to handle membership checks if these logs grow large? wdyt?


746-750: Refine return type for deprecation retrieval?

Currently returns List[Any]. Would you consider specifying List[AirbyteLogMessage] since you are returning log messages? wdyt?


752-766: Optimize duplicate detection loop if many logs are expected.

Checking if a log is in _collected_deprecation_logs runs in O(n) time with a list. If performance becomes a concern, how about using a set for constant-time membership checks? wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f63f195 and 8dd1200.

📒 Files selected for processing (6)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml (3 hunks)
  • airbyte_cdk/sources/declarative/models/base_model_with_deprecations.py (1 hunks)
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py (5 hunks)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (8 hunks)
  • airbyte_cdk/sources/declarative/requesters/http_requester.py (6 hunks)
  • airbyte_cdk/sources/declarative/requesters/requester.py (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
airbyte_cdk/sources/declarative/requesters/requester.py (2)
airbyte_cdk/sources/declarative/requesters/http_requester.py (1)
  • get_url (130-143)
airbyte_cdk/sources/types.py (1)
  • StreamSlice (66-160)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (2)
airbyte_cdk/sources/declarative/models/base_model_with_deprecations.py (1)
  • BaseModelWithDeprecations (27-122)
airbyte_cdk/sources/declarative/requesters/http_requester.py (1)
  • HttpRequester (37-478)
🪛 YAMLlint (1.35.1)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml

[error] 4184-4184: no new line character at the end of file

(new-line-at-end-of-file)

⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-amplitude' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (20)
airbyte_cdk/sources/declarative/models/base_model_with_deprecations.py (1)

27-123: Impressive implementation of the deprecation handling system!

The BaseModelWithDeprecations class is well-designed with clear documentation and separation of concerns. The approach of emitting both standard Python warnings and collecting custom Airbyte log messages is elegant and practical.

Just one thought - I notice you're initializing _deprecation_logs as an instance variable in the constructor, which is great. This addresses the shared state concern that was raised in a previous review. The implementation looks solid now!

airbyte_cdk/sources/declarative/requesters/requester.py (1)

37-48: LGTM! Clean addition of the get_url abstract method.

The method signature is well-defined with appropriate type annotations and the docstring clearly explains its purpose. This addition provides a good abstraction for URL handling that can be implemented by concrete requester classes.

airbyte_cdk/sources/declarative/declarative_component_schema.yaml (3)

1871-1890: Good approach for deprecating url_base while maintaining backward compatibility.

The deprecation is clearly marked with an informative message pointing users to the new url field. The description update properly informs users that this field is deprecated.


1891-1909: Well-structured replacement url property with matching capabilities.

The new url property maintains all the interpolation contexts and examples from url_base, ensuring a smooth transition for users. Nice implementation!


1911-1914: Consistent approach with the path deprecation.

The deprecation message for path consistently directs users to use the url field instead, which aligns well with the overall approach.

airbyte_cdk/sources/declarative/requesters/http_requester.py (6)

58-60: Looks like a clean approach to field deprecation!

Adding the new optional url field while keeping url_base allows for backward compatibility. This graceful transition will help users to migrate their existing connectors at their own pace.


73-84: Good implementation of the deprecation transition.

The __post_init__ logic handles both new and deprecated fields appropriately, and I appreciate that you added clear comments to indicate which fields are deprecated. This helps other developers understand the code evolution.


130-144: Well-implemented new get_url method.

This method follows the same pattern as other getters in the class and properly handles the interpolation context for the URL template. Good work on maintaining consistency with the existing codebase style.


145-174: Solid implementation of URL resolution logic.

The _get_url method nicely handles the different URL construction scenarios with a clear priority order: using url_base + path if url_base exists, otherwise using url + path if path exists, or just url. This ensures backward compatibility while supporting the new approach.


404-414: Good update to the _join_url method.

Making the path parameter optional is consistent with the overall deprecation approach, and the documentation has been updated to reflect this change.


454-459: Nice refactoring of the URL construction in send_request.

Using the new _get_url method here centralizes the URL construction logic, which makes the code more maintainable and consistent. Good example of the DRY principle.

airbyte_cdk/sources/declarative/models/declarative_component_schema.py (7)

11-13: Good addition of the new import.

The import of BaseModelWithDeprecations is properly organized, and it's a good practice to have the deprecation mechanism in a separate module for better code organization.


885-897: Type field added and field optionality improved for KeyTransformation.

The required type field and making prefix and suffix optional with improved type hints will help with model validation. This should provide better flexibility for users.


918-922: Consistent update to the key_transformation field type.

Making this field optional aligns with the changes to the KeyTransformation class. Good attention to detail in ensuring consistency across model relationships.


2173-2174: Great choice of base class for enabling deprecation warnings.

Changing the base class to BaseModelWithDeprecations is the key change that enables the whole deprecation warning mechanism. This is an elegant way to add this functionality without affecting existing code too much.


2175-2187: Properly structured deprecation of url_base.

The field is now optional, correctly marked as deprecated, and includes a helpful message pointing users to the new url field. The description has also been updated to indicate it's deprecated.


2188-2198: Well-documented new url field.

The new field has comprehensive documentation and examples that match the style of the previous url_base field. This will make it clear to users how to migrate to the new field.


2199-2210: Properly structured deprecation of path.

Like with url_base, the path field is correctly marked as deprecated with an informative message directing users to the new approach. This consistency is important for a smooth transition.

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)

30-30: Imports look good.

The addition of AirbyteLogMessage, FailureType, and Level appears consistent with the rest of the codebase. No issues spotted.


740-742: Conditional call to collect deprecations.

This is a solid approach for automatically collecting deprecation logs. No concerns here.

@@ -880,20 +882,17 @@ class FlattenFields(BaseModel):


class KeyTransformation(BaseModel):
prefix: Optional[Union[str, None]] = Field(
type: Literal["KeyTransformation"]
Copy link
Contributor Author

@bazarnov bazarnov Apr 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These corrects are auto-generated based on the declarative_component_schema.yaml as of this change and are not the part of the PR!

@@ -2171,11 +2170,13 @@ class SessionTokenAuthenticator(BaseModel):
parameters: Optional[Dict[str, Any]] = Field(None, alias="$parameters")


class HttpRequester(BaseModel):
class HttpRequester(BaseModelWithDeprecations):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes are related to the deprecation functionality being added by this PR.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
airbyte_cdk/sources/declarative/models/base_model_with_deprecations.py (2)

17-19: Consider the implications of globally modifying warning format

This change to warnings.formatwarning affects all warnings in the application, not just the ones from this class. Would it make more sense to scope this to only your deprecation warnings by using a custom warning display function instead of modifying the global one, wdyt?

-warnings.formatwarning = (
-    lambda message, category, *args, **kwargs: f"{category.__name__}: {message}"
-)
+def _display_deprecation_warning(message: str, category: Type[Warning], *args, **kwargs) -> None:
+    formatted_message = f"{category.__name__}: {message}"
+    # Use the warnings module to emit the warning, but with custom formatting
+    warnings._showwarnmsg_impl(warnings.WarningMessage(
+        message=message, category=category, filename=args[0], 
+        lineno=args[1], file=args[2], line=args[3]
+    ))

Or at minimum, add a comment explaining the global modification and its potential impact on other parts of the codebase.


129-131: Consider optimizing duplicate detection

The current approach compares entire AirbyteLogMessage objects to check for duplicates. For large numbers of deprecation warnings, would it be more efficient to compare just the message strings instead, wdyt?

-        if deprecation_log_message not in self._deprecation_logs:
+        # Check if this message text already exists in the logs
+        if not any(log.message == deprecation_log_message.message for log in self._deprecation_logs):
             # Avoid duplicates in the deprecation logs
             self._deprecation_logs.append(deprecation_log_message)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between baf1058 and c94363e.

📒 Files selected for processing (1)
  • airbyte_cdk/sources/declarative/models/base_model_with_deprecations.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Analyze (python)
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/models/base_model_with_deprecations.py (2)

49-60: The initialization order looks good

The implementation looks solid - you've addressed the previous concern about the class-level _deprecation_logs attribute by initializing it at the instance level in __init__. The order of operations (initializing Pydantic internals first, then processing fields) makes sense.


1-132: Overall implementation is robust and well-documented

This class provides a solid foundation for handling deprecation warnings in declarative components. I particularly like how thorough the docstrings are and the dual approach of emitting both Python warnings and Airbyte log messages.

One thing to consider: this implementation warns about deprecated fields but doesn't actually handle migration logic. For fields that are deprecated in favor of new ones (like url_base/pathurl), would it make sense to add automatic migration logic to maintain backward compatibility while still encouraging the new field usage, wdyt?

Copy link
Contributor

@lmossman lmossman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a couple small thoughts on the declarative_component_schema - will leave python review to @maxi297 or someone else on the python side

@bazarnov bazarnov requested a review from lmossman April 21, 2025 20:13
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (3)

1890-1907: Validate the new url field definition
The url property neatly replaces the url_base+path combo, and the interpolation contexts/examples look consistent. Do we need to update any spec documentation or add it to a required block when both url_base and path are absent? wdyt?


1909-1913: Ensure path deprecation consistency
path is also deprecated in favor of url, matching the url_base pattern. Would it be helpful to include a migration example in the deprecation_message or description to guide users through updating manifests? wdyt?


4182-4182: Add newline at end-of-file
YAMLlint flagged a missing newline at EOF. Could you add a blank line after line 4181 to satisfy the linter? wdyt?

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 4182-4182: no new line character at the end of file

(new-line-at-end-of-file)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9f743f2 and 97c83d9.

📒 Files selected for processing (1)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml (3 hunks)
🧰 Additional context used
🪛 YAMLlint (1.35.1)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml

[error] 4182-4182: no new line character at the end of file

(new-line-at-end-of-file)

⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-amplitude' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Analyze (python)
🔇 Additional comments (1)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)

1871-1874: Confirm url_base deprecation alignment
I see url_base is now marked as deprecated with a clear deprecation_message and updated description—nice work! Should we double‑check that it’s been removed from any required lists or external docs to avoid confusion? wdyt?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working security
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants