Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(low-code cdk): add dynamic stream config to check connection #450

Merged

Conversation

lazebnyi
Copy link
Contributor

@lazebnyi lazebnyi commented Mar 28, 2025

Fixed: https://github.com/airbytehq/airbyte-internal-issues/issues/12083

This PR modifies the check connections functionality and adds dynamic stream check configurations, allowing users to specify check settings in a single component for different stream types.

Summary by CodeRabbit

  • New Features

    • Added dynamic stream configuration options for connection checks, allowing users to flexibly specify stream parameters and receive improved feedback during connection validations.
    • Introduced a new configuration class for dynamic streams, enhancing the structure and functionality of connection checks.
  • Bug Fixes

    • Improved error handling during connection checks to provide clearer feedback when issues arise.
  • Tests

    • Expanded automated test coverage to ensure the stability and reliability of the dynamic stream connection checks, including validation for missing fields and handling various scenarios.

@lazebnyi
Copy link
Contributor Author

lazebnyi commented Mar 28, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

@github-actions github-actions bot added the enhancement New feature or request label Mar 28, 2025
Copy link
Contributor

coderabbitai bot commented Mar 28, 2025

📝 Walkthrough

Walkthrough

This pull request introduces support for dynamic stream checks by adding a new data class, DynamicStreamCheckConfig. The changes update the public interface in the checks package, modify the CheckStream class to include dynamic stream configurations and error handling, and adjust schema definitions accordingly. Additionally, the component factory and unit tests have been updated to integrate and verify the new dynamic stream functionality.

Changes

File(s) Change Summary
airbyte_cdk/sources/declarative/checks/__init__.py Added the DynamicStreamCheckConfig class and updated the public interface to include it.
airbyte_cdk/sources/declarative/checks/check_stream.py Introduced DynamicStreamCheckConfig class, added dynamic_streams_check_configs attribute to CheckStream, and improved error handling for connection checks.
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
airbyte_cdk/sources/declarative/models/declarative_component_schema.py
Updated schema definitions to remove stream_names property and add dynamic_streams_check_configs. Introduced DynamicStreamCheckConfig schema/class with required properties.
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py Added create_dynamic_stream_check_config method and updated create_check_stream to process dynamic stream configurations, including error handling when both dynamic and static configurations are missing.
unit_tests/sources/declarative/checks/test_check_stream.py Extended test coverage with new configurations and test cases to validate dynamic stream check functionality and ensure proper exception handling.

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client
  participant CF as ModelToComponentFactory
  participant CS as CheckStream
  participant Source as Source
  participant Logger as Logger

  Client->>CF: create_check_stream(model, config)
  CF->>CF: Process dynamic_streams_check_configs
  CF->>CS: Return CheckStream object with dynamic config
  Client->>CS: check_connection()
  CS->>Source: source.streams(config)
  alt Exception Occurs
    CS->>Logger: Log error and warning
    CS->>Client: Return error message
  else Successful Response
    CS->>Client: Return connection check result
  end
Loading

Possibly related PRs

  • feat(low-code cdk): add component resolver and http component resolver #88 – The changes in the main PR are related to the introduction of the DynamicStreamCheckConfig class and its integration into the CheckStream class, which aligns with the enhancements made in the retrieved PR that also focuses on dynamic stream capabilities and the introduction of DynamicDeclarativeStream.
  • feat(low-code): add check dynamic stream #223 – The changes in the main PR are related to the addition of the DynamicStreamCheckConfig class and its integration into the CheckStream class, while the retrieved PR introduces the CheckDynamicStream class, which also enhances the handling of dynamic streams.

Suggested reviewers

  • maxi297 – Would you be willing to review these changes, wdyt?
  • darynaishchenko – Could you take a look at the modifications to dynamic stream handling, wdyt?
  • natikgadzhi – Would you mind reviewing the updates to the component factory and tests, wdyt?
✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (8)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2)

332-335: Enhancement for dynamic stream checks:
The new property dynamic_streams_check_configs has been added to the CheckStream definition as an array of dynamic stream check configurations. This looks like a great step to support dynamic stream settings in the check operation. I was wondering—should we remove the legacy stream_names property entirely, or keep it (perhaps for backward compatibility) as optional? wdyt?


336-353: New DynamicStreamCheckConfig schema:
The definition for DynamicStreamCheckConfig is clear and well-structured, requiring both type and dynamic_stream_name. I noticed that stream_count is provided with a default value of 0. Is this the intended behavior, or would a default value of 1 be more appropriate to indicate at least one stream should be checked? Additionally, the enum for type is declared as [ DynamicStreamCheckConfig ] with an extra space; perhaps we should remove the extraneous space for consistency (i.e. [DynamicStreamCheckConfig])? wdyt?

airbyte_cdk/sources/declarative/checks/check_stream.py (2)

15-23: Consider refining the docs or parameter naming?

This new data class looks solid. As an optional refinement, would you consider clarifying the role of stream_count further or renaming it to something like max_streams for clarity? wdyt?


35-35: Optional default value for dynamic streams config?

Currently, dynamic_streams_check_configs is a required parameter. Would you like to set a default to an empty list to avoid potential runtime errors if not provided? wdyt?

- dynamic_streams_check_configs: List[DynamicStreamCheckConfig]
+ dynamic_streams_check_configs: List[DynamicStreamCheckConfig] = field(default_factory=list)
unit_tests/sources/declarative/checks/test_check_stream.py (1)

176-361: Manifest clarity

Defining _MANIFEST_WITHOUT_CHECK_COMPONENT with dynamic streams ensures thorough coverage of multiple cases. As these definitions expand, do you think we should add a smaller or more minimal manifest for quick tests? wdyt?

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (3)

947-953: Validation for dynamic_stream_name.

Currently, the function doesn't check for empty or invalid dynamic_stream_name. Would you like to enforce a non-empty string to help catch misconfiguration earlier? wdyt?


957-961: Guarding against conflicting usage.

Raising an error if both dynamic_streams_check_configs and stream_names are None is good, but do we need an additional check if both are non-empty? It might help avoid conflicting or confusing configurations. wdyt?


972-976: Preserving model parameters.

In the returned CheckStream, the code sets parameters={} unconditionally. Could we pass model.parameters or {} instead for consistency with other factory methods? wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 01fa881 and 650165f.

📒 Files selected for processing (6)
  • airbyte_cdk/sources/declarative/checks/__init__.py (2 hunks)
  • airbyte_cdk/sources/declarative/checks/check_stream.py (4 hunks)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1 hunks)
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py (2 hunks)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4 hunks)
  • unit_tests/sources/declarative/checks/test_check_stream.py (2 hunks)
🧰 Additional context used
🪛 GitHub Actions: Linters
airbyte_cdk/sources/declarative/checks/check_stream.py

[error] 86-86: Need type annotation for 'dynamic_stream_name_to_generated_streams' (hint: 'dynamic_stream_name_to_generated_streams: dict[, ] = ...')


[error] 88-88: Value of type 'Stream' is not indexable


[error] 90-90: Value of type 'Stream' is not indexable


[error] 101-101: Value of type 'Any | None' is not indexable


[error] 112-112: Value of type 'Any | None' is not indexable


[error] 113-113: Argument 1 to 'len' has incompatible type 'Any | None'; expected 'Sized'


[error] 115-115: Incompatible types in assignment (expression has type 'Stream | None', variable has type 'Stream')

⏰ Context from checks skipped due to timeout of 90000ms (6)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-amplitude' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (10)
airbyte_cdk/sources/declarative/checks/__init__.py (2)

10-13: LGTM - Nice update to the import statements!

The addition of DynamicStreamCheckConfig to the imports looks good. This properly brings in the new class that supports dynamic stream check configurations.


27-27: Perfect update to the public interface!

You've correctly exposed the new DynamicStreamCheckConfig class by adding it to the __all__ list, making it available to importers of this package.

airbyte_cdk/sources/declarative/models/declarative_component_schema.py (2)

45-55: Looks good - great job with the new class definition!

The DynamicStreamCheckConfig class is well-defined with appropriate fields:

  • Required type literal field
  • Required dynamic_stream_name field with descriptive help text
  • Optional stream_count field with a sensible default (0)

Documentation is clear and concise. This will help users understand how to configure dynamic stream checks.


1528-1537: Well-implemented update to CheckStream!

The CheckStream class has been properly enhanced to support dynamic stream configurations. Making the new dynamic_streams_check_configs field optional enables backward compatibility with existing configurations.

This implementation nicely achieves the PR objective of enabling "users to specify check settings for various stream types within a single component."

airbyte_cdk/sources/declarative/checks/check_stream.py (1)

44-49: Good error handling for stream creation

Catching exceptions around streams = source.streams(config=config) is a great addition. The broad exception is acceptable here since we log details and safely return. Nicely done!

unit_tests/sources/declarative/checks/test_check_stream.py (2)

168-174: Appreciate the test config introduction

Declaring _CONFIG with start_date and custom items is a great way to drive dynamic streams. No remarks besides ensuring coverage for edge cases, e.g., empty lists. wdyt?


363-490: Thorough dynamic stream test coverage

These added tests look robust, verifying behavior under various dynamic stream configurations. Great job verifying both HTTP-based and config-based dynamic streams. Kudos for checking multiple permutations in parametric tests.

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (3)

57-61: Imports look good.

The new imports for CheckDynamicStream, CheckStream, and DynamicStreamCheckConfig appear consistent with the changes below. wdyt?


225-227: Neat addition of DynamicStreamCheckConfigModel import.

This aligns with the usage of DynamicStreamCheckConfigModel in the code. wdyt?


569-569: DynamicStreamCheckConfigModel mapping looks consistent.

Adding this to PYDANTIC_MODEL_TO_CONSTRUCTOR ensures it’s recognized properly. wdyt?

@lazebnyi
Copy link
Contributor Author

lazebnyi commented Mar 28, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
airbyte_cdk/sources/declarative/checks/check_stream.py (3)

15-22: New data class looks good with clear purpose.

The DynamicStreamCheckConfig class is well-structured with a clear docstring that explains its purpose. I'm wondering about the default value of 0 for stream_count though - would we ever want to check 0 streams, or would a default of 1 make more sense? wdyt?


36-41: Good use of optional parameter with initialization.

The optional parameter for dynamic_streams_check_configs with initialization in __post_init__ is a clean approach.

One thing to consider: should there be a validation to ensure that at least one of stream_names or dynamic_streams_check_configs is provided? We're checking non-emptiness for each individually, but not ensuring at least one exists. wdyt?


100-124: Comprehensive dynamic streams availability checking.

The method effectively checks dynamic streams by using the manifest and mapping streams. However, I noticed we're using type: ignore[attr-defined] comments - could we instead add proper type annotations or conditional checks to avoid these? While they work, they might mask real typing issues in the future. wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 650165f and 68741ec.

📒 Files selected for processing (2)
  • airbyte_cdk/sources/declarative/checks/check_stream.py (2 hunks)
  • unit_tests/sources/declarative/checks/test_check_stream.py (2 hunks)
🧰 Additional context used
🧬 Code Definitions (2)
unit_tests/sources/declarative/checks/test_check_stream.py (4)
airbyte_cdk/sources/declarative/checks/check_stream.py (2)
  • CheckStream (26-158)
  • check_connection (49-82)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
  • CheckStream (1528-1536)
airbyte_cdk/sources/abstract_source.py (2)
  • streams (74-79)
  • check_connection (59-71)
airbyte_cdk/sources/streams/http/http.py (2)
  • state (387-391)
  • state (394-398)
airbyte_cdk/sources/declarative/checks/check_stream.py (7)
airbyte_cdk/sources/declarative/checks/connection_checker.py (2)
  • ConnectionChecker (12-35)
  • check_connection (18-35)
airbyte_cdk/sources/declarative/concurrent_declarative_source.py (1)
  • streams (179-188)
airbyte_cdk/sources/streams/http/availability_strategy.py (2)
  • HttpAvailabilityStrategy (17-54)
  • check_availability (18-54)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
  • DynamicStreamCheckConfig (45-54)
airbyte_cdk/sources/streams/core.py (2)
  • logger (128-129)
  • name (139-143)
airbyte_cdk/sources/streams/concurrent/adapters.py (4)
  • name (186-187)
  • stream_name (328-329)
  • check_availability (213-223)
  • check_availability (383-397)
airbyte_cdk/sources/streams/concurrent/default_stream.py (2)
  • name (47-48)
  • check_availability (54-55)
⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-amplitude' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: SDM Docker Image Build
🔇 Additional comments (11)
airbyte_cdk/sources/declarative/checks/check_stream.py (5)

43-47: Excellent error handling helper method.

This helper method is a great addition that improves code reuse and ensures consistent error handling throughout the class. The inclusion of the traceback in the log is particularly helpful for debugging.


52-82: Improved connection checking with proper error handling.

The enhanced check_connection method now properly handles exceptions when retrieving streams, which is a significant improvement. The logical separation between checking static streams and dynamic streams is clear and well-structured.


84-98: Well-structured stream availability checking.

Extracting this logic into a separate method improves code organization and readability. The method handles exceptions consistently and returns appropriate status and messages.


126-133: Clean implementation of stream mapping.

The helper method for mapping dynamic stream names to their corresponding generated streams is well-implemented. The use of setdefault is a clean approach to build the mapping.


135-157: Well-designed generated streams availability checking.

This method effectively limits the number of streams to check based on max_count and handles errors consistently. The detailed error messages in both log and return value are helpful for troubleshooting.

unit_tests/sources/declarative/checks/test_check_stream.py (6)

169-175: Good test configuration setup.

The _CONFIG dictionary provides a clean test configuration with realistic values for testing dynamic streams.


177-362: Comprehensive manifest for testing.

The _MANIFEST_WITHOUT_CHECK_COMPONENT dictionary provides a detailed and realistic manifest structure for testing. The hierarchical structure with dynamic streams and components is particularly valuable for testing real-world scenarios.


365-583: Extensive parameterized test cases.

The parameterized test cases cover a wide range of scenarios including static streams, different dynamic stream configurations, error conditions, and edge cases. This provides robust test coverage for the feature.


585-624: Well-structured test implementation.

The test_check_stream1 function effectively uses HttpMocker to simulate API requests and responses, enabling comprehensive testing of the connection checking logic with both static and dynamic streams.


626-644: Good validation error testing.

This test ensures that the component properly validates required fields in dynamic stream configurations, which is essential for catching configuration errors early.


646-655: Effective edge case testing.

This test verifies that a ValueError is raised when only the type is provided without necessary stream names, which is important for catching common configuration errors.

@lazebnyi lazebnyi requested a review from maxi297 March 28, 2025 18:18
Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one question but this looks good to me

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
airbyte_cdk/sources/declarative/checks/check_stream.py (4)

15-22: Dataclass looks good with a minor docstring issue.

The DynamicStreamCheckConfig dataclass is well-defined with immutability (frozen=True), which is perfect for configuration classes. The attribute names are clear and self-explanatory.

There's a tiny grammar issue in the docstring with two spaces between "streams" and "in" on line 18. Would you mind fixing that? wdyt?

-    what dynamic streams  in the stream template should be updated with value, supporting dynamic interpolation
+    what dynamic streams in the stream template should be updated with value, supporting dynamic interpolation

43-47: Useful error logging helper with potential redundancy.

This helper method nicely encapsulates error logging and formatting. However, there appears to be redundancy in the error logging - the traceback is included in the message AND with the exc_info=True parameter.

Would you consider removing the traceback from the message since it's already provided by exc_info=True? wdyt?

-        logger.error(error_message + f"Error traceback: \n {traceback.format_exc()}", exc_info=True)
+        logger.error(error_message, exc_info=True)

100-124: Dynamic streams availability checking with type annotation concerns.

The method is well-structured but has several type ignore comments. Would you consider adding proper type annotation for the source parameter to avoid these type ignores? Maybe create a protocol or type alias?

Also, for the dynamic stream naming fallback (lines 106-107), do you think a more descriptive fallback name than just "dynamic_stream_{i}" would be helpful for debugging? wdyt?


135-158: Availability checking with stream count limit.

The method effectively checks stream availability with a limit on the number of streams to check. However, it doesn't validate that max_count is non-negative. Would you consider adding validation to ensure max_count doesn't cause unexpected behavior with negative values? wdyt?

-    def _check_generated_streams_availability(
-        self,
-        generated_streams: List[Dict[str, Any]],
-        stream_name_to_stream: Dict[str, Any],
-        logger: logging.Logger,
-        max_count: int,
-    ) -> Tuple[bool, Any]:
+    def _check_generated_streams_availability(
+        self,
+        generated_streams: List[Dict[str, Any]],
+        stream_name_to_stream: Dict[str, Any],
+        logger: logging.Logger,
+        max_count: int,
+    ) -> Tuple[bool, Any]:
+        max_count = max(0, max_count)  # Ensure max_count is non-negative
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 68741ec and 8f8659d.

📒 Files selected for processing (1)
  • airbyte_cdk/sources/declarative/checks/check_stream.py (2 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
airbyte_cdk/sources/declarative/checks/check_stream.py (3)
airbyte_cdk/sources/declarative/checks/connection_checker.py (2)
  • ConnectionChecker (12-35)
  • check_connection (18-35)
airbyte_cdk/sources/streams/http/availability_strategy.py (2)
  • HttpAvailabilityStrategy (17-54)
  • check_availability (18-54)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
  • DynamicStreamCheckConfig (45-54)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-amplitude' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Analyze (python)
🔇 Additional comments (8)
airbyte_cdk/sources/declarative/checks/check_stream.py (8)

8-8: Proper import statement additions.

The additions of Dict and List to the imports are necessary for the type annotations in the new methods and attributes. This ensures proper typing for the codebase.


36-36: Good addition of dynamic streams configuration.

The type annotation is well-defined as an optional list of DynamicStreamCheckConfig instances with appropriate default value.


40-41: Proper initialization in post_init.

Initializing dynamic_streams_check_configs to an empty list if it's not provided is a good practice to avoid None checks throughout the code.


52-59: Good error handling for stream discovery.

The addition of a try-except block for handling exceptions during stream discovery is a solid improvement. I notice you've used "discovering streams" in the error message, which aligns with the previous review suggestion!


67-71: Nice refactoring to a separate method.

Extracting the stream availability checking logic into a separate method improves code organization and readability. The method call includes all necessary parameters.


73-82: Well-structured dynamic streams checking logic.

The conditions for checking dynamic streams are clear and the implementation is appropriate. The code correctly verifies the presence of required attributes before proceeding.


84-98: Effective stream availability checking implementation.

The method properly encapsulates the stream availability checking logic with appropriate error handling. This separates concerns and makes the code more maintainable.


126-133: Efficient mapping implementation.

The method efficiently builds the mapping of dynamic stream names to their generated streams using setdefault, which is a great approach. The docstring clearly explains its purpose.

Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thanks for the clarifications

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2)

332-336: Dynamic Streams Check Configuration Addition in CheckStream
You've introduced the new property dynamic_streams_check_configs as an array that references the DynamicStreamCheckConfig schema. This addition effectively enhances the flexibility of connection checks by supporting dynamic streams. Have you considered how the system should behave if both stream_names and dynamic_streams_check_configs are provided simultaneously? It might be useful to document the intended precedence or validation rules here. wdyt?


337-353: New Schema Definition: DynamicStreamCheckConfig
The new DynamicStreamCheckConfig schema looks well-structured with type and dynamic_stream_name as required properties, and a stream_count property that defaults to 0. Considering previous discussions about the default value being 0, do you intend for a value of 0 to effectively mean "skip the read" for the associated dynamic stream? It might be beneficial to clearly document this behavior in the schema description. wdyt?

unit_tests/sources/declarative/checks/test_check_stream.py (3)

169-175: Consider adding a brief comment explaining the test config.

The _CONFIG dictionary looks well-structured with both regular and custom stream configurations, which is essential for testing dynamic stream functionality. Would it be helpful to add a brief docstring explaining its purpose in the tests?

+# Test configuration with start date and custom streams for dynamic stream resolution
 _CONFIG = {
     "start_date": "2024-07-01T00:00:00.000Z",
     "custom_streams": [
         {"id": 3, "name": "item_3"},
         {"id": 4, "name": "item_4"},
     ],
 }

177-362: Consider adding documentation for the manifest structure.

The _MANIFEST_WITHOUT_CHECK_COMPONENT is a comprehensive test fixture that covers both static and dynamic stream configurations. Given its complexity, would it be helpful to add some high-level comments explaining the major sections? This would make maintenance easier for other developers.

+# Test manifest defining dynamic and static streams without a check component
+# Contains:
+# - Two dynamic stream definitions (HTTP-based and config-based)
+# - One static stream definition
 _MANIFEST_WITHOUT_CHECK_COMPONENT = {
     "version": "6.7.0",
     "type": "DeclarativeSource",
     "dynamic_streams": [
+        # HTTP-based dynamic stream with component mapping
         {
             "type": "DynamicDeclarativeStream",
             "name": "http_dynamic_stream",
             ...
         },
+        # Config-based dynamic stream using custom_streams from config
         {
             "type": "DynamicDeclarativeStream",
             ...
         },
     ],
+    # Static stream definition
     "streams": [
         ...
     ],
 }

618-661: Consider renaming the test function for clarity.

The function test_check_stream1 has a numeric suffix that doesn't clearly indicate its purpose compared to other test functions. Would something more descriptive like test_check_dynamic_streams better reflect what's being tested here?

-def test_check_stream1(
+def test_check_dynamic_streams(
     check_component, expected_result, expectation, response_code, expected_messages, request_count
 ):
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8f8659d and 6fc31f3.

📒 Files selected for processing (2)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1 hunks)
  • unit_tests/sources/declarative/checks/test_check_stream.py (2 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
unit_tests/sources/declarative/checks/test_check_stream.py (4)
airbyte_cdk/sources/declarative/checks/check_stream.py (2)
  • CheckStream (26-158)
  • check_connection (49-82)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
  • CheckStream (1528-1536)
airbyte_cdk/sources/abstract_source.py (2)
  • streams (74-79)
  • check_connection (59-71)
airbyte_cdk/sources/streams/http/http.py (2)
  • state (387-391)
  • state (394-398)
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-amplitude' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (6)
unit_tests/sources/declarative/checks/test_check_stream.py (6)

5-7: Imports nicely expanded for the new functionality.

Good job adding the necessary imports for the new features. The addition of json, deepcopy, and the HTTP mocking tools make sense for testing dynamic stream configurations.

Also applies to: 13-21


365-617: Well-structured parameterized tests covering key scenarios.

Great job on designing comprehensive test cases that cover various combinations of static streams, dynamic streams, and error conditions. The parameterization approach allows for testing multiple scenarios with clean code.


623-647: Good use of HttpMocker for testing HTTP interactions.

The HTTP mocking setup is thorough and covers all the necessary endpoints. You've established a good pattern for testing both static and dynamic streams without relying on actual HTTP calls.


659-660: Good validation of HTTP request count.

Checking the number of HTTP calls is an excellent practice to ensure the code is making exactly the expected number of network requests.


663-681: Good negative test for validation errors.

Testing that validation errors are properly raised when required fields are missing is important for ensuring robust error handling.


683-692: Well-structured test for ValueError condition.

Good job testing that a ValueError is raised when only the type is provided without stream names or dynamic stream configs. This ensures proper validation logic in the CheckStream component.

@lazebnyi lazebnyi merged commit caa1e7d into main Apr 1, 2025
24 of 25 checks passed
@lazebnyi lazebnyi deleted the lazebnyi/add-dynamic-stream-config-to-check-connection branch April 1, 2025 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants