chore: incremental without partition router as defaultstream #697

maxi297 · 2025-08-05T13:50:35Z

What

Following #691, this tackles streams that are incremental but do not have partition routing

Summary by CodeRabbit

New Features
- Concurrency-aware declarative streams for improved incremental sync performance and scalability.
- Enhanced stop-condition and client-side filtering behavior to reduce unnecessary requests.
Refactor
- Declarative streams now use a unified default stream model and updated slicer/cursor wiring for more consistent behavior.
- Pagination and cursor handling reworked to derive behavior from active concurrency constructs.
Tests
- Test suite updated with helpers and assertions to reflect the new stream model and retriever-based checks.

…for-filebased

…ed' into maxi297/availability_strategy_to_support_abstract_stream

…m' into maxi297/bland_stream_instantiated_as_defaultstream

…tract_stream

…m' into maxi297/bland_stream_instantiated_as_defaultstream

github-actions · 2025-08-05T13:50:51Z

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@maxi297/incremental_without_partition_router_as_defaultstream#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch maxi297/incremental_without_partition_router_as_defaultstream

Helpful Resources

CDK API Reference

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

/autofix - Fixes most formatting and linting issues
/poetry-lock - Updates poetry.lock file
/test - Runs connector tests with the updated CDK
/poe <command> - Runs any poe command in the CDK environment

📝 Edit this welcome message.

github-actions · 2025-08-05T13:58:04Z

PyTest Results (Fast)

3 694 tests ±0 3 683 ✅ ±0 6m 37s ⏱️ +7s
1 suites ±0 11 💤 ±0
1 files ±0 0 ❌ ±0

Results for commit d01497d. ± Comparison against base commit 2ccff10.

♻️ This comment has been updated with latest results.

github-actions · 2025-08-05T14:00:56Z

PyTest Results (Full)

3 697 tests ±0 3 686 ✅ ±0 9m 33s ⏱️ -8s
1 suites ±0 11 💤 ±0
1 files ±0 0 ❌ ±0

Results for commit d01497d. ± Comparison against base commit 2ccff10.

♻️ This comment has been updated with latest results.

…o maxi297/incremental_without_partition_router_as_defaultstream

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py

…stream

…o maxi297/incremental_without_partition_router_as_defaultstream

maxi297 · 2025-08-11T15:49:20Z

@brianjlai is there a reason why

airbyte-python-cdk/unit_tests/connector_builder/test_property_chunking.py

Lines 208 to 210 in addd443

    
           source = ManifestDeclarativeSource( 
        
               source_config=MANIFEST, emit_connector_builder_messages=True 
        
           )

has not been updated to use the ConcurrentDeclarativeSource?

brianjlai · 2025-08-19T22:21:23Z

@brianjlai is there a reason why

airbyte-python-cdk/unit_tests/connector_builder/test_property_chunking.py

Lines 208 to 210 in addd443

source = ManifestDeclarativeSource(

source_config=MANIFEST, emit_connector_builder_messages=True

)

has not been updated to use the ConcurrentDeclarativeSource?

probably an oversight, but this is addressed in https://github.com/airbytehq/airbyte-python-cdk/pull/704/files#diff-a0cf4a1d832d23490722285fffb5964b6ea08d67941348be673a8fa3dec97768R221 which moves all the relevant tests to ConcurrentDeclarativeSources

…_as_defaultstream

coderabbitai · 2025-08-20T13:01:55Z

📝 Walkthrough

Walkthrough

Migrates declarative stream construction to be concurrency-aware: removes legacy incremental cursor imports, introduces a ConcurrentStreamSlicer alias, builds combined slicers and a concurrent cursor, wires retrievers and DefaultStream with cursor-derived fields, and updates tests to use DefaultStream and new access helpers.

Changes

Cohort / File(s)	Summary
Parser factory concurrency migration `airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py`	Removes legacy incremental cursor imports and unused public imports; adds `ConcurrentStreamSlicer` alias; merges slicers, builds `concurrent_cursor`, and routes stream wiring (retriever, slicer, stop-condition, client-side filtering) through concurrency-aware helpers; constructs `DefaultStream` using active cursor-derived `cursor_field`; adds `_is_stop_condition_on_cursor` and `_is_client_side_filtering_enabled`.
Unit test updates for DefaultStream and helpers `unit_tests/sources/declarative/parsers/test_model_to_component_factory.py`	Replaces assertions to expect `DefaultStream` and cursor-derived fields; introduces `get_schema_loader` and `get_retriever` helpers; adjusts tests to inspect retriever (paginator, requester, selectors, cursor behavior) and to validate concurrent-cursor/partition-router scenarios.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Config as DeclarativeModel
  participant Factory as ModelToComponentFactory
  participant Slicer as ConcurrentStreamSlicer
  participant Cursor as ConcurrentCursor
  participant Retriever as SimpleRetriever
  participant Stream as DefaultStream

  Config->>Factory: create_declarative_stream(model)
  Note right of Factory: merge slicers & build partition router
  Factory->>Slicer: _merge_stream_slicers(...) -> combined_slicers
  Factory->>Factory: _build_stream_slicer_from_partition_router(...)
  Factory->>Factory: _build_concurrent_cursor(model, partition_router)
  Factory->>Cursor: instantiate concurrent cursor (if applicable)

  Factory->>Retriever: create(..., stream_slicer=combined_slicers, stop_condition_cursor=Cursor?)
  Note right of Retriever #lightblue: Retriever uses cursor for stop-condition/pagination

  Factory->>Stream: DefaultStream(retriever, cursor=Cursor?)
  Stream->>Stream: derive cursor_field from Cursor (if present)

  alt client-side filtering enabled
    Stream->>Cursor: client-side filter operations
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

chore: migrate client side filtering to concurrent cursor #679 — Implements similar migration from legacy declarative cursors to concurrent cursor implementations and overlaps in concurrent cursor construction.
fix(cdk): determine state from manager if not received a state in per partition router #544 — Related changes to concurrency-aware cursor construction and how per-partition cursors incorporate stream/partition state.
fix: revert connector builder limitation wrongly applied to all the streams #716 — Changes slicer wiring in model_to_component_factory.py and may conflict with combined-slicers/concurrency wiring.

Suggested reviewers

tolik0
brianjlai

Want me to expand the related-PR list or surface specific conflicting files to preempt merge conflicts, wdyt?

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch maxi297/incremental_without_partition_router_as_defaultstream

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (9)

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (7)
613-615: Alias for concurrent StreamSlicer is fine, but watch downstream typing.

You use ConcurrentStreamSlicer later as a variable annotation that sometimes holds a declarative PartitionRouter. This will trip static typing unless we adjust the annotation/casts there. See my suggestion on Lines 2070–2088 and 2099 to make mypy happy, wdyt?

2000-2009: Build partition_router once (reuse when possible) to avoid divergence.

You compute combined_slicers (which may already embed the partition router) and then build partition_router again for the concurrent cursor. Reusing the already-built instance avoids creating multiple routers with potentially distinct state/parameters and reduces overhead. Would you consider reusing combined_slicers when it’s a PartitionRouter and only building if needed, wdyt?

Proposed minimal change:
-        combined_slicers = self._merge_stream_slicers(model=model, config=config)
-        partition_router = self._build_stream_slicer_from_partition_router(
-            model.retriever, config, stream_name=model.name
-        )
-        concurrent_cursor = self._build_concurrent_cursor(model, partition_router, config)
+        combined_slicers = self._merge_stream_slicers(model=model, config=config)
+        partition_router = (
+            combined_slicers
+            if isinstance(combined_slicers, PartitionRouter)
+            else self._build_stream_slicer_from_partition_router(
+                model.retriever, config, stream_name=model.name
+            )
+        )
+        concurrent_cursor = self._build_concurrent_cursor(
+            model, cast(PartitionRouter, partition_router), config
+        )
2016-2021: Good wiring for stop-condition and client-side incremental.

Passing the concurrent cursor down to paginator (stop condition) and record filtering is the right abstraction. Consider a tiny typing polish to avoid mypy friction (optional):
-            stop_condition_cursor=concurrent_cursor
+            stop_condition_cursor=cast(Cursor, concurrent_cursor)
             if self._is_stop_condition_on_cursor(model)
             else None,
This matches the SimpleRetriever signature expecting Optional[Cursor]. Wdyt?

2054-2065: Broaden the cursor check to include all concurrent cursors.

The condition only checks for ConcurrentCursor, but _build_concurrent_cursor can also return ConcurrentPerPartitionCursor (still a Cursor). Would you switch to Cursor to be future-proof and clearer, wdyt?
-                isinstance(combined_slicers, PartitionRouter)
-                or isinstance(concurrent_cursor, ConcurrentCursor)
+                isinstance(combined_slicers, PartitionRouter)
+                or isinstance(concurrent_cursor, Cursor)
2070-2088: Fix mixed-type annotation and decorator cast to satisfy static typing.

stream_slicer is sometimes a concurrent cursor and sometimes a declarative PartitionRouter. Annotating it as ConcurrentStreamSlicer alone is inaccurate and will upset mypy. Shall we annotate as a union and cast at the decorator call site, wdyt?
-            stream_slicer: ConcurrentStreamSlicer = (
+            stream_slicer: Union[PartitionRouter, ConcurrentStreamSlicer] = (
                 concurrent_cursor if concurrent_cursor else SinglePartitionRouter(parameters={})
             )
And when wrapping for test reads, cast the wrapped_slicer too:
-                stream_slicer=cast(
-                    StreamSlicer,
-                    StreamSlicerTestReadDecorator(
-                        wrapped_slicer=stream_slicer,
-                        maximum_number_of_slices=self._limit_slices_fetched or 5,
-                    ),
-                ),
+                stream_slicer=cast(
+                    StreamSlicer,
+                    StreamSlicerTestReadDecorator(
+                        wrapped_slicer=cast(StreamSlicer, stream_slicer),
+                        maximum_number_of_slices=self._limit_slices_fetched or 5,
+                    ),
+                ),
Also applies to: 2099-2102

2110-2115: Preserve cursor_field when FinalStateCursor is used (fallback to model).

When the active cursor doesn’t expose cursor_field (e.g., FinalStateCursor), we currently pass "". To keep behavior closer to pre-migration DeclarativeStream (and to aid consumers/UI), would you fallback to the model’s cursor_field if available, wdyt?
-                cursor_field=cursor.cursor_field.cursor_field_key
-                if hasattr(cursor, "cursor_field")
-                else "",  # FIXME we should have the cursor field has part of the interface of cursor,
+                cursor_field=(
+                    getattr(getattr(cursor, "cursor_field", None), "cursor_field_key", None)
+                    or (
+                        model.incremental_sync.cursor_field
+                        if model.incremental_sync
+                        and hasattr(model.incremental_sync, "cursor_field")
+                        else ""
+                    )
+                ),
Also, the logger name change is noted as a breaking change. Do we want a follow-up todo or release note to call this out explicitly, wdyt?

2292-2310: Prefer isinstance over type equality for model checks.

Using isinstance is more idiomatic and resilient to any future subclassing. Would you consider this small cleanup, wdyt?
-        elif model.incremental_sync:
-            if type(model.incremental_sync) == IncrementingCountCursorModel:
+        elif model.incremental_sync:
+            if isinstance(model.incremental_sync, IncrementingCountCursorModel):
                 return self.create_concurrent_cursor_from_incrementing_count_cursor(
                     model_type=IncrementingCountCursorModel,
                     component_definition=model.incremental_sync.__dict__,
                     stream_name=model.name or "",
                     stream_namespace=None,
                     config=config or {},
                     stream_state_migrations=state_transformations,
                 )
-            elif type(model.incremental_sync) == DatetimeBasedCursorModel:
+            elif isinstance(model.incremental_sync, DatetimeBasedCursorModel):
                 return self.create_concurrent_cursor_from_datetime_based_cursor(
                     model_type=type(model.incremental_sync),
                     component_definition=model.incremental_sync.__dict__,
                     stream_name=model.name or "",
                     stream_namespace=None,
                     config=config or {},
                     stream_state_migrations=state_transformations,
                     attempt_to_create_cursor_if_not_provided=True,
                 )
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (2)
353-356: Shift to DefaultStream + accessor helpers is correct; consider consolidating helpers to avoid private-attr coupling

The assertions reflect the new wiring: DefaultStream for incremental-without-partition-router, stream.cursor_field exposure, and accessing retriever/paginator/requester via the new helpers. This all aligns with the PR intent and the concurrent model.

Would you consider moving get_schema_loader/get_retriever into a shared test utility (e.g., unit_tests/test_utils/streams.py) so multiple test modules don’t re-encode long private attribute chains, reducing fragility if internals change again, wdyt? You could then import those helpers here and in connector_builder tests to keep things DRY.

Also applies to: 357-360, 361-371, 372-383, 386-405, 406-411, 412-433, 436-439

1192-1198: Use identity for boolean assertion

Minor nit: prefer identity for booleans. Would you switch to is True for clarity, wdyt?
-    assert get_retriever(stream).record_selector.transform_before_filtering == True
+    assert get_retriever(stream).record_selector.transform_before_filtering is True

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 02246dc and 59c1fd8.

📒 Files selected for processing (2)

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (7 hunks)
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (8 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-01-14T00:20:32.310Z

Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the `airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py` file, the strict module name checks in `_get_class_from_fully_qualified_class_name` (requiring `module_name` to be "components" and `module_name_full` to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.

Applied to files:

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py

🧬 Code Graph Analysis (2)

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (5)

airbyte_cdk/sources/declarative/requesters/requester.py (1)

HttpMethod (18-26)

airbyte_cdk/sources/streams/concurrent/partitions/stream_slicer.py (1)

StreamSlicer (27-39)

airbyte_cdk/sources/declarative/partition_routers/partition_router.py (1)

PartitionRouter (14-62)

airbyte_cdk/sources/streams/concurrent/cursor.py (4)

ConcurrentCursor (134-502)

Cursor (50-86)

FinalStateCursor (89-131)

cursor_field (184-185)

airbyte_cdk/sources/declarative/partition_routers/single_partition_router.py (1)

SinglePartitionRouter (13-63)

unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (4)

airbyte_cdk/sources/streams/concurrent/default_stream.py (3)

DefaultStream (17-123)

name (44-45)

cursor_field (52-53)

unit_tests/connector_builder/test_connector_builder_handler.py (4)

name (835-836)

cursor_field (831-832)

get_retriever (447-452)

primary_key (827-828)

airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (5)

name (106-114)

name (117-119)

SimpleRetriever (53-602)

primary_key (290-292)

primary_key (295-297)

airbyte_cdk/sources/declarative/schema/json_file_schema_loader.py (1)

_get_json_filepath (69-70)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Check: source-shopify
GitHub Check: Pytest (Fast)
GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (All, Python 3.10, Ubuntu)

🔇 Additional comments (6)

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)

504-504: Import narrowing looks good.

Switching to import only HttpMethod is correct; no usages of Requester remain in this file. LGTM.

2138-2151: Helper predicates read cleanly.

The intent behind stop-condition-on-cursor and client-side filtering reads well and centralizes the logic. LGTM.

unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (4)

1109-1112: LGTM: Data feed pagination strategy check is accurate

Asserting StopConditionPaginationStrategyDecorator via get_retriever(stream) matches the intended behavior for data-feed incremental streams. Looks good to me.

4125-4153: LGTM: QueryProperties correctly surfaced via retriever and excluded from request options provider

Good assertions around:

retriever.additional_query_properties construction

property_chunking and merge strategy types/values

query_properties_key presence while excluding the “fields” param from the RequestOptionsProvider

This tracks the new responsibility split cleanly.

4226-4251: LGTM: PropertiesFromEndpoint wiring through retriever is validated end-to-end

Verifying the nested retriever/requester under PropertiesFromEndpoint and the absence of always_include_properties is on point. Nice coverage of both construction and values.

4315-4335: LGTM: Requester-level PropertiesFromEndpoint path validated without chunking

This covers the alternative placement (requester.query_properties) well and confirms no property_chunking is attached while still validating property list resolution.

unit_tests/sources/declarative/parsers/test_model_to_component_factory.py

…_as_defaultstream

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (3)

613-615: Two StreamSlicer classes in scope — consider clarifying the alias or downstream variable names.

Having both declarative StreamSlicer and concurrent StreamSlicer in scope can be confusing. Would you consider either:

renaming the alias to ConcurrentPartitionSlicer, or
renaming the local variable used later (e.g., stream_slicer at Lines 2070–2088) to effective_stream_slicer for readability, wdyt?

2054-2065: DefaultStream gating omits the “both partition router and cursor” case.

This block enables DefaultStream when either a partition router or a concurrent cursor exists, but the initialization below prefers one or the other due to an elif chain. For streams that have both, do you want to pass both the partition router (as slicer) and the concurrent cursor to DefaultStream now (instead of only the slicer), wdyt?

2070-2109: Initialize DefaultStream with both slicer and cursor when available; prefer None for missing cursor_field.

Right now the elif chain prioritizes the slicer over the cursor; this prevents passing the concurrent cursor when both exist. Also, the cursor_field fallback uses an empty string; DefaultStream accepts Optional[str], so None better communicates absence.

Would you consider this refactor to always set both when available and improve readability (also renames the local variable to avoid confusion with the two StreamSlicer types), wdyt?

-            stream_name = model.name or ""
-            stream_slicer: ConcurrentStreamSlicer = (
-                concurrent_cursor if concurrent_cursor else SinglePartitionRouter(parameters={})
-            )
-            cursor: Cursor = FinalStateCursor(stream_name, None, self._message_repository)
-            if isinstance(retriever, AsyncRetriever):
-                # The AsyncRetriever only ever worked with a cursor from the concurrent package. Hence, the method
-                # `_build_incremental_cursor` which we would usually think would return only declarative stuff has a
-                # special clause and return a concurrent cursor. This stream slicer is passed to AsyncRetriever when
-                # built because the async retriever has a specific partition router which relies on this stream slicer.
-                # We can't re-use `concurrent_cursor` because it is a different instance than the one passed in
-                # AsyncJobPartitionRouter.
-                stream_slicer = retriever.stream_slicer
-                if isinstance(combined_slicers, Cursor):
-                    cursor = combined_slicers
-            elif isinstance(combined_slicers, PartitionRouter):
-                stream_slicer = combined_slicers
-            elif concurrent_cursor:
-                cursor = concurrent_cursor
+            stream_name = model.name or ""
+            # Use a dedicated name to avoid confusion with the two StreamSlicer types in scope.
+            effective_stream_slicer: ConcurrentStreamSlicer = SinglePartitionRouter(parameters={})
+            cursor: Cursor = FinalStateCursor(stream_name, None, self._message_repository)
+
+            if isinstance(retriever, AsyncRetriever):
+                # AsyncRetriever has its own slicer wired via AsyncJobPartitionRouter
+                effective_stream_slicer = retriever.stream_slicer
+                # If combined_slicers produced a concurrent cursor (e.g., datetime-based), use it
+                if isinstance(combined_slicers, Cursor):
+                    cursor = combined_slicers
+            elif isinstance(combined_slicers, PartitionRouter):
+                # When a partition router exists, prefer it for slicing.
+                effective_stream_slicer = combined_slicers
+
+            # If a concurrent cursor exists, pass it irrespective of slicer selection.
+            if concurrent_cursor:
+                cursor = concurrent_cursor
@@
-                stream_slicer=stream_slicer,
+                stream_slicer=effective_stream_slicer,
@@
-                cursor_field=cursor.cursor_field.cursor_field_key
-                if hasattr(cursor, "cursor_field")
-                else "",  # FIXME we should have the cursor field has part of the interface of cursor,
+                cursor_field=(
+                    cursor.cursor_field.cursor_field_key if hasattr(cursor, "cursor_field") else None
+                ),  # Prefer None over empty string

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 59c1fd8 and d01497d.

📒 Files selected for processing (1)

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (5 hunks)

🧰 Additional context used

🧠 Learnings (2)

📚 Learning: 2024-11-15T01:04:21.272Z

Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.

Applied to files:

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py

📚 Learning: 2025-01-14T00:20:32.310Z

Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the `airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py` file, the strict module name checks in `_get_class_from_fully_qualified_class_name` (requiring `module_name` to be "components" and `module_name_full` to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.

Applied to files:

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py

🧬 Code Graph Analysis (1)

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4)

airbyte_cdk/sources/streams/concurrent/default_stream.py (4)

name (44-45)

cursor (92-93)

DefaultStream (17-123)

cursor_field (52-53)

airbyte_cdk/sources/streams/concurrent/cursor.py (4)

ConcurrentCursor (134-502)

Cursor (50-86)

FinalStateCursor (89-131)

cursor_field (184-185)

airbyte_cdk/sources/declarative/partition_routers/single_partition_router.py (1)

SinglePartitionRouter (13-63)

airbyte_cdk/sources/streams/concurrent/helpers.py (1)

get_primary_key_from_stream (8-28)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)

GitHub Check: Check: source-intercom
GitHub Check: Check: source-pokeapi
GitHub Check: Check: source-shopify
GitHub Check: Check: source-hardcoded-records
GitHub Check: Check: destination-motherduck
GitHub Check: Pytest (Fast)
GitHub Check: SDM Docker Image Build
GitHub Check: preview_docs
GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (All, Python 3.10, Ubuntu)
GitHub Check: Analyze (python)

🔇 Additional comments (4)

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4)

2000-2009: Nice: centralizing concurrent cursor and slicer assembly.

The split into combined_slicers, partition_router, and concurrent_cursor makes the wiring easier to reason about and extend. This aligns with the PR goal for incremental streams without partition routers. LGTM.

2016-2021: Stop condition and client-side filtering correctly hook into the concurrent cursor; verify expected runtime behavior.

Plumbing stop_condition_cursor and client_side_incremental_sync={"cursor": concurrent_cursor} looks right for data-feeds and client-side incremental filtering. Do we have test coverage that exercises both branches (is_data_feed and is_client_side_incremental) using DefaultStream paths, wdyt?

If helpful, I can sketch tests asserting:

paginator wrapping with StopConditionPaginationStrategyDecorator when is_data_feed is true,

ClientSideIncrementalRecordFilterDecorator is used when is_client_side_incremental is true.

2132-2145: Good separation of concerns with helpers.

The *_is_* helpers make the decision logic explicit and testable. Nice.

504-504: Verified: no lingering Requester in model_to_component_factory.py

I searched the updated file and can confirm there are no remaining Requester references after swapping in HttpMethod. Downstream modules still import and use Requester where needed, but this change is isolated and safe. wdyt?

maxi297 and others added 25 commits July 31, 2025 09:37

remove

1d5b468

Auto-fix lint and format issues

76ac6f7

remove unused file

2d1e2f4

have declarative availability check support AbstractStream

b4a5fec

Auto-fix lint and format issues

fc6c6b6

mypy

5fe2e02

Auto-fix lint and format issues

1e8e968

Remove RFR stuff

689e792

have bland stream be instantiated as DefaultStream

5399436

fix test

dff2559

fix test, format, lint and a bit of mypy

7dc2164

mypy

0bfbdfe

format

0b454bb

remove unused line

13c17f4

Merge branch 'main' into maxi297/remove-availability-strategy-except-…

0f36dc5

…for-filebased

Merge branch 'maxi297/remove-availability-strategy-except-for-filebas…

6f95ebb

…ed' into maxi297/availability_strategy_to_support_abstract_stream

Merge branch 'maxi297/availability_strategy_to_support_abstract_strea…

c94892a

…m' into maxi297/bland_stream_instantiated_as_defaultstream

fix test

fb75765

lint

c078395

format

decc557

code review

b8daf64

Merge branch 'main' into maxi297/availability_strategy_to_support_abs…

e8edc4b

…tract_stream

code review

2bc4b30

Merge branch 'maxi297/availability_strategy_to_support_abstract_strea…

98e2227

…m' into maxi297/bland_stream_instantiated_as_defaultstream

incremental without partition router as DefaultStream

d9d09f0

github-actions bot added the chore label Aug 5, 2025

refactor regarding async stuff

1af2264

maxi297 added 4 commits August 6, 2025 09:52

supports_file_transfer

1079629

Merge branch 'maxi297/bland_stream_instantiated_as_defaultstream' int…

f196ea7

…o maxi297/incremental_without_partition_router_as_defaultstream

format

7f643e4

Merge branch 'maxi297/bland_stream_instantiated_as_defaultstream' int…

8566607

…o maxi297/incremental_without_partition_router_as_defaultstream

tolik0 reviewed Aug 7, 2025

View reviewed changes

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py Show resolved Hide resolved

maxi297 and others added 8 commits August 11, 2025 11:00

Merge branch 'main' into maxi297/bland_stream_instantiated_as_default…

96f15c3

…stream

Merge branch 'maxi297/bland_stream_instantiated_as_defaultstream' int…

6cb012b

…o maxi297/incremental_without_partition_router_as_defaultstream

format

11e3a35

Merge branch 'maxi297/bland_stream_instantiated_as_defaultstream' int…

86909df

…o maxi297/incremental_without_partition_router_as_defaultstream

more fixes for DefaultStream in Connector Builder

ebb4b28

mypy and format

6fef39b

format broke mypy

e31fed9

Merge branch 'maxi297/bland_stream_instantiated_as_defaultstream' int…

e996805

…o maxi297/incremental_without_partition_router_as_defaultstream

Base automatically changed from maxi297/bland_stream_instantiated_as_defaultstream to main August 19, 2025 18:09

maxi297 added 2 commits August 20, 2025 08:58

Merge branch 'main' into maxi297/incremental_without_partition_router…

90eeaa6

…_as_defaultstream

format

1be518b

lint

59c1fd8

coderabbitai bot requested changes Aug 20, 2025

View reviewed changes

unit_tests/sources/declarative/parsers/test_model_to_component_factory.py Show resolved Hide resolved

tolik0 approved these changes Aug 20, 2025

View reviewed changes

Merge branch 'main' into maxi297/incremental_without_partition_router…

d01497d

…_as_defaultstream

coderabbitai bot reviewed Aug 20, 2025

View reviewed changes

coderabbitai bot approved these changes Aug 20, 2025

View reviewed changes

maxi297 merged commit cd48741 into main Aug 20, 2025
26 checks passed

maxi297 deleted the maxi297/incremental_without_partition_router_as_defaultstream branch August 20, 2025 20:35

coderabbitai bot mentioned this pull request Aug 27, 2025

chore: remove declarative stream #707

Merged

This was referenced Sep 3, 2025

chore: clean up ConcurrentDeclarativeSource only processing DefaultStream and streamline inheritance to Source interface #743

Merged

feat: support CustomRetriever with partition router #753

Merged

chore: incremental without partition router as defaultstream #697

chore: incremental without partition router as defaultstream #697

Conversation

maxi297 commented Aug 5, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Summary by CodeRabbit

Uh oh!

github-actions bot commented Aug 5, 2025

👋 Greetings, Airbyte Team Member!

Testing This CDK Version

Helpful Resources

PR Slash Commands

Uh oh!

github-actions bot commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PyTest Results (Fast)

Uh oh!

github-actions bot commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PyTest Results (Full)

Uh oh!

Uh oh!

maxi297 commented Aug 11, 2025

Uh oh!

brianjlai commented Aug 19, 2025

Uh oh!

coderabbitai bot commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

maxi297 commented Aug 5, 2025 •

edited by coderabbitai bot

Loading

github-actions bot commented Aug 5, 2025 •

edited

Loading

github-actions bot commented Aug 5, 2025 •

edited

Loading

coderabbitai bot commented Aug 20, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)