Skip to content

Conversation

aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Oct 11, 2025

feat: Add YAML custom connector publishing (Docker stubbed)

Summary

This PR adds PyAirbyte support for publishing YAML (declarative) custom connector definitions to Airbyte Cloud, leveraging new public API endpoints to bypass Docker image builds. The implementation includes:

  • Complete YAML support: Publish, list, get, update, and delete custom YAML source definitions
  • New dataclass: CloudCustomSourceDefinition with lazy-loading pattern following existing conventions
  • 3 MCP tools: publish_custom_source_definition, list_custom_source_definitions, update_custom_source_definition
  • Docker namespace reservation: Docker functionality is stubbed with NotImplementedError to preserve generic API structure for future support
  • Integration tests: Full CRUD test coverage for YAML connectors

The changes enable users to publish declarative connectors from Connector Builder or local YAML files without Docker builds, significantly simplifying the custom connector workflow.

Review & Testing Checklist for Human

Risk Level: 🟡 Medium - New feature with partial implementation

  • API design consistency: Review the custom_connector_type parameter requirement - does this feel intuitive when only "yaml" works, or should we have separate methods?
  • Error messaging clarity: Test Docker code paths to ensure NotImplementedError messages clearly communicate what's supported vs. planned
  • End-to-end workflow: Manually test the full publish → list → update → delete cycle with a real manifest file to verify the integration works as expected
  • Manifest compatibility: Test with manifests from Connector Builder or existing declarative sources to ensure our validation logic is sufficient
  • Documentation accuracy: Verify all docstrings accurately reflect YAML-only support and parameter requirements

Notes

  • Docker custom definitions are explicitly not supported in this PR but API structure is preserved for future implementation
  • Integration tests pass with a simplified test manifest, but real-world manifests may have additional complexity
  • Client-side validation is basic - server-side validation will catch additional issues

Requested by @aaronsteers
Devin Session: https://app.devin.ai/sessions/7733e25275f44008ab6cb765d4ef5106

Summary by CodeRabbit

  • New Features

    • Publish, list, retrieve, update, and permanently delete YAML-based custom source definitions via Cloud workspace APIs and CLI/tools.
    • Client-side YAML manifest validation with clear error feedback to prevent invalid submissions.
    • Docker-based custom source definitions acknowledged in APIs but not yet supported (NotImplemented).
  • Tests

    • Integration tests covering publish, list, fetch, update, delete flows and validation error scenarios.

devin-ai-integration bot and others added 7 commits October 2, 2025 22:21
- Add api_util functions for custom YAML/Docker source/destination definition CRUD operations
- Add CloudWorkspace methods for publishing all 3 definition types with validation
- Add 9 MCP tools (publish/list/update for YAML sources, Docker sources, Docker destinations)
- Add integration tests for all definition types
- Add client-side manifest validation for YAML with pre_validate option

Supports all 3 custom definition types from Airbyte 1.6:
- custom_yaml_source_definition (YAML manifests, no Docker build needed)
- custom_docker_source_definition (custom Docker source images)
- custom_docker_destination_definition (custom Docker destination images)

Uses airbyte-api 0.53.0 SDK with declarative_source_definitions, source_definitions, destination_definitions
Relates to Airbyte 1.6 release: https://docs.airbyte.com/release_notes/v-1.6
API docs: https://reference.airbyte.com/reference/createdeclarativesourcedefinition
Requested by: @aaronsteers
Devin session: https://app.devin.ai/sessions/7733e25275f44008ab6cb765d4ef5106

Co-Authored-By: AJ Steers <[email protected]>
- Replace 15 CloudWorkspace methods with 10 consolidated methods
- Add CloudCustomSourceDefinition and CloudCustomDestinationDefinition dataclasses
- Implement lazy-loading pattern for efficient data retrieval
- Replace 9 MCP tools with 6 consolidated tools with shortened names
- Update integration tests to use new dataclass returns
- Fix all parameter passing to use api_root, client_id, client_secret

All methods now return proper dataclasses following the lazy-loading pattern
from CloudSource/CloudConnection. Public API is consolidated to accept either
manifest_yaml or docker_image parameters.

Co-Authored-By: AJ Steers <[email protected]>
- Remove duplicate imports 'api as airbyte_api_api' and 'models as airbyte_api_models'
- Update all references to use shorter import names (api, models)
- Addresses PR feedback from @aaronsteers

All references updated and verified with poe fix-and-check passing.

Co-Authored-By: AJ Steers <[email protected]>
…date

- Make custom_connector_type required (not optional) in list/get/delete methods
- Add separate rename_custom_source_definition() method for Docker connectors
- Refactor update_custom_source_definition() to determine type from parameters
- Remove name parameter from update (use rename method instead)
- Update all callers: CloudWorkspace methods, MCP tools, dataclass, tests
- Add new rename_custom_source_definition MCP tool

Addresses PR feedback from @aaronsteers:
- Explicit type requirement prevents ambiguity between YAML/Docker domains
- Separate rename method clarifies intent vs generic updates
- Type determination from parameters simplifies update API

API changes:
- list_custom_source_definitions: custom_connector_type now required
- get_custom_source_definition: custom_connector_type now required
- permanently_delete_custom_source_definition: custom_connector_type now required
- update_custom_source_definition: removed name parameter, determines type from manifest_yaml vs docker_tag
- NEW: rename_custom_source_definition: Docker-only rename operation

Breaking changes: All public methods now require explicit type parameter

Co-Authored-By: AJ Steers <[email protected]>
- Move update_custom_source_definition to CloudCustomSourceDefinition.update_definition()
- Move rename_custom_source_definition to CloudCustomSourceDefinition.rename()
- Move update_custom_destination_definition to CloudCustomDestinationDefinition.update_definition()
- Update all callers: MCP tools and integration tests
- MCP tools now accept manifest_yaml as str | Path (not dict)
- Keep publish/get/list/permanently_delete in CloudWorkspace

This reduces clutter in CloudWorkspace by moving update/rename operations
to the narrower dataclasses where they belong. Methods use shorter names
since they're now in type-specific classes.

Addresses PR feedback from @aaronsteers

Co-Authored-By: AJ Steers <[email protected]>
- Replace TEST_YAML_MANIFEST with complete structure including spec section
- Add definitions section following real manifest patterns
- Skip Docker custom definition tests pending API support confirmation
- The missing spec section was causing 500 error 'get(...) must not be null'

Co-Authored-By: AJ Steers <[email protected]>
- Add support for publishing, listing, updating, and deleting custom YAML source definitions
- Implement CloudCustomSourceDefinition dataclass with lazy-loading pattern
- Add 3 MCP tools: publish_custom_source_definition, list_custom_source_definitions, update_custom_source_definition
- Stub out Docker custom definitions with NotImplementedError to preserve API structure for future support
- Add integration tests for YAML custom source definitions
- Remove Docker custom destination definitions (not in scope for this PR)

This enables PyAirbyte users to publish YAML (declarative) connectors to Airbyte Cloud
without needing to build Docker images, leveraging new Airbyte API endpoints.

Co-Authored-By: AJ Steers <[email protected]>
Copy link
Contributor

Original prompt from AJ Steers
Received message in Slack channel #dev-pyairbyte:

@Devin - Can you look into adding PyAirbyte support for a publish action for YAML connectors? Specifically, I believe there might be a new public Airbyte API endpoint at can plug into. Basically this would allow us to support YAML connector publishing without needing to build/publish docker images
Thread URL: https://airbytehq-team.slack.com/archives/C065V6XFWNQ/p1759278027099799?thread_ts=1759278027.099799

Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1760140221-yaml-connector-publish' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1760140221-yaml-connector-publish'

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test-pr - Runs tests with the updated PyAirbyte

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

Copy link
Contributor

coderabbitai bot commented Oct 11, 2025

📝 Walkthrough

Walkthrough

Adds client-side YAML manifest validation and full CRUD helpers for Declarative (YAML) custom source definitions: api_util wrappers, CloudCustomSourceDefinition model, CloudWorkspace publish/list/get/delete methods, MCP tools to publish/list/update, and integration tests. Docker-based paths remain unimplemented.

Changes

Cohort / File(s) Summary
API utilities for YAML custom sources
airbyte/_util/api_util.py
Adds validate_yaml_manifest and CRUD wrappers for Declarative (YAML) source definitions: create, list, get, update, delete. Validates manifest shape and raises/returns errors for invalid or missing API responses.
Cloud custom source entity
airbyte/cloud/connectors.py
Adds CloudCustomSourceDefinition class with YAML-centric properties and operations (update, rename, delete, factory _from_yaml_response). Docker-related accessors/flows raise NotImplementedError.
Workspace-level operations
airbyte/cloud/workspaces.py
Adds publish_custom_source_definition, list_custom_source_definitions, get_custom_source_definition, permanently_delete_custom_source_definition to manage YAML custom sources; supports Path/str/dict manifests, optional pre-validation, and uniqueness checks. Docker routes are placeholders.
MCP CLI/tools integration
airbyte/mcp/cloud_ops.py
Adds CLI/ops helpers publish_custom_source_definition, list_custom_source_definitions, update_custom_source_definition and registers them as tools; handles manifest parsing and pre-validation.
Integration tests
tests/integration_tests/cloud/test_custom_definitions.py
Adds end-to-end tests for publish, list, get, update, and permanent delete of YAML custom source definitions and for manifest validation error handling.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant WS as CloudWorkspace
  participant API as api_util (YAML CRUD)
  participant AB as Airbyte API

  rect rgb(245,248,255)
    note right of User: Publish YAML custom source
    User->>WS: publish_custom_source_definition(name, manifest_yaml, pre_validate)
    WS->>WS: parse manifest (Path/str/dict)
    WS->>WS: optional validate_yaml_manifest
    WS->>API: create_custom_yaml_source_definition(...)
    API->>AB: POST /declarative_source_definitions/create
    AB-->>API: DeclarativeSourceDefinitionResponse
    API-->>WS: response
    WS-->>User: CloudCustomSourceDefinition
  end
Loading
sequenceDiagram
  autonumber
  actor User
  participant WS as CloudWorkspace
  participant API as api_util (YAML CRUD)
  participant AB as Airbyte API

  rect rgb(245,255,245)
    note right of User: List/Get/Delete YAML custom sources
    User->>WS: list_custom_source_definitions(type="yaml")
    WS->>API: list_custom_yaml_source_definitions(...)
    API->>AB: POST /declarative_source_definitions/list
    AB-->>API: List<Definition>
    API-->>WS: list
    WS-->>User: [CloudCustomSourceDefinition]
    User->>WS: get_custom_source_definition(id, type="yaml")
    WS->>API: get_custom_yaml_source_definition(...)
    API->>AB: POST /declarative_source_definitions/get
    AB-->>API: Definition
    API-->>WS: response
    WS-->>User: CloudCustomSourceDefinition
    User->>WS: permanently_delete_custom_source_definition(id, type="yaml")
    WS->>API: delete_custom_yaml_source_definition(...)
    API->>AB: POST /declarative_source_definitions/delete
    AB-->>API: 204/ok
    API-->>WS: None
    WS-->>User: None
  end
Loading
sequenceDiagram
  autonumber
  actor User
  participant Def as CloudCustomSourceDefinition
  participant API as api_util (YAML CRUD)
  participant AB as Airbyte API

  rect rgb(255,248,240)
    note right of User: Update YAML manifest
    User->>Def: update(manifest_yaml, pre_validate)
    Def->>Def: parse & optional validate_yaml_manifest
    Def->>API: update_custom_yaml_source_definition(...)
    API->>AB: POST /declarative_source_definitions/update
    AB-->>API: Updated Definition
    API-->>Def: response
    Def-->>User: CloudCustomSourceDefinition (updated)
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • chore: bump airbyte-api to 0.53.0 #815 — Adds Declarative (YAML) custom source definition CRUD and workspace/connectors integration that aligns with these api_util and CloudWorkspace changes.

Suggested reviewers

  • bnchrch
  • maxi297

Want a short checklist for manual testing of the YAML publish/update flows, wdyt?

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title clearly summarizes the primary feature added—YAML custom connector publishing—and notes that Docker support is stubbed, accurately reflecting the changeset’s main focus.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/1760140221-yaml-connector-publish

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fea8293 and 63fcbbd.

📒 Files selected for processing (1)
  • tests/integration_tests/cloud/test_custom_definitions.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/integration_tests/cloud/test_custom_definitions.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (No Creds)

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (9)
airbyte/cloud/connectors.py (2)

478-483: Avoid network calls in repr

repr accesses name, triggering an API fetch when _definition_info is not cached. That’s surprising in logs/debugging. Shall we use the cached name if present, else fall back to definition_id without fetching, wdyt?

-    def __repr__(self) -> str:
-        """String representation."""
-        return (
-            f"CloudCustomSourceDefinition(definition_id={self.definition_id}, "
-            f"name={self.name}, connector_type={self.connector_type})"
-        )
+    def __repr__(self) -> str:
+        """String representation without triggering network calls."""
+        name = self._definition_info.name if self._definition_info else "<unloaded>"
+        return (
+            f"CloudCustomSourceDefinition(definition_id={self.definition_id}, "
+            f"name={name}, connector_type={self.connector_type})"
+        )

421-442: Ensure parsed manifest is a dict even when pre_validate=False

If manifest_yaml is a YAML string/path that parses to a non-dict (e.g., list/None), and pre_validate=False, we’ll send an invalid payload to the API. Should we guard that the parsed result is a dict regardless of pre_validate, wdyt?

         if is_yaml:
             manifest_dict: dict[str, Any]
             if isinstance(manifest_yaml, Path):
-                manifest_dict = yaml.safe_load(manifest_yaml.read_text())
+                manifest_dict = yaml.safe_load(manifest_yaml.read_text())
             elif isinstance(manifest_yaml, str):
-                manifest_dict = yaml.safe_load(manifest_yaml)
+                manifest_dict = yaml.safe_load(manifest_yaml)
             else:
                 manifest_dict = manifest_yaml  # type: ignore[assignment]
 
-            if pre_validate:
-                api_util.validate_yaml_manifest(manifest_dict, raise_on_error=True)
+            # Always ensure dict shape; validate fields only if requested.
+            if not isinstance(manifest_dict, dict):
+                raise exc.PyAirbyteInputError(
+                    message="Manifest must be a dictionary",
+                    context={"manifest": manifest_dict},
+                )
+            if pre_validate:
+                api_util.validate_yaml_manifest(manifest_dict, raise_on_error=True)

Optionally, also refresh this instance’s cache after a successful update to avoid stale reads:

-            return CloudCustomSourceDefinition._from_yaml_response(self.workspace, result)
+            self._definition_info = result
+            return CloudCustomSourceDefinition._from_yaml_response(self.workspace, result)
tests/integration_tests/cloud/test_custom_definitions.py (1)

88-94: Prefer deepcopy for manifest mutations

Shallow copy works since you only change version, but future edits to nested structures could mutate the shared constant. Use deepcopy to be safe, wdyt?

-        updated_manifest = TEST_YAML_MANIFEST.copy()
+        import copy
+        updated_manifest = copy.deepcopy(TEST_YAML_MANIFEST)
airbyte/mcp/cloud_ops.py (3)

505-556: Allow dict input for manifest_yaml

For programmatic callers, accepting dict avoids re-serialization. The workspace method already supports dict. Shall we extend the type hint and docstring, wdyt?

-    manifest_yaml: Annotated[
-        str | Path | None,
+    manifest_yaml: Annotated[
+        str | Path | dict | None,
         Field(
             description=(
-                "The Low-code CDK manifest as a YAML string or file path. "
+                "The Low-code CDK manifest as a dict, YAML string, or file path. "
                 "Required for YAML connectors."
             ),
             default=None,
         ),
     ] = None,

558-578: Make manifest inclusion optional in list results

Returning full manifests can be heavy and may expose details. Add a flag to include/exclude manifests (default False), wdyt?

-def list_custom_source_definitions() -> list[dict[str, Any]]:
+def list_custom_source_definitions(include_manifest: bool = False) -> list[dict[str, Any]]:
@@
-    return [
-        {
+    return [
+        {
             "definition_id": d.definition_id,
             "name": d.name,
             "connector_type": d.connector_type,
-            "manifest": d.manifest,
+            **({"manifest": d.manifest} if include_manifest else {}),
             "version": d.version,
         }
         for d in definitions
     ]

581-622: Also accept dict for update manifests

Same rationale as publish: accept dict for manifest_yaml to avoid forcing YAML strings/paths, wdyt?

-    manifest_yaml: Annotated[
-        str | Path,
+    manifest_yaml: Annotated[
+        str | Path | dict,
         Field(
-            description="New manifest as YAML string or file path.",
+            description="New manifest as dict, YAML string, or file path.",
         ),
     ],
airbyte/cloud/workspaces.py (2)

523-549: Guard manifest shape regardless of pre_validate

If manifest_yaml parses to a non-dict and pre_validate=False, we’ll send an invalid payload to the API. Should we ensure dict shape always and only skip field-level validation when pre_validate=False, wdyt?

         if is_yaml:
             manifest_dict: dict[str, Any]
             if isinstance(manifest_yaml, Path):
                 manifest_dict = yaml.safe_load(manifest_yaml.read_text())
             elif isinstance(manifest_yaml, str):
                 manifest_dict = yaml.safe_load(manifest_yaml)
             elif manifest_yaml is not None:
                 manifest_dict = manifest_yaml
             else:
                 raise exc.PyAirbyteInputError(
                     message="manifest_yaml is required for YAML connectors",
                     context={"name": name},
                 )
 
-            if pre_validate:
-                api_util.validate_yaml_manifest(manifest_dict, raise_on_error=True)
+            if not isinstance(manifest_dict, dict):
+                raise exc.PyAirbyteInputError(
+                    message="Manifest must be a dictionary",
+                    context={"manifest": manifest_dict},
+                )
+            if pre_validate:
+                api_util.validate_yaml_manifest(manifest_dict, raise_on_error=True)

This keeps yaml.safe_load usage consistent with PyYAML best practices. Based on learnings


512-522: Short-circuit Docker path before uniqueness check

With docker_image provided, unique=True triggers list_custom_source_definitions("docker") which raises NotImplementedError. Would we prefer an earlier, clearer NotImplementedError before the list call, wdyt?

-        if unique:
+        if is_docker:
+            raise NotImplementedError(
+                "Docker custom source definitions are not yet supported. "
+                "Only YAML manifest-based custom sources are currently available."
+            )
+
+        if unique:
             existing = self.list_custom_source_definitions(
                 name=name,
-                custom_connector_type="yaml" if is_yaml else "docker",
+                custom_connector_type="yaml",
             )
airbyte/_util/api_util.py (1)

938-974: Consider expanding validation (optional)

Current checks (dict, required fields, type) are solid. Shall we optionally extend with basic structural checks (e.g., presence of streams/spec/check keys) behind a stricter flag in the future, wdyt?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7313e98 and a014f2a.

📒 Files selected for processing (5)
  • airbyte/_util/api_util.py (1 hunks)
  • airbyte/cloud/connectors.py (2 hunks)
  • airbyte/cloud/workspaces.py (2 hunks)
  • airbyte/mcp/cloud_ops.py (3 hunks)
  • tests/integration_tests/cloud/test_custom_definitions.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
airbyte/_util/api_util.py (3)
airbyte/cloud/connectors.py (3)
  • manifest (308-314)
  • name (113-121)
  • name (301-305)
airbyte/exceptions.py (2)
  • PyAirbyteInputError (201-210)
  • AirbyteError (432-447)
airbyte/secrets/base.py (1)
  • SecretString (38-143)
airbyte/cloud/workspaces.py (4)
airbyte/_util/api_util.py (5)
  • validate_yaml_manifest (938-974)
  • create_custom_yaml_source_definition (977-1009)
  • list_custom_yaml_source_definitions (1012-1037)
  • get_custom_yaml_source_definition (1040-1067)
  • delete_custom_yaml_source_definition (1105-1124)
airbyte/cloud/connectors.py (5)
  • CloudCustomSourceDefinition (261-498)
  • name (113-121)
  • name (301-305)
  • manifest (308-314)
  • _from_yaml_response (486-498)
airbyte/mcp/cloud_ops.py (2)
  • publish_custom_source_definition (505-555)
  • list_custom_source_definitions (558-578)
airbyte/exceptions.py (2)
  • PyAirbyteInputError (201-210)
  • AirbyteDuplicateResourcesError (513-517)
airbyte/cloud/connectors.py (3)
airbyte/cloud/workspaces.py (3)
  • CloudWorkspace (64-642)
  • workspace_url (82-84)
  • permanently_delete_custom_source_definition (618-642)
airbyte/_util/api_util.py (3)
  • get_custom_yaml_source_definition (1040-1067)
  • validate_yaml_manifest (938-974)
  • update_custom_yaml_source_definition (1070-1102)
airbyte/exceptions.py (2)
  • workspace_url (442-447)
  • PyAirbyteInputError (201-210)
tests/integration_tests/cloud/test_custom_definitions.py (5)
airbyte/cloud/workspaces.py (5)
  • CloudWorkspace (64-642)
  • publish_custom_source_definition (461-553)
  • list_custom_source_definitions (555-586)
  • get_custom_source_definition (588-616)
  • permanently_delete_custom_source_definition (618-642)
tests/integration_tests/cloud/conftest.py (1)
  • cloud_workspace (91-102)
airbyte/cloud/connectors.py (5)
  • name (113-121)
  • name (301-305)
  • manifest (308-314)
  • version (317-323)
  • update_definition (379-446)
airbyte/_util/text_util.py (1)
  • generate_random_suffix (14-22)
airbyte/exceptions.py (1)
  • PyAirbyteInputError (201-210)
airbyte/mcp/cloud_ops.py (2)
airbyte/cloud/workspaces.py (4)
  • publish_custom_source_definition (461-553)
  • CloudWorkspace (64-642)
  • list_custom_source_definitions (555-586)
  • get_custom_source_definition (588-616)
airbyte/cloud/connectors.py (5)
  • name (113-121)
  • name (301-305)
  • version (317-323)
  • manifest (308-314)
  • update_definition (379-446)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (2)
airbyte/mcp/cloud_ops.py (1)

641-643: Tool registrations look good

Registering the three new tools completes the surface area. LGTM.

airbyte/_util/api_util.py (1)

1070-1103: LGTM on update helper

The update flow aligns with the other YAML helpers and error handling is consistent. Nice.

Copy link

github-actions bot commented Oct 11, 2025

PyTest Results (Fast Tests Only, No Creds)

304 tests  ±0   304 ✅ ±0   4m 23s ⏱️ ±0s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 63fcbbd. ± Comparison against base commit 7313e98.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Oct 11, 2025

PyTest Results (Full)

370 tests  +2   354 ✅ +2   19m 9s ⏱️ - 1m 28s
  1 suites ±0    16 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 63fcbbd. ± Comparison against base commit 7313e98.

♻️ This comment has been updated with latest results.

def list_custom_source_definitions(
self,
*,
name: str | None = None,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin, delete the name filter implementation in this method only.

- Remove name parameter from list_custom_source_definitions method
- Update uniqueness check in publish_custom_source_definition to filter results after fetching
- As requested in PR #827 review feedback

Co-Authored-By: AJ Steers <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
airbyte/cloud/workspaces.py (3)

528-534: Unreachable else clause, wdyt about removing it?

Since is_yaml = manifest_yaml is not None (line 491), if we're inside the if is_yaml: block, manifest_yaml cannot be None. The else clause at lines 531-534 is unreachable.

Consider simplifying to:

 if is_yaml:
     manifest_dict: dict[str, Any]
     if isinstance(manifest_yaml, Path):
         manifest_dict = yaml.safe_load(manifest_yaml.read_text())
     elif isinstance(manifest_yaml, str):
         manifest_dict = yaml.safe_load(manifest_yaml)
-    elif manifest_yaml is not None:
+    else:
         manifest_dict = manifest_yaml
-    else:
-        raise exc.PyAirbyteInputError(
-            message="manifest_yaml is required for YAML connectors",
-            context={"name": name},
-        )

524-527: Consider wrapping file I/O and YAML parsing errors for better UX, wdyt?

When reading from a Path or parsing YAML strings, several exceptions can occur (FileNotFoundError, PermissionError, yaml.YAMLError) that aren't wrapped in PyAirbyteInputError. This could lead to less user-friendly error messages.

Consider adding error handling:

 if isinstance(manifest_yaml, Path):
-    manifest_dict = yaml.safe_load(manifest_yaml.read_text())
+    try:
+        manifest_dict = yaml.safe_load(manifest_yaml.read_text())
+    except FileNotFoundError:
+        raise exc.PyAirbyteInputError(
+            message=f"Manifest file not found: {manifest_yaml}",
+            context={"path": str(manifest_yaml)},
+        ) from None
+    except (OSError, PermissionError) as e:
+        raise exc.PyAirbyteInputError(
+            message=f"Failed to read manifest file: {e}",
+            context={"path": str(manifest_yaml)},
+        ) from e
+    except yaml.YAMLError as e:
+        raise exc.PyAirbyteInputError(
+            message=f"Invalid YAML in manifest file: {e}",
+            context={"path": str(manifest_yaml)},
+        ) from e
 elif isinstance(manifest_yaml, str):
-    manifest_dict = yaml.safe_load(manifest_yaml)
+    try:
+        manifest_dict = yaml.safe_load(manifest_yaml)
+    except yaml.YAMLError as e:
+        raise exc.PyAirbyteInputError(
+            message=f"Invalid YAML in manifest string: {e}",
+            context={"manifest_preview": manifest_yaml[:100]},
+        ) from e

614-638: Minor style inconsistency with else clause, wdyt?

The implementation correctly deletes YAML custom source definitions. However, this method uses else: (line 634) to handle the Docker case, while the other methods explicitly check custom_connector_type == "docker".

For consistency, consider:

-    else:
+    elif custom_connector_type == "docker":
         raise NotImplementedError(
             "Docker custom source definitions are not yet supported. "
             "Only YAML manifest-based custom sources are currently available."
         )
+    else:
+        raise exc.PyAirbyteInputError(
+            message=f"Unknown custom_connector_type: {custom_connector_type}",
+            context={"custom_connector_type": custom_connector_type},
+        )

This also adds validation for unexpected values, which the Literal type hint should prevent but defensive programming doesn't hurt.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a014f2a and fea8293.

📒 Files selected for processing (1)
  • airbyte/cloud/workspaces.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/cloud/workspaces.py (3)
airbyte/_util/api_util.py (5)
  • validate_yaml_manifest (938-974)
  • create_custom_yaml_source_definition (977-1009)
  • list_custom_yaml_source_definitions (1012-1037)
  • get_custom_yaml_source_definition (1040-1067)
  • delete_custom_yaml_source_definition (1105-1124)
airbyte/cloud/connectors.py (5)
  • CloudCustomSourceDefinition (261-498)
  • name (113-121)
  • name (301-305)
  • manifest (308-314)
  • _from_yaml_response (486-498)
airbyte/mcp/cloud_ops.py (2)
  • publish_custom_source_definition (505-555)
  • list_custom_source_definitions (558-578)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (2)
airbyte/cloud/workspaces.py (2)

554-582: LGTM!

The implementation correctly handles listing YAML custom source definitions and clearly communicates that Docker support is not yet available. The use of custom_connector_type as a required parameter makes the API explicit and future-proof.


584-612: LGTM!

The implementation correctly retrieves a YAML custom source definition by ID and follows the same pattern as the list method. Error handling and messaging are clear.

Comment on lines +512 to +520
if unique:
existing = self.list_custom_source_definitions(
custom_connector_type="yaml" if is_yaml else "docker",
)
if any(d.name == name for d in existing):
raise exc.AirbyteDuplicateResourcesError(
resource_type="custom_source_definition",
resource_name=name,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Docker uniqueness check will fail with confusing error, wdyt?

When unique=True and a Docker connector is specified, the code calls list_custom_source_definitions(custom_connector_type="docker"), which immediately raises NotImplementedError. Users will see "Docker custom source definitions are not yet supported" instead of the intended validation logic.

Consider skipping the uniqueness check for Docker connectors since they're not yet supported:

 if unique:
+    if is_docker:
+        # Skip uniqueness check for Docker connectors since they're not yet supported
+        pass
+    else:
         existing = self.list_custom_source_definitions(
             custom_connector_type="yaml" if is_yaml else "docker",
         )
         if any(d.name == name for d in existing):
             raise exc.AirbyteDuplicateResourcesError(
                 resource_type="custom_source_definition",
                 resource_name=name,
             )

Alternatively, raise the NotImplementedError earlier before attempting the uniqueness check.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if unique:
existing = self.list_custom_source_definitions(
custom_connector_type="yaml" if is_yaml else "docker",
)
if any(d.name == name for d in existing):
raise exc.AirbyteDuplicateResourcesError(
resource_type="custom_source_definition",
resource_name=name,
)
if unique:
if is_docker:
# Skip uniqueness check for Docker connectors since they're not yet supported
pass
else:
existing = self.list_custom_source_definitions(
custom_connector_type="yaml" if is_yaml else "docker",
)
if any(d.name == name for d in existing):
raise exc.AirbyteDuplicateResourcesError(
resource_type="custom_source_definition",
resource_name=name,
)
🤖 Prompt for AI Agents
In airbyte/cloud/workspaces.py around lines 512-520 the uniqueness check calls
list_custom_source_definitions(custom_connector_type="docker") which raises
NotImplementedError and surfaces a confusing "Docker custom source definitions
are not yet supported" instead of performing validation; change the logic so
that when unique is True you only perform the existing-name check for YAML
connectors (is_yaml True) and skip the uniqueness check for Docker connectors,
or alternatively detect Docker earlier and raise NotImplementedError before
attempting to list; implement the simpler fix: guard the
list_custom_source_definitions call behind if unique and is_yaml, leaving Docker
paths to either skip uniqueness validation or explicitly raise a clear
NotImplementedError before this block.

…_definitions

- Remove name parameter from list_custom_source_definitions call in test
- Add manual filtering to find definition by name
- Fixes test failure after removing name filter from list method

Co-Authored-By: AJ Steers <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant