Skip to content

Conversation

aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Aug 26, 2025

Summary by CodeRabbit

  • New Features

    • Per-call execution-mode override (auto, docker, python, yaml) for multiple connector tooling calls; auto selects Docker when available.
    • Centralized source resolver ensures connector executors are prepared before use and calls ensure-installation.
  • Bug Fixes

    • Uninstall now removes the referenced Docker image and suppresses removal errors.
    • Install for managed Docker executors is now a no-op.
  • Breaking Changes

    • Executor construction now requires an explicit Docker image identifier.

Important

Auto-merge enabled.

This PR is set to merge automatically when all requirements are met.

Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@feat/mcp/allow-override-execution-mode' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@feat/mcp/allow-override-execution-mode'

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test-pr - Runs tests with the updated PyAirbyte

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

Copy link
Contributor

coderabbitai bot commented Aug 26, 2025

📝 Walkthrough

Walkthrough

DockerExecutor now requires explicit name and image_name_full; install() is a no-op, uninstall() removes the Docker image via docker rmi <image_name_full> (errors suppressed), and ensure_installation drops the keyword-only constraint. get_connector_executor forwards image_name_full. MCP functions gain a centralized _get_mcp_source and an override_execution_mode parameter.

Changes

Cohort / File(s) Summary
Docker executor API and behavior
airbyte/_executors/docker.py
DockerExecutor.__init__ now requires name and image_name_full and stores self.image_name_full; imports updated (subprocess, suppress); install() is now a no-op (returns None); uninstall() runs docker rmi <image_name_full> while suppressing subprocess.CalledProcessError; ensure_installation(auto_fix: bool = True) is no longer keyword-only.
Executor util wiring
airbyte/_executors/util.py
get_connector_executor now passes image_name_full=docker_image to DockerExecutor when constructing a Docker executor.
MCP source resolver and mode override
airbyte/mcp/_local_ops.py
Adds _get_mcp_source(...) to select and instantiate a Source based on override_execution_mode (auto/docker/python/yaml) with Docker auto-detect; the resolver calls source.executor.ensure_installation(); multiple public MCP functions gain an override_execution_mode parameter (default "auto") and now call _get_mcp_source(...) instead of get_source(...); adds Source typing/imports.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Caller
    participant LocalOps as mcp/_local_ops.py
    participant Resolver as _get_mcp_source
    participant DockerEnv as Docker
    participant Source

    Caller->>LocalOps: call function(..., override_execution_mode="auto")
    LocalOps->>Resolver: _get_mcp_source(connector_name, override_execution_mode)
    alt override_execution_mode == "auto"
        Resolver->>DockerEnv: check docker availability
        alt Docker available
            Resolver->>Source: get_source(mode="docker", image_name_full=...)
        else
            Resolver->>Source: get_source(mode="python" or "yaml")
        end
    else mode in {"docker","python","yaml"}
        Resolver->>Source: get_source(mode=override_execution_mode)
    else unknown
        Resolver-->>LocalOps: raise ValueError
    end
    Resolver->>Source: source.executor.ensure_installation(auto_fix=True)
    Resolver-->>LocalOps: return Source
    LocalOps->>Source: perform operation (validate/list/read/sync)
    Source-->>LocalOps: result
    LocalOps-->>Caller: return result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/mcp/allow-override-execution-mode

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbit in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbit in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbit gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbit read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbit help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbit ignore or @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbit summary or @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbit or @coderabbitai title anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (9)
airbyte/_executors/util.py (2)

306-311: Pass target_version to DockerExecutor for clearer reporting and parity with Python installs

Since docker_image is normalized to always include a tag (Lines 279-281), could we also pass the parsed tag as target_version so downstream logging/telemetry reflects the intended version, wdyt?

Apply this diff within the return block:

         return DockerExecutor(
             name=name,
+            target_version=(docker_image.split(":", 1)[1] if ":" in docker_image else None),
             image_name_full=docker_image,
             executable=docker_cmd,
             volumes=volumes,
         )

282-291: Ensure host_temp_dir is always a Path

TEMP_DIR_OVERRIDE may be a str; volumes expects Path keys and map_cli_args checks Path(...).is_relative_to(...). To avoid accidental str keys, would you wrap the override in Path(...) when set, wdyt?

Example outside this hunk:

host_temp_dir = Path(TEMP_DIR_OVERRIDE) if TEMP_DIR_OVERRIDE else Path(tempfile.gettempdir())
airbyte/_executors/docker.py (3)

36-55: Consider distinguishing “Docker not found” from “spec failed” for better UX

Currently any failure during spec execution becomes AirbyteConnectorExecutableNotFoundError, which can be misleading if docker exists but the spec command fails for connector reasons. Would you catch FileNotFoundError to raise the “executable not found” case, and otherwise surface a more specific installation/execution error, wdyt?

Possible refinement sketch:

-        try:
-            assert (shutil.which("docker") is not None), "Docker couldn't be found on your system. Please Install it."
-            self.execute(["spec"])
-        except Exception as e:
-            raise exc.AirbyteConnectorExecutableNotFoundError(
-                connector_name=self.name,
-            ) from e
+        if shutil.which("docker") is None:
+            raise exc.AirbyteConnectorExecutableNotFoundError(connector_name=self.name)
+        try:
+            self.execute(["spec"])
+        except Exception as e:
+            raise exc.AirbyteConnectorInstallationError(
+                message="Failed to execute 'spec' with Docker image.",
+                connector_name=self.name,
+                context={"image": self.image_name_full},
+            ) from e

56-63: Optionally pull the image during install()

Since install() is invoked by higher-level APIs, would you consider best-effort docker pull to prime the image cache (guarded and non-fatal), wdyt?

 def install(self) -> None:
     """Install the connector.
 
     For docker images, for now this is a no-op. In the future we might
     pull the Docker image in this step.
     """
-    pass
+    with suppress(subprocess.CalledProcessError, FileNotFoundError):
+        subprocess.check_output(["docker", "pull", self.image_name_full])

64-75: Also suppress FileNotFoundError and add debug log on uninstall

If Docker isn’t installed, uninstall will raise FileNotFoundError. Shall we suppress it as well and log at debug to keep behavior quiet but traceable, wdyt?

     def uninstall(self) -> None:
         """Uninstall the connector.
 
         For docker images, this operation runs an `docker rmi` command to remove the image.
 
         We suppress any errors that occur during the removal process.
         """
-        with suppress(subprocess.CalledProcessError):
-            subprocess.check_output(
-                ["docker", "rmi", self.image_name_full],
-            )
+        with suppress(subprocess.CalledProcessError, FileNotFoundError):
+            logger.debug("Attempting to remove Docker image: %s", self.image_name_full)
+            subprocess.check_output(["docker", "rmi", self.image_name_full])
airbyte/mcp/_local_ops.py (4)

19-21: Avoid importing Source at runtime; keep it under TYPE_CHECKING to reduce import-time coupling

You import Source both at runtime (Line 19) and under TYPE_CHECKING (Lines 24-26). Keeping only the TYPE_CHECKING import avoids potential circulars and speeds import. Shall we remove the runtime import, wdyt?

-from airbyte.sources.base import Source

Optionally add at file top (outside this hunk) to ease forward refs:

from __future__ import annotations

174-178: Signature update LGTM; consider documenting the “auto” behavior

Would you add a short note in the docstring that “auto” prefers Docker when available (or whatever heuristic we choose), so tool users aren’t surprised, wdyt?


390-393: Signature update LGTM

Same note as earlier: brief docstring note about the “auto” behavior might help MCP consumers, wdyt?


428-431: Minor: consider including execution mode in the sync summary

Would including the effective execution mode in the summary help with diagnostics (e.g., “Synced via Docker image …” or “via Python venv …”), wdyt?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between be50b70 and 93da9b4.

📒 Files selected for processing (3)
  • airbyte/_executors/docker.py (3 hunks)
  • airbyte/_executors/util.py (1 hunks)
  • airbyte/mcp/_local_ops.py (8 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
airbyte/mcp/_local_ops.py (4)
airbyte/sources/base.py (1)
  • Source (67-969)
airbyte/_util/meta.py (1)
  • is_docker_installed (168-169)
airbyte/_executors/docker.py (1)
  • install (56-62)
airbyte/_connector_base.py (1)
  • install (375-381)
airbyte/_executors/docker.py (5)
airbyte/_connector_base.py (3)
  • name (84-86)
  • install (375-381)
  • uninstall (383-389)
airbyte/_executors/python.py (2)
  • install (112-190)
  • uninstall (99-103)
airbyte/_executors/base.py (2)
  • install (231-232)
  • uninstall (235-236)
airbyte/_executors/declarative.py (2)
  • install (122-124)
  • uninstall (126-128)
airbyte/_executors/local.py (2)
  • install (48-52)
  • uninstall (54-58)
🪛 GitHub Actions: Run Linters
airbyte/mcp/_local_ops.py

[error] 63-63: poetry run mypy .: Name "source" already defined on line 57 [no-redef]

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (No Creds)
🔇 Additional comments (11)
airbyte/_executors/docker.py (1)

22-35: Constructor change looks good; explicit image_name_full improves clarity

Requiring image_name_full and storing it for lifecycle ops is a solid improvement. No issues spotted here. 👍

airbyte/mcp/_local_ops.py (10)

99-103: New override_execution_mode parameter: API shape looks good

The addition of override_execution_mode with a safe default of "auto" reads well for MCP callers. No issues from a typing perspective.


109-112: Centralizing source resolution via _get_mcp_source is a nice cleanup

Shifting responsibility here simplifies the tool. Once the mypy fix lands in _get_mcp_source, this callsite should be good. 👍


183-186: Use of _get_mcp_source here is consistent and reduces duplication

Looks good and keeps selection logic centralized.


214-218: Signature update LGTM

No concerns; the override parameter is consistent across tools.


220-223: Centralized resolver usage LGTM

Good reuse here.


256-260: Signature update LGTM

Consistent parameterization across tools is appreciated.


263-266: Centralized resolver usage LGTM

All good here.


316-320: Signature update LGTM

Uniform override parameter across the API is great for discoverability.


327-330: Centralized resolver usage LGTM

No issues.


396-398: Centralized resolver usage LGTM

Looks consistent.

Copy link

github-actions bot commented Aug 26, 2025

PyTest Results (Fast Tests Only, No Creds)

301 tests  ±0   301 ✅ ±0   4m 20s ⏱️ +3s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 4ea78c0. ± Comparison against base commit 721f26c.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
airbyte/mcp/_local_ops.py (1)

113-115: Avoid leaking secrets or internals in user-visible error payloads

Several error paths return ex!r, ex!s, and full tracebacks directly to the caller. Connector errors can sometimes include config echoes or environment values, which risks secret exposure. Could we sanitize user-facing messages and send details to stderr/logs only, wdyt?

Here’s a minimal redaction-first approach:

@@
-    except Exception as ex:
-        return False, f"Failed to get connector '{connector_name}': {ex}"
+    except Exception:
+        tb_str = traceback.format_exc()
+        print(tb_str, file=sys.stderr)
+        return False, f"Failed to get connector '{connector_name}'. See server logs for details."
@@
-    except Exception as ex:
-        return False, f"Failed to resolve configuration for {connector_name}: {ex}"
+    except Exception:
+        tb_str = traceback.format_exc()
+        print(tb_str, file=sys.stderr)
+        return False, f"Failed to resolve configuration for {connector_name}. See server logs for details."
@@
-    except Exception as ex:
-        return False, f"Configuration for {connector_name} is invalid: {ex}"
+    except Exception:
+        tb_str = traceback.format_exc()
+        print(tb_str, file=sys.stderr)
+        return False, f"Configuration for {connector_name} is invalid. See server logs for details."
@@
-    except Exception as ex:
-        tb_str = traceback.format_exc()
-        # If any error occurs, we print the error message to stderr and return an empty list.
-        return (
-            f"Error reading records from source '{source_connector_name}': {ex!r}, {ex!s}\n{tb_str}"
-        )
+    except Exception:
+        tb_str = traceback.format_exc()
+        print(tb_str, file=sys.stderr)
+        return f"Error reading records from source '{source_connector_name}'. See server logs for details."
@@
-    except Exception as ex:
-        tb_str = traceback.format_exc()
-        return {
-            "ERROR": f"Error getting stream previews from source '{source_name}': "
-            f"{ex!r}, {ex!s}\n{tb_str}"
-        }
+    except Exception:
+        tb_str = traceback.format_exc()
+        print(tb_str, file=sys.stderr)
+        return {"ERROR": f"Error getting stream previews from source '{source_name}'. See server logs for details."}

If a redaction helper exists (e.g., to mask values that match the current config), we could use that to still include limited context safely.

Also applies to: 123-125, 127-130, 280-285, 355-359

♻️ Duplicate comments (1)
airbyte/mcp/_local_ops.py (1)

47-54: Heuristics parity for 'auto' across entry points?

Do we want auto here to mirror get_connector_executor’s registry-based ordering (prefer local/YAML/Python first, then Docker) for parity, or keep “prefer Docker if present”? If parity is preferred, I can sketch a small helper to centralize the heuristic, wdyt?

🧹 Nitpick comments (5)
airbyte/mcp/_local_ops.py (5)

80-82: Consider disabling auto-fix installs to avoid heavy side effects in MCP context

Unconditionally calling ensure_installation() may trigger installs/pulls (depending on executor semantics). For MCP tools, would you prefer to avoid mutations and surface a friendly error instead? If so, we could pass auto_fix=False. Docker’s implementation ignores this flag anyway, so it’s safe there.

-    source.executor.ensure_installation()
+    source.executor.ensure_installation(auto_fix=False)

Alternatively, we could make this opt-in via an env var (e.g., AB_MCP_AUTO_FIX=1) or a function param, wdyt?


220-231: Optional: make get_source_stream_json_schema resilient to bad input

This path raises directly if the stream doesn’t exist or config is invalid, whereas other tools return user-friendly messages. Would you like to wrap this with a try/except and return an error dict/string for consistency across MCP tools, or keep the current behavior, wdyt?


272-277: Nit: prefer explicit keyword for readability when calling get_records

get_records’s first arg is the stream name, but using the keyword aids readability, especially with neighbors like limit and stop_event. Want to switch to get_records(stream=stream_name)?

-        record_generator = source.get_records(stream_name)
+        record_generator = source.get_records(stream=stream_name)

Also applies to: 274-274


460-465: Optional: avoid assuming DuckDB-specific attributes in describe_default_cache

This method always reads db_path, which is DuckDB-specific. If a different default cache is configured, this will error. Shall we gate or gracefully omit fields when not present?

     return {
         "cache_type": type(cache).__name__,
         "cache_dir": str(cache.cache_dir),
-        "cache_db_path": str(Path(cache.db_path).absolute()),
+        "cache_db_path": str(Path(cache.db_path).absolute()) if getattr(cache, "db_path", None) else None,
         "cached_streams": list(cache.streams.keys()),
     }

161-166: Nit: unify parameter naming for the connector name

We use connector_name, source_connector_name, and source_name across tools. Would standardizing on one (e.g., connector_name) help API ergonomics for MCP clients, or do you prefer the current explicitness per tool, wdyt?

Also applies to: 292-299, 86-92

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a42cebe and fc0b862.

📒 Files selected for processing (1)
  • airbyte/mcp/_local_ops.py (8 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/mcp/_local_ops.py (3)
airbyte/sources/base.py (1)
  • Source (67-969)
airbyte/_util/meta.py (1)
  • is_docker_installed (168-169)
airbyte/_executors/docker.py (1)
  • ensure_installation (36-54)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (No Creds)
🔇 Additional comments (3)
airbyte/mcp/_local_ops.py (3)

55-68: Nice: single annotation for source avoids mypy no-redef

Declaring source: Source once and assigning in branches resolves the prior mypy complaint and keeps the code clean. LGTM. Do we also want to add a return type comment in the docstring (e.g., “Returns: Source”) for quick readers, wdyt?


99-103: LGTM: Consistent override_execution_mode plumbed through all MCP tools

The new parameter is documented consistently and correctly forwarded to _get_mcp_source across tools. This should make execution-mode overrides straightforward for callers. Nice touch keeping the default as "auto".

Also applies to: 183-186, 214-218, 220-223, 256-260, 263-266, 316-320, 327-330, 389-393, 395-398


19-19: Import of Source for return typing looks good

This keeps type clarity without pulling heavy deps at runtime. Looks good to me.

Copy link

github-actions bot commented Aug 26, 2025

PyTest Results (Full)

364 tests  ±0   348 ✅ ±0   20m 52s ⏱️ + 2m 20s
  1 suites ±0    16 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 4ea78c0. ± Comparison against base commit 721f26c.

♻️ This comment has been updated with latest results.

@aaronsteers aaronsteers enabled auto-merge (squash) August 28, 2025 02:05
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
airbyte/mcp/_local_ops.py (1)

56-62: Heuristics parity: should 'auto' mirror get_connector_executor ordering?

Currently, 'auto' prefers Docker-when-installed, else defers to get_source defaults. For parity across entry points, do we want 'auto' to mirror get_connector_executor’s registry-first ordering (e.g., local/registry → YAML → Python → Docker) rather than privileging Docker, or is the current bias intentional, wdyt?

🧹 Nitpick comments (3)
airbyte/mcp/_local_ops.py (3)

86-88: Avoid duplicate/expensive spec calls caused by ensure_installation()

DockerExecutor.ensure_installation() runs execute(["spec"]) (see airbyte/_executors/docker.py), and most call sites here immediately access source.config_spec, which runs spec again. This doubles startup latency. Since every public tool below reads spec soon after, can we drop this call here and rely on the subsequent spec access, wdyt?

Apply this diff to remove the redundant call:

-    # Ensure installed:
-    source.executor.ensure_installation()
+    # Defer installation verification to the first spec/check/read call to avoid duplicate work.

105-109: Deduplicate execution-mode typing and keep Literal order consistent

We repeat Literal["docker", "python", "yaml", "auto"] across many signatures (with a different order in _get_mcp_source). Shall we introduce a shared alias to reduce drift and enforce consistency, wdyt?

Example alias (place near imports):

from typing import Literal
ExecutionMode = Literal["auto", "docker", "python", "yaml"]

Then update each signature, e.g.:

-override_execution_mode: Annotated[Literal["docker", "python", "yaml", "auto"], Field(...)] = "auto",
+override_execution_mode: Annotated[ExecutionMode, Field(...)] = "auto",

Also applies to: 180-184, 220-224, 262-266, 322-326, 395-399


19-19: Optional: move Source import under TYPE_CHECKING to trim import-time cost

If we enable postponed annotations (or use forward references), we can avoid importing airbyte.sources.base at runtime. Interested in this micro-optimization, wdyt?

Option A (preferred): add from __future__ import annotations and:

if TYPE_CHECKING:
    from airbyte.sources.base import Source

Option B: use forward refs "Source" where annotated.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 50e5ea4 and 117ea1f.

📒 Files selected for processing (1)
  • airbyte/mcp/_local_ops.py (8 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/mcp/_local_ops.py (3)
airbyte/sources/base.py (1)
  • Source (67-969)
airbyte/_util/meta.py (1)
  • is_docker_installed (168-169)
airbyte/_executors/docker.py (1)
  • ensure_installation (36-54)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (2)
airbyte/mcp/_local_ops.py (2)

47-54: Centralized source resolver + safe 'auto' fallback look solid

Nice consolidation. The mypy no-redef fix is clean, and the 'auto' path now falls back to library heuristics when Docker isn’t present — great DX improvement, wdyt?

Also applies to: 56-62, 68-85


115-118: Call-site cleanup reads well

Switching all direct get_source(...) usages to _get_mcp_source(...) keeps mode logic centralized and easier to reason about. Nice touch, wdyt?

Also applies to: 189-192, 226-229, 269-272, 333-336, 401-404

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
airbyte/mcp/_local_ops.py (1)

55-62: Nice: explicit 'auto' branch avoids falling through to ValueError when Docker is absent

This addresses the earlier “auto errors without Docker” concern. LGTM.

🧹 Nitpick comments (4)
airbyte/mcp/_local_ops.py (4)

19-19: Consider postponing type evaluation to avoid a runtime import of Source

Would we prefer to avoid importing Source at runtime (reducing import-time work and potential cycles) and keep it under TYPE_CHECKING by turning on postponed annotations, wdyt? Example:

+from __future__ import annotations
@@
-from airbyte.sources.base import Source
+if TYPE_CHECKING:
+    from airbyte.sources.base import Source

52-54: Auto mode currently forces Docker when present — consider delegating to library heuristics for parity

Right now, "auto" becomes "docker" whenever Docker is installed. For parity with other entry points (and to avoid picking Docker for connectors without Docker images), would you consider letting get_source(...) heuristics decide in auto mode without forcing Docker, wdyt?

-    if override_execution_mode == "auto" and is_docker_installed():
-        override_execution_mode = "docker"
+    # Let library heuristics resolve the best executor in 'auto' mode without forcing Docker.

86-88: Double-check the cost/necessity of always calling ensure_installation()

DockerExecutor.ensure_installation() runs a spec which can be non-trivial; many calls below will soon run discover anyway. Do we want to:

  • call ensure_installation() only when we need a fail-fast check (e.g., Docker path), or
  • document the extra round-trip so users aren’t surprised by the additional latency, wdyt?

109-113: Unify Literal order and clarify ‘auto’ semantics in param descriptions

For consistency/readability, would you align the Literal[...] order across functions (e.g., Literal["auto", "docker", "python", "yaml"]) and add a brief note that “auto uses Docker when available, otherwise falls back to library heuristics,” wdyt? Example:

-override_execution_mode: Annotated[
-    Literal["docker", "python", "yaml, "auto"],
-    Field(description="Optionally override the execution method to use for the connector."),
-] = "auto",
+override_execution_mode: Annotated[
+    Literal["auto", "docker", "python", "yaml"],
+    Field(description="Optionally override the execution method. In 'auto', Docker is used when available; otherwise, library heuristics choose the best available mode."),
+] = "auto",

Also applies to: 189-193, 234-238, 281-285, 346-350, 424-428

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 117ea1f and 4ea78c0.

📒 Files selected for processing (1)
  • airbyte/mcp/_local_ops.py (8 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/mcp/_local_ops.py (3)
airbyte/sources/base.py (1)
  • Source (67-969)
airbyte/_util/meta.py (1)
  • is_docker_installed (168-169)
airbyte/_executors/docker.py (1)
  • ensure_installation (36-54)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (1)
airbyte/mcp/_local_ops.py (1)

119-122: Call sites correctly consolidate on _get_mcp_source and thread through override

The refactor to a single resolver looks good and simplifies maintenance across tools.

Also applies to: 198-201, 240-243, 288-291, 357-360, 430-433

@aaronsteers aaronsteers merged commit 43e5af7 into main Aug 29, 2025
22 checks passed
@aaronsteers aaronsteers deleted the feat/mcp/allow-override-execution-mode branch August 29, 2025 06:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant