Skip to content

Conversation

aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Sep 16, 2025

fix(mcp): Auto-register .envrc files for secret management

Summary

Fixes an issue where MCP tools were prompting for credentials manually instead of automatically loading them from .envrc files. The problem was that PyAirbyte's default secret managers only looked for .env files, not .envrc files.

The fix adds automatic detection and registration of .envrc files in the MCP initialize_secrets() function. When a .envrc file exists in the current working directory, it gets registered as a DotenvSecretManager and loaded into the environment.

Before: MCP tools prompted for AIRBYTE_CLOUD_WORKSPACE_ID, AIRBYTE_CLOUD_CLIENT_ID, and AIRBYTE_CLOUD_CLIENT_SECRET even when they were defined in .envrc

After: MCP tools automatically load credentials from .envrc without prompting

Review & Testing Checklist for Human

  • Test with different working directories - Verify .envrc detection works when MCP tools are invoked from different directories
  • Test edge cases - Verify behavior with missing .envrc, malformed .envrc, or .envrc with syntax errors
  • Verify secret manager priority - Confirm that .envrc secrets have the intended precedence over other secret sources
  • Security review - Consider security implications of automatically loading .envrc files from the current directory
  • Test broader MCP functionality - Verify other MCP tools still work correctly (I only tested check_airbyte_cloud_workspace and list_deployed_cloud_connections)

Test Plan

  1. Run poe mcp-tool-test check_airbyte_cloud_workspace '{}' - should work without prompting
  2. Run poe mcp-tool-test list_deployed_cloud_connections '{}' - should work without prompting
  3. Test from different directories to verify path resolution
  4. Test with other MCP tools to ensure no regressions

Notes

  • This change affects all MCP operations since it modifies the core secret initialization
  • The fix registers the .envrc secret manager after custom dotenv managers but before GCP GSM
  • Both secret registration AND environment loading occur for .envrc files

Link to Devin run: https://app.devin.ai/sessions/3dbb42ea20ff42e59cb1a635753c2407
Requested by: @aaronsteers

Summary by CodeRabbit

  • New Features

    • Automatically loads secrets from a .envrc file in the current directory.
    • Adds MCP mode, enabling tailored behavior and telemetry when running the MCP server.
  • Improvements

    • More accurate detection of interactive environments.
    • Suppresses secret prompts in non-interactive runs to prevent hangs.
  • Telemetry

    • Includes an MCP mode flag in environment metadata.
  • Chores

    • Adjusted server startup to enable mode and initialize secrets in the correct order (no public API changes).

Important

Auto-merge enabled.

This PR is set to merge automatically when all requirements are met.

- Add automatic detection and registration of .envrc files in MCP initialize_secrets()
- Fixes issue where .envrc environment variables weren't being loaded in MCP mode
- Now MCP tools can access credentials from .envrc without manual prompting

Co-Authored-By: AJ Steers <[email protected]>
Copy link
Contributor

Original prompt from AJ Steers
@Devin - see if your new creds are working for PyAirbyte cloud operations MCP tools. Use the mcp-tool-test Poe command to confirm you can get workspace info and list connections. The secrets should be loaded automatically via .envrc - don't do anything, just try to run the tools with that poe task and let me know if it works.
Thread URL: https://airbytehq-team.slack.com/archives/D089P0UPVT4/p1758055021736529?thread_ts=1758055021.736529

Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1758055597-fix-envrc-loading-mcp' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1758055597-fix-envrc-loading-mcp'

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test-pr - Runs tests with the updated PyAirbyte

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

Copy link

github-actions bot commented Sep 16, 2025

PyTest Results (Fast Tests Only, No Creds)

301 tests  ±0   300 ✅ ±0   4m 38s ⏱️ +4s
  1 suites ±0     1 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit fe29937. ± Comparison against base commit 2295dd2.

♻️ This comment has been updated with latest results.

Copy link
Contributor

coderabbitai bot commented Sep 16, 2025

📝 Walkthrough

Walkthrough

Adds MCP mode plumbing and integrates it into startup, interactivity detection, and telemetry. Secret initialization now loads a .envrc in the CWD before other dotenv sources and disables PROMPT secrets when non-interactive. MCP server now enables MCP mode then initializes secrets. Telemetry records MCP mode.

Changes

Cohort / File(s) Summary
MCP secret initialization
airbyte/mcp/_util.py
Adds early .envrc loading via DotenvSecretManager in CWD, registers it before other dotenv handling; imports is_interactive, SecretSourceEnum, disable_secret_source; disables PROMPT source when non-interactive; no public API change to initialize_secrets.
Meta: MCP mode and interactivity
airbyte/_util/meta.py
Introduces _MCP_MODE_ENABLED flag with set_mcp_mode() and is_mcp_mode(); revises is_interactive() (no caching, CI/MCP short-circuit false; Colab/Jupyter true; TTY fallback; exception-safe); updates docstrings.
Telemetry: MCP flag
airbyte/_util/telemetry.py
Adds MCP environment flag in get_env_flags() using meta.is_mcp_mode(); included only when truthy.
MCP server bootstrap
airbyte/mcp/server.py
Moves initialize_secrets() from module scope into main(); calls set_mcp_mode() then initialize_secrets() before async run; imports set_mcp_mode; no signature changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant CLI as CLI/User
  participant Server as mcp/server.py
  participant Meta as _util/meta.py
  participant Secrets as mcp/_util.py
  participant Dotenv as DotenvSecretManager
  participant Telemetry as _util/telemetry.py

  CLI->>Server: main()
  Server->>Meta: set_mcp_mode()
  Server->>Secrets: initialize_secrets()
  Secrets->>Secrets: check CWD for .envrc
  alt .envrc present
    Secrets->>Dotenv: create manager for CWD/.envrc
    Secrets->>Dotenv: _load_dotenv_file()
    Secrets->>Secrets: register secret manager
  end
  Secrets->>Meta: is_interactive()
  alt non-interactive
    Secrets->>Secrets: disable_secret_source(PROMPT)
  end
  Note over Server,Secrets: Other dotenv/env sources initialized as before

  Server-->>CLI: start async server

  CLI->>Telemetry: get_env_flags()
  Telemetry->>Meta: is_mcp_mode()
  Telemetry-->>CLI: flags (includes mcp=true)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title accurately and concisely summarizes the primary changes in the changeset: adding MCP mode flags to meta/telemetry and auto-registering .envrc files as secret managers. It is directly related to the diff, specific about the main intent, and avoids vague phrasing or irrelevant details, so a reviewer scanning history will understand the primary change.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/1758055597-fix-envrc-loading-mcp

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
airbyte/mcp/_util.py (2)

41-47: Optional: search parent directories for .envrc and add a debug log

Many teams keep .envrc at the repo root and invoke tools from subdirs. Do we want to walk up to the git root (or first parent) to find the nearest .envrc and log which one was loaded for debuggability, wdyt?


20-21: Docstring clarity: not just “.env” files

We now also load .envrc. Shall we broaden the docstring to “dotenv-format (.env/.envrc) file” for clarity, wdyt?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2295dd2 and 284f09a.

📒 Files selected for processing (1)
  • airbyte/mcp/_util.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/mcp/_util.py (2)
airbyte/secrets/env_vars.py (1)
  • DotenvSecretManager (31-64)
airbyte/secrets/config.py (1)
  • register_secret_manager (45-60)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (No Creds)

Copy link

github-actions bot commented Sep 16, 2025

PyTest Results (Full)

364 tests  ±0   348 ✅ ±0   21m 10s ⏱️ - 1m 38s
  1 suites ±0    16 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit fe29937. ± Comparison against base commit 2295dd2.

♻️ This comment has been updated with latest results.

devin-ai-integration bot and others added 2 commits September 16, 2025 14:53
…track in telemetry

- Add is_mcp() function with multiple detection strategies (environment vars, FastMCP module, script name)
- Add enable_mcp_mode() helper function to set private _MCP_MODE_ENABLED constant upstream
- Modify is_interactive() to return False when in MCP mode, preventing SecretsPrompt from being added
- Add MCP flag to telemetry tracking in get_env_flags()
- Auto-register .envrc files for secret management in MCP mode
- Remove @lru_cache from is_interactive() to allow dynamic MCP detection
- Call enable_mcp_mode() early in MCP server initialization for proper timing

Co-Authored-By: AJ Steers <[email protected]>
…mplify logic

- Rename is_mcp() to is_mcp_mode() for clarity
- Rename enable_mcp_mode() to set_mcp_mode() for consistency
- Remove private _set_mcp_mode() function and inline logic
- Simplify is_mcp_mode() conditions to avoid false positives
- Update all references in telemetry.py and mcp/server.py

Addresses GitHub comments from aaronsteers on PR #781

Co-Authored-By: AJ Steers <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
airbyte/_util/meta.py (1)

35-59: MCP detection via fastmcp import may cause false positives; tighten or confirm intent?

is_mcp_mode() returns True if "fastmcp" is in sys.modules. Any code importing FastMCP (e.g., tests, helpers) would flip the library into MCP mode globally. Is that intentional, or should MCP mode be enabled only via set_mcp_mode()/server entrypoint/script-name? If we want stricter semantics, would you prefer to drop or further gate the fastmcp heuristic (e.g., require script name match or an env flag) — wdyt?

airbyte/_util/telemetry.py (1)

182-199: Avoid caching get_env_flags() to prevent stale MCP flag.

With @lru_cache, if get_env_flags() is called before set_mcp_mode(), the cached value will omit "MCP": True for the rest of the process. Since flags are cheap to compute and correctness matters more than micro‑perf here, would you drop the cache — wdyt?

Apply this diff:

-@lru_cache
 def get_env_flags() -> dict[str, Any]:
@@
-    # Drop these flags if value is False or None
+    # Drop these flags if value is False or None
     return {k: v for k, v in flags.items() if v is not None and v is not False}

Optionally also remove the unused import:

-from functools import lru_cache
airbyte/mcp/server.py (1)

9-23: Reduce import‑time side effects: move MCP enable + init into main().

Calling set_mcp_mode() and initialize_secrets() at import time flips global behavior for any importer (e.g., tests importing this module). Would you move these into main() and construct/register the app there to avoid global side effects — wdyt?

Apply this diff:

-from airbyte._util.meta import set_mcp_mode
+from airbyte._util.meta import set_mcp_mode
@@
-
-set_mcp_mode()
-initialize_secrets()
-
-app: FastMCP = FastMCP("airbyte-mcp")
-register_connector_registry_tools(app)
-register_local_ops_tools(app)
-register_cloud_ops_tools(app)
+def _build_app() -> FastMCP:
+    app = FastMCP("airbyte-mcp")
+    register_connector_registry_tools(app)
+    register_local_ops_tools(app)
+    register_cloud_ops_tools(app)
+    return app
@@
 def main() -> None:
     """Main entry point for the MCP server."""
     print("Starting Airbyte MCP server.", file=sys.stderr)
     try:
-        asyncio.run(app.run_stdio_async())
+        set_mcp_mode()
+        initialize_secrets()
+        app = _build_app()
+        asyncio.run(app.run_stdio_async())
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 284f09a and 18dfaf8.

📒 Files selected for processing (3)
  • airbyte/_util/meta.py (2 hunks)
  • airbyte/_util/telemetry.py (1 hunks)
  • airbyte/mcp/server.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
airbyte/_util/telemetry.py (1)
airbyte/_util/meta.py (1)
  • is_mcp_mode (48-57)
airbyte/mcp/server.py (2)
airbyte/_util/meta.py (1)
  • set_mcp_mode (38-45)
airbyte/mcp/_util.py (1)
  • initialize_secrets (30-56)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (1)
airbyte/_util/meta.py (1)

88-91: Good call: treat MCP/CI as non-interactive and remove caching.

Removing the cache avoids stale interactivity when MCP is enabled later, and short-circuiting on MCP/CI prevents prompts. Looks solid; just confirm no hot path relied on the old caching, wdyt?

@aaronsteers aaronsteers changed the title fix(mcp): Auto-register .envrc files for secret management fix(mcp): Add MCP mode flags to meta and telemetry modules, auto-register .envrc files as secret managers Sep 16, 2025
@aaronsteers aaronsteers changed the title fix(mcp): Add MCP mode flags to meta and telemetry modules, auto-register .envrc files as secret managers feat: add MCP mode flags to meta and telemetry modules, auto-register .envrc files as secret managers Sep 16, 2025
@aaronsteers aaronsteers enabled auto-merge (squash) September 16, 2025 22:12
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
airbyte/mcp/_util.py (2)

48-55: Apply intended precedence: custom dotenv > .envrc > GSM.

Suggest adding .envrc here and appending as backup to preserve explicit‑config priority. Also, consider explicitly checking is_file().

     if AIRBYTE_MCP_DOTENV_PATH_ENVVAR in os.environ:
         dotenv_path = Path(os.environ[AIRBYTE_MCP_DOTENV_PATH_ENVVAR]).absolute()
         custom_dotenv_secret_mgr = DotenvSecretManager(dotenv_path)
         _load_dotenv_file(dotenv_path)
         register_secret_manager(
             custom_dotenv_secret_mgr,
         )
+
+    # Load a local .envrc, if present, as a fallback after explicit dotenv paths.
+    envrc_path = Path.cwd() / ".envrc"
+    if envrc_path.is_file():
+        envrc_secret_mgr = DotenvSecretManager(envrc_path)
+        _load_dotenv_file(envrc_path)
+        register_secret_manager(
+            envrc_secret_mgr,
+            as_backup=True,  # ensure AIRBYTE_MCP_ENV_FILE retains precedence
+        )

56-63: Make Google GSM a fallback to keep local dotenvs authoritative.

Right now GSM is inserted at the head, which would override dotenvs. Shall we append as backup to achieve the intended chain (custom dotenv > .envrc > GSM), wdyt?

-        register_secret_manager(
-            GoogleGSMSecretManager(
-                project=get_secret("GCP_GSM_PROJECT_ID"),
-                credentials_json=get_secret("GCP_GSM_CREDENTIALS"),
-            )
-        )
+        register_secret_manager(
+            GoogleGSMSecretManager(
+                project=get_secret("GCP_GSM_PROJECT_ID"),
+                credentials_json=get_secret("GCP_GSM_CREDENTIALS"),
+            ),
+            as_backup=True,
+        )
♻️ Duplicate comments (1)
airbyte/mcp/_util.py (1)

40-47: .envrc precedence and PR summary mismatch.

Current code registers .envrc before the custom dotenv file and inserts it at the head, which risks precedence surprises and contradicts the PR summary (“after custom dotenv managers but before GCP GSM”). Could we move .envrc registration after the custom dotenv block and append it as a backup so explicit AIRBYTE_MCP_ENV_FILE wins?

-    envrc_path = Path.cwd() / ".envrc"
-    if envrc_path.exists():
-        envrc_secret_mgr = DotenvSecretManager(envrc_path)
-        _load_dotenv_file(envrc_path)
-        register_secret_manager(
-            envrc_secret_mgr,
-        )
+    # (moved below custom dotenv registration, appended as backup)

And add the block below (see next comment for full placement).

🧹 Nitpick comments (2)
airbyte/_util/meta.py (1)

34-37: Do we want is_ci() cached?

If CI can change during process lifetime (rare but possible in tests), caching could stale. Keeping it uncached is also cheap; want to drop @lru_cache here, or keep as‑is for perf, wdyt?

airbyte/mcp/_util.py (1)

27-35: Explicitly set override=False when loading dotenv files?

dotenv.load_dotenv defaults to override=False, but being explicit avoids accidental overrides if defaults change. Add override=False (and maybe encoding="utf-8") for clarity, wdyt?

-    dotenv.load_dotenv(dotenv_path=dotenv_path)
+    dotenv.load_dotenv(dotenv_path=dotenv_path, override=False, encoding="utf-8")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c2501a6 and eeff879.

📒 Files selected for processing (3)
  • airbyte/_util/meta.py (3 hunks)
  • airbyte/mcp/_util.py (3 hunks)
  • airbyte/mcp/server.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte/mcp/server.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-10-09T21:11:11.706Z
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#415
File: airbyte/cli.py:0-0
Timestamp: 2024-10-09T21:11:11.706Z
Learning: In `_resolve_source_job`, returning `None` when `config` is falsy distinguishes between empty config and not-yet-provided config, allowing `_resolve_config()` to handle non-null inputs effectively.

Applied to files:

  • airbyte/mcp/_util.py
🧬 Code graph analysis (1)
airbyte/mcp/_util.py (4)
airbyte/_util/meta.py (1)
  • is_interactive (85-99)
airbyte/secrets/env_vars.py (1)
  • DotenvSecretManager (31-64)
airbyte/secrets/base.py (1)
  • SecretSourceEnum (23-31)
airbyte/secrets/config.py (2)
  • register_secret_manager (45-60)
  • disable_secret_source (68-81)
🪛 GitHub Actions: Run Linters
airbyte/_util/meta.py

[error] 55-55: Ruff SIM103: Return the condition directly. Replace the if statement with 'return script_name and "airbyte-mcp" in script_name'.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
🔇 Additional comments (4)
airbyte/_util/meta.py (2)

39-47: MCP mode flag initializer looks good.

Idempotent, simple, and early‑init friendly. Nice.


85-99: Interactive detection order is sensible.

Short‑circuiting CI/MCP to non‑interactive prevents accidental prompts. Thanks for removing caching here.

airbyte/mcp/_util.py (2)

12-19: Imports and PROMPT disabling hook look right.

Good call importing is_interactive and wiring disable_secret_source into init flow.


65-67: Nice: disable PROMPT in non‑interactive runs.

This prevents blocking prompts under MCP/CI. LGTM.

- Fix SIM103 lint error in is_mcp_mode() by simplifying conditional logic
- Fix test_ci_environment_b_progress_style by clearing is_ci() cache
- Update docstring to clarify secret manager precedence

Co-Authored-By: AJ Steers <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
tests/unit_tests/test_progress.py (2)

164-164: Optional: prefer module-scope import for readability

Since we’re explicitly clearing the cache, importing meta at module scope would be simpler and avoids function-local imports. Would you consider moving the import to the top of the file and keeping the cache_clear calls as you’ve added, or do you prefer the local import to emphasize test isolation? Wdyt?


164-166: Add a finalizer to clear meta.is_ci cache after the test to prevent cross-test leakage

is_ci is decorated with @lru_cache in airbyte/_util/meta.py, so meta.is_ci.cache_clear exists — could we register a finalizer to always clear the cache after the test? Wdyt?

File: tests/unit_tests/test_progress.py
Lines: 164-166

 def test_ci_environment_b_progress_style(monkeypatch):
     """Test the style in a CI environment."""
     monkeypatch.setenv("CI", "1")
     from airbyte._util import meta

     meta.is_ci.cache_clear()
+    # Ensure cache is cleared after the test to avoid leakage across tests
+    monkeypatch.addfinalizer(meta.is_ci.cache_clear)

     progress = ProgressTracker(source=None, cache=None, destination=None)
     assert progress.style == ProgressStyle.PLAIN
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eeff879 and e509c21.

📒 Files selected for processing (3)
  • airbyte/_util/meta.py (3 hunks)
  • airbyte/mcp/_util.py (3 hunks)
  • tests/unit_tests/test_progress.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • airbyte/mcp/_util.py
  • airbyte/_util/meta.py
🧰 Additional context used
🧬 Code graph analysis (1)
tests/unit_tests/test_progress.py (1)
airbyte/_util/meta.py (1)
  • is_ci (35-36)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)

…licts

- Enhanced remove_postgres_container fixture to handle all container states
- Added dynamic port allocation when default port 5432 is in use
- Improved error handling for Docker container management
- Fixes test_sync_with_merge_to_postgres CI failure

Co-Authored-By: AJ Steers <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
tests/conftest.py (5)

117-124: Bind to loopback and drop listen() to satisfy CodeQL and reduce exposure

Binding to "" listens on all interfaces and triggers the CodeQL warning; also listen() isn’t needed just to reserve an ephemeral port. Wdyt?

-def find_free_port():
-    """Find a free port to use for PostgreSQL testing."""
-    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
-        s.bind(("", 0))
-        s.listen(1)
-        port = s.getsockname()[1]
-    return port
+def find_free_port():
+    """Find a free loopback port to use for PostgreSQL testing."""
+    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
+        s.bind(("127.0.0.1", 0))
+        port = s.getsockname()[1]
+    return port

128-147: Simplify container cleanup; avoid redundant scan and log when Docker is unavailable

After get(...).remove(), scanning all containers by name is redundant (names are unique). Also, can we log the DockerException at debug for easier diagnostics, wdyt?

-        for container in client.containers.list(all=True):
-            if container.name == PYTEST_POSTGRES_CONTAINER:
-                if container.status in ["running", "paused"]:
-                    container.stop(timeout=10)
-                container.remove(force=True)
-
-    except docker.errors.DockerException:
-        pass  # Docker not running, nothing to do.
+        # No further scan needed: names are unique and get()+remove() above is sufficient.
+
+    except docker.errors.DockerException as e:
+        logger.debug("Docker not running; skipping postgres container cleanup: %s", e)

149-151: Use a short connect timeout to make readiness polling snappier

A small connect_timeout keeps each failed attempt bounded and speeds up the overall wait loop. Wdyt?

-    pg_url = f"postgresql://postgres:postgres@{host}:{port}/postgres"
+    pg_url = f"postgresql://postgres:postgres@{host}:{port}/postgres?connect_timeout=1"

202-208: Avoid TOCTOU port race; let Docker pick an ephemeral loopback port and then read it

Choosing a “free” port and then starting the container has a race. Could we ask Docker to allocate a random host port on 127.0.0.1 and then read it from the container’s bindings, wdyt?

-    postgres_port = PYTEST_POSTGRES_PORT
-    if is_port_in_use(postgres_port):
-        postgres_port = find_free_port()
-        logger.info(
-            f"Port {PYTEST_POSTGRES_PORT} is in use, using port {postgres_port} instead"
-        )
+    # Let Docker allocate a free ephemeral port on loopback to avoid port races.
+    postgres_port = None  # will be populated from container bindings after start
-        ports={"5432/tcp": postgres_port},
+        ports={"5432/tcp": ("127.0.0.1", 0)},
         detach=True,
         remove=False,

Then, right after creating the container (before host detection), set postgres_port from the binding:

postgres.reload()
bindings = (postgres.attrs["NetworkSettings"]["Ports"] or {}).get("5432/tcp") or []
if not bindings:
    raise Exception("No 5432/tcp host binding found on postgres container.")
postgres_port = int(bindings[0]["HostPort"])

Also applies to: 218-221


246-249: Clarify error message: ‘host’ may be stale/ambiguous

On failure, ‘host’ could be the last candidate tried; the message may mislead. Prefer referencing “any tested host” and include the port, wdyt?

-    if final_host is None:
-        raise Exception(f"Failed to connect to the PostgreSQL database on host {host}.")
+    if final_host is None:
+        raise Exception(f"Failed to connect to the PostgreSQL database on any tested host (port {postgres_port}).")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e509c21 and 7aa4a42.

📒 Files selected for processing (1)
  • tests/conftest.py (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/conftest.py (1)
airbyte/caches/postgres.py (1)
  • PostgresCache (37-61)
🪛 GitHub Check: CodeQL
tests/conftest.py

[warning] 120-120: Binding a socket to all network interfaces
'' binds a socket to all interfaces.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
🔇 Additional comments (2)
tests/conftest.py (2)

196-199: LGTM on deterministic cleanup

Stopping if running/paused and then removing the prior container looks good.


257-257: LGTM — quick check: verify test usage of new_postgres_cache

rg found one hit: tests/integration_tests/test_source_test_fixture.py:823 — confirm that the test is receiving the fixture value via pytest injection (i.e., new_postgres_cache is a test parameter) or update that call site to use the fixture’s return (host, port) appropriately. wdyt?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
airbyte/_util/meta.py (1)

81-95: Swap CI/MCP check before Jupyter/Colab to prevent prompting in CI, wdyt?

Current order returns True for Jupyter/Colab before checking CI/MCP; flip the checks so CI/MCP always disable interactivity.

     try:
-        if is_colab() or is_jupyter():
-            return True
-
-        if is_ci() or is_mcp_mode():
-            return False
+        if is_ci() or is_mcp_mode():
+            return False
+
+        if is_colab() or is_jupyter():
+            return True
♻️ Duplicate comments (1)
airbyte/_util/meta.py (1)

48-55: Collapse to a single boolean return to satisfy Ruff SIM103.

We can return the condition directly and include the flag check in one expression. This also aligns with the prior lint feedback. Shall we apply the one-liner, wdyt?

 def is_mcp_mode() -> bool:
     """Return True if running in MCP (Model Context Protocol) mode."""
-    if _MCP_MODE_ENABLED:
-        return True
-
-    script_name = get_python_script_name()
-    return bool(script_name and "airbyte-mcp" in script_name)
+    script_name = get_python_script_name()
+    return bool(_MCP_MODE_ENABLED or (script_name and "airbyte-mcp" in script_name))
🧹 Nitpick comments (2)
airbyte/_util/meta.py (2)

20-22: LGTM on introducing the MCP mode flag.

Module-level flag and docstring look good. Given this is mutable global state, do we want a lightweight way to toggle it in tests (see next comment), wdyt?


38-46: Make set_mcp_mode toggleable for tests and multi-run scenarios.

Allowing enable/disable via a parameter makes this easier to use in tests and long-lived processes without needing a separate “reset” API. Shall we switch to this form, wdyt?

-def set_mcp_mode() -> None:
-    """Set flag indicating we are running in MCP (Model Context Protocol) mode.
+def set_mcp_mode(enabled: bool = True) -> None:
+    """Enable or disable MCP (Model Context Protocol) mode.
 
-    This should be called early in MCP server initialization to ensure
-    proper detection and prevent interactive prompts.
+    Call early in MCP server initialization to ensure proper detection and
+    prevent interactive prompts. Pass enabled=False to undo for tests.
     """
     global _MCP_MODE_ENABLED
-    _MCP_MODE_ENABLED = True
+    _MCP_MODE_ENABLED = bool(enabled)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7aa4a42 and c5e8641.

📒 Files selected for processing (1)
  • airbyte/_util/meta.py (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
airbyte/_util/meta.py (3)

20-21: Global MCP flag: add a reset/context manager for tests and nested flows?

The global works, but tests and nested server spins will benefit from a reversible toggle to avoid state leakage.

You could add a context manager and a reset API outside this range:

from contextlib import contextmanager

def reset_mcp_mode() -> None:
    global _MCP_MODE_ENABLED
    _MCP_MODE_ENABLED = False

@contextmanager
def mcp_mode(enabled: bool = True):
    global _MCP_MODE_ENABLED
    prev = _MCP_MODE_ENABLED
    try:
        _MCP_MODE_ENABLED = bool(enabled)
        yield
    finally:
        _MCP_MODE_ENABLED = prev

Would this help in tests and CLIs, wdyt?


38-46: Generalize setter to accept a bool; keep API idempotent.

This enables toggling in tests and aligns with the proposed context manager.

Apply:

-def set_mcp_mode() -> None:
-    """Set flag indicating we are running in MCP (Model Context Protocol) mode.
+def set_mcp_mode(enabled: bool = True) -> None:
+    """Set flag indicating we are running in MCP (Model Context Protocol) mode.
@@
-    global _MCP_MODE_ENABLED
-    _MCP_MODE_ENABLED = True
+    global _MCP_MODE_ENABLED
+    _MCP_MODE_ENABLED = bool(enabled)

Shall we do this for better testability, wdyt?


76-91: Interactivity check: prefer sys.stdin/stdout with fallback to stdin/stdout?

Relying only on stdin/stdout can mis-detect interactivity under runners that swap stdio (e.g., uvicorn, pytest captures). Using sys.* first, then falling back is more accurate.

Apply:

-        # No special modes detected. Return result based on whether stdin and stdout are ttys.
-        return bool(
-            sys.__stdin__ and sys.__stdin__.isatty() and sys.__stdout__ and sys.__stdout__.isatty()
-        )
+        # No special modes detected. Check TTY on current streams, with fallback to original streams.
+        stdin = getattr(sys, "stdin", None) or getattr(sys, "__stdin__", None)
+        stdout = getattr(sys, "stdout", None) or getattr(sys, "__stdout__", None)
+        return bool(stdin and stdin.isatty() and stdout and stdout.isatty())

Does this match your intended behavior across app servers and test runners, wdyt?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c5e8641 and fe29937.

📒 Files selected for processing (1)
  • airbyte/_util/meta.py (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (No Creds)
🔇 Additional comments (1)
airbyte/_util/meta.py (1)

48-50: LGTM: simple, strict-bool source of truth.

This addresses the earlier SIM103-style concerns by returning a boolean directly. No issues.

@aaronsteers aaronsteers merged commit 10125ff into main Sep 17, 2025
22 checks passed
@aaronsteers aaronsteers deleted the devin/1758055597-fix-envrc-loading-mcp branch September 17, 2025 06:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant