Skip to content

Conversation

aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Aug 27, 2025

feat: split config parameter into config and config_file for better MCP client interoperability

Summary

This PR addresses MCP client interoperability issues by splitting the config parameter into two separate parameters:

  • config: Now accepts dict | str | None (dict objects OR JSON strings)
  • config_file: New parameter accepting str | Path | None for file paths

The changes expand the resolve_config() utility function to parse JSON strings and handle the new parameter structure, then update all 6 MCP tools (validate_connector_config, list_source_streams, get_source_stream_json_schema, read_source_stream_records, get_stream_previews, sync_source_to_cache) to use the new signature.

Key motivation: Different MCP clients (like Claude Desktop) may serialize JSON objects as strings, causing type validation errors with the previous dict | Path | None typing. This change allows both programmatic clients (passing dict objects) and user-facing clients (passing JSON strings) to work seamlessly.

Review & Testing Checklist for Human

⚠️ CRITICAL TESTING NEEDED - Environment issues prevented full end-to-end testing during development.

  • Test JSON string parsing: Verify config parameter correctly handles JSON strings like '{"count": 100}' across all MCP tools
  • Test backward compatibility: Ensure existing usage with dict objects continues to work without changes
  • Test parameter priority logic: Verify behavior when multiple config sources are provided (config + config_file + config_secret_name) - config should override config_file, secrets should have highest priority
  • Test file-based configs: Confirm config_file parameter works correctly with YAML/JSON file paths
  • Test error handling: Verify graceful failures for malformed JSON strings and invalid file paths

Notes

  • Breaking change risk: The config parameter no longer accepts Path objects directly - these should now use config_file instead
  • Type safety: Added comprehensive JSON parsing with proper error messages for malformed input
  • Merge priority: config_file < config < config_secret_name (highest priority)
  • Requested by: @aaronsteers from session https://app.devin.ai/sessions/8a5a613809a34709967ffb7f8a130c90

Summary by CodeRabbit

  • New Features

    • Accept connector config as a JSON string or via a YAML/JSON config file (new config_file) and added .env loading via AIRBYTE_MCP_DOTENV_PATH_ENVVAR.
  • Bug Fixes / Validation

    • Improved errors for missing/unreadable/malformed/non-dict config inputs; changed behavior to return empty config when no input provided; enforces clear precedence: file → config/string → secrets; detects hardcoded secrets and surfaces guidance.
  • Documentation

    • Updated descriptions and error messages to reflect new input forms, precedence, and exceptions.

Important

Auto-merge enabled.

This PR is set to merge automatically when all requirements are met.

…CP client interoperability

- Expand resolve_config() to accept config as dict or JSON string
- Add separate config_file parameter for file paths
- Update all MCP tools to use new parameter structure
- Maintain backward compatibility with existing usage
- Improve interoperability across different MCP clients

Co-Authored-By: AJ Steers <[email protected]>
Copy link
Contributor

Original prompt from AJ Steers
@Devin - I need you to test the PyAirbyte MCP server for a specific use case. I commonly get errors from the LLM if it tries to pass config directly instead of using a config secret. Try this on a connector that does not need auth first, like source faker. If you can pass config to it properly, then I want you to try passing Config that needs hydrated secrets. There are three ways that an LLM can send this input arg: as a string containing JSON or as a JSON structure directly. Remind me before we start how this field is typed in the function signature. It might be that fast MCP is casting it incorrectly because we provided two wide of an input option set

Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1735261427-split-config-parameter' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1735261427-split-config-parameter'

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test-pr - Runs tests with the updated PyAirbyte

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

Addresses GitHub comment from @aaronsteers to move import statement
to the top of the file for better code organization.

Co-Authored-By: AJ Steers <[email protected]>
@aaronsteers aaronsteers enabled auto-merge (squash) August 27, 2025 01:30
Copy link
Contributor

coderabbitai bot commented Aug 27, 2025

📝 Walkthrough

Walkthrough

Public MCP operation signatures now accept dict or JSON string configs and add a config_file parameter. resolve_config was extended to load configs from file (YAML/JSON), parse JSON strings, merge file -> config -> secret, detect hardcoded secrets, and optionally load a .env via an env var (AIRBYTE_MCP_DOTENV_PATH_ENVVAR). Operations pass config_file through to resolve_config.

Changes

Cohort / File(s) Summary of changes
MCP operations API updates
airbyte/mcp/_local_ops.py
Updated function signatures: config now accepts dict or JSON str; added config_file (str or Path). Updated docstrings and propagated config_file to resolve_config calls in validate_connector_config, list_source_streams, get_source_stream_json_schema, read_source_stream_records, get_stream_previews, and sync_source_to_cache.
Config resolution & helpers
airbyte/mcp/_util.py
resolve_config signature changed to accept config (dict or JSON str) and config_file (str or Path). Added file loading (YAML/JSON) with existence and parse checks, JSON-string parsing, merge order (file -> config -> secret), explicit errors (FileNotFoundError, ValueError, TypeError) for file/parse/content issues, hardcoded-secret detection against config_spec_jsonschema, dotenv loading via a new helper and env var (AIRBYTE_MCP_DOTENV_PATH_ENVVAR), a _raise_invalid_file_config_type helper, and imported json/dotenv. Docstrings updated accordingly.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Caller
  participant Ops as _local_ops.py<br/>MCP Operation
  participant Util as _util.py<br/>resolve_config
  participant FS as File System
  participant Secrets as Secret Store
  participant Env as .env loader

  Caller->>Ops: call op(config, config_file, config_secret_name, ...)
  Ops->>Util: resolve_config(config, config_file, config_secret_name, schema)

  rect rgba(220,240,255,0.35)
    alt config_file provided
      Util->>FS: read config_file (YAML/JSON)
      FS-->>Util: file_config (dict) or error
      Util->>Util: validate file yields dict or raise TypeError
    else no config_file
      Note over Util: skip file load
    end
  end

  alt dotenv env var set
    Util->>Env: _load_dotenv_file(path)
    Env-->>Util: env vars loaded or error
  else no dotenv
    Note over Util: skip dotenv
  end

  alt config is JSON string
    Util->>Util: parse JSON -> dict or raise ValueError
  else config is dict / None
    Note over Util: use as-is
  end

  Util->>Util: merge file_config -> config

  opt config_secret_name provided
    Util->>Secrets: load secret config
    Secrets-->>Util: secret_config (dict)
    Util->>Util: deep_update(main, secret_config)
  end

  Util->>Util: check hardcoded secrets (if schema) -> raise ValueError if found
  Util-->>Ops: resolved_config (dict)
  Ops-->>Caller: result using resolved_config
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Would you like adding unit tests for JSON-string parsing errors, file-vs-config-vs-secret precedence, and dotenv loading behavior, wdyt?

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/1735261427-split-config-parameter

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
airbyte/mcp/_local_ops.py (3)

27-44: Update CONFIG_HELP to reflect new config vs config_file split and precedence

The help text still says “config as JSON or a Path”, which conflicts with the new API. Shall we update and note precedence, wdyt?

-You can provide `config` as JSON or a Path to a YAML/JSON file.
-If a `dict` is provided, it must not contain hardcoded secrets.
+You can provide `config` as a dict or JSON string. For file paths, use `config_file` (YAML/JSON).
+If a dict is provided, it must not contain hardcoded secrets.
@@
-If both `config` and `config_secret_name` are provided, the
-`config` will be loaded first and then the referenced secret config
-will be layered on top of the non-secret config.
+If multiple sources are provided, precedence is:
+`config_file` < `config` < `config_secret_name`.
+The non-secret config is loaded first and then the referenced secret config
+is layered on top.

233-236: Bug: config_file not passed to resolve_config (also causes ARG001)

This breaks the new API and triggers the linter. Shall we pass it through, wdyt?

-        config_dict = resolve_config(
-            config=config,
-            config_secret_name=config_secret_name,
-            config_spec_jsonschema=source.config_spec,
-        )
+        config_dict = resolve_config(
+            config=config,
+            config_file=config_file,
+            config_secret_name=config_secret_name,
+            config_spec_jsonschema=source.config_spec,
+        )

1-539: Ensure config_file is passed to resolve_config in read_source_stream_records

I ran a quick search across all resolve_config call sites and every tool correctly includes config_file—except the one in read_source_stream_records, which only passes config and config_secret_name. This means users can’t supply a --config-file when invoking that tool.

• File: airbyte/mcp/_local_ops.py
Function: read_source_stream_records
Call site (~lines 232–234):

config_dict = resolve_config(
    config=config,
    config_secret_name=config_secret_name,
)

• Suggestion: Add both config_file=config_file (and, for completeness, config_spec_jsonschema=source.config_spec) so it’s consistent with the other tools. For example:

 config_dict = resolve_config(
     config=config,
+    config_file=config_file,
     config_secret_name=config_secret_name,
+    config_spec_jsonschema=source.config_spec,
 )

Would you mind updating that call site so --config-file works as expected? wdyt?

🧹 Nitpick comments (3)
airbyte/mcp/_util.py (2)

46-51: Docstring: make precedence explicit and align exceptions with code

Can we clarify precedence (config_file < config < config_secret_name) and list both ValueError/TypeError in Raises to match behavior, wdyt?

-"""Resolve a configuration dictionary, JSON string, or file path to a dictionary.
+"""Resolve configuration from a dict, JSON string, or file into a dictionary.
@@
-    Returns:
-        Resolved configuration dictionary
+    Precedence:
+        config_file < config < config_secret_name
+
+    Returns:
+        Resolved configuration dictionary.
@@
-    Raises:
-        ValueError: If no configuration provided or if JSON parsing fails
+    Raises:
+        ValueError: If no configuration provided, parsing fails, or file/secret content is invalid.
+        TypeError: If parameters are of an invalid type.
 """

Also applies to: 52-65


46-46: Reduce branching to satisfy PLR0912

Ruff flags “Too many branches (17 > 15)”. Would you consider extracting helpers to cut branches, wdyt?

Example (outside selected range):

def _load_file_config(config_file: Path) -> dict[str, Any]: ...
def _parse_config_param(config: dict | str) -> dict[str, Any]: ...
def _merge_secret(config_dict: dict[str, Any], secret_name: str) -> dict[str, Any]: ...
airbyte/mcp/_local_ops.py (1)

16-44: Reduce duplication for param descriptions across tools

Would you like to factor the repeated description strings into shared constants to keep things DRY and avoid future E501 regressions, wdyt?

Example (outside selected ranges):

CONFIG_DESC = "The configuration for the source connector as a dict object or JSON string."
CONFIG_FILE_DESC = "Path to a YAML or JSON file containing the source connector configuration."

Then reference with Field(description=CONFIG_DESC) / Field(description=CONFIG_FILE_DESC).

Also applies to: 520-539

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between be50b70 and 3bf53e4.

📒 Files selected for processing (2)
  • airbyte/mcp/_local_ops.py (11 hunks)
  • airbyte/mcp/_util.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/mcp/_util.py (1)
airbyte/secrets/hydration.py (1)
  • detect_hardcoded_secrets (140-171)
🪛 GitHub Actions: Run Linters
airbyte/mcp/_util.py

[error] 46-46: Ruff: PLR0912 Too many branches (17 > 15) in function 'resolve_config'.


[error] 68-68: Ruff: PLC0415 import should be at the top-level of a file.


[error] 69-69: Ruff: W293 Blank line contains whitespace.


[error] 71-71: Ruff: W293 Blank line contains whitespace.


[error] 81-81: Ruff: W293 Blank line contains whitespace.


[error] 89-89: Ruff: W293 Blank line contains whitespace.


[error] 93-93: Ruff: TRY004 Prefer TypeError exception for invalid type.


[error] 93-93: Ruff: TRY301 Abstract raise to an inner function.


[error] 99-99: Ruff: B904 Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling.


[error] 106-106: Ruff: TRY004 Prefer TypeError exception for invalid type.


[error] 109-109: Ruff: E501 Line too long (105 > 100).


[error] 113-113: Ruff: B904 Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling.


[error] 113-113: Ruff: TRY301 Abstract raise to an inner function.

airbyte/mcp/_local_ops.py

[error] 131-131: Ruff: E501 Line too long (105 > 100).


[error] 135-135: Ruff: E501 Line too long (104 > 100).


[error] 172-172: Ruff: E501 Line too long (105 > 100).


[error] 176-176: Ruff: E501 Line too long (104 > 100).


[error] 206-206: Ruff: E501 Line too long (105 > 100).


[error] 208-208: Ruff: ARG001 Unused function argument: config_file.


[error] 210-210: Ruff: E501 Line too long (104 > 100).


[error] 264-264: Ruff: E501 Line too long (105 > 100).


[error] 268-268: Ruff: E501 Line too long (104 > 100).


[error] 345-345: Ruff: E501 Line too long (105 > 100).


[error] 349-349: Ruff: E501 Line too long (104 > 100).


[error] 351-351: Ruff: E501 Line too long (105 > 100).

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (6)
airbyte/mcp/_util.py (1)

119-136: Hardcoded secret detection: looks solid

The check runs only when schema is provided and before secret overlay—nice layering.

airbyte/mcp/_local_ops.py (5)

78-85: Propagation of config_file looks correct here

Thanks for threading it through to resolve_config.


150-156: resolve_config propagation looks good

config_file is passed and schema provided—LGTM.


188-193: resolve_config propagation looks good

Good consistency with other ops.


296-301: resolve_config propagation looks good

Consistent and correct.


366-370: resolve_config propagation looks good

Matches new API expectations.

Copy link

github-actions bot commented Aug 27, 2025

PyTest Results (Fast Tests Only, No Creds)

301 tests  ±0   301 ✅ ±0   4m 19s ⏱️ -3s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 00bbeb1. ± Comparison against base commit 56748f4.

♻️ This comment has been updated with latest results.

- Shortened Field descriptions to fix E501 line length violations
- Added missing config_file parameter to resolve_config call in read_source_stream_records
- Fixed unused function argument ARG001 error

Co-Authored-By: AJ Steers <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (3)
airbyte/mcp/_util.py (3)

4-4: Thanks for hoisting json to module scope.

This resolves the prior PLC0415 warning. LGTM.


77-99: File handling: use TypeError for type mismatches, specify UTF-8, narrow exceptions, chain errors, and deep-merge

This addresses Ruff TRY004/B904 and aligns merge semantics with deep_update. Also prefer is_file() to avoid directories. Wdyt?

-    if config_file is not None:
-        if isinstance(config_file, str):
-            config_file = Path(config_file)
-
-        if not isinstance(config_file, Path):
-            raise ValueError(
-                f"config_file must be a string or Path object, got: {type(config_file).__name__}"
-            )
-
-        if not config_file.exists():
-            raise FileNotFoundError(f"Configuration file not found: {config_file}")
-
-        try:
-            file_config = yaml.safe_load(config_file.read_text())
-            if not isinstance(file_config, dict):
-                raise ValueError(
-                    f"Configuration file must contain a valid JSON/YAML object, "
-                    f"got: {type(file_config).__name__}"
-                )
-            config_dict.update(file_config)
-        except Exception as e:
-            raise ValueError(f"Error reading configuration file {config_file}: {e}")
+    if config_file is not None:
+        config_path = Path(config_file) if isinstance(config_file, (str, Path)) else None
+        if config_path is None:
+            raise TypeError(f"config_file must be str or Path, got: {type(config_file).__name__}")
+        if not (config_path.exists() and config_path.is_file()):
+            raise FileNotFoundError(f"Configuration file not found: {config_path}")
+        try:
+            text = config_path.read_text(encoding="utf-8")
+            file_config = yaml.safe_load(text)
+            if not isinstance(file_config, dict):
+                raise TypeError(
+                    f"Configuration file must contain a JSON/YAML object, got: {type(file_config).__name__}"
+                )
+            deep_update(config_dict, file_config)
+        except (OSError, UnicodeDecodeError, yaml.YAMLError) as e:
+            raise ValueError(f"Error reading configuration file {config_path}") from e

100-116: Inline config: deep-merge dicts; TypeError for type issues; chain JSON errors

This fixes TRY004/B904 and unifies merge behavior with secrets. Wdyt?

-    if config is not None:
-        if isinstance(config, dict):
-            config_dict.update(config)
-        elif isinstance(config, str):
-            try:
-                parsed_config = json.loads(config)
-                if not isinstance(parsed_config, dict):
-                    raise ValueError(
-                        f"Parsed JSON config must be an object/dict, "
-                        f"got: {type(parsed_config).__name__}"
-                    )
-                config_dict.update(parsed_config)
-            except json.JSONDecodeError as e:
-                raise ValueError(f"Invalid JSON in config parameter: {e}")
-        else:
-            raise ValueError(f"Config must be a dict or JSON string, got: {type(config).__name__}")
+    if config is not None:
+        if isinstance(config, dict):
+            deep_update(config_dict, config)
+        elif isinstance(config, str):
+            try:
+                parsed_config = json.loads(config)
+            except json.JSONDecodeError as e:
+                raise ValueError("Invalid JSON in config parameter") from e
+            if not isinstance(parsed_config, dict):
+                raise TypeError(f"Parsed JSON config must be a dict, got: {type(parsed_config).__name__}")
+            deep_update(config_dict, parsed_config)
+        else:
+            raise TypeError(f"config must be dict or JSON string, got: {type(config).__name__}")
🧹 Nitpick comments (3)
airbyte/mcp/_util.py (3)

117-133: Hardcoded-secrets check should run after secrets merge to avoid false positives

If a hardcoded value is later overridden by config_secret_name, the current ordering still raises. Shall we move this block to after the secret merge, wdyt?

-    if config_dict and config_spec_jsonschema is not None:
-        hardcoded_secrets: list[list[str]] = detect_hardcoded_secrets(
-            config=config_dict,
-            spec_json_schema=config_spec_jsonschema,
-        )
-        if hardcoded_secrets:
-            error_msg = "Configuration contains hardcoded secrets in fields: "
-            error_msg += ", ".join(
-                [".".join(hardcoded_secret) for hardcoded_secret in hardcoded_secrets]
-            )
-
-            error_msg += (
-                "Please use environment variables instead. For example:\n"
-                "To set a secret via reference, set its value to "
-                "`secret_reference::ENV_VAR_NAME`.\n"
-            )
-            raise ValueError(error_msg)

Add this after merging secrets (just before return):

# Run after merging secrets so overrides can suppress false positives.
if config_dict and config_spec_jsonschema is not None:
    hardcoded = detect_hardcoded_secrets(
        config=config_dict,
        spec_json_schema=config_spec_jsonschema,
    )
    if hardcoded:
        fields = ", ".join(".".join(p) for p in hardcoded)
        raise ValueError(
            "Configuration contains hardcoded secrets in fields: "
            f"{fields}. Use secret_reference::ENV_VAR_NAME or environment variables."
        )

135-149: Secret parsing: narrow exceptions, chain errors, and prefer TypeError for non-dict

This improves diagnostics and consistency with the rest of the function. Wdyt?

-    if config_secret_name is not None:
-        # Assume this is a secret name that points to a JSON/YAML config.
-        secret_config = yaml.safe_load(str(get_secret(config_secret_name)))
-        if not isinstance(secret_config, dict):
-            raise ValueError(
-                f"Secret '{config_secret_name}' must contain a valid JSON or YAML object, "
-                f"but got: {type(secret_config).__name__}"
-            )
+    if config_secret_name is not None:
+        # Assume this is a secret name that points to a JSON/YAML config.
+        try:
+            secret_text = str(get_secret(config_secret_name))
+            secret_config = yaml.safe_load(secret_text)
+        except yaml.YAMLError as e:
+            raise ValueError(f"Error parsing secret '{config_secret_name}' as YAML/JSON") from e
+        if not isinstance(secret_config, dict):
+            raise TypeError(
+                f"Secret '{config_secret_name}' must contain a JSON/YAML object, "
+                f"got: {type(secret_config).__name__}"
+            )

53-66: Docstring: reflect TypeError cases and priority semantics

Since we’ll raise TypeError for type mismatches, and the merge priority is config_file < config < config_secret_name, shall we update the docstring “Raises” and “Args” sections accordingly, wdyt?

-    """Resolve a configuration dictionary, JSON string, or file path to a dictionary.
+    """Resolve a configuration dictionary, JSON string, or file path to a dictionary.
@@
-    Raises:
-        ValueError: If no configuration provided or if JSON parsing fails
+    Raises:
+        ValueError: If no configuration is provided, or file/JSON parsing fails.
+        TypeError: If inputs are of invalid types (e.g., non-dict JSON object).
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 3bf53e4 and 03caa6d.

📒 Files selected for processing (2)
  • airbyte/mcp/_local_ops.py (12 hunks)
  • airbyte/mcp/_util.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte/mcp/_local_ops.py
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/mcp/_util.py (1)
airbyte/secrets/hydration.py (1)
  • detect_hardcoded_secrets (140-171)
🪛 GitHub Actions: Run Linters
airbyte/mcp/_util.py

[error] 47-47: Ruff: PLR0912 Too many branches (17 > 15).


[error] 92-92: Ruff: TRY004 Prefer TypeError exception for invalid type.


[error] 92-92: Ruff: TRY301 Abstract raise to an inner function.


[error] 98-98: Ruff: B904 Within an except clause, raise exceptions with raise ... from err or raise ... from None.


[error] 107-107: Ruff: TRY004 Prefer TypeError exception for invalid type.


[error] 113-113: Ruff: B904 Within an except clause, raise exceptions with raise ... from err or raise ... from None.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (No Creds)
🔇 Additional comments (2)
airbyte/mcp/_util.py (2)

71-75: Early guard is solid.

Clear error when no config source is provided. LGTM.


48-66: Please verify all resolve_config call-sites and update the docstring accordingly

I ran a scan for any direct Path(...) instances being passed into the config parameter but didn’t spot any matches—could you double-check that every call-site now routes file paths through config_file instead of config? In particular, take a look at these invocations in airbyte/mcp/_local_ops.py around lines 79, 150, 188, 232, 297, and 366.

Also, since the implementation now raises a TypeError when inputs are of the wrong kind, should we tweak the docstring’s “Raises” section to reflect that? For example:

 Raises:
     ValueError: If no configuration provided or if JSON parsing fails
+    TypeError: If the provided config or config_file is of an unexpected type

Quick linter thought: would you like to locally suppress PLR0912 on resolve_config to unblock CI while we refactor the branching structure? For example:

def resolve_config(  # noqa: PLR0912
    config: dict | str | None = None,
    config_file: str | Path | None = None,
    config_secret_name: str | None = None,
    config_spec_jsonschema: dict[str, Any] | None = None,
) -> dict[str, Any]:
    ...

Wdyt?

Copy link

github-actions bot commented Aug 27, 2025

PyTest Results (Full)

364 tests  ±0   348 ✅ ±0   20m 18s ⏱️ + 2m 13s
  1 suites ±0    16 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 00bbeb1. ± Comparison against base commit 56748f4.

♻️ This comment has been updated with latest results.

- Use TypeError instead of ValueError for type validation (TRY004)
- Add proper exception chaining with 'from err' (B904)
- Add noqa comment for function complexity (PLR0912)
- Abstract raise to inner function to reduce complexity (TRY301)
- Use object type instead of Any to avoid ANN401

Co-Authored-By: AJ Steers <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (4)
airbyte/mcp/_util.py (4)

4-4: Import hoisted to module scope — lint issue resolved.

This addresses the prior PLC0415 note; thanks for moving it.


85-93: Use TypeError for invalid config_file type and tighten message.

Invalid-type guards typically raise TypeError; switch from ValueError and keep the message concise, wdyt?

-        if not isinstance(config_file, Path):
-            raise ValueError(
-                f"config_file must be a string or Path object, got: {type(config_file).__name__}"
-            )
+        if not isinstance(config_file, Path):
+            raise TypeError(f"config_file must be str or Path, got: {type(config_file).__name__}")

94-104: File load: specify UTF-8, catch narrow exceptions, and deep-merge.

  • Use read_text(encoding="utf-8") to avoid locale surprises.
  • Catch (OSError, UnicodeError, yaml.YAMLError) instead of bare Exception.
  • Prefer deep_update to avoid clobbering nested dicts, wdyt?
-        try:
-            file_config = yaml.safe_load(config_file.read_text())
+        try:
+            text = config_file.read_text(encoding="utf-8")
+            file_config = yaml.safe_load(text)
             if not isinstance(file_config, dict):
-                _raise_invalid_file_config_type(file_config)
-            config_dict.update(file_config)
-        except Exception as e:
-            raise ValueError(f"Error reading configuration file {config_file}: {e}") from e
+                _raise_invalid_obj_type(file_config, "Configuration file content")
+            deep_update(config_dict, file_config)
+        except (OSError, UnicodeError, yaml.YAMLError) as e:
+            raise ValueError(f"Error reading configuration file {config_file}") from e

105-121: Merge semantics and type errors for config.

  • Use deep_update for dict merges to preserve nested keys.
  • Raise TypeError for invalid top-level type.
  • Keep JSON error concise and chained, wdyt?
     if config is not None:
         if isinstance(config, dict):
-            config_dict.update(config)
+            deep_update(config_dict, config)
         elif isinstance(config, str):
             try:
                 parsed_config = json.loads(config)
                 if not isinstance(parsed_config, dict):
                     raise TypeError(
                         f"Parsed JSON config must be an object/dict, "
                         f"got: {type(parsed_config).__name__}"
                     )
-                config_dict.update(parsed_config)
+                deep_update(config_dict, parsed_config)
             except json.JSONDecodeError as e:
-                raise ValueError(f"Invalid JSON in config parameter: {e}") from e
+                raise ValueError("Invalid JSON in config parameter") from e
         else:
-            raise ValueError(f"Config must be a dict or JSON string, got: {type(config).__name__}")
+            raise TypeError(f"config must be dict or JSON string, got: {type(config).__name__}")
🧹 Nitpick comments (5)
airbyte/mcp/_util.py (5)

17-23: Broaden helper to cover non-file contexts or adjust message.

Since this raises TypeError for “file config” shape, the message reads oddly when reused conceptually for non-file contexts; would you generalize it to accept a context string (e.g., "file", "json") or tighten usage to file-only and keep other sites bespoke, wdyt?

-def _raise_invalid_file_config_type(file_config: object) -> None:
-    """Raise TypeError for invalid file config type."""
-    raise TypeError(
-        f"Configuration file must contain a valid JSON/YAML object, "
-        f"got: {type(file_config).__name__}"
-    )
+def _raise_invalid_obj_type(obj: object, where: str) -> None:
+    """Raise TypeError for invalid object type."""
+    raise TypeError(f"{where} must be a JSON/YAML object (dict), got: {type(obj).__name__}")

38-44: Comment mismatches behavior: no CWD .env load.

The comment states loading from CWD, but code only loads when AIRBYTE_MCP_ENV_FILE is set. Update the comment or add a CWD fallback (e.g., load ".env" if present), wdyt?

-    # Load the .env file from the current working directory.
+    # Load environment variables from a .env file if AIRBYTE_MCP_ENV_FILE is set.

55-75: Docstring: enumerate all raised exceptions.

resolve_config can raise FileNotFoundError and TypeError in addition to ValueError. Shall we document these for callers, wdyt?

-    Raises:
-        ValueError: If no configuration provided or if JSON parsing fails
+    Raises:
+        ValueError: If no configuration provided, file parse fails, or JSON parsing fails
+        FileNotFoundError: If the provided config_file path does not exist
+        TypeError: If provided types are invalid or parsed content is not a dict

Would you like me to add a brief “Exceptions” section to the README for MCP tool authors?


122-139: Consider checking for hardcoded secrets even without a schema.

detect_hardcoded_secrets falls back to a global mask when schema is None; would you run the check whenever config_dict is non-empty to keep protection consistent, wdyt?

-    if config_dict and config_spec_jsonschema is not None:
+    if config_dict:
         hardcoded_secrets: list[list[str]] = detect_hardcoded_secrets(
             config=config_dict,
             spec_json_schema=config_spec_jsonschema,
         )

55-71: Add focused tests for new inputs and precedence?

Shall we add unit tests covering: JSON string parsing (valid/invalid), deep merge behavior (file < config < secret), type guards (TypeError cases), and hardcoded-secret detection with/without schema, wdyt?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 03caa6d and 75f4c94.

📒 Files selected for processing (1)
  • airbyte/mcp/_util.py (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/mcp/_util.py (1)
airbyte/secrets/hydration.py (1)
  • detect_hardcoded_secrets (140-171)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (1)
airbyte/mcp/_util.py (1)

149-153: Secret merge priority LGTM.

deep_update for secrets at highest precedence matches the stated priority.

aaronsteers and others added 2 commits August 26, 2025 20:40
… docstring

- Remove _raise_invalid_file_config_type helper function and inline the logic
- Remove Args block from resolve_config function docstring
- Change behavior to return empty dict instead of raising error when no config provided
- Addresses comments from @aaronsteers on PR #762

Co-Authored-By: AJ Steers <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
airbyte/mcp/_util.py (1)

17-23: Inline single-use helper to reduce indirection?

_raise_invalid_file_config_type() is only used once; inlining trims a hop and keeps the error local. Wdyt?

- def _raise_invalid_file_config_type(file_config: object) -> None:
-     """Raise TypeError for invalid file config type."""
-     raise TypeError(
-         f"Configuration file must contain a valid JSON/YAML object, "
-         f"got: {type(file_config).__name__}"
-     )
+ # (Helper removed — inline where used)

@@
-            if not isinstance(file_config, dict):
-                _raise_invalid_file_config_type(file_config)
+            if not isinstance(file_config, dict):
+                raise TypeError(
+                    f"Configuration file must contain a JSON/YAML object, got: {type(file_config).__name__}"
+                )

Also applies to: 93-95

🧹 Nitpick comments (6)
airbyte/mcp/_util.py (6)

28-36: Use Path.is_file() for a tighter existence check

exists() allows directories; is_file() is stricter and improves diagnostics. Wdyt?

-    if not dotenv_path.exists():
-        raise FileNotFoundError(f".env file not found: {dotenv_path}")
+    if not dotenv_path.is_file():
+        raise FileNotFoundError(f".env file not found or not a file: {dotenv_path}")

38-44: Docstring/comment mismatch with behavior

This only loads a .env when AIRBYTE_MCP_ENV_FILE is set (not the CWD). Shall we align the docstring/comments (or add a fallback dotenv.load_dotenv()), wdyt?

-def initialize_secrets() -> None:
-    """Initialize dotenv to load environment variables from .env files."""
-    # Load the .env file from the current working directory.
+def initialize_secrets() -> None:
+    """Initialize dotenv from AIRBYTE_MCP_ENV_FILE (if set), then register secret managers."""
+    # Load the .env file from AIRBYTE_MCP_ENV_FILE if present.

55-60: Expand docstring to enumerate raised exceptions

Callers will benefit from the full contract (TypeError, FileNotFoundError, etc.). Add-on below, wdyt?

-"""Resolve a configuration dictionary, JSON string, or file path to a dictionary.
+"""Resolve configuration from dict, JSON string, or file path into a dict.
@@
-Returns:
-    Resolved configuration dictionary
+Returns:
+    Resolved configuration dictionary.
@@
-Raises:
-    ValueError: If no configuration provided or if JSON parsing fails
+Raises:
+    ValueError: If no configuration provided or JSON/file parsing fails.
+    FileNotFoundError: If config_file does not exist or is not a file.
+    TypeError: If inputs are wrong types or parsed content is not a dict.

Also applies to: 61-70


79-87: Type validation: prefer TypeError; simplify Path handling

Minor polish: validate type, then coerce once. Wdyt?

-    if config_file is not None:
-        if isinstance(config_file, str):
-            config_file = Path(config_file)
-
-        if not isinstance(config_file, Path):
-            raise ValueError(
-                f"config_file must be a string or Path object, got: {type(config_file).__name__}"
-            )
+    if config_file is not None:
+        if not isinstance(config_file, (str, Path)):
+            raise TypeError(f"config_file must be str or Path, got: {type(config_file).__name__}")
+        config_file = Path(config_file)

88-98: File load: specify UTF-8, narrow exceptions, deep-merge

  • Use encoding="utf-8".
  • Catch OSError/UnicodeDecodeError/yaml.YAMLError instead of a blanket Exception.
  • Prefer deep_update to avoid clobbering nested keys. Wdyt?
-        if not config_file.exists():
-            raise FileNotFoundError(f"Configuration file not found: {config_file}")
+        if not config_file.is_file():
+            raise FileNotFoundError(f"Configuration file not found or not a file: {config_file}")
@@
-        try:
-            file_config = yaml.safe_load(config_file.read_text())
+        try:
+            text = config_file.read_text(encoding="utf-8")
+            file_config = yaml.safe_load(text)
             if not isinstance(file_config, dict):
-                _raise_invalid_file_config_type(file_config)
-            config_dict.update(file_config)
-        except Exception as e:
-            raise ValueError(f"Error reading configuration file {config_file}: {e}") from e
+                raise TypeError(
+                    f"Configuration file must contain a JSON/YAML object, got: {type(file_config).__name__}"
+                )
+            deep_update(config_dict, file_config)
+        except (OSError, UnicodeDecodeError, yaml.YAMLError) as e:
+            raise ValueError(f"Error reading configuration file {config_file}") from e

99-115: Config merging: deep-merge dicts; chain JSON errors; TypeError for wrong types

This avoids nested key loss and tightens error types. Wdyt?

-    if config is not None:
-        if isinstance(config, dict):
-            config_dict.update(config)
-        elif isinstance(config, str):
-            try:
-                parsed_config = json.loads(config)
-                if not isinstance(parsed_config, dict):
-                    raise TypeError(
-                        f"Parsed JSON config must be an object/dict, "
-                        f"got: {type(parsed_config).__name__}"
-                    )
-                config_dict.update(parsed_config)
-            except json.JSONDecodeError as e:
-                raise ValueError(f"Invalid JSON in config parameter: {e}") from e
-        else:
-            raise ValueError(f"Config must be a dict or JSON string, got: {type(config).__name__}")
+    if config is not None:
+        if isinstance(config, dict):
+            deep_update(config_dict, config)
+        elif isinstance(config, str):
+            try:
+                parsed_config = json.loads(config)
+            except json.JSONDecodeError as e:
+                raise ValueError("Invalid JSON in config parameter") from e
+            if not isinstance(parsed_config, dict):
+                raise TypeError(f"Parsed JSON config must be a dict, got: {type(parsed_config).__name__}")
+            deep_update(config_dict, parsed_config)
+        else:
+            raise TypeError(f"config must be dict or JSON string, got: {type(config).__name__}")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 75f4c94 and e8aad90.

📒 Files selected for processing (1)
  • airbyte/mcp/_util.py (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/mcp/_util.py (1)
airbyte/secrets/hydration.py (1)
  • detect_hardcoded_secrets (140-171)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (No Creds)
🔇 Additional comments (4)
airbyte/mcp/_util.py (4)

4-4: Hoisting json import — LGTM

Thanks for moving this to module scope.


116-133: Hardcoded-secrets detection UX — LGTM

Clear guidance and field paths are helpful. Nice.


143-147: Priority order preserved via deep merge — LGTM

Merging secret config last honors config_file < config < secret. Good.


55-60: All resolve_config usages & signatures are up-to-date
I’ve checked the code in airbyte/mcp/_local_ops.py and confirmed that:

  • Every call to resolve_config now passes both config and config_file arguments.
  • None of the callers use a Path in the config parameter.
  • All MCP tool function signatures (validate_connector_config, list_source_streams, etc.) include the new config_file parameter.

Looks like there’s no signature or usage drift here—thanks for handling the update! Anything else you’d like me to double-check? wdyt?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
airbyte/mcp/_util.py (1)

80-85: Inline the single-use helper to reduce noise

This inner helper is only used once; inlining keeps things compact (and you’ll still satisfy TRY301 elsewhere). Shall we remove it, wdyt?

-def _raise_invalid_type(file_config: object) -> None:
-    raise TypeError(
-        f"Configuration file must contain a valid JSON/YAML object, "
-        f"got: {type(file_config).__name__}"
-    )
🧹 Nitpick comments (6)
airbyte/mcp/_util.py (6)

47-52: Signature/types: tighten config typing and docstring mention of config_file

Would you switch config: dict | str | None to dict[str, Any] | str | None for parity with the return type and update the docstring to explicitly mention config_file and its accepted types, wdyt?

-def resolve_config(  # noqa: PLR0912
-    config: dict | str | None = None,
-    config_file: str | Path | None = None,
+def resolve_config(  # noqa: PLR0912
+    config: dict[str, Any] | str | None = None,
+    config_file: str | Path | None = None,

53-60: Docstring mismatch with behavior and exceptions

The function returns {} when nothing is provided (Line 66) but the docstring says it raises on “no configuration provided”. Also, wrong-type errors below use ValueError in a few places. Could we align the docstring to current behavior and note TypeError for invalid types, wdyt?

-"""Resolve a configuration dictionary, JSON string, or file path to a dictionary.
+"""Resolve configuration from a dict, JSON string, and/or file path into a dictionary.
@@
-Returns:
-    Resolved configuration dictionary
+Returns:
+    Resolved configuration dictionary. Returns `{}` if no inputs are provided.
@@
-Raises:
-    ValueError: If no configuration provided or if JSON parsing fails
+Raises:
+    ValueError: If file read/parse fails or JSON parsing fails.
+    TypeError: If inputs are of invalid types (e.g., non-dict JSON content or wrong param types).

68-76: Type validation: prefer TypeError and simplify normalization

For wrong types, TypeError is more appropriate (TRY004). Also, Path(config_file) covers both str and Path. Shall we streamline like this, wdyt?

-if config_file is not None:
-    if isinstance(config_file, str):
-        config_file = Path(config_file)
-
-    if not isinstance(config_file, Path):
-        raise ValueError(
-            f"config_file must be a string or Path object, got: {type(config_file).__name__}"
-        )
+if config_file is not None:
+    try:
+        config_file = Path(config_file)  # accepts str or Path
+    except TypeError as e:
+        raise TypeError(f"config_file must be str or Path, got: {type(config_file).__name__}") from e

86-93: File load: specify UTF-8, catch specific exceptions, deep-merge

  • Use encoding="utf-8" for deterministic decoding.
  • Catch OSError|UnicodeError|yaml.YAMLError instead of a blanket Exception.
  • Prefer deep_update over shallow dict.update to avoid clobbering nested keys (aligns with secret merge semantics).

Would you apply the below, wdyt?

-try:
-    file_config = yaml.safe_load(config_file.read_text())
-    if not isinstance(file_config, dict):
-        _raise_invalid_type(file_config)
-    config_dict.update(file_config)
-except Exception as e:
-    raise ValueError(f"Error reading configuration file {config_file}: {e}") from e
+try:
+    text = config_file.read_text(encoding="utf-8")
+    file_config = yaml.safe_load(text)
+    if not isinstance(file_config, dict):
+        raise TypeError(f"Configuration file must contain a JSON/YAML object, got: {type(file_config).__name__}")
+    deep_update(config_dict, file_config)
+except (OSError, UnicodeError, yaml.YAMLError) as e:
+    raise ValueError(f"Error reading configuration file {config_file}: {e}") from e

94-110: Config merging: deep-merge dicts and adjust exception types/messages

  • Use deep_update when config is a dict or JSON-parses to a dict.
  • Raise TypeError for invalid types (TRY004).
  • Keep JSON error chained; message can be concise to avoid E501.

Shall we update as below, wdyt?

-if isinstance(config, dict):
-    config_dict.update(config)
+if isinstance(config, dict):
+    deep_update(config_dict, config)
 elif isinstance(config, str):
     try:
         parsed_config = json.loads(config)
         if not isinstance(parsed_config, dict):
-            raise TypeError(
-                f"Parsed JSON config must be an object/dict, "
-                f"got: {type(parsed_config).__name__}"
-            )
-        config_dict.update(parsed_config)
+            raise TypeError(f"Parsed JSON config must be a dict, got: {type(parsed_config).__name__}")
+        deep_update(config_dict, parsed_config)
     except json.JSONDecodeError as e:
-        raise ValueError(f"Invalid JSON in config parameter: {e}") from e
+        raise ValueError("Invalid JSON in config parameter") from e
 else:
-    raise ValueError(f"Config must be a dict or JSON string, got: {type(config).__name__}")
+    raise TypeError(f"config must be dict or JSON string, got: {type(config).__name__}")

122-127: Minor copy tweak in guidance (optional)

Would you add an example env var name to make this more concrete (e.g., secret_reference::MY_API_KEY), wdyt?

- "To set a secret via reference, set its value to "
- "`secret_reference::ENV_VAR_NAME`.\n"
+ "To set a secret via reference, set its value to "
+ "`secret_reference::ENV_VAR_NAME` (e.g., `secret_reference::MY_API_KEY`).\n"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e8aad90 and 587022c.

📒 Files selected for processing (1)
  • airbyte/mcp/_util.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/mcp/_util.py (1)
airbyte/secrets/hydration.py (1)
  • detect_hardcoded_secrets (140-171)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (6)
airbyte/mcp/_util.py (6)

4-4: Good fix: moved json import to module scope

This addresses prior Ruff PLC0415; nice cleanup.


65-67: Graceful empty-config behavior looks good

Early return {} on no inputs matches the PR intent for backward-safe defaults.


77-79: Existence check is fine

FileNotFoundError on missing path is clear and actionable.


111-121: Hardcoded secrets check: timing and message look good

Running after merge of file/config but before secrets gives the right signal to users; LGTM.


129-144: Secret merge via deep_update is correct

This preserves “secrets override everything” precedence. Nice.


47-110: No file-path config usages detected

I ran the ripgrep scans against the provided patterns and didn’t find:

  • Any calls to resolve_config passing a Path(...) or .yaml/.json string via its config= parameter
  • Any definitions or invocations of the old MCP ops (e.g. validate_connector_config, list_source_streams, etc.) still accepting only a single config argument

It looks like all consumers have migrated to the new config_file parameter (or pass a dict/JSON string to config). Let me know if you spot any positional calls or other edge cases we should verify, wdyt?

@aaronsteers aaronsteers merged commit 721f26c into main Aug 28, 2025
21 checks passed
@aaronsteers aaronsteers deleted the devin/1735261427-split-config-parameter branch August 28, 2025 02:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant