-
Notifications
You must be signed in to change notification settings - Fork 67
docs(mcp): improve MCP docs including auto-generated pdocs #806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This PyAirbyte VersionYou can test this version of PyAirbyte using the following: # Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@docs/mcp-getting-started' pyairbyte --help
# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@docs/mcp-getting-started' Helpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
Community SupportQuestions? Join the #pyairbyte channel in our Slack workspace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR improves the documentation for the PyAirbyte MCP (Model Context Protocol) server, including auto-generated pdocs support. The changes reorganize and enhance the documentation structure to provide clearer setup instructions and better developer guidance.
Key changes:
- Restructured MCP documentation with improved setup instructions and user guide
- Added comprehensive contributing guide section for MCP server development
- Reorganized module imports and added pdoc visibility annotations
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
docs/CONTRIBUTING.md | Added new section for MCP server development with testing instructions and configuration examples |
airbyte/mcp/server.py | Updated imports and added docstring annotations for better documentation generation |
airbyte/mcp/local_ops.py | Added pdoc visibility annotations and improved function documentation |
airbyte/mcp/connector_registry.py | Added pdoc visibility annotations for internal functions |
airbyte/mcp/cloud_ops.py | Added pdoc visibility annotations for internal registration function |
airbyte/mcp/init.py | Major restructuring of module documentation with detailed setup guide and improved organization |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <[email protected]>
Warning Rate limit exceeded@aaronsteers has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 12 minutes and 6 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (2)
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughThe MCP package’s public API was expanded by exporting additional submodules and updating imports. Several docstrings were revised to mark internals, and CONTRIBUTING docs gained an MCP server contribution/testing section. local_ops introduced new internal (deferred) tools and renamed a public constant to a private one. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🧪 Early access (Sonnet 4.5): enabledWe are currently testing the Sonnet 4.5 model, which is expected to improve code review quality. However, this model may lead to increased noise levels in the review comments. Please disable the early access features if the noise level causes any inconvenience. Note:
Comment |
…hq/PyAirbyte into docs/mcp-getting-started
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
docs/CONTRIBUTING.md (1)
68-133
: Well-structured documentation for MCP contributors!The new section provides clear, actionable guidance for testing and contributing to the MCP server. The examples are comprehensive and cover both local and cloud operations. The JSON config example is syntactically correct and the poetry-based workflow is well-explained.
A few thoughts to consider:
- The JSON config example uses absolute paths (
/path/to/repos/PyAirbyte
,/path/to/.mcp/airbyte_mcp.env
) – might be worth adding a note that users should replace these with their actual paths?- The hot-reload note (line 105) is valuable – it might be even more prominent if it appeared earlier in the section, perhaps right after the config example?
wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
airbyte/mcp/__init__.py
(4 hunks)airbyte/mcp/cloud_ops.py
(1 hunks)airbyte/mcp/connector_registry.py
(3 hunks)airbyte/mcp/local_ops.py
(6 hunks)airbyte/mcp/server.py
(1 hunks)docs/CONTRIBUTING.md
(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/mcp/server.py (5)
airbyte/mcp/cloud_ops.py (1)
register_cloud_ops_tools
(504-519)airbyte/mcp/connector_registry.py (1)
register_connector_registry_tools
(154-160)airbyte/mcp/local_ops.py (1)
register_local_ops_tools
(763-785)airbyte/_util/meta.py (1)
set_mcp_mode
(40-51)airbyte/mcp/_util.py (1)
initialize_secrets
(37-70)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (No Creds)
🔇 Additional comments (16)
airbyte/mcp/connector_registry.py (3)
104-104
: LGTM! Appropriate @Private marking for internal model.The
ConnectorInfo
class is correctly marked as private since it's an internal data structure used by the MCP tools. This will help keep the public API documentation focused on user-facing components.
113-113
: Good documentation of deferred registration pattern.The comment clearly indicates that tool registration happens later via the
register_connector_registry_tools
function, which helps maintainers understand why the decorator isn't applied directly. This is consistent with the pattern used throughout the MCP modules.
154-158
: Consistent @Private marking for internal registration function.The docstring update correctly marks this as an internal function. This is consistent with the pattern used in
cloud_ops.py
andlocal_ops.py
, helping maintain a clean public API surface by hiding implementation details of the tool registration lifecycle.airbyte/mcp/cloud_ops.py (1)
504-508
: Consistent @Private marking for internal registration function.The docstring update maintains consistency with the registration functions in
connector_registry.py
andlocal_ops.py
. This helps ensure that implementation details of the tool registration lifecycle remain internal while keeping the public API clean and focused on user-facing functionality.airbyte/mcp/server.py (3)
11-13
: LGTM! Import paths updated to reflect new public API structure.The import paths have been updated from the underscore-prefixed internal modules to the new public submodule structure. This aligns with the expanded MCP package API mentioned in the PR objectives and maintains backward compatibility since these imports are used internally for server initialization.
19-21
: Good addition of app object documentation.Adding a docstring for the module-level
app
variable improves code documentation and helps users understand the purpose of this FastMCP instance. Clear and concise!
27-34
: Enhanced main() documentation with appropriate @Private marking.The expanded docstring provides clear guidance about the function's purpose and usage, correctly marking it as private since it's invoked via CLI entry point rather than direct user calls. The note about consulting MCP client documentation is particularly helpful for users trying to understand how to connect to the server.
airbyte/mcp/local_ops.py (7)
28-28
: Good practice: marking internal constant as private.Renaming
CONFIG_HELP
to_CONFIG_HELP
follows Python convention for private module-level variables. This correctly signals that the constant is for internal use within the module (for tool descriptions) and shouldn't be relied upon by external code.
646-659
: Nice addition: cache introspection tool.The
list_cached_streams()
function provides a clean way to inspect what's in the default cache. The implementation correctly:
- Uses the
CachedDatasetInfo
model for structured returns- Properly closes the cache with
del cache
- Accounts for table prefixes and schema names
This will be helpful for users exploring their cached data. Have you tested this with various cache configurations (with/without table prefix, different schema names)? wdyt?
662-671
: Useful cache metadata tool.The
describe_default_cache()
function provides valuable metadata about cache configuration. The information returned (cache type, paths, and cached streams) will help users understand their cache setup, especially when working with SQL queries viarun_sql_query()
.
674-706
: Solid security validation for SQL queries.The
_is_safe_sql()
function implements good security practices:
- Disallows multi-statement queries (semicolon check)
- Whitelist approach for read-only operations
- Proper normalization for case-insensitive prefix matching
One thought: the semicolon check on line 693 scans the original query, which correctly catches semicolons in comments or strings. However, have you considered edge cases where a semicolon might appear in a string literal within a valid SELECT query? This might be overly restrictive, but it's also safer. wdyt about the trade-off?
709-758
: Well-implemented SQL query tool with good security.The
run_sql_query()
function is thoughtfully designed:
- Security validation before execution
- Clear error messages with context (query, traceback)
- Proper resource cleanup with
finally
block- Good defaults (max_records=1000)
- Helpful docstring with examples
The implementation correctly returns error information as a list with a single dict (matching the return type signature), which maintains consistency with the successful case.
763-767
: Consistent @Private marking for registration function.The docstring update maintains consistency with the registration functions in
cloud_ops.py
andconnector_registry.py
, correctly marking this internal function to keep it out of public API documentation.
768-785
: Proper registration of new tools with consistent description pattern.The registration logic correctly:
- Includes the three new tools (describe_default_cache, list_cached_streams, run_sql_query)
- Appends
_CONFIG_HELP
to tool descriptions for consistency- Keeps
list_connector_config_secrets
separate (without config help, which makes sense since it doesn't use connector config)Nice organization!
airbyte/mcp/__init__.py (2)
5-7
: Clear experimental status warning.The NOTE block appropriately sets expectations about API stability. This is helpful for users adopting the MCP integration.
139-139
: Expose all MCP submodules as intended:cloud_ops
,connector_registry
, andlocal_ops
each define public tool APIs (e.g.,deploy_source_to_cloud
,list_connectors
,validate_connector_config
), so including them alongsideserver
in__init__.py
correctly surfaces the full MCP API. wdyt?
Important
Auto-merge enabled.
This PR is set to merge automatically when all requirements are met.
Summary by CodeRabbit
New Features
Documentation
Style