Feature/deep wiki mcp#12
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds DeepWiki MCP integration as the primary source for GitHub repository documentation, with automatic fallback to the repocards library for robustness. It introduces async/await patterns for the repository info tool, updates dependencies, and adds utility functions for URL normalization and text clipping to manage token budgets.
Key changes:
- Integrated DeepWiki MCP server with 60-second timeout and automatic fallback to repocards
- Made
tool_repo_summaryasync and added source tracking in responses - Added URL parsing utilities and text clipping function to manage LLM token limits
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| src/ai_agent/agent/tools/deepwiki_tool.py | New module implementing DeepWiki MCP integration with timeout handling and fallback logic |
| src/ai_agent/agent/tools/repo_info_tool.py | Refactored to use async DeepWiki lookup with repocards fallback, simplified from previous GitHub API implementation |
| src/ai_agent/agent/tools/utils.py | Added _clip utility function to truncate long text for token budget management |
| src/ai_agent/agent/utils.py | Added GitHub URL parsing and normalization functions (_coerce_owner_repo_ref, coerce_github_url_or_none) |
| src/ai_agent/agent/agent.py | Updated to use async repo_info tool, increased tool call limit from 3 to 6, added source tracking to tool calls |
| pyproject.toml | Added pydantic-ai[mcp] and repocards dependencies |
| CHANGELOG.md | Documented DeepWiki integration, fallback mechanism, and schema changes |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: qchapp <74377782+qchapp@users.noreply.github.com>
[WIP] Update deep wiki MCP implementation based on feedback
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ging-plaza-ai-agent into feature/deep-wiki-mcp
|
This looks clean now but I'm waiting for @rmfranken to confirm before merging to |
|
Hi Quentin, I see you have fully embraced the AI workflow! Looks pretty cool. That being said, and perhaps especially with the above in mind, have you written some tests for this code with expected in/output? Or, at the least, have you ran some tests manually? I've gone through the code and it looks good to me, I am wondering if we are still doing gitlab support now, especially looking at the regex section in the utils.py - to be honest I have not tried using a gitlab repo with our tool in a while... @caviri is that still planned to be supported, or are we just abandoning gitlab altogether? Or does this code fallback to OG gitlab-supported gimie if the deepwiki stuff returns a no-can-do? edit: I just realized this is not for our git-meta-extractor repo lol. But the question soooort of stands - as deepwiki does not support gitlab repos, does that mean the recommendation tool is going to be heavily biased to github tools because there is more information available for those? |
Indeed, for now we don't have any information for gitlab repo. Maybe we should try using this: https://github.com/AsyncFuncAI/deepwiki-open. I don't know what you guys think? @rmfranken @caviri For the tests part, I did some manual tests checking if Deepwiki returned a response but I can try to use |
|
I counted only 5 software out of 73 that are not hosted on github. What I can try is to make Otherwise I'm writing some unit tests but there isn't much to learn from it since for now we only support github url. |
|
So now |
|
After discussing with Carlos last Tuesday we don't think that tests are really necessary for now, although I wrote some basic tests that all passed (I will commit them along with the PR in a second). The issue with gitlab is not a big deal as there is almost no gitlab software. I let you review @caviri before we merge the PR with the |
|
Thanks @qchapp I'll review it now |
This pull request introduces DeepWiki MCP integration as the primary source for GitHub repository documentation in the repository info tool, with automatic fallback to the GitHub API for robustness. It also updates dependencies, enhances observability, and improves code structure for repo info handling and string clipping.
DeepWiki Integration and Fallback:
deepwiki_tool.pythat uses the DeepWiki MCP server (https://mcp.deepwiki.com/sse) to fetch repository documentation, with a 60-second timeout and automatic fallback torepocardsif DeepWiki is unavailable. This provides fast, pre-indexed documentation access without API rate limits.sourcefield: "deepwiki" or "repocards") for observability and debugging. [1] [2]Dependency and Schema Updates:
pydantic-aidependency to include MCP support (pydantic-ai[mcp]) and addedrepocardsas a dependency. [1] [2]RepoSummaryOutputschema to include asourcefield indicating the data source.Agent and Tool Improvements:
awaitfor async DeepWiki calls and improved logging of tool call details, including the data source.Utility Enhancements:
_cliputility function to safely truncate long documentation strings, ensuring LLM token usage stays within budget. [1] [2] [3]Changelog:
CHANGELOG.mdto document the DeepWiki integration, fallback logic, dependency updates, and schema changes.