Skip to content

Feature/deep wiki mcp#12

Merged
qchapp merged 15 commits into
developfrom
feature/deep-wiki-mcp
Jan 22, 2026
Merged

Feature/deep wiki mcp#12
qchapp merged 15 commits into
developfrom
feature/deep-wiki-mcp

Conversation

@qchapp
Copy link
Copy Markdown
Member

@qchapp qchapp commented Dec 9, 2025

This pull request introduces DeepWiki MCP integration as the primary source for GitHub repository documentation in the repository info tool, with automatic fallback to the GitHub API for robustness. It also updates dependencies, enhances observability, and improves code structure for repo info handling and string clipping.

DeepWiki Integration and Fallback:

  • Added a new deepwiki_tool.py that uses the DeepWiki MCP server (https://mcp.deepwiki.com/sse) to fetch repository documentation, with a 60-second timeout and automatic fallback to repocards if DeepWiki is unavailable. This provides fast, pre-indexed documentation access without API rate limits.
  • The repo info tool now logs and returns the data source (source field: "deepwiki" or "repocards") for observability and debugging. [1] [2]

Dependency and Schema Updates:

  • Updated the pydantic-ai dependency to include MCP support (pydantic-ai[mcp]) and added repocards as a dependency. [1] [2]
  • Enhanced the RepoSummaryOutput schema to include a source field indicating the data source.

Agent and Tool Improvements:

  • Increased the maximum allowed repo info tool calls per run from 3 to 6 to support more robust querying. This is not the final number but it was mainly for testing purposes.
  • Refactored repo info tool code to use await for async DeepWiki calls and improved logging of tool call details, including the data source.
  • Improved GitHub repository URL parsing and normalization logic for better robustness and edge case handling. [1] [2]

Utility Enhancements:

  • Added a _clip utility function to safely truncate long documentation strings, ensuring LLM token usage stays within budget. [1] [2] [3]

Changelog:

  • Updated CHANGELOG.md to document the DeepWiki integration, fallback logic, dependency updates, and schema changes.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds DeepWiki MCP integration as the primary source for GitHub repository documentation, with automatic fallback to the repocards library for robustness. It introduces async/await patterns for the repository info tool, updates dependencies, and adds utility functions for URL normalization and text clipping to manage token budgets.

Key changes:

  • Integrated DeepWiki MCP server with 60-second timeout and automatic fallback to repocards
  • Made tool_repo_summary async and added source tracking in responses
  • Added URL parsing utilities and text clipping function to manage LLM token limits

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/ai_agent/agent/tools/deepwiki_tool.py New module implementing DeepWiki MCP integration with timeout handling and fallback logic
src/ai_agent/agent/tools/repo_info_tool.py Refactored to use async DeepWiki lookup with repocards fallback, simplified from previous GitHub API implementation
src/ai_agent/agent/tools/utils.py Added _clip utility function to truncate long text for token budget management
src/ai_agent/agent/utils.py Added GitHub URL parsing and normalization functions (_coerce_owner_repo_ref, coerce_github_url_or_none)
src/ai_agent/agent/agent.py Updated to use async repo_info tool, increased tool call limit from 3 to 6, added source tracking to tool calls
pyproject.toml Added pydantic-ai[mcp] and repocards dependencies
CHANGELOG.md Documented DeepWiki integration, fallback mechanism, and schema changes

Comment thread src/ai_agent/agent/tools/deepwiki_tool.py Outdated
Comment thread src/ai_agent/agent/tools/deepwiki_tool.py
Comment thread src/ai_agent/agent/utils.py Outdated
Comment thread CHANGELOG.md Outdated
Comment thread src/ai_agent/agent/tools/utils.py Outdated
Comment thread src/ai_agent/agent/tools/utils.py Outdated
Comment thread src/ai_agent/agent/utils.py Outdated
qchapp and others added 3 commits December 9, 2025 17:29
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented Dec 9, 2025

@qchapp I've opened a new pull request, #13, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot AI and others added 2 commits December 9, 2025 16:35
Co-authored-by: qchapp <74377782+qchapp@users.noreply.github.com>
[WIP] Update deep wiki MCP implementation based on feedback
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Comment thread src/ai_agent/agent/tools/repo_info_tool.py Outdated
Comment thread src/ai_agent/agent/tools/repo_info_tool.py
Comment thread src/ai_agent/agent/tools/deepwiki_tool.py
Comment thread src/ai_agent/agent/tools/utils.py Outdated
qchapp and others added 4 commits December 9, 2025 17:47
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@qchapp
Copy link
Copy Markdown
Member Author

qchapp commented Dec 9, 2025

This looks clean now but I'm waiting for @rmfranken to confirm before merging to develop.

@rmfranken
Copy link
Copy Markdown

rmfranken commented Dec 10, 2025

Hi Quentin, I see you have fully embraced the AI workflow! Looks pretty cool.

That being said, and perhaps especially with the above in mind, have you written some tests for this code with expected in/output? Or, at the least, have you ran some tests manually?

I've gone through the code and it looks good to me, I am wondering if we are still doing gitlab support now, especially looking at the regex section in the utils.py - to be honest I have not tried using a gitlab repo with our tool in a while... @caviri is that still planned to be supported, or are we just abandoning gitlab altogether? Or does this code fallback to OG gitlab-supported gimie if the deepwiki stuff returns a no-can-do?

edit: I just realized this is not for our git-meta-extractor repo lol. But the question soooort of stands - as deepwiki does not support gitlab repos, does that mean the recommendation tool is going to be heavily biased to github tools because there is more information available for those?

Copy link
Copy Markdown

@rmfranken rmfranken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments above:

  • Tests!

@qchapp
Copy link
Copy Markdown
Member Author

qchapp commented Dec 11, 2025

as deepwiki does not support gitlab repos, does that mean the recommendation tool is going to be heavily biased to github tools because there is more information available for those?

Indeed, for now we don't have any information for gitlab repo. Maybe we should try using this: https://github.com/AsyncFuncAI/deepwiki-open. I don't know what you guys think? @rmfranken @caviri

For the tests part, I did some manual tests checking if Deepwiki returned a response but I can try to use pytest to test for different cases (deepwiki, fallback to repocards, gitlab repo ans non-github/non-gitlab url). Thanks for the review

@qchapp
Copy link
Copy Markdown
Member Author

qchapp commented Dec 12, 2025

I counted only 5 software out of 73 that are not hosted on github. What I can try is to make repocards usable for gitlab as well but the information are not very relevant compared to deepwiki. I think the best solution would be to run our own instance of https://github.com/AsyncFuncAI/deepwiki-open and query it but I don't know if it is really necessary for roughly 7% of the software only.

Otherwise I'm writing some unit tests but there isn't much to learn from it since for now we only support github url.

@qchapp
Copy link
Copy Markdown
Member Author

qchapp commented Dec 22, 2025

So now repocards supports GitLab as well! But again the information retrieved are really far from the ones from deepwiki, which could indeed introduce a bias.

@qchapp qchapp requested a review from caviri January 13, 2026 13:49
@qchapp
Copy link
Copy Markdown
Member Author

qchapp commented Jan 22, 2026

After discussing with Carlos last Tuesday we don't think that tests are really necessary for now, although I wrote some basic tests that all passed (I will commit them along with the PR in a second). The issue with gitlab is not a big deal as there is almost no gitlab software. I let you review @caviri before we merge the PR with the develop branch. Thanks!

@caviri
Copy link
Copy Markdown
Member

caviri commented Jan 22, 2026

Thanks @qchapp I'll review it now

Copy link
Copy Markdown
Member

@caviri caviri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @qchapp, all clear. Thanks, you can proceed with the merge.

@qchapp qchapp merged commit dbd3bb7 into develop Jan 22, 2026
2 checks passed
@qchapp qchapp deleted the feature/deep-wiki-mcp branch January 22, 2026 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants