A lightweight MCP server that searches and retrieves relevant documentation content from popular AI libraries like LangChain, LlamaIndex, and OpenAI using a combination of web search and content parsing.
This project allows Language Models to query and fetch up-to-date documentation content dynamically, acting as a bridge between LLMs and external doc sources.
The Model Context Protocol is an open standard that enables developers to build secure, two-way connections between their data sources and AI-powered tools. The architecture is straightforward: developers can either expose their data through MCP servers or build AI applications (MCP clients) that connect to these servers.
LLMs alone are limited — their true potential is unlocked when integrated with tools and services via frameworks like MCP.
-
LLMs without tools, LLMs are static and have limited utility.
-
With tools, they become interactive, but orchestration can be messy.
-
With MCP, LLMs gain a scalable, plug-and-play interface to real-world services, making them much more practical and powerful in production environments.
The MCP Server acts as the translator/interface between LLMs and services.
MCP (Modular Capability Provider) standardizes how LLMs interact with external tools/services — promoting interoperability, modularity, and cleaner interfaces.
This structure decentralizes responsibility:
-
Tool providers build and maintain their own MCP Server implementation.
-
LLMs just need to speak the MCP protocol.
Purpose and Vision:
-
Standardize communication between LLMs and external tools
-
Avoid bespoke integrations
-
Encourage a scalable ecosystem of services (like a plugin architecture)
Web Search Integration
Uses the Serper API to query Google and retrieve the top documentation pages related to a given search query.
Clean Content Extraction
Parses HTML content using BeautifulSoup to extract clean, human readable text stripping away unnecessary tags, ads, or navigation content.
Seamless LLM Tooling
Exposes a structured get_docs tool that can be used within LLM agents (e.g., Claude, GPT) to query specific libraries in real time.
get_docs(query: str, library: str)
This is the core tool provided by the MCP server. It accepts:
query: The search term or phrase.
library: One of langchain, llama-index, or openai.
- Searches for relevant documentation pages
- Fetches and parses clean text content
- Sends the result back to the LLM for further reasoning and responses
- Clone the repository
git clone https://github.com/your-username/mcp-docs-search.git
cd mcp-docs-search
- Create a virtual Envoirment using uv and activate it
uv venv .venv
.\.venv\Scripts\activate
- Install dependencies
uv add "mcp[cli]" httpx
uv pip install beautifulsoup4
- Set your environment variables Create a .env file and add your Serper API key:
SERPER_API_KEY=your_serper_api_key
To integrate this server as a tool within Claude Desktop:
Open Claude Desktop → File > Settings > Developer > Edit Config.
Update your claude_desktop_config.json to include the following:
{
"mcpServers": {
"documnetation": {
"command": "uv",
"args": [
"--directory",
"your_reository_where_the_repo_exists",
"run",
"main.py"
]
}
}
}
🔁 Important: Restart Claude Desktop after saving the config to load the new to
Once integrated successfully, you'll see your custom MCP tool appear within the Claude UI:
Use it to query docs in real time:
One can also debug the tool that we created using the following command.
Remember to install NodeJs18+
npx @modelcontextprotocol/inspector uv run main.py
and follow to the port where the connection is setup.
More libraries can be easily added by updating the docs_urls dictionary.
-
Add support for additional libraries like HuggingFace, PyTorch, TensorFlow, etc.
-
Implement caching to reduce redundant fetches and improve performance.
-
Introduce a scoring/ranking mechanism based on relevance or token quality.
-
Unit testing and better exception handling for production readiness.