Version 3.0.0 - Enhanced research synthesis with intelligent source quality assessment and deduplication.
An advanced Model Context Protocol (MCP) server that provides comprehensive Google search capabilities, webpage content extraction, and AI-powered research synthesis. Built for Claude Code, Claude Desktop, and other MCP-compatible clients.
This MCP server transforms Google search into a powerful research tool by:
- Intelligent Source Ranking - Automatically scores sources by authority, recency, and credibility
- Deduplication - Removes duplicate URLs and similar content across search results
- Agent-Based Synthesis - Leverages your existing Claude session to synthesize research findings
- Focus Area Analysis - Provides dedicated analysis for specific aspects of your research topic
- Quality Metrics - Tracks source diversity, authority, and content freshness
- Node.js 18 or higher
- Google Cloud Platform account with Custom Search API enabled
- Google Custom Search Engine ID
# Clone the repository
git clone <https://github.com/mixelpixx/Google-Search-MCP-Server>
cd Google-Research-MCP
# Install dependencies
npm install
# Build the project
npm run buildCreate a .env file in the project root:
GOOGLE_API_KEY=your_google_api_key
GOOGLE_SEARCH_ENGINE_ID=your_custom_search_engine_idNote: No Anthropic API key is required. The server uses agent-based synthesis that leverages your existing Claude session.
# Start v3 server (recommended)
npm run start:v3
# For HTTP mode
npm run start:v3:httpExpected output:
============================================================
Google Research MCP Server v3.0.0 (Enhanced)
============================================================
✓ Source quality assessment
✓ Deduplication
✓ AI synthesis: AGENT MODE (Claude will launch agents)
└─ No API key needed - uses your existing Claude session
✓ Focus area analysis
✓ Enhanced error handling
✓ Cache metadata
============================================================
Server running on STDIO
- Full-text search with quality scoring
- Domain filtering and date restrictions
- Result categorization (academic, official docs, news, forums, etc.)
- Automatic deduplication of results
- Source authority ranking
- Clean content extraction from web pages
- Multiple output formats (Markdown, HTML, plain text)
- Configurable preview lengths
- Batch extraction support (up to 5 URLs)
- Automatic content summarization
- Agent-based research analysis
- Comprehensive source synthesis
- Focus area breakdowns
- Contradiction detection
- Actionable recommendations
- Quality metrics reporting
| Depth | Sources | Analysis | Use Case |
|---|---|---|---|
| basic | 3 | Quick overview, 3-5 findings | Fast comparisons, initial research |
| intermediate | 5 | Comprehensive analysis, 5-7 findings | Standard research tasks |
| advanced | 8-10 | In-depth analysis, 7-10 findings, contradictions | Decision-making, comprehensive reviews |
research_topic({
topic: "WebAssembly performance optimization",
depth: "basic"
})Returns:
- 3 high-quality sources
- Brief overview (2-3 paragraphs)
- 3-5 key findings
- Quality metrics
research_topic({
topic: "Kubernetes security",
depth: "advanced",
focus_areas: ["RBAC", "network policies", "pod security"],
num_sources: 8
})Returns:
- 8 authoritative sources
- In-depth executive summary
- 7-10 detailed findings
- Common themes across sources
- Dedicated analysis for each focus area
- Contradictions between sources
- Actionable recommendations
- Comprehensive quality metrics
google_search({
query: "docker container security best practices",
num_results: 10,
dateRestrict: "y1", // Last year only
site: "github.com" // Limit to GitHub
})Returns:
- Quality-scored results
- Duplicate removal report
- Source type classification
- Authority ratings
extract_webpage_content({
url: "https://kubernetes.io/docs/concepts/security/",
format: "markdown",
max_length: 5000,
preview_length: 300
})Returns:
- Clean extracted content
- Metadata (title, description, author)
- Word count and statistics
- Configurable preview
- Cache information
Agent Mode is the default synthesis method. Instead of requiring a separate Anthropic API key, it uses your existing Claude session:
- Research Gathering - MCP server searches, deduplicates, and ranks sources
- Content Extraction - Full content extracted from top sources
- Agent Prompt Generation - All research data packaged into structured prompt
- Agent Launch - Claude Code automatically launches agent with research data
- Synthesis - Agent analyzes sources and generates comprehensive report
- No Additional API Key - Uses your existing Claude subscription
- Full Context - Agent has access to conversation history
- Transparent Process - See agent analysis in real-time
- Same Quality - Uses same Claude model you're already using
For automated workflows or scripts, you can use Direct API mode:
# .env
ANTHROPIC_API_KEY=your_anthropic_api_key
USE_DIRECT_API=trueThis bypasses agent mode and calls the Anthropic API directly from the MCP server.
src/
├── google-search-v3.ts # Main MCP server (v3)
├── services/
│ ├── google-search.service.ts # Google Custom Search integration
│ ├── content-extractor.service.ts # Web content extraction
│ ├── source-quality.service.ts # Source ranking and scoring
│ ├── deduplication.service.ts # Duplicate detection
│ └── research-synthesis.service.ts # Agent-based synthesis
└── types.ts # TypeScript interfaces
Search Query → Google API → Results
↓
Deduplication
↓
Quality Scoring
↓
Content Extraction
↓
Agent Synthesis
↓
Comprehensive Research Report
Search Google with advanced filtering and quality scoring.
Parameters:
query(string, required) - Search querynum_results(number, optional) - Number of results (default: 5, max: 10)site(string, optional) - Limit to specific domainlanguage(string, optional) - ISO 639-1 language codedateRestrict(string, optional) - Date filter (e.g., "m6" for last 6 months)exactTerms(string, optional) - Exact phrase matchingresultType(string, optional) - Filter by type (image, news, video)page(number, optional) - Paginationsort(string, optional) - Sort by relevance or date
Returns:
- Ranked search results with quality scores
- Deduplication statistics
- Source categorization
- Pagination info
- Cache metadata
Extract clean content from a webpage.
Parameters:
url(string, required) - Target URLformat(enum, optional) - Output format: markdown, html, text (default: markdown)full_content(boolean, optional) - Return full content (default: false)max_length(number, optional) - Maximum content lengthpreview_length(number, optional) - Preview length (default: 500)
Returns:
- Extracted content
- Metadata (title, description, author)
- Statistics (word count, character count)
- Content summary
- Cache information
Batch extract content from multiple URLs (max 5).
Parameters:
urls(array, required) - Array of URLs (max 5)format(enum, optional) - Output format
Returns:
- Extracted content per URL
- Error details for failed extractions
- Cache metadata
Comprehensive research with AI synthesis.
Parameters:
topic(string, required) - Research topicdepth(enum, optional) - Analysis depth: basic, intermediate, advanced (default: intermediate)num_sources(number, optional) - Number of sources (default: varies by depth)focus_areas(array, optional) - Specific aspects to analyze
Returns:
- Executive summary
- Key findings with citations
- Common themes
- Focus area analysis (if specified)
- Contradictions between sources
- Recommendations
- Quality metrics (source diversity, authority, freshness)
- Source list with quality scores
| Variable | Required | Default | Description |
|---|---|---|---|
GOOGLE_API_KEY |
Yes | - | Google Custom Search API key |
GOOGLE_SEARCH_ENGINE_ID |
Yes | - | Custom Search Engine ID |
ANTHROPIC_API_KEY |
No | - | For Direct API mode only |
USE_DIRECT_API |
No | false | Enable Direct API mode |
MCP_TRANSPORT |
No | stdio | Transport mode: stdio or http |
PORT |
No | 3000 | Port for HTTP mode |
| Operation | Typical Duration | Notes |
|---|---|---|
| google_search | 1-2s | Includes quality scoring and deduplication |
| extract_webpage_content | 2-3s | Per URL |
| research_topic (basic) | 8-10s | 3 sources with agent synthesis |
| research_topic (intermediate) | 12-15s | 5 sources with comprehensive analysis |
| research_topic (advanced) | 18-25s | 8-10 sources with deep analysis |
| Metric | v2 | v3 | Improvement |
|---|---|---|---|
| Summary Quality | 2/10 | 9/10 | 350% |
| Source Diversity | Not tracked | Optimized | New |
| Duplicate Removal | 0% | ~30% | New |
| Source Ranking | Random | By quality | New |
| Focus Area Support | Generic | Dedicated | New |
| Error Helpfulness | 3/10 | 9/10 | 200% |
Symptoms: Research returns basic concatenation instead of synthesis
Solutions:
- Verify server shows "AGENT MODE" on startup
- Check for
[AGENT_SYNTHESIS_REQUIRED]in response - Ensure using v3:
npm run start:v3 - Rebuild:
npm run build
Symptoms: Search results don't show quality scores
Solutions:
- Confirm running v3, not v2
- Check server startup output
- Verify no TypeScript compilation errors
Solutions:
- Verify Google API key is valid
- Check Custom Search Engine ID
- Ensure search engine has indexing enabled
- Try broader search terms
- QUICK-START.md - Fast setup guide (2 minutes)
- AGENT-MODE.md - Comprehensive agent mode documentation
- SETUP-V3.md - Detailed setup and testing guide
- README-V3.md - Feature documentation and comparisons
- tool-evaluation-report.md - Detailed analysis of improvements
- implementation-guide.md - Code implementation details
- Agent-based synthesis (no API key required)
- Source quality assessment and ranking
- Comprehensive deduplication
- Focus area analysis
- Enhanced error handling with suggestions
- Cache metadata transparency
- Consistent preview lengths
- Research depth differentiation
- HTTP transport support
- Batch webpage extraction
- Basic research synthesis
- Content categorization
- Initial release
- Google Custom Search integration
- Basic content extraction
Contributions are welcome. Please ensure:
- Code follows existing style conventions
- All tests pass:
npm run build - Documentation is updated
- Commit messages are descriptive
See LICENSE file for details.
For issues, questions, or feature requests, please open an issue on GitHub.
- Google Custom Search API - Search functionality
- Anthropic Claude - AI-powered research synthesis
- Mozilla Readability - Content extraction
- MCP SDK - Model Context Protocol integration
Status: Production Ready Version: 3.0.0 Last Updated: 2025-11-07