Google Research MCP Server

Version 3.0.0 - Enhanced research synthesis with intelligent source quality assessment and deduplication.

An advanced Model Context Protocol (MCP) server that provides comprehensive Google search capabilities, webpage content extraction, and AI-powered research synthesis. Built for Claude Code, Claude Desktop, and other MCP-compatible clients.

Overview

This MCP server transforms Google search into a powerful research tool by:

Intelligent Source Ranking - Automatically scores sources by authority, recency, and credibility
Deduplication - Removes duplicate URLs and similar content across search results
Agent-Based Synthesis - Leverages your existing Claude session to synthesize research findings
Focus Area Analysis - Provides dedicated analysis for specific aspects of your research topic
Quality Metrics - Tracks source diversity, authority, and content freshness

Quick Start

Prerequisites

Node.js 18 or higher
Google Cloud Platform account with Custom Search API enabled
Google Custom Search Engine ID

Installation

# Clone the repository
git clone <https://github.com/mixelpixx/Google-Search-MCP-Server>
cd Google-Research-MCP

# Install dependencies
npm install

# Build the project
npm run build

Configuration

Create a .env file in the project root:

GOOGLE_API_KEY=your_google_api_key
GOOGLE_SEARCH_ENGINE_ID=your_custom_search_engine_id

Note: No Anthropic API key is required. The server uses agent-based synthesis that leverages your existing Claude session.

Running the Server

# Start v3 server (recommended)
npm run start:v3

# For HTTP mode
npm run start:v3:http

Expected output:

============================================================
Google Research MCP Server v3.0.0 (Enhanced)
============================================================
✓ Source quality assessment
✓ Deduplication
✓ AI synthesis: AGENT MODE (Claude will launch agents)
  └─ No API key needed - uses your existing Claude session
✓ Focus area analysis
✓ Enhanced error handling
✓ Cache metadata
============================================================
Server running on STDIO

Features

Core Capabilities

1. Advanced Google Search

Full-text search with quality scoring
Domain filtering and date restrictions
Result categorization (academic, official docs, news, forums, etc.)
Automatic deduplication of results
Source authority ranking

2. Content Extraction

Clean content extraction from web pages
Multiple output formats (Markdown, HTML, plain text)
Configurable preview lengths
Batch extraction support (up to 5 URLs)
Automatic content summarization

3. Research Synthesis

Agent-based research analysis
Comprehensive source synthesis
Focus area breakdowns
Contradiction detection
Actionable recommendations
Quality metrics reporting

Research Depth Levels

Depth	Sources	Analysis	Use Case
basic	3	Quick overview, 3-5 findings	Fast comparisons, initial research
intermediate	5	Comprehensive analysis, 5-7 findings	Standard research tasks
advanced	8-10	In-depth analysis, 7-10 findings, contradictions	Decision-making, comprehensive reviews

Usage Examples

Basic Research

research_topic({
  topic: "WebAssembly performance optimization",
  depth: "basic"
})

Returns:

3 high-quality sources
Brief overview (2-3 paragraphs)
3-5 key findings
Quality metrics

Comprehensive Research with Focus Areas

research_topic({
  topic: "Kubernetes security",
  depth: "advanced",
  focus_areas: ["RBAC", "network policies", "pod security"],
  num_sources: 8
})

Returns:

8 authoritative sources
In-depth executive summary
7-10 detailed findings
Common themes across sources
Dedicated analysis for each focus area
Contradictions between sources
Actionable recommendations
Comprehensive quality metrics

Targeted Search

google_search({
  query: "docker container security best practices",
  num_results: 10,
  dateRestrict: "y1",  // Last year only
  site: "github.com"   // Limit to GitHub
})

Returns:

Quality-scored results
Duplicate removal report
Source type classification
Authority ratings

Content Extraction

extract_webpage_content({
  url: "https://kubernetes.io/docs/concepts/security/",
  format: "markdown",
  max_length: 5000,
  preview_length: 300
})

Returns:

Clean extracted content
Metadata (title, description, author)
Word count and statistics
Configurable preview
Cache information

Agent Mode

How It Works

Agent Mode is the default synthesis method. Instead of requiring a separate Anthropic API key, it uses your existing Claude session:

Research Gathering - MCP server searches, deduplicates, and ranks sources
Content Extraction - Full content extracted from top sources
Agent Prompt Generation - All research data packaged into structured prompt
Agent Launch - Claude Code automatically launches agent with research data
Synthesis - Agent analyzes sources and generates comprehensive report

Benefits

No Additional API Key - Uses your existing Claude subscription
Full Context - Agent has access to conversation history
Transparent Process - See agent analysis in real-time
Same Quality - Uses same Claude model you're already using

Alternative: Direct API Mode

For automated workflows or scripts, you can use Direct API mode:

# .env
ANTHROPIC_API_KEY=your_anthropic_api_key
USE_DIRECT_API=true

This bypasses agent mode and calls the Anthropic API directly from the MCP server.

Architecture

Services

src/
├── google-search-v3.ts              # Main MCP server (v3)
├── services/
│   ├── google-search.service.ts     # Google Custom Search integration
│   ├── content-extractor.service.ts # Web content extraction
│   ├── source-quality.service.ts    # Source ranking and scoring
│   ├── deduplication.service.ts     # Duplicate detection
│   └── research-synthesis.service.ts # Agent-based synthesis
└── types.ts                          # TypeScript interfaces

Data Flow

Search Query → Google API → Results
                              ↓
                         Deduplication
                              ↓
                         Quality Scoring
                              ↓
                         Content Extraction
                              ↓
                         Agent Synthesis
                              ↓
                    Comprehensive Research Report

API Reference

Tools

google_search

Search Google with advanced filtering and quality scoring.

Parameters:

query (string, required) - Search query
num_results (number, optional) - Number of results (default: 5, max: 10)
site (string, optional) - Limit to specific domain
language (string, optional) - ISO 639-1 language code
dateRestrict (string, optional) - Date filter (e.g., "m6" for last 6 months)
exactTerms (string, optional) - Exact phrase matching
resultType (string, optional) - Filter by type (image, news, video)
page (number, optional) - Pagination
sort (string, optional) - Sort by relevance or date

Returns:

Ranked search results with quality scores
Deduplication statistics
Source categorization
Pagination info
Cache metadata

extract_webpage_content

Extract clean content from a webpage.

Parameters:

url (string, required) - Target URL
format (enum, optional) - Output format: markdown, html, text (default: markdown)
full_content (boolean, optional) - Return full content (default: false)
max_length (number, optional) - Maximum content length
preview_length (number, optional) - Preview length (default: 500)

Returns:

Extracted content
Metadata (title, description, author)
Statistics (word count, character count)
Content summary
Cache information

extract_multiple_webpages

Batch extract content from multiple URLs (max 5).

Parameters:

urls (array, required) - Array of URLs (max 5)
format (enum, optional) - Output format

Returns:

Extracted content per URL
Error details for failed extractions
Cache metadata

research_topic

Comprehensive research with AI synthesis.

Parameters:

topic (string, required) - Research topic
depth (enum, optional) - Analysis depth: basic, intermediate, advanced (default: intermediate)
num_sources (number, optional) - Number of sources (default: varies by depth)
focus_areas (array, optional) - Specific aspects to analyze

Returns:

Executive summary
Key findings with citations
Common themes
Focus area analysis (if specified)
Contradictions between sources
Recommendations
Quality metrics (source diversity, authority, freshness)
Source list with quality scores

Configuration Options

Environment Variables

Variable	Required	Default	Description
`GOOGLE_API_KEY`	Yes	-	Google Custom Search API key
`GOOGLE_SEARCH_ENGINE_ID`	Yes	-	Custom Search Engine ID
`ANTHROPIC_API_KEY`	No	-	For Direct API mode only
`USE_DIRECT_API`	No	false	Enable Direct API mode
`MCP_TRANSPORT`	No	stdio	Transport mode: stdio or http
`PORT`	No	3000	Port for HTTP mode

Performance

Response Times

Operation	Typical Duration	Notes
google_search	1-2s	Includes quality scoring and deduplication
extract_webpage_content	2-3s	Per URL
research_topic (basic)	8-10s	3 sources with agent synthesis
research_topic (intermediate)	12-15s	5 sources with comprehensive analysis
research_topic (advanced)	18-25s	8-10 sources with deep analysis

Quality Improvements Over v2

Metric	v2	v3	Improvement
Summary Quality	2/10	9/10	350%
Source Diversity	Not tracked	Optimized	New
Duplicate Removal	0%	~30%	New
Source Ranking	Random	By quality	New
Focus Area Support	Generic	Dedicated	New
Error Helpfulness	3/10	9/10	200%

Troubleshooting

Agent Mode Not Working

Symptoms: Research returns basic concatenation instead of synthesis

Solutions:

Verify server shows "AGENT MODE" on startup
Check for [AGENT_SYNTHESIS_REQUIRED] in response
Ensure using v3: npm run start:v3
Rebuild: npm run build

Quality Scores Missing

Symptoms: Search results don't show quality scores

Solutions:

Confirm running v3, not v2
Check server startup output
Verify no TypeScript compilation errors

No Results Found

Solutions:

Verify Google API key is valid
Check Custom Search Engine ID
Ensure search engine has indexing enabled
Try broader search terms

Documentation

QUICK-START.md - Fast setup guide (2 minutes)
AGENT-MODE.md - Comprehensive agent mode documentation
SETUP-V3.md - Detailed setup and testing guide
README-V3.md - Feature documentation and comparisons
tool-evaluation-report.md - Detailed analysis of improvements
implementation-guide.md - Code implementation details

Version History

v3.0.0 (Current)

Agent-based synthesis (no API key required)
Source quality assessment and ranking
Comprehensive deduplication
Focus area analysis
Enhanced error handling with suggestions
Cache metadata transparency
Consistent preview lengths
Research depth differentiation

v2.0.0

HTTP transport support
Batch webpage extraction
Basic research synthesis
Content categorization

v1.0.0

Initial release
Google Custom Search integration
Basic content extraction

Contributing

Contributions are welcome. Please ensure:

Code follows existing style conventions
All tests pass: npm run build
Documentation is updated
Commit messages are descriptive

License

See LICENSE file for details.

Support

For issues, questions, or feature requests, please open an issue on GitHub.

Credits

Google Custom Search API - Search functionality
Anthropic Claude - AI-powered research synthesis
Mozilla Readability - Content extraction
MCP SDK - Model Context Protocol integration

Status: Production Ready Version: 3.0.0 Last Updated: 2025-11-07

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
src		src
.env.example		.env.example
.gitignore		.gitignore
AGENT-MODE.md		AGENT-MODE.md
CONTRIBUTING.md		CONTRIBUTING.md
License.md		License.md
QUICK-START.md		QUICK-START.md
README.md		README.md
SETUP-V3.md		SETUP-V3.md
license		license
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

License

Licenses found

mixelpixx/Google-Search-MCP-Server

Folders and files

Latest commit

History

Repository files navigation

Google Research MCP Server

Overview

Quick Start

Prerequisites

Installation

Configuration

Running the Server

Features

Core Capabilities

1. Advanced Google Search

2. Content Extraction

3. Research Synthesis

Research Depth Levels

Usage Examples

Basic Research

Comprehensive Research with Focus Areas

Targeted Search

Content Extraction

Agent Mode

How It Works

Benefits

Alternative: Direct API Mode

Architecture

Services

Data Flow

API Reference

Tools

google_search

extract_webpage_content

extract_multiple_webpages

research_topic

Configuration Options

Environment Variables

Performance

Response Times

Quality Improvements Over v2

Troubleshooting

Agent Mode Not Working

Quality Scores Missing

No Results Found

Documentation

Version History

v3.0.0 (Current)

v2.0.0

v1.0.0

Contributing

License

Support

Credits

About

Resources

License

Licenses found

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages