Skip to content

MCP server to search across NVIDIA blogs and releases to empower LLMs to better answer NVIDIA specific queries

License

Notifications You must be signed in to change notification settings

bharatr21/mcp-nvidia

mcp-nvidia

MCP server to search across NVIDIA blogs and releases to empower LLMs to better answer NVIDIA-specific queries.

Overview

This Model Context Protocol (MCP) server enables Large Language Models (LLMs) to search across multiple NVIDIA domains to find relevant information about NVIDIA technologies, products, and services.

Supported Domains (15 default, customizable)

The server searches across the following NVIDIA domains by default. You can customize this list using the MCP_NVIDIA_DOMAINS environment variable or the domains parameter in search queries:

  • blogs.nvidia.com - NVIDIA blog posts and articles
  • build.nvidia.com - NVIDIA AI Foundation models and services
  • catalog.ngc.nvidia.com - NGC catalog of GPU-accelerated software
  • developer.download.nvidia.com - Developer downloads and resources
  • developer.nvidia.com - Developer resources, SDKs, and technical documentation
  • docs.api.nvidia.com - API documentation
  • docs.nvidia.com - Comprehensive technical documentation
  • docs.omniverse.nvidia.com - Omniverse documentation
  • forums.developer.nvidia.com - Developer forums
  • forums.nvidia.com - Community forums
  • ngc.nvidia.com - NVIDIA GPU Cloud
  • nvidia.github.io - NVIDIA GitHub Pages documentation
  • nvidianews.nvidia.com - Official NVIDIA news and press releases
  • research.nvidia.com - NVIDIA research publications
  • resources.nvidia.com - NVIDIA resources and whitepapers

Note: For security, only nvidia.com domains and nvidia.github.io are allowed.

Key Features

Search & Ranking

  • Advanced relevance scoring combining multiple signals
  • Intelligent keyword extraction
  • Context-aware search with enhanced snippets and highlighting
  • Flexible sorting by relevance, publication date, or domain
  • Content type detection (tutorials, announcements, guides, forums, etc.)
  • Date extraction from HTML metadata and page content

Rich Metadata

  • Publication dates extracted from HTML metadata and text patterns
  • Content metadata including author, word count, code/video/image detection
  • Structured content types for easy filtering and categorization

Connectivity

  • stdio mode (default) - for local MCP clients like Claude Desktop
  • HTTP/SSE mode - for remote access and web-based clients
  • Structured JSON output compatible with AI agents and LLMs

Customization

  • Domain-specific filtering for targeted searches
  • Configurable domains via environment variable
  • Adjustable relevance thresholds and result limits
  • Multiple sort options (relevance, date, domain)

Installation

Via npx (Easiest - recommended for MCP clients)

npx @bharatr21/mcp-nvidia

Or add to your Claude Desktop config:

{
  "mcpServers": {
    "nvidia": {
      "command": "npx",
      "args": ["-y", "@bharatr21/mcp-nvidia"]
    }
  }
}

Note: This requires Python 3.10+ to be installed on your system. The package will automatically use the Python backend.

Via pip

pip install mcp-nvidia

From source

git clone https://github.com/bharatr21/mcp-nvidia.git
cd mcp-nvidia
pip install -e .

Usage

Running the Server

The MCP server can be run directly from the command line:

mcp-nvidia

Configuration

The server can be configured using environment variables:

  • MCP_NVIDIA_DOMAINS: Comma-separated list of custom NVIDIA domains to search (overrides defaults)
    • Security: Only nvidia.com domains and subdomains are allowed. Invalid domains are automatically filtered out.
    • Example: "https://developer.nvidia.com/,https://docs.nvidia.com/"
  • MCP_NVIDIA_LOG_LEVEL: Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)

Example:

export MCP_NVIDIA_DOMAINS="https://developer.nvidia.com/,https://docs.nvidia.com/"
export MCP_NVIDIA_LOG_LEVEL="DEBUG"
mcp-nvidia

Configuring with Claude Desktop

Add the following to your Claude Desktop configuration file:

MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "nvidia": {
      "command": "mcp-nvidia"
    }
  }
}

With custom configuration (environment variables):

{
  "mcpServers": {
    "nvidia": {
      "command": "mcp-nvidia",
      "env": {
        "MCP_NVIDIA_LOG_LEVEL": "DEBUG",
        "MCP_NVIDIA_DOMAINS": "https://developer.nvidia.com/,https://docs.nvidia.com/"
      }
    }
  }
}

If you installed from source, you may need to use the full path to the Python executable:

{
  "mcpServers": {
    "nvidia": {
      "command": "/path/to/python",
      "args": ["-m", "mcp_nvidia"]
    }
  }
}

HTTP Server Mode (Remote Access)

For remote access or public deployment, use HTTP/SSE mode:

# Run in HTTP mode (default port 8000)
mcp-nvidia http

# Custom host and port
mcp-nvidia http --host 0.0.0.0 --port 3000

# Or explicitly use stdio mode (default)
mcp-nvidia stdio
mcp-nvidia  # same as stdio

Configure Claude AI for remote server:

{
  "mcpServers": {
    "nvidia": {
      "url": "https://your-server.com/sse",
      "transport": "sse"
    }
  }
}

SDK Resources (Code Execution Mode)

In addition to the standard MCP tool-calling interface, this server exposes TypeScript and Python SDKs as MCP Resources. This enables code execution workflows where AI agents can:

  1. Discover SDK files via list_resources()
  2. Read type definitions via read_resource(uri)
  3. Write code that uses fully-typed interfaces instead of raw tool calls

This approach provides the benefits described in Anthropic's Code Execution with MCP and Cloudflare's Code Mode:

  • Type safety - Full TypeScript/Python type hints for inputs and outputs
  • Data processing in code - Filter and transform results before returning to the model
  • Leverages LLM strengths - LLMs excel at writing code against APIs

Available SDK Resources

The server exposes the following resources via the MCP Resources protocol:

TypeScript SDK:

  • mcp-nvidia://sdk/typescript/search_nvidia.ts - TypeScript types and interface for search_nvidia
  • mcp-nvidia://sdk/typescript/discover_nvidia_content.ts - TypeScript types for discover_nvidia_content
  • mcp-nvidia://sdk/typescript/index.ts - Barrel export file
  • mcp-nvidia://sdk/typescript/README.md - TypeScript SDK documentation

Python SDK:

  • mcp-nvidia://sdk/python/search_nvidia.py - Python types and interface for search_nvidia
  • mcp-nvidia://sdk/python/discover_nvidia_content.py - Python types for discover_nvidia_content
  • mcp-nvidia://sdk/python/__init__.py - Package exports
  • mcp-nvidia://sdk/python/README.md - Python SDK documentation

Example: Code Execution Workflow

Instead of calling tools directly:

# Traditional MCP tool call
result = call_tool("search_nvidia", {"query": "CUDA", "max_results_per_domain": 5})

An agent can discover and use typed SDKs:

// 1. Agent lists resources to discover SDK files
const resources = await client.list_resources();
// Returns: mcp-nvidia://sdk/typescript/search_nvidia.ts, etc.

// 2. Agent reads the TypeScript definitions
const sdkContent = await client.read_resource("mcp-nvidia://sdk/typescript/search_nvidia.ts");

// 3. Agent writes code using the typed interface
import { searchNvidia, SearchNvidiaInput } from "./search_nvidia";

const input: SearchNvidiaInput = {
  query: "CUDA programming",
  max_results_per_domain: 5,
  sort_by: "relevance"
};

const result = await searchNvidia(input);

// 4. Process results in code
const tutorials = result.results.filter(r => r.content_type === "tutorial");
const recent = result.results.filter(r => r.published_date > "2024-01-01");

The generated SDK files include:

  • TypeScript SDK: Full interfaces with JSDoc + functional wrappers that call MCP tools via client
  • Python SDK: TypedDict definitions + direct implementation calls (no MCP overhead!)
  • Type-safe input/output schemas
  • Example usage code
  • Comprehensive documentation

Key Difference:

  • Python SDK: Directly calls mcp_nvidia.lib functions - no MCP protocol overhead
  • TypeScript SDK: Calls via MCP client (since implementation is in Python)

How to Use SDK Resources

Via MCP Client:

Any MCP client that supports resources can access these SDK files:

# List available SDK resources
resources = await mcp_client.list_resources()

# Read a specific SDK file
typescript_sdk = await mcp_client.read_resource("mcp-nvidia://sdk/typescript/search_nvidia.ts")
python_sdk = await mcp_client.read_resource("mcp-nvidia://sdk/python/search_nvidia.py")

Using the Python SDK (Direct Calls):

# Import the generated SDK
from mcp_nvidia_sdk import search_nvidia

# Call directly - no MCP overhead!
result = await search_nvidia({
    "query": "CUDA optimization",
    "max_results_per_domain": 5,
    "sort_by": "date"
})

# Process with full type safety
for item in result["results"]:
    print(f"{item['title']}: {item['url']}")

Using the TypeScript SDK (via MCP Client):

import { searchNvidia, MCPClient } from "./search_nvidia";

// Get MCP client instance
const client: MCPClient = getMCPClient();

// Call via MCP protocol
const result = await searchNvidia({
  query: "CUDA optimization",
  max_results_per_domain: 5,
  sort_by: "date"
}, client);

// Process with full type safety
result.results.forEach(item => {
  console.log(`${item.title}: ${item.url}`);
});

Generated SDK Structure:

Each SDK includes:

  • Type Definitions: Full input/output type definitions automatically generated from tool schemas
  • Async Functions: Properly typed async function signatures for each tool
  • Documentation: Inline comments and docstrings explaining parameters and return types
  • Examples: Usage examples in the generated code
  • Index/Init Files: Convenient imports for all tools

SDK Generation & Testing

The SDK files are automatically generated from the MCP tool schemas when the server starts. They are cached in memory for performance.

Run SDK tests:

# Test SDK generation
pytest tests/test_sdk_generation.py -v

# Test MCP Resources
pytest tests/test_resources.py -v

# Run all tests
pytest tests/ -v

Backward Compatibility

The SDK Resources feature is fully backward compatible. Existing MCP clients can continue using traditional tool calling, while clients that support code execution can use the SDK Resources. Both methods work simultaneously.

Available Tools

search_nvidia

Search across NVIDIA domains for specific information. Results include citations with URLs for easy reference.

Parameters:

  • query (required): The search query to find information across NVIDIA domains
  • domains (optional): List of specific NVIDIA domains to search. If not provided, searches all default domains
  • max_results_per_domain (optional): Maximum number of results to return per domain (default: 3)
  • min_relevance_score (optional): Minimum relevance score threshold (0-100) to filter results (default: 17)
  • sort_by (optional): Sort order for results - one of:
    • relevance (default): Sort by relevance score (highest first)
    • date: Sort by publication date (newest first)
    • domain: Sort alphabetically by domain name

Example queries:

  • "CUDA programming best practices"
  • "RTX 4090 specifications"
  • "TensorRT optimization techniques"
  • "Latest AI announcements"
  • "Omniverse development tutorials"

Example usage with sorting:

# Sort by publication date (newest first) - great for finding latest announcements
search_nvidia(query="NIM microservices", sort_by="date")

# Sort by relevance (default) - best for finding most relevant content
search_nvidia(query="CUDA optimization", sort_by="relevance")

# Sort by domain - useful for browsing content by source
search_nvidia(query="TensorRT", sort_by="domain")

Features:

  • Enhanced search using ddgs package for reliable DuckDuckGo integration with domain filtering
  • Structured JSON output: Returns data in a structured format with both machine-readable and human-readable fields
  • Context-aware snippets: Automatically fetches surrounding text from source URLs and highlights the relevant snippet with **bold** formatting
  • Relevance scoring (0-100 scale): Each result includes a relevance score based on query term matches in title, snippet, and URL
    • Results are sorted by relevance score (highest first) when using default sorting
    • Results below the threshold are automatically filtered out
    • Score displayed as "Score: X/100" in formatted text
  • Flexible sorting options:
    • relevance: Sort by relevance score (default)
    • date: Sort by publication date, newest first
    • domain: Sort alphabetically by domain name
  • Publication date extraction: Automatically extracts dates from HTML metadata (meta tags, time elements) and page content using multiple strategies
  • Content type detection: Identifies content as announcements, tutorials, guides, forum discussions, blog posts, documentation, research papers, news, videos, courses, or articles
  • Rich metadata extraction:
    • Author name from various HTML sources
    • Approximate word count
    • Code snippet detection
    • Video and image detection with counts
  • Security controls: Input validation and limits (customizable via code)
    • Query/topic length: 500 characters max
    • Results per domain: 10 max
    • Domain whitelist: nvidia.com and nvidia.github.io only
    • Rate limiting: 200ms between search API calls (~5 searches/sec)
    • Concurrent searches: 5 max simultaneous searches
  • Concurrent search across multiple domains for fast results
  • Formatted results with titles, URLs, enhanced snippets with context, and source domains
  • Dedicated citations section with numbered references for easy copying

Output Format: Results are returned as structured JSON with the following schema:

{
  "success": true,
  "results": [
    {
      "title": "Page Title",
      "url": "https://example.nvidia.com/page",
      "snippet": "Enhanced snippet with **highlighted** keywords",
      "domain": "example.nvidia.com",
      "relevance_score": 85,
      "published_date": "2025-01-15",
      "content_type": "tutorial",
      "metadata": {
        "author": "John Doe",
        "word_count": 1234,
        "has_code": true,
        "has_video": false,
        "has_images": true,
        "image_count": 5
      }
    }
  ],
  "metadata": {
    "domains_searched": 16,
    "search_time_ms": 1234
  }
}

Result Fields:

  • title: Page title
  • url: Full URL to the resource
  • snippet: Enhanced snippet with bold highlighting for matched keywords
  • domain: Source domain (e.g., "docs.nvidia.com")
  • relevance_score: Relevance score from 0-100 based on keyword matches
  • published_date (optional): Publication date in YYYY-MM-DD format, extracted from HTML metadata or page content
  • content_type: Detected content type, one of:
    • announcement: Product/feature announcements
    • tutorial: Step-by-step tutorials
    • guide: How-to guides and best practices
    • forum_discussion: Forum posts and discussions
    • blog_post: Blog articles
    • documentation: Technical documentation
    • research_paper: Research publications
    • news: News articles and press releases
    • video: Video content
    • course: Training courses
    • article: General articles (default)
  • metadata (optional): Additional extracted metadata:
    • author: Content author name (if available)
    • word_count: Approximate word count
    • has_code: Whether the page contains code snippets
    • has_video: Whether the page contains embedded videos
    • has_images: Whether the page contains images
    • image_count: Number of images on the page

Note: The published_date and metadata fields are optional and only appear when successfully extracted from the page.

discover_nvidia_content

Discover specific types of NVIDIA educational and learning content such as videos, courses, tutorials, webinars, or blog posts.

Parameters:

  • content_type (required): Type of content to find - one of:
    • video: Video tutorials and demonstrations
    • course: Training courses and certifications (NVIDIA DLI)
    • tutorial: Step-by-step guides and how-tos
    • webinar: Webinars and live sessions
    • blog: Blog posts and articles
  • topic (required): The topic or technology to find content about (e.g., "CUDA", "Omniverse", "AI")
  • max_results (optional): Maximum number of content items to return (default: 5)

Example queries:

  • Find video tutorials: discover_nvidia_content(content_type="video", topic="CUDA programming")
  • Find training courses: discover_nvidia_content(content_type="course", topic="Deep Learning")
  • Find webinars: discover_nvidia_content(content_type="webinar", topic="AI in Healthcare")

Features:

  • Content-specific search strategies optimized for each type
  • Relevance scoring on 0-100 scale to highlight best matches
    • Score displayed as "Score: X/100" for transparency
  • Direct links to videos, courses, tutorials, and other resources
  • Resource links section for easy access to all discovered content

Development

Setting up a development environment

# Clone the repository
git clone https://github.com/bharatr21/mcp-nvidia.git
cd mcp-nvidia

# Create a virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install in development mode with dev dependencies
pip install -e ".[dev]"

Running tests

# Install test dependencies
pip install -e ".[test]"

# Run all tests
pytest tests/ -v

# Run tests excluding slow tests (rate limiting)
pytest tests/ -v -m "not slow"

Architecture

The server uses the Model Context Protocol (MCP) to expose search functionality to LLMs.

Search Flow

sequenceDiagram
    participant LLM as LLM/AI Agent
    participant MCP as MCP Server
    participant Validator as Input Validator
    participant RateLimit as Rate Limiter
    participant DDGS as DuckDuckGo Search
    participant Fetcher as URL Fetcher

    LLM->>MCP: search_nvidia(query, domains)
    MCP->>Validator: Validate inputs
    Validator-->>MCP: ✓ Valid (or ✗ Error)

    loop For each domain (max 5 concurrent)
        MCP->>RateLimit: Check rate limit
        RateLimit-->>MCP: ✓ Proceed (wait 200ms)
        MCP->>DDGS: Search "site:domain query"
        DDGS-->>MCP: Search results

        loop For each result
            MCP->>Fetcher: Fetch URL context
            Fetcher-->>MCP: Enhanced snippet
        end
    end

    MCP->>MCP: Calculate relevance scores
    MCP->>MCP: Sort & format JSON
    MCP-->>LLM: Structured JSON response
Loading

System Architecture

flowchart TD
    A[MCP Client<br/>Claude/LLMs] -->|JSON-RPC| B[MCP Server]
    B --> C{Tool Router}

    C -->|search_nvidia| D[Search Handler]
    C -->|discover_nvidia_content| E[Discovery Handler]

    D --> F[Security Layer]
    F --> G[Input Validator]
    F --> H[Rate Limiter]
    F --> I[Concurrency Control]

    G --> J[Domain Searcher]
    J --> K[DuckDuckGo API]

    J --> L[URL Context Fetcher]
    L --> M[BeautifulSoup Parser]

    J --> N[Relevance Scorer]
    N --> O[Response Builder]
    O --> P[Structured JSON]

    P -->|CallToolResult| A

    style F fill:#ffcccc
    style P fill:#ccffcc
    style K fill:#cce5ff
Loading

Key Components

  1. Input Validation: Validates query length, domain whitelist, and parameter limits
  2. Rate Limiting: Enforces 200ms minimum interval between DuckDuckGo API calls
  3. Concurrent Search: Searches up to 5 domains simultaneously with semaphore control
  4. Context Enhancement: Fetches actual page content and highlights relevant snippets
  5. Relevance Scoring: Calculates 0-100 scores based on keyword matches
  6. JSON Output: Returns structured data compatible with both AI agents and humans
  7. SDK Generator (New): Automatically generates TypeScript and Python SDKs from tool schemas
  8. MCP Resources: Exposes generated SDKs as virtual filesystem resources for code execution workflows

Extending Domain Coverage

The list of searchable domains is configured in src/mcp_nvidia/server.py in the DEFAULT_DOMAINS constant. To add more NVIDIA domains:

  1. Edit src/mcp_nvidia/server.py
  2. Add new domain URLs to the DEFAULT_DOMAINS list
  3. Reinstall the package

License

MIT License - see LICENSE file for details

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Support

For issues, questions, or contributions, please visit the GitHub repository.

About

MCP server to search across NVIDIA blogs and releases to empower LLMs to better answer NVIDIA specific queries

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •