Universal MCP Server for dbt Project Analysis - Works with Any GitHub Repository
A production-ready Model Context Protocol (MCP) server that provides comprehensive dbt project quality assessment for any GitHub repository. Powered by GitHub App authentication for secure, scalable access to public and private repositories. Purpose-built for AI agents and modern data workflows.
Data Product Hub transforms any dbt project on GitHub into an agent-accessible data quality platform that:
- Analyzes ANY GitHub dbt repository with AI-powered suggestions and best practices
- Works with public and private repos via secure GitHub App authentication
- Supports subdirectory dbt projects (detects dbt/, transform/, analytics/ folders)
- Checks metadata coverage across your entire data product portfolio
- Maps data lineage and dependency relationships
- Integrates with Git for enhanced context and change analysis
- Exposes MCP tools for seamless AI agent integration
- Deploys anywhere - FastMCP Cloud (recommended), Docker, Kubernetes
analyze_dbt_model(model_name, repo_url)
- Basic dbt model analysisanalyze_dbt_model_with_ai(model_name, repo_url)
- NEW: AI-powered analysis with user's OpenAI keycheck_metadata_coverage(repo_url)
- Project-wide metadata assessmentget_project_lineage(repo_url)
- Data dependency mappingassess_data_product_quality(model_name, repo_url)
- Comprehensive quality scoringvalidate_github_repository(repo_url)
- Validate repo access and dbt structureanalyze_dbt_model_with_git_context(model_name, repo_url)
- dbt analysis + Git historyget_composite_server_status()
- Server capabilities and GitHub integration status
- Local CLI -
dph -f ./project
- Hostable MCP Server -
dph serve --mcp-host 0.0.0.0
- Container Deployment - Docker + Kubernetes + Helm charts
- FastMCP Cloud - One-click cloud deployment
- Compatible with Claude Code, Cursor, and any MCP-enabled AI agent
- JSON-first output for automation and CI/CD pipelines
- Structured responses for programmatic consumption
1. Install the GitHub App on your dbt repositories:
- Visit: https://github.com/apps/data-product-hub/installations/new
- Select repositories containing dbt projects
- Grant read permissions
2. (Optional) Enable AI features by adding your OpenAI API key:
- Go to Repository Settings β Environments
- Create or use any of these environment names:
production
,prod
,data-analysis
,main
, orstaging
- Add
OPENAI_API_KEY
as an Environment Secret - Set the value to your OpenAI API key (
sk-proj-...
) - This enables the
analyze_dbt_model_with_ai
tool - Note: All other tools work without an API key - only AI-powered analysis requires it
3. Use via Claude Desktop:
// Add to ~/.claude_desktop_config.json
{
"mcpServers": {
"data-product-hub": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-fetch", "https://data-product-hub.fastmcp.app/mcp"]
}
}
}
4. Ask Claude to analyze any dbt repository:
"Analyze the customer_metrics model in https://github.com/company/analytics-dbt"
"Get AI-powered suggestions for the user_events model in github.com/company/dbt-models"
"Check metadata coverage for github.com/myorg/data-warehouse"
"Get project lineage for github.com/startup/dbt-models"
# Install package
pip install data-product-hub
# CLI analysis
dph -f ./my-dbt-project --metadata-only
# Start local MCP server
dph --mcp-server -f ./my-dbt-project
from fastmcp import Client
# Connect to the universal MCP server
client = Client("https://data-product-hub.fastmcp.app/mcp")
async with client:
# Basic analysis of any GitHub repository
analysis = await client.call_tool(
"analyze_dbt_model",
{
"model_name": "customer_summary",
"repo_url": "https://github.com/company/analytics-dbt"
}
)
# AI-powered analysis (requires OpenAI API key in environment secrets)
ai_analysis = await client.call_tool(
"analyze_dbt_model_with_ai",
{
"model_name": "customer_summary",
"repo_url": "https://github.com/company/analytics-dbt"
}
)
# Check metadata coverage across any project
coverage = await client.call_tool(
"check_metadata_coverage",
{"repo_url": "github.com/myorg/data-warehouse"}
)
Ready to use immediately:
- MCP Server:
https://data-product-hub.fastmcp.app/mcp
- GitHub App: https://github.com/apps/data-product-hub/installations/new
Quick Setup:
- Install the GitHub App on your dbt repositories
- Add the MCP server to Claude Desktop configuration
- Start analyzing any dbt repository via Claude
For organizations wanting their own instance:
Prerequisites:
- Fork this repository
- Create your own GitHub App with read permissions
- Get GitHub App ID and base64-encoded private key
Deployment:
- Deploy to FastMCP Cloud with entry point:
server.py
- Set your GitHub App credentials as environment variables
- Share your GitHub App installation URL with users
π Complete Deployment Guide
# Using Docker Compose
docker-compose up
# Custom container
docker run -p 8080:8080 \
-v ./my-dbt-project:/dbt-project \
data-product-hub:latest
# Deploy with Helm
helm install data-product-hub ./charts/data-product-hub \
--set persistence.hostPath="/path/to/dbt-project" \
--set dbtAi.database="snowflake"
The Data Product Hub MCP server is ready to use - no configuration required for end users! Just install the GitHub App and start analyzing.
# Database configuration (local CLI only)
DATABASE=snowflake # snowflake, postgres, redshift, bigquery
# OpenAI API (optional - for AI features in local CLI)
OPENAI_API_KEY=your-openai-api-key
DBT_AI_BASIC_MODEL=gpt-4o-mini
DBT_AI_ADVANCED_MODEL=gpt-4o
- Snowflake (default)
- PostgreSQL
- Amazon Redshift
- Google BigQuery
Data Product Hub implements a composite MCP architecture:
Your Data Product Hub Server
βββ Core dbt Analysis
βββ Git Integration (via Git MCP server)
βββ Future: Monte Carlo Integration
βββ Future: DataHub Integration
βββ Future: Snowflake Performance Integration
This allows AI agents to get comprehensive data product insights from a single MCP endpoint.
- Automated quality checks in CI/CD pipelines
- Documentation coverage monitoring
- Lineage analysis for impact assessment
- Agent-driven data workflows
- Data product understanding before making changes
- Quality assessment as part of automated reviews
- Context-aware suggestions with Git history
- Comprehensive data product insights
- Centralized data quality hub
- Production-ready MCP server deployment
- Multi-tool integration platform
- Kubernetes-native scaling
If you're upgrading from the legacy dbt-ai
package:
# Old command
dbt-ai -f ./project --metadata-only
# New command (identical functionality) - use the short dph command!
dph -f ./project --metadata-only
All CLI functionality is 100% backwards compatible.
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
Data Product Hub - Transforming dbt projects into agent-accessible data quality platforms. π