This directory contains sample configuration files demonstrating automatic registry synchronization from different data sources.
Choose your data source:
| Source | Use Case | Config File | Sync Interval |
|---|---|---|---|
| Git | Official registries, version control | config-git.yaml | 30m |
| API | Upstream aggregation, federation | config-api.yaml | 1h |
| File | Local development, testing | config-file.yaml | 5m |
Start the server with sync:
# Git source (recommended for getting started)
thv-registry-api serve --config examples/config-git.yaml
# API source (upstream MCP registry)
thv-registry-api serve --config examples/config-api.yaml
# File source (local development)
thv-registry-api serve --config examples/config-file.yamlVerify sync is working:
# Query the API
curl http://localhost:8080/registry/v0.1/servers | jq
# Check health
curl http://localhost:8080/healthFile: config-git.yaml
Syncs from the official ToolHive Git repository.
Configuration:
registryName: toolhive
registries:
- name: toolhive
format: toolhive
git:
repository: https://github.com/stacklok/toolhive.git
branch: main
path: pkg/registry/data/registry.json
syncPolicy:
interval: "30m"What happens when you start:
- Background sync coordinator starts immediately
- Clones
https://github.com/stacklok/toolhive.git(shallow, depth=1) - Extracts
pkg/registry/data/registry.jsonfrom themainbranch - Stores synced data in the PostgreSQL database
- Repeats every 30 minutes
Best for:
- Using official ToolHive registry data
- Version-controlled registry sources
- Multi-environment deployments (use different branches)
- Pinning to specific tags/commits for stability
Options:
- Use
tag: v1.0.0instead ofbranchto pin to a release - Use
commit: abc123to pin to exact commit - Change
intervalto control sync frequency
File: config-api.yaml
Syncs from another MCP Registry API endpoint (like the official upstream registry).
Configuration:
registryName: mcp-registry
registries:
- name: mcp-upstream
format: upstream
api:
endpoint: https://registry.modelcontextprotocol.io
syncPolicy:
interval: "1h"What happens when you start:
- Makes HTTP GET to
https://registry.modelcontextprotocol.io/registry/v0.1/servers - Converts from upstream MCP format to ToolHive format
- Stores synced data in the PostgreSQL database
- Repeats every hour (less frequent to be respectful of external APIs)
Best for:
- Aggregating multiple registry sources
- Consuming official MCP registry data
- Creating curated/filtered subsets
- Registry federation scenarios
Format Structure:
The upstream format uses a wrapper structure with version, meta, and data sections (see examples/upstream-registry.json).
Each server follows the MCP 2025-10-17 schema with $schema, packages[], and _meta extensions for ToolHive-specific metadata (tier, status, tools, etc.).
File: config-file.yaml
Reads registry data from a local file on the filesystem.
Configuration:
registryName: toolhive
registries:
- name: local-file
format: toolhive
file:
path: ./data/registry.json
syncPolicy:
interval: "5m"What happens when you start:
- Reads registry data from the specified file path
- Validates the JSON data structure
- Stores synced data in the PostgreSQL database
- Repeats every 5 minutes to detect file changes
Best for:
- Local development and testing
- Reading from mounted volumes in containers
- Using pre-generated registry files
- Quick prototyping without external dependencies
Note: The sync manager detects if the file has changed by comparing content hashes, so unchanged files won't trigger a database write.
File: config-complete.yaml
Comprehensive example showing all available configuration options with detailed comments.
Use this as a reference when you need to:
- Understand all available options
- Configure advanced filtering
- See examples of every source type
- Learn about data formats
All config files follow this structure:
# Registry name/identifier (optional, defaults to "default")
registryName: <name>
# Registries configuration (can have multiple registries)
registries:
- name: <registry-name>
# Data format: toolhive (native) or upstream (MCP registry format)
format: <toolhive|upstream>
# Source-specific config (one of: git, api, file, managed)
git:
repository: <url>
branch: <name> # OR tag: <name> OR commit: <sha>
path: <file-path>
api:
endpoint: <base-url>
file:
path: <file-path>
managed: {} # For API-managed registries (no sync)
# Per-registry sync policy (required except for managed registries)
syncPolicy:
interval: <duration> # e.g., "30m", "1h", "24h"
# Optional: Per-registry filter
filter:
names:
include: [<glob-patterns>]
exclude: [<glob-patterns>]
tags:
include: [<tag-names>]
exclude: [<tag-names>]Choose based on your source and needs:
syncPolicy:
interval: "5m" # Development/testing - very frequent
interval: "30m" # Git sources - balance freshness vs load
interval: "1h" # API sources - respectful rate limiting
interval: "6h" # Stable sources - infrequent updatesInclude/exclude specific servers:
filter:
# Name-based filtering (glob patterns)
names:
include:
- "official/*" # Only official namespace
- "myorg/*" # Your organization
exclude:
- "*/deprecated" # Skip deprecated
- "*/internal" # Skip internal-only
- "*/test" # Skip test servers
# Tag-based filtering (exact matches)
tags:
include:
- "production" # Only production-ready
- "verified" # Only verified servers
exclude:
- "experimental" # Skip experimental
- "beta" # Skip beta versionsFilter logic:
- Name include patterns (empty = include all)
- Name exclude patterns
- Tag include (empty = include all)
- Tag exclude
See the example files for format reference:
- ToolHive format (
examples/toolhive-registry.json): Flat structure with servers as an object/map - Upstream format (
examples/upstream-registry.json): Wrapper structure with meta/data sections and MCP 2025-10-17 schema
registries:
- name: my-registry
format: toolhive # or "upstream"
git:
repository: https://github.com/stacklok/toolhive.git
# Option 1: Track latest on a branch
branch: main
# Option 2: Pin to a release tag
tag: v1.2.3
# Option 3: Pin to exact commit
commit: abc123def456
path: pkg/registry/data/registry.json
syncPolicy:
interval: "30m"Sync status is stored in the PostgreSQL database and exposed via server logs.
Status phases:
Syncing: Sync operation in progressComplete: Last sync successfulFailed: Last sync failed (will auto-retry at next sync interval)
Look for these log messages:
# Successful initialization
"Initializing sync manager for automatic registry synchronization"
"Loaded sync status: last sync at 2024-11-05T12:00:00Z, 42 servers"
"Starting background sync coordinator"
# Successful sync
"Starting sync operation (attempt 1)"
"Registry data fetched successfully from source"
"Sync completed successfully: 42 servers, hash=abc123de"
# Sync failures
"Sync failed: Fetch failed: ..."Symptom: Health endpoint returns unhealthy
Solution:
- Verify
--configflag is provided:thv-registry-api serve --config examples/config-git.yaml
- Check logs for "Loaded configuration from..."
- Ensure
syncPolicyis defined in config - Ensure
databaseis configured and reachable
Symptom: Status shows Failed with git error
Solutions:
- Check repository URL is accessible:
git ls-remote <url> - For private repos, configure git credentials
- Verify branch/tag/commit exists
- Check network connectivity
Symptom: Status shows Failed with connection error
Solutions:
- Verify endpoint URL:
curl <endpoint>/v0.1/servers - Check network connectivity
- Look for rate limiting (increase interval)
- Verify API is MCP-compatible
All synced data is stored in PostgreSQL. Database configuration is required in the config file.
Fast updates for local development:
registries:
- name: dev-registry
format: toolhive
git:
repository: https://github.com/stacklok/toolhive.git
branch: develop # Use dev branch
path: pkg/registry/data/registry.json
syncPolicy:
interval: "1m" # Very frequent for testingConservative config with filtering:
registries:
- name: prod-registry
format: toolhive
git:
repository: https://github.com/your-org/registry.git
branch: production
path: registry.json
syncPolicy:
interval: "30m"
filter:
tags:
include: ["production", "stable"]
exclude: ["experimental", "deprecated"]Run multiple instances and aggregate at the application level:
# Instance 1: Official ToolHive (port 8081)
thv-registry-api serve \
--config examples/config-git.yaml \
--address :8081 &
# Instance 2: Upstream MCP (port 8082)
thv-registry-api serve \
--config examples/config-api.yaml \
--address :8082 &
# Instance 3: Local file (port 8083)
thv-registry-api serve \
--config examples/config-file.yaml \
--address :8083 &Note: Each instance should be configured with its own database or registry name to avoid conflicts.
# Start with Git sync
thv-registry-api serve --config examples/config-git.yaml
# Start with API sync
thv-registry-api serve --config examples/config-api.yaml
# Start with File sync (local development)
thv-registry-api serve --config examples/config-file.yaml
# Start with custom address
thv-registry-api serve --config examples/config-git.yaml --address :9090
# Test API endpoint
curl http://localhost:8080/registry/v0.1/servers | jq
# Check health
curl http://localhost:8080/health
# Watch logs
tail -f /var/log/thv-registry-api.log | grep -i sync
# Note: Manual sync triggering is not currently supported
# Sync happens automatically based on configured intervals- Main README - Full project documentation
- Architecture - System design
- API Reference - REST endpoints
- CLAUDE.md - Development guide