Deriva MCP Server

Model Context Protocol (MCP) server that exposes Deriva catalog operations and DerivaML ML workflow tools for LLM applications.

Overview

This MCP server provides an interface to Deriva catalogs and DerivaML, enabling AI assistants like Claude to:

Connect to and manage Deriva catalogs
Create and manage datasets with versioning
Work with controlled vocabularies
Define and execute ML workflows
Create and manage features for ML experiments

For full ML workflow management, this server is designed to work alongside the GitHub MCP Server to enable:

Storing and versioning hydra-zen configurations in GitHub repositories
Managing workflow code and model implementations
Collaborative development of ML experiments

Prerequisites

Deriva Authentication

DerivaML uses Globus for authentication. Before using the MCP server, you must authenticate with your Deriva server:

# Install deriva-ml if not already installed
pip install deriva-ml

# Authenticate with your Deriva server
python -c "from deriva_ml import DerivaML; DerivaML.globus_login('your-server.org')"

This opens a browser window for Globus authentication. Credentials are cached locally and persist across sessions.

Alternatively, use the Deriva Auth Agent for browser-based authentication:

Install the Deriva Auth Agent from deriva-py
Run deriva-globus-auth-utils login --host your-server.org

GitHub Authentication (for configuration management)

Create a GitHub Personal Access Token (PAT) for the GitHub MCP Server:

Go to GitHub Settings > Personal Access Tokens
Create a fine-grained token with these permissions:
- Repository access: Select repositories containing your ML configurations
- Permissions:
  - Contents: Read and write (for pushing configs)
  - Pull requests: Read and write (optional, for PR workflows)
  - Issues: Read (optional, for tracking)
Copy the token securely - you'll need it for configuration

Installation

Using Docker (Recommended)

Docker provides the simplest setup with no Python environment management. The image is automatically built and published to GitHub Container Registry on every commit to main.

# Pull the latest image from GitHub Container Registry
docker pull ghcr.io/informatics-isi-edu/deriva-mcp:latest

# Run the server (for testing)
docker run --rm -it ghcr.io/informatics-isi-edu/deriva-mcp:latest --help

To build locally instead:

git clone https://github.com/informatics-isi-edu/deriva-mcp.git
cd deriva-mcp
./scripts/docker-build.sh

Using uv

uv pip install deriva-mcp

Using pip

pip install deriva-mcp

From source

git clone https://github.com/informatics-isi-edu/deriva-mcp.git
cd deriva-mcp
uv sync

Configuration

Claude Desktop - Full Setup with GitHub Integration

For the complete ML workflow experience, configure both DerivaML and GitHub MCP servers together.

Configuration file locations:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json

Option 1: Both Servers with Docker (Recommended)

Uses Docker for both MCP servers - most consistent setup:

{
  "mcpServers": {
    "deriva-ml": {
      "type": "stdio",
      "command": "/bin/sh",
      "args": [
        "-c",
        "docker run -i --rm --add-host localhost:host-gateway -e HOME=$HOME -v $HOME/.deriva:$HOME/.deriva:ro -v $HOME/.bdbag:$HOME/.bdbag -v $HOME/.deriva-ml:$HOME/.deriva-ml ghcr.io/informatics-isi-edu/deriva-mcp:latest"
      ],
      "env": {}
    },
    "github": {
      "type": "stdio",
      "command": "docker",
      "args": [
        "run", "-i", "--rm",
        "-e", "GITHUB_PERSONAL_ACCESS_TOKEN",
        "ghcr.io/github/github-mcp-server"
      ],
      "env": {
        "GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_your_token_here"
      }
    }
  }
}

Docker arguments explained:

--add-host localhost:host-gateway - Allows connecting to a Deriva server running on localhost
-e HOME=$HOME - Passes your home directory path into the container so mounted paths are found correctly

For localhost with self-signed certificates, the image defaults to using ~/.deriva/allCAbundle-with-local.pem as the CA bundle. See Troubleshooting for how to create this file.

Volume mounts explained:

$HOME/.deriva:$HOME/.deriva:ro - Mounts your Deriva credentials (read-only)
$HOME/.bdbag:$HOME/.bdbag - Mounts bdbag keychain for dataset download authentication (writable)
$HOME/.deriva-ml:$HOME/.deriva-ml - Working directory for execution outputs (writable)

Note: Create the workspace directory before first use:

mkdir -p ~/.deriva-ml

If the directory doesn't exist, Docker creates it as root, causing permission issues.

Option 2: Direct Install with GitHub Remote

Uses pip-installed DerivaML MCP with GitHub's hosted server:

{
  "mcpServers": {
    "deriva-ml": {
      "type": "stdio",
      "command": "deriva-mcp",
      "env": {}
    },
    "github": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "@anthropic-ai/github-mcp-server"],
      "env": {
        "GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_your_token_here"
      }
    }
  }
}

Option 3: From Source (Development)

For development or customization:

{
  "mcpServers": {
    "deriva-ml": {
      "type": "stdio",
      "command": "uv",
      "args": [
        "--directory",
        "/path/to/deriva-mcp",
        "run",
        "deriva-mcp"
      ],
      "env": {}
    },
    "github": {
      "type": "stdio",
      "command": "docker",
      "args": [
        "run", "-i", "--rm",
        "-e", "GITHUB_PERSONAL_ACCESS_TOKEN",
        "ghcr.io/github/github-mcp-server"
      ],
      "env": {
        "GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_your_token_here"
      }
    }
  }
}

Option 4: DerivaML Only (No GitHub)

If you don't need GitHub integration:

{
  "mcpServers": {
    "deriva-ml": {
      "type": "stdio",
      "command": "/bin/sh",
      "args": [
        "-c",
        "docker run -i --rm --add-host localhost:host-gateway -e HOME=$HOME -v $HOME/.deriva:$HOME/.deriva:ro -v $HOME/.bdbag:$HOME/.bdbag -v $HOME/.deriva-ml:$HOME/.deriva-ml ghcr.io/informatics-isi-edu/deriva-mcp:latest"
      ],
      "env": {}
    }
  }
}

Or with direct install:

{
  "mcpServers": {
    "deriva-ml": {
      "type": "stdio",
      "command": "deriva-mcp",
      "env": {}
    }
  }
}

Claude Code

Add to ~/.mcp.json (global) or your project's .mcp.json file:

With Docker:

{
  "mcpServers": {
    "deriva-ml": {
      "type": "stdio",
      "command": "/bin/sh",
      "args": [
        "-c",
        "docker run -i --rm --add-host localhost:host-gateway -e HOME=$HOME -v $HOME/.deriva:$HOME/.deriva:ro -v $HOME/.bdbag:$HOME/.bdbag -v $HOME/.deriva-ml:$HOME/.deriva-ml ghcr.io/informatics-isi-edu/deriva-mcp:latest"
      ],
      "env": {}
    },
    "github": {
      "type": "stdio",
      "command": "docker",
      "args": [
        "run", "-i", "--rm",
        "-e", "GITHUB_PERSONAL_ACCESS_TOKEN",
        "ghcr.io/github/github-mcp-server"
      ],
      "env": {
        "GITHUB_PERSONAL_ACCESS_TOKEN": "${GITHUB_PERSONAL_ACCESS_TOKEN}"
      }
    }
  }
}

With direct install:

{
  "mcpServers": {
    "deriva-ml": {
      "type": "stdio",
      "command": "deriva-mcp",
      "env": {}
    },
    "github": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "@anthropic-ai/github-mcp-server"],
      "env": {
        "GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_your_token_here"
      }
    }
  }
}

Then enable in .claude/settings.local.json:

{
  "enableAllProjectMcpServers": true,
  "enabledMcpjsonServers": ["deriva-ml", "github"]
}

VS Code with Continue or Cline

Add to your MCP configuration (typically .vscode/mcp.json):

{
  "mcp": {
    "servers": {
      "deriva-ml": {
        "type": "stdio",
        "command": "/bin/sh",
        "args": [
          "-c",
          "docker run -i --rm --add-host localhost:host-gateway -e HOME=$HOME -v $HOME/.deriva:$HOME/.deriva:ro -v $HOME/.bdbag:$HOME/.bdbag -v $HOME/.deriva-ml:$HOME/.deriva-ml ghcr.io/informatics-isi-edu/deriva-mcp:latest"
        ],
        "env": {}
      },
      "github": {
        "type": "stdio",
        "command": "docker",
        "args": [
          "run", "-i", "--rm",
          "-e", "GITHUB_PERSONAL_ACCESS_TOKEN",
          "ghcr.io/github/github-mcp-server"
        ],
        "env": {
          "GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_your_token_here"
        }
      }
    }
  }
}

Environment Variables

For security, store tokens in environment variables instead of config files:

# Add to ~/.bashrc, ~/.zshrc, or equivalent
export GITHUB_PERSONAL_ACCESS_TOKEN="ghp_your_token_here"

Then reference in config:

{
  "mcpServers": {
    "github": {
      "type": "stdio",
      "command": "docker",
      "args": ["run", "-i", "--rm", "-e", "GITHUB_PERSONAL_ACCESS_TOKEN", "ghcr.io/github/github-mcp-server"],
      "env": {
        "GITHUB_PERSONAL_ACCESS_TOKEN": "${GITHUB_PERSONAL_ACCESS_TOKEN}"
      }
    }
  }
}

Verifying Your Setup

After configuration, verify both servers are working:

User: What MCP servers are available?

Claude: I have access to two MCP servers:
1. deriva-ml - For managing ML workflows in Deriva catalogs
2. github - For managing GitHub repositories and configurations

User: Connect to the deriva catalog at example.org with ID 42

Claude: [Uses connect_catalog tool]
Connected to example.org, catalog 42. The domain schema is 'my_project'.

User: List the hydra-zen configs in the my-ml-project repo

Claude: [Uses GitHub get_file_contents tool]
Found configuration files in configs/:
- deriva.py - DerivaML connection settings
- datasets.py - Dataset specifications
- model.py - Model hyperparameters

Available Tools

Catalog Management

Tool	Description
`connect_catalog`	Connect to a DerivaML catalog
`disconnect_catalog`	Disconnect from the active catalog
`list_connections`	List all active connections
`set_active_catalog`	Set which connection is active
`get_catalog_info`	Get information about the active catalog
`list_users`	List users with catalog access
`get_chaise_url`	Get web interface URL for a table
`resolve_rid`	Find which table a RID belongs to
`list_catalog_registry`	List all catalogs and aliases on a server
`create_catalog`	Create a new DerivaML catalog (with optional alias)
`delete_catalog`	Permanently delete a catalog
`clone_catalog`	Clone a catalog to create a copy

Catalog Alias Management

Tool	Description
`create_catalog_alias`	Create an alias for a catalog
`get_catalog_alias`	Get alias metadata (target, owner)
`update_catalog_alias`	Update alias target or owner
`delete_catalog_alias`	Delete an alias (catalog not affected)

Dataset Management

Tool	Description
`find_datasets`	Find all datasets in the catalog
`lookup_dataset`	Look up detailed information about a dataset
`create_dataset`	Create a new dataset
`list_dataset_members`	List members of a dataset
`add_dataset_members`	Add members to a dataset
`get_dataset_version_history`	Get version history
`increment_dataset_version`	Update dataset version
`delete_dataset`	Delete a dataset
`list_dataset_element_types`	List valid element types
`add_dataset_element_type`	Enable a table as element type

Vocabulary Management

Tool	Description
`list_vocabularies`	List all vocabulary tables
`list_vocabulary_terms`	List terms in a vocabulary
`lookup_term`	Find a term by name or synonym
`add_term`	Add a term to a vocabulary
`create_vocabulary`	Create a new vocabulary table

Workflow Management

Tool	Description
`find_workflows`	Find all workflows
`lookup_workflow`	Find a workflow by URL/checksum
`create_workflow`	Create and register a workflow
`list_workflow_types`	List available workflow types
`add_workflow_type`	Add a new workflow type

Feature Management

Tool	Description
`find_features`	Find features for a table
`lookup_feature`	Get feature details
`list_feature_values`	Get all values for a feature
`create_feature`	Create a feature definition
`delete_feature`	Delete a feature
`list_feature_names`	List all feature names

Schema Management

Tool	Description
`create_table`	Create a new table in the domain schema
`create_asset_table`	Create an asset table for file management
`list_assets`	List all assets in an asset table
`list_tables`	List all tables in the domain schema
`get_table_schema`	Get column and key definitions for a table
`list_asset_types`	List available asset type terms
`add_asset_type`	Add a new asset type to the vocabulary

Execution Management

Tool	Description
`create_execution`	Create a new execution for ML workflows
`start_execution`	Start the active execution
`stop_execution`	Stop and complete the active execution
`update_execution_status`	Update execution status and message
`get_execution_info`	Get details about the active execution
`restore_execution`	Restore a previous execution by RID
`asset_file_path`	Register a file for upload as an execution output
`upload_execution_outputs`	Upload all registered outputs to the catalog
`list_executions`	List recent executions
`create_execution_dataset`	Create a dataset within an execution
`download_execution_dataset`	Download a dataset for processing
`get_execution_working_dir`	Get the working directory path

Execution Workflow

The typical execution workflow using the context manager:

with execution.execute() as exe:
    # Do your work here
    exe.asset_file_path(asset_name="Image", file_name="output.png")
    # ... more processing ...

# After context exits, upload outputs
execution.upload_execution_outputs()

Using MCP tools, the equivalent workflow is:

create_execution() - Create the execution record with workflow info
start_execution() - Mark execution as running, begin timing
asset_file_path() - Register output files (repeat as needed)
stop_execution() - Mark execution as complete
upload_execution_outputs() - Required: Upload all registered files to catalog

Important: You must call upload_execution_outputs() after completing your work to upload any registered assets to the catalog. This is not automatic.

Available Resources

MCP resources provide read-only access to catalog information and configuration templates.

Static Resources - Configuration Templates

These resources provide code templates for configuring DerivaML with hydra-zen:

Resource URI	Description
`deriva-ml://config/deriva-ml-template`	Hydra-zen configuration template for DerivaML connection
`deriva-ml://config/dataset-spec-template`	Configuration template for dataset specifications
`deriva-ml://config/execution-template`	Configuration template for ML executions
`deriva-ml://config/model-template`	Configuration template for ML models with zen_partial

Dynamic Resources - Catalog Information

These resources return current catalog state (requires active connection):

Resource URI	Description
`deriva-ml://catalog/schema`	Current catalog schema structure in JSON
`deriva-ml://catalog/vocabularies`	All vocabulary tables and their terms
`deriva-ml://catalog/datasets`	All datasets in the current catalog
`deriva-ml://catalog/workflows`	All registered workflows
`deriva-ml://catalog/features`	All feature names defined in the catalog

Template Resources - Parameterized

These resources accept parameters to return specific information:

Resource URI	Description
`deriva-ml://dataset/{dataset_rid}`	Detailed information about a specific dataset
`deriva-ml://table/{table_name}/features`	Features defined for a specific table
`deriva-ml://vocabulary/{vocab_name}`	Terms in a specific vocabulary table

Documentation Resources

Documentation is fetched dynamically from GitHub repositories with 1-hour caching:

Resource URI	Description
`deriva-ml://docs/overview`	DerivaML overview and architecture
`deriva-ml://docs/datasets`	Guide to creating and managing datasets
`deriva-ml://docs/features`	Guide to defining and using features
`deriva-ml://docs/execution-configuration`	Guide to configuring ML executions
`deriva-ml://docs/hydra-zen`	Guide to hydra-zen configuration
`deriva-ml://docs/file-assets`	Guide to managing file assets
`deriva-ml://docs/notebooks`	Guide to Jupyter notebook integration
`deriva-ml://docs/identifiers`	Guide to RIDs, MINIDs, and identifiers
`deriva-ml://docs/install`	Installation instructions
`deriva-ml://docs/ermrest/*`	ERMrest API documentation
`deriva-ml://docs/chaise/*`	Chaise UI documentation
`deriva-ml://docs/deriva-py/*`	Deriva Python SDK documentation

Using Resources

Resources are accessed differently than tools - they provide static or semi-static data that can be read without side effects:

User: Show me the DerivaML configuration template

Claude: [Reads deriva-ml://config/deriva-ml-template resource]
Here's a hydra-zen configuration template for DerivaML...

User: What datasets are in the catalog?

Claude: [Reads deriva-ml://catalog/datasets resource]
Found the following datasets in your catalog...

Usage Examples

Discovering and Connecting to Catalogs

User: What catalogs are available on example.org?

Claude: [Uses list_catalog_registry tool]
Found 3 catalogs on example.org:
- ID: 21, Name: "ML Project", Persistent: true
- ID: 45, Name: "Test Environment", Persistent: true
- ID: 50, Name: "Clone of ML Project", Persistent: true

And 2 aliases:
- "ml-project" -> catalog 21
- "test-env" -> catalog 45

User: Connect to ml-project

Claude: [Uses connect_catalog tool with catalog_id="ml-project"]
Connected to example.org using alias "ml-project" (catalog 21).
The domain schema is 'my_domain'.

Cloning and Managing Catalogs

User: Clone the production catalog to create a test environment

Claude: [Uses clone_catalog tool]
Cloned catalog 21 to new catalog 52 with all data and policies.

User: Create an alias "dev-test" for the new catalog

Claude: [Uses create_catalog_alias tool]
Created alias "dev-test" pointing to catalog 52.
You can now connect using: connect_catalog("example.org", "dev-test")

Basic Catalog Operations

User: Connect to the deriva catalog at example.org with ID 123

Claude: I'll connect to that catalog for you.
[Uses connect_catalog tool]
Connected to example.org, catalog 123. The domain schema is 'my_project'.

User: What datasets are available?

Claude: Let me check what datasets exist.
[Uses find_datasets tool]
Found 5 datasets:
1. Training Images (v1.2.0) - 1500 images for model training
2. Validation Set (v1.0.0) - 300 images for validation
...

Configuration Workflow with GitHub

User: Create a new hydra-zen config for training on the cell-images dataset

Claude: I'll create a configuration file for you.
[Reads deriva-ml://config/model-template resource for template]
[Uses GitHub create_or_update_file to save configs/cell_training.py]

Created configs/cell_training.py with:
- DerivaML connection to example.org
- Dataset spec for cell-images v2.0.0
- Model config with learning_rate=1e-3, epochs=50

User: Commit and push that config

Claude: [Uses GitHub create_or_update_file with commit message]
Committed and pushed to main branch.

Hydra-zen Configuration

DerivaML integrates with hydra-zen for configuration management, enabling reproducible ML workflows with structured configuration.

Basic Configuration

from hydra_zen import builds, instantiate
from deriva_ml import DerivaML
from deriva_ml.core.config import DerivaMLConfig

# Create a structured config using hydra-zen
DerivaMLConf = builds(DerivaMLConfig, populate_full_signature=True)

# Configure for your environment
conf = DerivaMLConf(
    hostname='deriva.example.org',
    catalog_id='42',
    domain_schema='my_domain',
)

# Instantiate to get a DerivaMLConfig object, then create DerivaML
config = instantiate(conf)
ml = DerivaML.instantiate(config)

Working Directory Configuration

DerivaML automatically configures Hydra's output directory based on your working_dir setting:

conf = DerivaMLConf(
    hostname='deriva.example.org',
    working_dir='/shared/ml_workspace',  # Custom working directory
)

Hydra outputs will be organized under: {working_dir}/{username}/deriva-ml/hydra/{timestamp}/

Configuration Composition

Create environment-specific configurations using hydra-zen's store:

from hydra_zen import store

# Development configuration
store(DerivaMLConf(
    hostname='dev.example.org',
    catalog_id='1',
), name='dev')

# Production configuration
store(DerivaMLConf(
    hostname='prod.example.org',
    catalog_id='100',
), name='prod')

Dataset Specification Configuration

Use DatasetSpecConfig for cleaner dataset specifications:

from deriva_ml.dataset import DatasetSpecConfig

# Create dataset specs (hydra-zen compatible)
training_data = DatasetSpecConfig(
    rid="1ABC",
    version="1.0.0",
    materialize=True,       # Download asset files
    description="Training images"
)

metadata_only = DatasetSpecConfig(
    rid="2DEF",
    version="2.0.0",
    materialize=False,      # Only download table data
)

# Use in hydra-zen store
from hydra_zen import store
datasets_store = store(group="datasets")
datasets_store([training_data], name="training")
datasets_store([metadata_only], name="metadata_only")

Asset Configuration

Use AssetRIDConfig for input assets (model weights, config files):

from deriva_ml.execution import AssetRIDConfig

# Define input assets
model_weights = AssetRIDConfig(rid="WXYZ", description="Pretrained model")
config_file = AssetRIDConfig(rid="ABCD", description="Hyperparameters")

# Store asset collections
assets_store = store(group="assets")
assets_store([model_weights, config_file], name="default_assets")

Execution Configuration

Configure ML executions with ExecutionConfiguration:

from hydra_zen import builds, instantiate
from deriva_ml.execution import ExecutionConfiguration
from deriva_ml.dataset import DatasetSpecConfig

# Build execution config
ExecConf = builds(ExecutionConfiguration, populate_full_signature=True)

# Configure execution with datasets and assets
conf = ExecConf(
    description="Training run",
    datasets=[
        DatasetSpecConfig(rid="1ABC", version="1.0.0", materialize=True),
    ],
    assets=["WXYZ", "ABCD"],  # Asset RIDs
)

exec_config = instantiate(conf)

Configuration Summary

Class	Module	Purpose
`DerivaMLConfig`	`deriva_ml.core.config`	Main DerivaML connection config
`DatasetSpecConfig`	`deriva_ml.dataset`	Dataset specification for executions
`AssetRIDConfig`	`deriva_ml.execution`	Input asset specification
`ExecutionConfiguration`	`deriva_ml.execution`	Full execution configuration
`Workflow`	`deriva_ml.execution`	Workflow definition

See the DerivaML Hydra-zen Guide for complete documentation.

Troubleshooting

Docker with Localhost Deriva Server

When running the MCP server in Docker and connecting to a Deriva server on your local machine, you need additional configuration depending on how Deriva is running.

Option A: Deriva Running Directly on Host (not in Docker)

If your Deriva server is running directly on the host machine (not in Docker), use host-gateway:

{
  "mcpServers": {
    "deriva-ml": {
      "type": "stdio",
      "command": "/bin/sh",
      "args": [
        "-c",
        "docker run -i --rm --add-host localhost:host-gateway -v $HOME/.deriva:$HOME/.deriva:ro -v $HOME/.bdbag:$HOME/.bdbag -v $HOME/.deriva-ml:$HOME/.deriva-ml ghcr.io/informatics-isi-edu/deriva-mcp:latest"
      ],
      "env": {}
    }
  }
}

Option B: Deriva Running in Docker (deriva-localhost)

If your Deriva server is running in Docker (e.g., using deriva-localhost), the MCP container must join the same Docker network and map localhost to the webserver container's IP:

{
  "mcpServers": {
    "deriva-ml": {
      "type": "stdio",
      "command": "/bin/sh",
      "args": [
        "-c",
        "docker run -i --rm --network deriva-localhost_internal_network --add-host localhost:172.28.3.15 -e HOME=$HOME -v $HOME/.deriva:$HOME/.deriva:ro -v $HOME/.bdbag:$HOME/.bdbag -v $HOME/.deriva-ml:$HOME/.deriva-ml deriva-mcp:latest"
      ],
      "env": {}
    }
  }
}

Finding the webserver IP:

docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' deriva-webserver

Why this is needed: The MCP container needs to download dataset assets from the Deriva server. When Deriva runs in Docker, URLs in the dataset bags reference localhost, which must resolve to the Deriva webserver container. The entrypoint script in the MCP image automatically adjusts /etc/hosts so that the --add-host mapping takes effect.

SSL Certificate Configuration

If your localhost Deriva server uses a self-signed certificate (common for development), the container won't trust it by default. The image automatically sets REQUESTS_CA_BUNDLE to $HOME/.deriva/allCAbundle-with-local.pem, so you just need to create this file:

Creating the CA bundle with your local certificate (macOS):

# Export the local CA certificate from System Keychain
security find-certificate -a -c "DERIVA Dev Local CA" -p /Library/Keychains/System.keychain > /tmp/deriva-local-ca.pem

# Combine with existing CA bundle (if you have one)
cat ~/.deriva/allCAbundle.pem /tmp/deriva-local-ca.pem > ~/.deriva/allCAbundle-with-local.pem

# Or just use the local CA alone
cp /tmp/deriva-local-ca.pem ~/.deriva/allCAbundle-with-local.pem

To use a different CA bundle path, override with -e REQUESTS_CA_BUNDLE=/path/to/bundle.pem.

Deriva Authentication Issues

Error: "No credentials found"

# Re-authenticate with Deriva
python -c "from deriva_ml import DerivaML; DerivaML.globus_login('your-server.org')"

Error: "Token expired"

# Force re-authentication
python -c "from deriva_ml import DerivaML; DerivaML.globus_login('your-server.org', force=True)"

GitHub MCP Issues

Error: "Bad credentials"

Verify your PAT hasn't expired
Check the token has required permissions (Contents: Read/Write)
Ensure the token is correctly set in your config

Docker not found

Install Docker Desktop or use the npx method instead
On Linux, ensure your user is in the docker group

MCP Server Connection Issues

Server not responding

Check the server is installed: which deriva-mcp
Test manually: deriva-mcp (should start without errors)
Check Claude Desktop logs for errors

Multiple server conflicts

Ensure each server has a unique name in the config
Restart Claude Desktop after config changes

Development

Running Tests

uv run pytest

Code Quality

uv run ruff check src/
uv run ruff format src/

Requirements

Python 3.10+
MCP SDK 1.2.0+
DerivaML 0.1.0+
Docker (optional, for GitHub MCP local server)

License

Apache 2.0

Related Projects

DerivaML - Core library for ML workflows on Deriva
Deriva - Python SDK for Deriva scientific data management
GitHub MCP Server - Official GitHub MCP server
MCP Python SDK - Official Python SDK for Model Context Protocol

Name		Name	Last commit message	Last commit date
Latest commit History 163 Commits
.github		.github
scripts		scripts
src/deriva_ml_mcp		src/deriva_ml_mcp
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

informatics-isi-edu/deriva-mcp

Folders and files

Latest commit

History

Repository files navigation

Deriva MCP Server

Overview

Prerequisites

Deriva Authentication

GitHub Authentication (for configuration management)

Installation

Using Docker (Recommended)

Using uv

Using pip

From source

Configuration

Claude Desktop - Full Setup with GitHub Integration

Option 1: Both Servers with Docker (Recommended)

Option 2: Direct Install with GitHub Remote

Option 3: From Source (Development)

Option 4: DerivaML Only (No GitHub)

Claude Code

VS Code with Continue or Cline

Environment Variables

Verifying Your Setup

Available Tools

Catalog Management

Catalog Alias Management

Dataset Management

Vocabulary Management

Workflow Management

Feature Management

Schema Management

Execution Management

Execution Workflow

Available Resources

Static Resources - Configuration Templates

Dynamic Resources - Catalog Information

Template Resources - Parameterized

Documentation Resources

Using Resources

Usage Examples

Discovering and Connecting to Catalogs

Cloning and Managing Catalogs

Basic Catalog Operations

Configuration Workflow with GitHub

Hydra-zen Configuration

Basic Configuration

Working Directory Configuration

Configuration Composition

Dataset Specification Configuration

Asset Configuration

Execution Configuration

Configuration Summary

Troubleshooting

Docker with Localhost Deriva Server

Option A: Deriva Running Directly on Host (not in Docker)

Option B: Deriva Running in Docker (deriva-localhost)

SSL Certificate Configuration

Deriva Authentication Issues

GitHub MCP Issues

MCP Server Connection Issues

Development

Running Tests

Code Quality

Requirements

License

Related Projects

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 22

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages