Skip to content

Conversation

@h2zh
Copy link
Contributor

@h2zh h2zh commented Nov 18, 2025

This PR comes with Pelican Docs updates. Check out how to use this feature there (spoiler: only one command required!)

Benefits

This Model Context Protocol (MCP) integration enables researchers to access scientific datasets using natural language through AI assistants like Claude Code and VSCode Copilot. This positions Pelican at the forefront of the AI-powered scientific computing revolution and reach a wider user base.

MCP is the emerging standard for connecting AI to the real world. Think of it as "USB for AI assistants" - a standardized protocol that allows any MCP-compatible AI assistant (Claude, VS Code Copilot, Cline, etc.) to interact with external tools and data sources. As a data federation CLI tool, Pelican Client is a perfect fit as a MCP to plug in the Agentic AI world.

For users, no more sophisticated commands memorization required - AI can understand the most download demands. What's more, it can suggest the next step, even accomplish the entire workflow from data downloads to data analysis, eventually provides a report on its own.

System design in a nutshell

AI Assistant (e.g. VS Code Copilot) spawns a MCP server process, which is just a thin JSON-RPC wrapper around existing client API. The MCP server does NOT execute pelican object get as a subprocess. Instead, it imports and calls the client library functions (e.g. client.DoGet) directly.

Limitation

This MCP only supports public namespaces, because MCP cannot open an external browser to complete the OAuth flow. But since a large portion of Pelican usage is public data, this documented feature is ready for the production.

h2zh and others added 3 commits November 18, 2025 02:17
  1. mcp/types.go
  - All MCP protocol message structures (JSONRPCRequest, JSONRPCResponse, RPCError)
  - MCP-specific structures (InitializeParams, Tool, CallToolResult, etc.)

  2. mcp/server.go
  - Server struct and core server logic
  - Request handling (initialize, list tools, call tools)
  - Response/error sending functions

  3. mcp/tools.go
  - getToolsList() - Returns tool definitions
  - handleDownload() - Pelican download implementation
  - handleStat() - File metadata retrieval
  - handleList() - Directory listing
- Move config.InitClient() from server startup to lazy initialization
- Initialize Pelican client only when first tool is called
- Prevents corrupting JSON-RPC stream with startup errors
- Fixes 'Invalid input' error in Claude Desktop
- All logs go to stderr, stdout is clean JSON-RPC only

(cherry picked from commit e8df70f)
- Handle 'initialized' notification (sent after initialize)
- Don't respond to notifications (JSON-RPC requests without ID)
- Remove omitempty from response ID field for spec compliance

This fixes the 'Invalid input' error in Claude Desktop caused by
responding to the 'initialized' notification when we shouldn't.

(cherry picked from commit 4babe40)
@h2zh h2zh added the client Issue affecting the OSDF client label Nov 18, 2025
Copy link
Member

@jhiemstrawisc jhiemstrawisc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mostly poking around the PR out of curiosity -- these comments were small things I noticed while nosing around.

@@ -0,0 +1,333 @@
# Using Pelican with AI Assistants via MCP

The Pelican Model Context Protocol (MCP) server enables AI assistants like Claude Code, VS Code Copilot, and other MCP-compatible tools to download files, inspect metadata, and list directories from Pelican federations. This allows you to use natural language to interact with Pelican data directly from your AI-powered development environment.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, we've been trying to switch to a one sentence per line markdown format elsewhere -- it's easier to review Markdown PRs that use this formatting.

Also, since Pelican is an object store, we should refer to "collections" instead of "directories".

### Downloading Research Data

```
User: I need to download the LIGO gravitational wave data from
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit strange to associate this test file with LIGO.

Also, let's prefer the /pelicanplatform/test/hello-world.txt test file wherever possible.

Comment on lines +76 to +78
if debugFlag, _ := cmd.Flags().GetBool("debug"); debugFlag {
log.SetLevel(log.DebugLevel)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than set the log level here, passing the debug flag should behave the way it does with other Pelican commands so that our internal helpers handle setting all the levels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

client Issue affecting the OSDF client

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants