Version: 0.2.0 (RAG-Primary Architecture)
Planning documentation for the ACCESS-CI intelligent documentation agent.
An AI-powered question-answering system for ACCESS (Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support). ACCESS allocates computing resources from Resource Providers—supercomputers, cloud platforms, and storage systems—to researchers across the US.
Users ask questions like:
- "What GPUs does Delta have?"
- "How do I request an allocation?"
- "Is Expanse currently down?"
This tool answers those questions accurately, with citations to source data.
The ACCESS QA system is an intelligent agent that answers researcher questions about computing resources, allocations, and system status. It classifies each query and routes it to the appropriate handler: factual questions about resource specs and documentation are answered from a database of human-verified Q&A pairs, while questions about real-time data like outages or user allocations are answered via live API calls to MCP servers. All responses include citations linking back to source data.
The system is built on three main components: the access-agent (LangGraph) handles query classification and response synthesis, access-qa-service provides RAG retrieval from curated Q&A pairs stored in PostgreSQL with pgvector, and 10 MCP servers provide real-time access to ACCESS APIs. Human reviewers curate Q&A pairs through Argilla before they enter the system. Future phases will add authenticated actions, allowing users to create announcements and events conversationally.
┌─────────────────────────────────────────────────────────────────────────────┐
│ USER ASKS A QUESTION │
└─────────────────────────────────────┬───────────────────────────────────────┘
│
┌────────────────────┼────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ "What GPUs does │ │ "What GPUs does │ │ "Is Delta down?" │
│ Delta have?" │ │ Delta have and │ │ │
│ │ │ is it running?" │ │ DYNAMIC │
│ STATIC │ │ COMBINED │ │ │
└────────┬─────────┘ └────────┬─────────┘ └────────┬─────────┘
│ │ │
▼ ▼ ▼
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ RAG retrieval │ │ RAG + live MCP │ │ Live MCP call │
│ (verified Q&A) │ │ (comprehensive) │ │ (real-time) │
└────────┬─────────┘ └────────┬─────────┘ └────────┬─────────┘
│ │ │
└────────────────────┼────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ RESPONSE WITH CITATIONS │
│ "Delta has 4x NVIDIA A100 GPUs per node [source link]" │
└─────────────────────────────────────────────────────────────┘
┌─────────────────┐
│ access-agent │ LangGraph orchestration
│ (Python) │ Query classification → routing → synthesis
└────────┬────────┘
│ HTTP
▼
┌─────────────────┐ ┌─────────────────┐
│ QA Service │ │ MCP Servers │
│ (FastAPI) │ │ (TypeScript) │
│ pgvector RAG │ │ 10 servers │
└────────┬────────┘ └────────┬────────┘
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ PostgreSQL │ │ ACCESS APIs │
│ + pgvector │ │ (live data) │
└─────────────────┘ └─────────────────┘
▲
│ sync
┌─────────────────┐
│ Argilla │ Human review
│ (Q&A curation) │
└─────────────────┘
| # | Document | Purpose | Key Sections |
|---|---|---|---|
| 01 | agent-architecture.md | System design + roadmap | Architecture, phases, success metrics, data governance |
| 02 | 02-qa-data.md | Q&A data preparation | MCP extraction, Q&A templates, deduplication |
| 03 | review-system.md | Human review (Argilla) | Pre-deployment approval, post-deployment feedback, domain reviewers |
| 04 | model-training.md | Model training (deprecated) | Historical reference for fine-tuning approach |
| 05 | events-actions.md | MCP action tools | Announcements (Phase 1), Events (Phase 2) |
| 06 | mcp-authentication.md | Authentication architecture | OAuth 2.1, CILogon proxy, token strategy |
| 07 | backend-integration-spec.md | Backend API contract | Service tokens, X-Acting-User, authorization patterns |
| 08 | observability.md | Distributed tracing & monitoring | Honeycomb, OpenTelemetry, dashboards |
| 09 | researcher-profiles.md | User personalization | AI profile storage, Drupal integration, privacy controls |
| 10 | analytics-and-domain-agents.md | Analytics reporting & domain agents | GA4+DB reports, Mailgun delivery, domain agent routing |
| 11 | capability-registry.md | Capability discovery & ratings | Dynamic UI, personalized context, contextual ratings |
| Document | Purpose |
|---|---|
| mcp-extraction-impl.md | MCP Q&A extraction pipeline implementation |
| drupal-announcements-api-spec.md | Drupal API spec for Announcements (Phase 1 pilot) |
| jsm-mcp-server-plan.md | JSM MCP server plan for ticket creation/retrieval |
| jsm-my-tickets-api-spec.md | JSM ticket lookup endpoint specification |
| uky-resource-scoped-rag-spec.md | Resource-scoped RAG with UKY endpoint integration |
Start with 01-agent-architecture.md - covers the full system design and implementation phases.
The data pipeline docs describe a continuous flow:
- 02-qa-data.md - Sources, extraction, Q&A generation
- 03-review-system.md - Human review via Argilla
- Q&A pairs sync to access-qa-service for RAG retrieval
For AI agents to take actions on behalf of users:
- 05-events-actions.md - Overview: phased approach, key patterns
- 06-mcp-authentication.md - OAuth 2.1 authentication with CILogon
- 07-backend-integration-spec.md - Contract for backend API teams
- drupal-announcements-api-spec.md - Phase 1: Drupal developer spec
- Phase: Production / Continuous Improvement
- Completed:
- RAG-primary architecture implemented and deployed
- access-qa-service running with pgvector + HNSW indexing
- access-agent with query classification (static/dynamic/combined)
- 10 MCP servers deployed for live data access
- Argilla integration for human review
- Weekly analytics reports (GA4 + PostgreSQL → Mailgun email) — deployed and scheduled
- Domain agent routing architecture (announcements + JSM) — code complete, not yet committed
- JWT cookie authentication (ES256 + JWKS)
- Chatbot UI analytics: core events (
chatbot_open,chatbot_question_sent, etc.) and ACCESS layer events (tickets, security, menu) tracked via GTM → GA4
- Key Learnings:
- Fine-tuned models didn't reliably retain facts
- Worse, they hallucinated details around what they did memorize
- RAG retrieves verified answers — no hallucination risk
- In Progress:
- JWT cookie authentication working end-to-end (Drupal → agent → MCP servers)
- Domain agent routing deployed for announcements + JSM
- Announcements CRUD working with authenticated user attribution
- Drupal content assist API (
/api/suggest-tags,/api/suggest-summary) - Capability registry design spec complete (11-capability-registry)
- Next Steps:
- Resource-scoped RAG — pass resource context (e.g., Anvil) from Drupal embedding through the agent to UKY RAG endpoints, with fallback to general RAG when out of scope
- Implement capability registry and dynamic chatbot UI
- Deploy remaining ACCESS sites with JWT cookie support (allocations, access-ci.org, metrics)
- Wire production chatbot UI to agent endpoint
- Register remaining GA4 custom dimensions (
isEmbedded,chatbot_env) - MCP server OpenTelemetry instrumentation
- Observability dashboards and alerting
| Repository | Description |
|---|---|
| access-agent | LangGraph agent with RAG + MCP integration |
| access-qa-service | FastAPI service for Q&A retrieval (pgvector) |
| access-mcp | MCP servers for ACCESS data (10 servers) |
| access-qa-extraction | Q&A pair extraction from MCP servers |
| access-qa-training | DEPRECATED - Fine-tuning pipeline (archived) |
This is a planning repository. To propose changes:
- Create a branch
- Edit the relevant document(s)
- Open a PR with a description of what changed and why