ACCESS Q&A Service

RAG-based Q&A retrieval service for ACCESS-CI. Provides semantic search over human-verified Q&A pairs from Argilla.

Architecture

┌─────────────────┐     ┌─────────────────┐
│  access-agent   │     │   MCP Server    │
│  (LangGraph)    │     │  (TypeScript)   │
└────────┬────────┘     └────────┬────────┘
         │                       │
         │    HTTP/REST API      │
         └───────────┬───────────┘
                     ▼
         ┌───────────────────────┐
         │   QA Service (this)   │
         │      (FastAPI)        │
         └───────────┬───────────┘
                     │
    ┌────────────────┼────────────────┐
    ▼                ▼                ▼
┌────────┐    ┌────────────┐    ┌─────────┐
│ pgvector│    │   Redis    │    │ Argilla │
│ (Q&A)   │    │ (citations)│    │ (source)│
└────────┘    └────────────┘    └─────────┘

Features

Semantic search over Q&A pairs using sentence-transformers embeddings
Performance optimized:
- HNSW index (15x faster than IVFFlat)
- Query-level caching (90%+ reduction for repeated queries)
- Pre-loaded embedding model (no cold start)
- Batch embedding generation
Citation validation via Redis registry
Argilla integration for syncing human-verified Q&A pairs

Quick Start

1. Start Dependencies

docker-compose up -d

This starts:

PostgreSQL with pgvector extension (port 5433)
Redis (port 6380)

2. Install Python Dependencies

# Create virtual environment
python -m venv .venv
source .venv/bin/activate

# Install package
pip install -e ".[dev]"

3. Configure Environment

cp .env.example .env
# Edit .env with your settings

4. Run the Service

# Development mode
uvicorn src.access_qa_service.main:app --reload --port 8001

# Or use the CLI
python -m access_qa_service.main

5. Test It

# Health check
curl http://localhost:8001/health

# Search (after loading data)
curl -X POST http://localhost:8001/search \
  -H "Content-Type: application/json" \
  -d '{"query": "What GPUs does Delta have?"}'

# Get stats
curl http://localhost:8001/admin/stats

API Endpoints

Method	Path	Description
`POST`	`/search`	Semantic search for matching Q&A
`POST`	`/search/by-domain`	Search filtered by domain
`POST`	`/citations/validate`	Batch validate citation markers
`GET`	`/admin/stats`	Service health and stats
`POST`	`/admin/sync`	Trigger Argilla sync (auth required)
`POST`	`/admin/bulk-load`	Direct Q&A upload (auth required)
`POST`	`/admin/load-jsonl`	Load from JSONL file (auth required)
`POST`	`/admin/clear-cache`	Clear query cache (auth required)

Loading Data

From Argilla (Production)

curl -X POST http://localhost:8001/admin/sync \
  -H "Authorization: Bearer $ADMIN_TOKEN"

From JSONL (Development/Testing)

Create a file qa_pairs.jsonl:

{"question": "What GPUs does Delta have?", "answer": "Delta has NVIDIA A100 GPUs. <<SRC:compute-resources:delta.ncsa.access-ci.org>>"}
{"question": "What is the network fabric on PNRP?", "answer": "PNRP uses GigaIO SuperNODE fabric. <<SRC:compute-resources:pnrp.access-ci.org>>"}

Upload it:

curl -X POST http://localhost:8001/admin/load-jsonl \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -F "file=@qa_pairs.jsonl"

Performance

Expected latency after optimizations:

Metric	Value
DB Query (HNSW)	~25ms
Cache Hit	~5ms
P50 End-to-End	~50ms
P95 End-to-End	~150ms

Development

# Run tests
pytest

# Type checking
mypy src

# Linting
ruff check src
ruff format src

Configuration

Variable	Default	Description
`DATABASE_URL`	postgres://localhost:5433/qa_service	PostgreSQL connection
`REDIS_URL`	redis://localhost:6380/0	Redis connection
`EMBEDDING_MODEL`	sentence-transformers/all-MiniLM-L6-v2	Embedding model
`RAG_SIMILARITY_THRESHOLD`	0.85	Minimum similarity for matches
`RAG_TOP_K`	3	Max results to return
`CACHE_TTL_SECONDS`	3600	Query cache TTL
`ADMIN_TOKEN`	(empty)	Admin endpoint auth token

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
src/access_qa_service		src/access_qa_service
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
TODO.md		TODO.md
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ACCESS Q&A Service

Architecture

Features

Quick Start

1. Start Dependencies

2. Install Python Dependencies

3. Configure Environment

4. Run the Service

5. Test It

API Endpoints

Loading Data

From Argilla (Production)

From JSONL (Development/Testing)

Performance

Development

Configuration

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ACCESS Q&A Service

Architecture

Features

Quick Start

1. Start Dependencies

2. Install Python Dependencies

3. Configure Environment

4. Run the Service

5. Test It

API Endpoints

Loading Data

From Argilla (Production)

From JSONL (Development/Testing)

Performance

Development

Configuration

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages