Skip to content

necyberteam/access-qa-service

Repository files navigation

ACCESS Q&A Service

RAG-based Q&A retrieval service for ACCESS-CI. Provides semantic search over human-verified Q&A pairs from Argilla.

Architecture

┌─────────────────┐     ┌─────────────────┐
│  access-agent   │     │   MCP Server    │
│  (LangGraph)    │     │  (TypeScript)   │
└────────┬────────┘     └────────┬────────┘
         │                       │
         │    HTTP/REST API      │
         └───────────┬───────────┘
                     ▼
         ┌───────────────────────┐
         │   QA Service (this)   │
         │      (FastAPI)        │
         └───────────┬───────────┘
                     │
    ┌────────────────┼────────────────┐
    ▼                ▼                ▼
┌────────┐    ┌────────────┐    ┌─────────┐
│ pgvector│    │   Redis    │    │ Argilla │
│ (Q&A)   │    │ (citations)│    │ (source)│
└────────┘    └────────────┘    └─────────┘

Features

  • Semantic search over Q&A pairs using sentence-transformers embeddings
  • Performance optimized:
    • HNSW index (15x faster than IVFFlat)
    • Query-level caching (90%+ reduction for repeated queries)
    • Pre-loaded embedding model (no cold start)
    • Batch embedding generation
  • Citation validation via Redis registry
  • Argilla integration for syncing human-verified Q&A pairs

Quick Start

1. Start Dependencies

docker-compose up -d

This starts:

  • PostgreSQL with pgvector extension (port 5433)
  • Redis (port 6380)

2. Install Python Dependencies

# Create virtual environment
python -m venv .venv
source .venv/bin/activate

# Install package
pip install -e ".[dev]"

3. Configure Environment

cp .env.example .env
# Edit .env with your settings

4. Run the Service

# Development mode
uvicorn src.access_qa_service.main:app --reload --port 8001

# Or use the CLI
python -m access_qa_service.main

5. Test It

# Health check
curl http://localhost:8001/health

# Search (after loading data)
curl -X POST http://localhost:8001/search \
  -H "Content-Type: application/json" \
  -d '{"query": "What GPUs does Delta have?"}'

# Get stats
curl http://localhost:8001/admin/stats

API Endpoints

Method Path Description
POST /search Semantic search for matching Q&A
POST /search/by-domain Search filtered by domain
POST /citations/validate Batch validate citation markers
GET /admin/stats Service health and stats
POST /admin/sync Trigger Argilla sync (auth required)
POST /admin/bulk-load Direct Q&A upload (auth required)
POST /admin/load-jsonl Load from JSONL file (auth required)
POST /admin/clear-cache Clear query cache (auth required)

Loading Data

From Argilla (Production)

curl -X POST http://localhost:8001/admin/sync \
  -H "Authorization: Bearer $ADMIN_TOKEN"

From JSONL (Development/Testing)

Create a file qa_pairs.jsonl:

{"question": "What GPUs does Delta have?", "answer": "Delta has NVIDIA A100 GPUs. <<SRC:compute-resources:delta.ncsa.access-ci.org>>"}
{"question": "What is the network fabric on PNRP?", "answer": "PNRP uses GigaIO SuperNODE fabric. <<SRC:compute-resources:pnrp.access-ci.org>>"}

Upload it:

curl -X POST http://localhost:8001/admin/load-jsonl \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -F "file=@qa_pairs.jsonl"

Performance

Expected latency after optimizations:

Metric Value
DB Query (HNSW) ~25ms
Cache Hit ~5ms
P50 End-to-End ~50ms
P95 End-to-End ~150ms

Development

# Run tests
pytest

# Type checking
mypy src

# Linting
ruff check src
ruff format src

Configuration

Variable Default Description
DATABASE_URL postgres://localhost:5433/qa_service PostgreSQL connection
REDIS_URL redis://localhost:6380/0 Redis connection
EMBEDDING_MODEL sentence-transformers/all-MiniLM-L6-v2 Embedding model
RAG_SIMILARITY_THRESHOLD 0.85 Minimum similarity for matches
RAG_TOP_K 3 Max results to return
CACHE_TTL_SECONDS 3600 Query cache TTL
ADMIN_TOKEN (empty) Admin endpoint auth token

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors