Document Version: 1.0 Created: 2025-10-22 Status: Planning Phase Target: Connectome Server Deployment Integration
This document outlines the complete integration plan for connecting the Phase 2 RAG Benchmark System (Performance Tracking, Cost Optimization, A/B Testing) with the existing Literature Monitoring System deployed on the Connectome server.
✅ Completed Systems:
-
Phase 1: RAGAS Quality Metrics (Days 1-10)
- EvaluationDataset with Pydantic validation
- RAGASEvaluator (faithfulness, answer_relevancy, context_precision, context_recall)
- BaselineEvaluator with aggregation
- PrometheusMetricsExporter + Grafana dashboard
-
Phase 2: Performance Benchmark System (Days 1-10)
- PerformanceTracker (latency, token usage, cost estimation)
- CostOptimizer (budget tracking, optimization suggestions)
- ABTest framework (variant testing, statistical analysis)
✅ Existing Connectome Deployment:
- Literature monitoring (8 ArXiv + 4 PubMed sources)
- Strategic paper collection (400-600 papers/month)
- Docker-based deployment (
deploy_to_connectome.sh) - PostgreSQL + Redis + Celery infrastructure
❌ Gap: Phase 2 systems are standalone - not integrated into Connectome deployment or FastAPI application.
- Expose Phase 2 via REST API - Add evaluation endpoints to existing FastAPI app
- Deploy Monitoring Stack - Add Prometheus + Grafana to docker-compose
- Automate Benchmarks - Create Celery tasks for periodic RAG evaluation
- Unified Dashboard - Combine literature monitoring + RAG performance metrics
- Production Ready - Health checks, logging, alerts, documentation
- ✅ All Phase 2 components accessible via
/api/v1/rag-evaluation/*endpoints - ✅ Grafana dashboard live at
http://connectome:3000with real-time metrics - ✅ Automated daily RAG benchmarks via Celery Beat
- ✅ Deployment script updated with Phase 2 setup steps
- ✅ Zero downtime deployment (existing literature monitoring unaffected)
┌─────────────────────────────────────────────────────────────────┐
│ Connectome Server │
├─────────────────────────────────────────────────────────────────┤
│ Docker Containers │
│ ├─ api (FastAPI) │
│ │ └─ /api/v1/monitoring/* (literature sources/alerts) │
│ ├─ postgres (literature metadata) │
│ ├─ redis (cache + broker) │
│ ├─ celery-worker (paper ingestion tasks) │
│ └─ celery-beat (scheduled syncs) │
├─────────────────────────────────────────────────────────────────┤
│ Phase 2 Systems (STANDALONE - Not Integrated) │
│ ├─ src/services/rag/performance_tracker.py │
│ ├─ src/services/rag/cost_optimizer.py │
│ ├─ src/services/rag/ab_testing.py │
│ ├─ src/services/rag/evaluation_dataset.py │
│ ├─ src/services/rag/ragas_evaluator.py │
│ └─ src/services/rag/metrics_exporter.py │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ Connectome Server (Enhanced) │
├─────────────────────────────────────────────────────────────────┤
│ Docker Containers (Existing + New) │
│ ├─ api (FastAPI) ✨ │
│ │ ├─ /api/v1/monitoring/* (literature) │
│ │ └─ /api/v1/rag-evaluation/* (NEW - Phase 2 endpoints) │
│ ├─ postgres (extended schema) ✨ │
│ ├─ redis (cache + broker) │
│ ├─ celery-worker (extended tasks) ✨ │
│ ├─ celery-beat (extended schedule) ✨ │
│ ├─ prometheus (NEW - metrics collection) 🆕 │
│ └─ grafana (NEW - visualization) 🆕 │
├─────────────────────────────────────────────────────────────────┤
│ Phase 2 Integration Layer (NEW) │
│ ├─ src/api/v1/rag_evaluation.py (REST endpoints) │
│ ├─ src/tasks/rag_benchmark.py (Celery tasks) │
│ ├─ src/services/rag_manager.py (orchestration) │
│ └─ grafana/dashboards/* (monitoring UI) │
└─────────────────────────────────────────────────────────────────┘
NEW Endpoints:
# Performance Tracking
POST /api/v1/rag-evaluation/performance/track
GET /api/v1/rag-evaluation/performance/metrics
POST /api/v1/rag-evaluation/performance/reset
# Cost Optimization
POST /api/v1/rag-evaluation/cost/budget/create
GET /api/v1/rag-evaluation/cost/budget/{budget_id}
POST /api/v1/rag-evaluation/cost/optimize
GET /api/v1/rag-evaluation/cost/suggestions
# A/B Testing
POST /api/v1/rag-evaluation/ab-test/create
POST /api/v1/rag-evaluation/ab-test/{test_id}/add-result
GET /api/v1/rag-evaluation/ab-test/{test_id}/analyze
GET /api/v1/rag-evaluation/ab-test/{test_id}/winner
# RAGAS Evaluation
POST /api/v1/rag-evaluation/ragas/evaluate
GET /api/v1/rag-evaluation/ragas/baseline/{dataset_id}
GET /api/v1/rag-evaluation/ragas/metrics
# Prometheus Metrics Export
GET /metrics # Standard Prometheus endpointIntegration Points:
- Uses existing FastAPI app (
src/main.py) - Shares database connection pool
- Uses Redis for caching evaluation results
- Authentication via existing JWT middleware
NEW Tables:
-- RAG Evaluation Results
CREATE TABLE rag_evaluations (
id UUID PRIMARY KEY,
dataset_id UUID,
evaluation_type VARCHAR(50), -- 'ragas', 'baseline', 'ab_test'
metrics JSONB,
metadata JSONB,
created_at TIMESTAMP DEFAULT NOW()
);
-- Performance Metrics History
CREATE TABLE rag_performance_metrics (
id UUID PRIMARY KEY,
operation VARCHAR(100),
latency FLOAT,
token_usage JSONB,
cost FLOAT,
timestamp TIMESTAMP DEFAULT NOW()
);
-- Cost Budgets
CREATE TABLE rag_cost_budgets (
id UUID PRIMARY KEY,
name VARCHAR(200),
total_budget FLOAT,
spent FLOAT DEFAULT 0.0,
warning_threshold FLOAT DEFAULT 0.8,
critical_threshold FLOAT DEFAULT 0.95,
expenses JSONB,
created_at TIMESTAMP DEFAULT NOW()
);
-- A/B Test Configurations
CREATE TABLE rag_ab_tests (
id UUID PRIMARY KEY,
name VARCHAR(200),
config JSONB, -- variants, traffic_split
status VARCHAR(20), -- 'active', 'completed', 'cancelled'
created_at TIMESTAMP DEFAULT NOW()
);
-- A/B Test Results
CREATE TABLE rag_ab_test_results (
id UUID PRIMARY KEY,
test_id UUID REFERENCES rag_ab_tests(id),
variant_name VARCHAR(100),
metrics JSONB,
cost FLOAT,
timestamp TIMESTAMP DEFAULT NOW()
);Migration:
alembic revision --autogenerate -m "Add Phase 2 RAG evaluation tables"
alembic upgrade headNEW Tasks:
from celery import shared_task
@shared_task(name="rag.daily_benchmark")
def run_daily_rag_benchmark():
"""Run comprehensive RAG evaluation daily."""
# 1. Generate test dataset
# 2. Run RAGAS evaluation
# 3. Track performance metrics
# 4. Calculate costs
# 5. Export to Prometheus
# 6. Send alerts if quality degraded
@shared_task(name="rag.hourly_performance_snapshot")
def capture_performance_snapshot():
"""Capture RAG performance metrics hourly."""
# 1. Get current latency statistics
# 2. Calculate token usage
# 3. Export to Prometheus
# 4. Store in database
@shared_task(name="rag.weekly_cost_analysis")
def analyze_weekly_costs():
"""Weekly cost analysis and optimization suggestions."""
# 1. Aggregate week's costs
# 2. Generate optimization suggestions
# 3. Check budget alerts
# 4. Send report email
@shared_task(name="rag.ab_test_evaluation")
def evaluate_ab_test(test_id: str):
"""Evaluate A/B test and declare winner if ready."""
# 1. Get test configuration
# 2. Analyze results
# 3. Calculate statistical significance
# 4. Declare winner if confidence > 95%Celery Beat Schedule:
# src/core/celery_config.py
CELERY_BEAT_SCHEDULE = {
'daily-rag-benchmark': {
'task': 'rag.daily_benchmark',
'schedule': crontab(hour=2, minute=0), # 2 AM daily
},
'hourly-performance-snapshot': {
'task': 'rag.hourly_performance_snapshot',
'schedule': crontab(minute=0), # Every hour
},
'weekly-cost-analysis': {
'task': 'rag.weekly_cost_analysis',
'schedule': crontab(day_of_week=1, hour=9, minute=0), # Monday 9 AM
},
}NEW Services:
# docker-compose.yml additions
# Prometheus Metrics Collection
prometheus:
image: prom/prometheus:latest
container_name: ai-coscientist-prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/usr/share/prometheus/console_libraries'
- '--web.console.templates=/usr/share/prometheus/consoles'
ports:
- "9090:9090"
volumes:
- ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
networks:
- coscientist-network
restart: unless-stopped
# Grafana Visualization
grafana:
image: grafana/grafana:latest
container_name: ai-coscientist-grafana
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD:-admin}
- GF_INSTALL_PLUGINS=
volumes:
- grafana_data:/var/lib/grafana
- ./grafana/provisioning:/etc/grafana/provisioning
- ./grafana/dashboards:/var/lib/grafana/dashboards
networks:
- coscientist-network
restart: unless-stopped
depends_on:
- prometheus
volumes:
prometheus_data:
grafana_data:Prometheus Configuration (prometheus/prometheus.yml):
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'ai-coscientist-api'
static_configs:
- targets: ['api:8000']
metrics_path: '/metrics'Dashboard JSON (grafana/dashboards/rag_comprehensive_dashboard.json):
{
"dashboard": {
"title": "AI-CoScientist RAG Comprehensive Dashboard",
"panels": [
{
"id": 1,
"title": "RAGAS Metrics - Faithfulness",
"type": "timeseries",
"targets": [
{"expr": "rag_faithfulness{env=\"production\"}"}
]
},
{
"id": 2,
"title": "Performance - Latency (p95)",
"type": "timeseries",
"targets": [
{"expr": "histogram_quantile(0.95, rate(rag_latency_bucket[5m]))"}
]
},
{
"id": 3,
"title": "Cost - Daily Spending",
"type": "graph",
"targets": [
{"expr": "sum(rate(rag_cost_total[1d]))"}
]
},
{
"id": 4,
"title": "A/B Tests - Active Experiments",
"type": "stat",
"targets": [
{"expr": "count(rag_ab_test_status{status=\"active\"})"}
]
}
]
}
}Provisioning (grafana/provisioning/dashboards/dashboards.yml):
apiVersion: 1
providers:
- name: 'AI-CoScientist RAG'
orgId: 1
folder: 'RAG Evaluation'
type: file
disableDeletion: false
updateIntervalSeconds: 10
allowUiUpdates: true
options:
path: /var/lib/grafana/dashboards
foldersFromFilesStructure: trueDay 1-2: REST API Endpoints
# Create new API router
touch src/api/v1/rag_evaluation.py
# Implement endpoints:
# - Performance tracking endpoints
# - Cost optimization endpoints
# - A/B testing endpoints
# - RAGAS evaluation endpoints
# Add to main router
# src/api/v1/__init__.py: include rag_evaluation_routerDay 3-4: Database Schema
# Create migration
alembic revision --autogenerate -m "Add Phase 2 RAG tables"
# Review and adjust migration
vim alembic/versions/[hash]_add_phase_2_rag_tables.py
# Apply migration
alembic upgrade head
# Verify schema
psql -U postgres -d ai_coscientist -c "\dt"Day 5: Testing
# Write API endpoint tests
pytest tests/api/test_rag_evaluation.py -v
# Integration tests
pytest tests/integration/test_rag_phase2.py -vDay 1-2: Prometheus Setup
# Create Prometheus config
mkdir -p prometheus
vim prometheus/prometheus.yml
# Add Prometheus to docker-compose
vim docker-compose.yml
# Test locally
docker-compose up -d prometheus
curl http://localhost:9090/-/healthyDay 3-4: Grafana Setup
# Create Grafana provisioning
mkdir -p grafana/{provisioning,dashboards}
# Import Phase 1 dashboard (already exists)
cp grafana/dashboards/rag_evaluation_dashboard.json \
grafana/dashboards/rag_phase1_metrics.json
# Create comprehensive dashboard
vim grafana/dashboards/rag_comprehensive_dashboard.json
# Add Grafana to docker-compose
vim docker-compose.yml
# Test locally
docker-compose up -d grafana
open http://localhost:3000Day 5: Integration Testing
# Start full stack
docker-compose up -d
# Verify Prometheus scraping
curl http://localhost:9090/api/v1/targets
# Verify Grafana data source
curl -u admin:admin http://localhost:3000/api/datasources
# Run end-to-end test
pytest tests/e2e/test_monitoring_stack.py -vDay 1-2: Celery Tasks
# Create task module
touch src/tasks/rag_benchmark.py
# Implement tasks:
# - run_daily_rag_benchmark()
# - capture_performance_snapshot()
# - analyze_weekly_costs()
# - evaluate_ab_test()Day 3: Celery Beat Schedule
# Update Celery config
vim src/core/celery_config.py
# Add beat schedule for RAG tasks
# Test schedule
celery -A src.core.celery_app inspect scheduledDay 4-5: Testing
# Test individual tasks
pytest tests/tasks/test_rag_benchmark.py -v
# Test beat scheduling
# (requires Redis + Celery Beat running)Day 1-2: Deployment Script
# Update deploy_to_connectome.sh
vim scripts/deploy_to_connectome.sh
# Add new sections:
# - setup_rag_evaluation()
# - configure_prometheus()
# - configure_grafana()
# - verify_rag_deployment()Day 3: Documentation
# Update deployment guide
vim claudedocs/DEPLOYMENT_GUIDE.md
# Add RAG evaluation sections:
# - Environment variables
# - Grafana access
# - Prometheus endpoints
# - TroubleshootingDay 4-5: Integration Testing
# Test deployment script locally
./scripts/deploy_to_connectome.sh
# Verify all services
docker-compose ps
# Check health endpoints
curl http://localhost:8000/api/v1/health
curl http://localhost:9090/-/healthy
curl http://localhost:3000/api/health
# Run comprehensive test suite
pytest tests/ -v --cov=srcNEW Variables (.env.production):
# RAG Evaluation Settings
RAG_EVALUATION_ENABLED=true
RAG_BENCHMARK_SCHEDULE="0 2 * * *" # Daily at 2 AM
RAG_PERFORMANCE_SNAPSHOT_INTERVAL=3600 # Hourly
# Prometheus
PROMETHEUS_PORT=9090
PROMETHEUS_RETENTION_DAYS=15
# Grafana
GRAFANA_PORT=3000
GRAFANA_ADMIN_USER=admin
GRAFANA_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
# Cost Budgets
RAG_MONTHLY_BUDGET=100.0
RAG_BUDGET_WARNING_THRESHOLD=0.8
RAG_BUDGET_CRITICAL_THRESHOLD=0.95
# A/B Testing
RAG_AB_TEST_MIN_CONFIDENCE=0.95
RAG_AB_TEST_MIN_SAMPLES=100# src/core/config.py
class Settings(BaseSettings):
# ... existing settings ...
# Phase 2 RAG Evaluation
rag_evaluation_enabled: bool = True
rag_benchmark_schedule: str = "0 2 * * *"
rag_performance_snapshot_interval: int = 3600
# Prometheus
prometheus_enabled: bool = True
prometheus_port: int = 9090
# Grafana
grafana_enabled: bool = True
grafana_port: int = 3000
# Cost Management
rag_monthly_budget: float = 100.0
rag_budget_warning_threshold: float = 0.8Extended Health Endpoint (/api/v1/health):
@router.get("/health/detailed")
async def detailed_health():
return {
"status": "healthy",
"components": {
"database": await check_database(),
"redis": await check_redis(),
"prometheus": await check_prometheus(),
"rag_evaluation": await check_rag_evaluation(),
"celery_workers": await check_celery_workers()
},
"metrics": {
"rag_evaluations_24h": await count_evaluations_24h(),
"active_ab_tests": await count_active_ab_tests(),
"current_budget_usage": await get_budget_usage()
}
}Prometheus Alerts (prometheus/alerts.yml):
groups:
- name: rag_quality_alerts
interval: 5m
rules:
- alert: RAGFaithfulnessLow
expr: rag_faithfulness{env="production"} < 0.7
for: 10m
labels:
severity: warning
annotations:
summary: "RAG faithfulness score is low"
description: "Faithfulness score {{ $value }} is below threshold (0.7)"
- alert: RAGLatencyHigh
expr: histogram_quantile(0.95, rate(rag_latency_bucket[5m])) > 5
for: 15m
labels:
severity: warning
annotations:
summary: "RAG latency is high"
description: "p95 latency {{ $value }}s exceeds threshold (5s)"
- alert: RAGBudgetCritical
expr: rag_budget_usage_ratio > 0.95
for: 5m
labels:
severity: critical
annotations:
summary: "RAG budget critically low"
description: "Budget usage {{ $value }}% exceeds critical threshold (95%)"# src/services/rag_manager.py
import logging
import structlog
logger = structlog.get_logger(__name__)
async def run_rag_evaluation(dataset_id: str):
logger.info(
"rag_evaluation_started",
dataset_id=dataset_id,
evaluation_type="ragas"
)
try:
results = await evaluator.run_evaluation(dataset_id)
logger.info(
"rag_evaluation_completed",
dataset_id=dataset_id,
metrics=results['metrics'],
duration_seconds=results['duration']
)
return results
except Exception as e:
logger.error(
"rag_evaluation_failed",
dataset_id=dataset_id,
error=str(e),
exc_info=True
)
raise# tests/api/test_rag_evaluation.py
async def test_performance_tracking_endpoint():
"""Test POST /api/v1/rag-evaluation/performance/track"""
response = await client.post(
"/api/v1/rag-evaluation/performance/track",
json={
"operation": "retrieval",
"latency": 0.5,
"tokens": {"prompt": 100, "completion": 50},
"model": "gpt-4"
}
)
assert response.status_code == 200
assert "metric_id" in response.json()
# tests/tasks/test_rag_benchmark.py
async def test_daily_benchmark_task():
"""Test daily RAG benchmark Celery task"""
result = run_daily_rag_benchmark.delay()
assert result.get(timeout=60) # Wait max 60s
# Verify results stored in database
evaluations = await get_recent_evaluations(hours=1)
assert len(evaluations) > 0# tests/integration/test_monitoring_stack.py
async def test_prometheus_metrics_export():
"""Test that Prometheus can scrape metrics from API"""
# Trigger some RAG operations
await run_rag_evaluation(test_dataset_id)
# Wait for metrics export
await asyncio.sleep(2)
# Query Prometheus
response = requests.get(
"http://localhost:9090/api/v1/query",
params={"query": "rag_faithfulness"}
)
assert response.status_code == 200
data = response.json()
assert len(data['data']['result']) > 0
async def test_grafana_dashboard_data():
"""Test that Grafana can display RAG metrics"""
# Query Grafana API
response = requests.get(
"http://localhost:3000/api/datasources/proxy/1/api/v1/query",
params={"query": "rag_faithfulness"},
auth=("admin", "admin")
)
assert response.status_code == 200
assert "data" in response.json()# tests/e2e/test_full_rag_workflow.py
async def test_complete_rag_evaluation_workflow():
"""
End-to-end test of complete RAG evaluation workflow:
1. Upload test data
2. Run evaluation
3. Track performance
4. Calculate costs
5. Export metrics
6. Verify dashboard
"""
# 1. Upload test dataset
dataset = await upload_test_dataset()
# 2. Run RAGAS evaluation
evaluation = await run_evaluation(dataset.id)
assert evaluation.metrics['faithfulness'] > 0.7
# 3. Verify performance tracked
metrics = await get_performance_metrics(operation="evaluation")
assert metrics['count'] > 0
# 4. Verify cost calculated
cost = await get_evaluation_cost(evaluation.id)
assert cost > 0
# 5. Verify Prometheus metrics
prom_metrics = await query_prometheus("rag_faithfulness")
assert len(prom_metrics) > 0
# 6. Verify Grafana dashboard
dashboard = await get_grafana_dashboard("rag_comprehensive_dashboard")
assert dashboard['meta']['slug'] == "rag_comprehensive_dashboard"| Risk | Impact | Probability | Mitigation |
|---|---|---|---|
| Prometheus data loss | High | Low | Regular backups, retention policies |
| Grafana configuration drift | Medium | Medium | Version-controlled dashboards, provisioning |
| Celery task failures | High | Medium | Retry policies, dead letter queue, monitoring |
| Database migration issues | High | Low | Test migrations on staging, backup before production |
| API performance degradation | Medium | Medium | Load testing, caching, async operations |
| Risk | Impact | Probability | Mitigation |
|---|---|---|---|
| Deployment downtime | High | Low | Blue-green deployment, health checks |
| Monitoring stack overhead | Medium | Medium | Resource limits, selective metrics |
| Cost budget exceeded | Medium | Medium | Alerts at 80%, automatic throttling |
| A/B test data loss | Medium | Low | Database backups, test snapshots |
1. Gradual Rollout:
# Deploy to staging first
./scripts/deploy_to_connectome.sh --env staging
# Monitor for 48 hours
watch -n 60 'curl -s http://staging:8000/api/v1/health/detailed | jq .status'
# Deploy to production with canary
./scripts/deploy_to_connectome.sh --env production --canary2. Rollback Plan:
# Tag current production state
git tag -a v2.0-pre-integration -m "Before Phase 2 integration"
# If issues detected, rollback
docker-compose down
git checkout v2.0-pre-integration
docker-compose up -d
alembic downgrade -1 # Rollback migration3. Monitoring Safeguards:
# Resource limits in docker-compose.yml
prometheus:
deploy:
resources:
limits:
cpus: '0.5'
memory: 512M
reservations:
cpus: '0.25'
memory: 256M
grafana:
deploy:
resources:
limits:
cpus: '0.5'
memory: 512M- ✅ All services healthy after deployment
- ✅ Zero errors in logs for first 24 hours
- ✅ Prometheus successfully scraping metrics
- ✅ Grafana dashboards displaying data
- ✅ Celery tasks executing on schedule
- ✅ API response times < 500ms p95
- ✅ Literature monitoring unaffected (existing system continues)
API Latency:
/api/v1/rag-evaluation/*endpoints: p95 < 500ms- Prometheus
/metricsendpoint: p95 < 100ms - Database queries: p95 < 50ms
Resource Usage:
- Prometheus memory: < 512 MB
- Grafana memory: < 512 MB
- API container CPU: < 50% (with Phase 2 active)
Quality Metrics:
- RAGAS evaluation time: < 60 seconds per dataset
- A/B test analysis time: < 5 seconds
- Cost calculation time: < 1 second
Operational Efficiency:
- Automated RAG benchmarks: 1x daily
- Performance snapshots: 1x hourly
- Cost analysis: 1x weekly
- Alert response time: < 5 minutes
Cost Savings:
- Budget alerts prevent overruns: >90% effectiveness
- Optimization suggestions implemented: >50%
- A/B tests inform model selection: measurable cost reduction
-
API Documentation (
docs/API_RAG_EVALUATION.md)- Complete endpoint reference
- Request/response schemas
- Authentication requirements
- Example cURL commands
-
Deployment Guide (
claudedocs/DEPLOYMENT_GUIDE_PHASE2.md)- Step-by-step deployment instructions
- Environment variable configuration
- Docker compose updates
- Migration procedures
-
Operations Manual (
claudedocs/OPERATIONS_RAG_EVALUATION.md)- Monitoring procedures
- Alert response playbooks
- Backup and recovery
- Troubleshooting guide
-
Developer Guide (
claudedocs/DEVELOPER_GUIDE_PHASE2.md)- Local development setup
- Testing procedures
- Code contribution guidelines
- Architecture decisions
-
Grafana Dashboard Guide (
grafana/README_DASHBOARDS.md)- Dashboard navigation
- Metric interpretations
- Alert configurations
- Custom queries
-
Cost Management Guide (
docs/COST_MANAGEMENT.md)- Budget setup
- Optimization workflows
- Report interpretation
- Cost reduction strategies
-
A/B Testing Guide (
docs/AB_TESTING_GUIDE.md)- Test setup
- Result interpretation
- Statistical significance
- Best practices
- Review & Approve Plan - Stakeholder sign-off on integration approach
- Environment Setup - Prepare staging environment for testing
- Resource Allocation - Assign developers to implementation tasks
- Timeline Confirmation - Finalize 4-week implementation schedule
- API Development - Begin REST endpoint implementation
- Database Migration - Create and test schema extensions
- Code Review - Establish review process for Phase 2 code
- CI/CD Updates - Configure automated testing for new endpoints
- Daily Standups - Progress tracking and blocker resolution
- Weekly Demos - Stakeholder demonstrations of completed features
- Bi-weekly Retrospectives - Process improvement and lessons learned
- Milestone Reports - End-of-phase summaries and metrics
Technical Lead: [Name] DevOps Engineer: [Name] Product Owner: [Name]
Communication Channels:
- Slack: #ai-coscientist-phase2
- Email: team@transconnectome.org
- GitHub Issues: https://github.com/Transconnectome/AI-CoScientist/issues
Emergency Contact: [On-call rotation]
- Phase 1 Implementation:
claudedocs/PHASE1_COMPLETE.md - Phase 2 Development:
claudedocs/PHASE2_COMPLETE.md - Existing Deployment:
scripts/deploy_to_connectome.sh - Monitoring Strategy:
claudedocs/MONITORING_STRATEGY.md
- Prometheus Documentation: https://prometheus.io/docs/
- Grafana Provisioning: https://grafana.com/docs/grafana/latest/administration/provisioning/
- FastAPI Best Practices: https://fastapi.tiangolo.com/
- Celery Documentation: https://docs.celeryq.dev/
Document Status: ✅ Ready for Review Next Review Date: 2025-10-29 Version History:
- v1.0 (2025-10-22): Initial integration plan created