Your South African medical aid questions answered with accurate, up-to-date information.
A modern AI-powered medical aid assistant that helps South Africans understand and compare medical aid plans from the top 3 providers. Built with quality data, semantic search, and RAG (Retrieval-Augmented Generation).
β Phase 3 Complete: Production-ready medical aid assistant with modern UI
- β Phase 1: High-quality scraping from top 3 providers (23/26 documents, 88% quality)
- β Phase 2: PostgreSQL + pgvector database, semantic search, cloud LLM integration
- β Phase 3: Public platform with provider selector, enhanced citations, responsive design
- π Phase 4 (Current): Deployment preparation and optimization
The application is functional and ready for testing!
- π₯ Top 3 SA Providers - Discovery Health, Bonitas Medical Fund, Momentum Health
- π¬ AI-Powered Chat - Ask questions in natural language, get accurate answers
- π Semantic Search - Query expansion and intent detection for better results
- π Source Citations - Every answer backed by official documentation with links
- π¨ Modern UI - Next.js/Vercel-inspired design with dark mode
- π± Responsive - Optimized for desktop, tablet, and mobile
- π― Provider Filter - Search across all providers or focus on one
- β‘ Real-time Streaming - Answers stream in as they're generated
- πΏπ¦ SA Context - Rands (R), SA English spelling, and medical aid terminology
- ποΈ Vector Database - PostgreSQL with pgvector for semantic search
- π€ RAG System - Retrieval-Augmented Generation with cloud LLMs
- π Fallback Strategy - Multiple LLM models (Gemini, Llama, Mistral)
- π Smart Re-ranking - Intent-based result boosting
- π Query Expansion - Automatic synonym and related term expansion
- Node.js 18+
- npm or yarn
- PostgreSQL with pgvector extension (for Phase 2+)
- Ollama (for local embeddings)
# Clone the repository
git clone https://github.com/ThandoSomacele/covercheck.git
cd covercheck
# Install dependencies
npm install
# Set up environment variables
cp .env.example .env
# Edit .env and add your API keys and database connection stringCreate a .env file in the project root with the following variables:
# Database Configuration
DB_CONNECTION_STRING=postgresql://user:password@host:port/database
# OpenRouter API Configuration
OPENROUTER_API_KEY=sk-or-v1-your_api_key_here- NEVER commit your
.envfile to git - NEVER hardcode API keys or secrets in source code
- The
.envfile is already in.gitignorefor your protection - Use
.env.exampleas a template (it contains no real secrets)
# Start the development server
npm run dev
# Open http://localhost:5173 in your browser# Set up the database schema
psql $DB_CONNECTION_STRING -f scripts/db-setup.sql
# Load scraped documents into the database
npx tsx scripts/load-documents.ts
# Verify data loaded correctly
npx tsx scripts/check-db-stats.ts
# Optimize database performance
psql $DB_CONNECTION_STRING -f scripts/optimize-db.sql# Scrape all 3 providers
npm run scrape
# Scrape individual providers
npm run scrape:discovery
npm run scrape:bonitas
npm run scrape:momentumcovercheck/
βββ src/
β βββ lib/
β β βββ insurance/ # Static insurance data
β β β βββ documents-sa.ts # SA medical aid documents
β β β βββ insurance-*-glossary.ts
β β βββ server/
β β βββ rag.ts # RAG logic (future)
β β βββ scrapers/ # Web scraping system
β β βββ BaseScraper.ts
β β βββ DiscoveryHealthScraper.ts
β β βββ BonitasScraper.ts
β β βββ MomentumHealthScraper.ts
β β βββ ScraperOrchestrator.ts
β βββ routes/
β βββ +page.svelte # Chat UI (future)
β βββ api/chat/+server.ts # API endpoint (future)
βββ scripts/ # Utility scripts
β βββ scrape.ts # Main scraping CLI
β βββ validate-content.ts # Quality validation
β βββ analyze-scraped-data.ts # Data analysis
βββ docs/ # Complete documentation
β βββ README.md # Documentation index
β βββ SCRAPING.md # Scraping system guide
β βββ VERIFIED_URLS.md # Verified provider URLs
β βββ SCRAPER_FIX_PLAN.md # Quality improvement process
βββ scraped-data/ # JSON output (gitignored)
βββ legacy/ # Old implementations
βββ COVERCHECK_ROADMAP.md # Development roadmap
| Provider | Documents | Quality Rate | Avg. Content |
|---|---|---|---|
| Discovery Health | 13/14 | 93% | 10,000 chars |
| Momentum Health | 9/10 | 90% | 10,500 chars |
| Bonitas Medical Fund | 1/2 | 50% | 193,754 chars |
| Total | 23/26 | 88% | ~10,000 chars |
β Plan Information
- All major plans from each provider
- Plan benefits and exclusions
- Coverage details
β Benefits Documentation
- Hospital benefits
- Day-to-day benefits
- Chronic illness benefits
β Support Information
- Claims processes
- Comparison tools
- Contact information
- Research and verify provider URLs
- Build scraping system with Playwright
- Implement quality validation
- Scrape top 3 SA providers
- Achieve 20+ quality documents
- Set up PostgreSQL + pgvector
- Design database schema
- Process and chunk documents
- Generate embeddings with Ollama
- Implement semantic search with query expansion
- Build RAG pipeline with cloud LLMs
- Implement streaming responses
- Add source citations
- Modern SvelteKit UI with dark mode
- Provider selector component
- Enhanced citation display with relevance scores
- Responsive design for mobile/tablet
- CoverCheck branding and logo
- Production environment configuration
- Database optimization and indexes
- Error handling improvements
- Deployment documentation
- Analytics tracking
- Final testing and QA
- Deploy to Vercel + Railway/Supabase
- Monitor performance and errors
- Collect user feedback
- Content update automation
See CLAUDE.md and DEPLOYMENT.md for detailed information.
All documentation is in the docs/ directory:
- docs/README.md - Documentation index
- docs/SCRAPING.md - Complete scraping guide
- docs/VERIFIED_URLS.md - Working provider URLs
- docs/SCRAPER_FIX_PLAN.md - Quality improvement process
- docs/PROJECT_OVERVIEW.md - Project goals and architecture
- docs/SETUP_SA.md - SA-specific setup guide
Current:
- SvelteKit - Full-stack framework
- TypeScript - Type safety
- Playwright - Web scraping
- Cheerio - HTML parsing
Planned:
- PostgreSQL + pgvector - Vector database
- Ollama - Local AI model runner
- OpenRouter - Cloud AI API (alternative)
- Svelte 5 - Reactive UI
Want to add another medical aid provider? See the Adding New Scrapers guide.
Quick overview:
- Create a new scraper class extending
BaseScraper - Define target URLs and selectors
- Register in
ScraperOrchestrator - Test and validate quality
This is a learning project documenting the journey of building a production RAG system. Contributions, suggestions, and feedback are welcome!
See LICENSE file for details.
Built with:
Data sourced from:
Made with β€οΈ for South Africans who deserve simple, accurate medical aid information.
Current Version: Phase 3 Complete (Production-Ready) Last Updated: 2025-11-20