Skip to content

elg0nz/paribusAI

Repository files navigation

ParibusAI

Self-hosted Socratic Mentor for Economics Education

ParibusAI is a locally-run AI teaching assistant that uses Socratic methodology to guide economics students through concepts, exam preparation, and conversational quizzes. Originally developed as "Eve" for ECON 3180 Health Economics at Western Michigan University.

Quick Start (Mac Mini M4)

# 1. Install LM Studio from https://lmstudio.ai/
# 2. Download a model (Llama 3.1 8B Instruct recommended) and start server
# 3. Install ParibusAI
git clone <repo-url>
cd paribusAI
python3 -m venv .venv
source .venv/bin/activate
pip install -e .

# 4. Configure
cp .env.example .env
# Edit .env with your settings

# 5. Test it!
paribus --version
paribus status
paribus chat

See detailed installation instructions below.

Features

✅ Currently Available

  • Socratic Mentoring Mode: Interactive chat that guides students through concepts using Socratic questioning

    • Maintains conversation history (10 turns)
    • Never directly states if you're wrong
    • Helps students discover answers through guided questions
  • Conversational Quiz Mode: Structured 10-minute quiz with automatic grading

    • 4-phase workflow: Warmup → Core → Applications → Synthesis
    • Timer enforcement with adaptive pacing
    • Pass/Try Again evaluation with quiz codes
    • Tracks concepts demonstrated and examples provided
  • Course Assistant (RAG): Semantic search over course materials

    • Index PDF, DOCX, and TXT documents
    • Retrieve relevant context for questions
    • Cite sources with similarity scores
    • Filter by document type (syllabus/slides/notes/textbook)
  • Document Indexing: Automated pipeline for course materials

    • Chunks documents with overlap for better retrieval
    • Generates semantic embeddings
    • Stores in ChromaDB vector database
    • Preserves metadata for filtering
  • Fully Self-hosted: Runs locally on Mac Mini M4 with LM Studio

    • No data leaves your machine
    • FERPA compliant
    • No external API calls
    • Complete data privacy control

🧪 Tested & Production Ready

  • 43 comprehensive integration tests (all passing)
  • RAG pipeline tested with real documents
  • Quiz state machine validated
  • Error handling and graceful degradation

Requirements

  • Hardware: Mac Mini M4 or similar Apple Silicon Mac (M1/M2/M3 also work)
  • RAM: 16GB+ recommended (8GB minimum)
  • Storage: 10GB+ free space for models
  • macOS: 12.0 (Monterey) or later

Installation Guide for macOS

Step 1: Install Homebrew (if not already installed)

Open Terminal and run:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Follow the on-screen instructions to add Homebrew to your PATH.

Step 2: Install Python 3.11

Option A: Using Homebrew (Recommended)

brew install python@3.11

Option B: Using asdf (For version management)

# Install asdf
brew install asdf

# Add asdf to your shell (choose your shell)
echo '. /opt/homebrew/opt/asdf/libexec/asdf.sh' >> ~/.zshrc  # For zsh
# or
echo '. /opt/homebrew/opt/asdf/libexec/asdf.sh' >> ~/.bash_profile  # For bash

# Restart your terminal, then:
asdf plugin add python
asdf install python 3.11.0
asdf global python 3.11.0

Verify Python installation:

python3 --version  # Should show Python 3.11.x

Step 3: Install LM Studio

Download and Install

  1. Visit https://lmstudio.ai/
  2. Click "Download for macOS"
  3. Open the downloaded .dmg file
  4. Drag LM Studio to your Applications folder
  5. Open LM Studio from Applications (you may need to allow it in System Preferences → Security & Privacy)

Download a Model

  1. In LM Studio, click on the Search tab (🔍 icon on the left)

  2. Search for one of these recommended models:

    • Llama 3.1 8B Instruct (recommended, 4.7GB)
    • Mistral 7B Instruct v0.2 (good alternative, 4.1GB)
    • Phi-3 Medium Instruct (smaller, faster, 7.6GB)
  3. Click Download next to your chosen model

    • For best performance on 16GB RAM, choose the Q4_K_M quantization
    • This will take 5-15 minutes depending on your internet speed

Load the Model and Start Server

  1. Click on the Local Server tab (💬 icon on the left)
  2. Click Select a model to load
  3. Choose the model you just downloaded
  4. Click Start Server
  5. The server should start at http://127.0.0.1:1234
  6. You should see a green "Server Running" indicator

Important Server Settings:

  • Keep "CORS" enabled (default)
  • Keep "OpenAI Compatible" enabled (default)
  • Context Length: 8192 or higher
  • GPU Offload: Max (for Apple Silicon)

Step 4: Clone and Set Up ParibusAI

# Clone the repository (or navigate to your existing clone)
cd /path/to/paribusAI

# Create and activate a virtual environment (recommended)
python3 -m venv .venv
source .venv/bin/activate  # On macOS/Linux

# Install ParibusAI
pip install -e .

# For development with all tools:
pip install -e ".[dev]"

Step 5: Configure Environment Variables

Copy the example environment file:

cp .env.example .env

Edit .env with your preferred text editor:

nano .env  # or use: code .env, vim .env, etc.

Minimum required settings:

# LM Studio Configuration
LM_STUDIO_URL=http://127.0.0.1:1234/v1
MODEL_NAME=llama-3.1-8b-instruct  # Match your downloaded model
TEMPERATURE=0.0

# Course Information (customize for your course)
COURSE_CODE=ECON3180
COURSE_NAME=Health Economics
PROFESSOR_EMAIL=professor@university.edu
INSTITUTION_NAME=Western Michigan University

Step 6: Verify Installation

Check that ParibusAI CLI is installed:

paribus --version

Check LM Studio connection:

paribus status

If successful, you should see:

  • ✓ LM Studio server is running
  • ✓ Model loaded: llama-3.1-8b-instruct
  • ✓ Connection successful

Troubleshooting

"Command not found: paribus"

  • Make sure you activated the virtual environment: source .venv/bin/activate
  • Or reinstall: pip install -e .

"Connection refused" when running paribus status

  • Make sure LM Studio is running
  • Check that the server is started (green indicator in LM Studio)
  • Verify URL in .env matches: http://127.0.0.1:1234/v1

LM Studio model download stuck

  • Check your internet connection
  • Try a smaller model first (Phi-3 Medium)
  • Restart LM Studio and try again

Out of memory errors

  • Use a smaller model (7B or smaller)
  • Reduce context length in LM Studio settings
  • Close other applications to free up RAM

"Python version mismatch"

  • Verify: python3 --version shows 3.11.x
  • Recreate virtual environment with correct Python version

Usage

CLI Commands

Check Installation

Verify ParibusAI is installed correctly:

paribus --version

Check LM Studio connection and available models:

paribus status

Display current configuration:

paribus config

Socratic Chat Mode

Start an interactive Socratic mentoring session with conversation history:

paribus chat

The chat mode:

  • Maintains conversation history (last 10 turns)
  • Uses Socratic questioning to guide learning
  • Never directly states if you're wrong - asks questions instead
  • Type exit, quit, or q to end the session

Example session:

$ paribus chat

Connected to LM Studio. Ready to start!

You: I'm studying adverse selection. Can you help me understand it?
Mentor: What do you already know about information asymmetry in markets?
You: One party has more information than the other?
Mentor: Good start! Can you think of a real-world example where this happens?
...

Customize model settings:

paribus chat --model mistral-7b-instruct --temperature 0.0

Conversational Quiz Mode

Start a 10-minute structured quiz with automatic grading:

paribus quiz --topic "adverse selection"

The quiz mode:

  • 4-phase structure: Warmup → Core Concepts → Applications → Synthesis
  • 10-minute timer: Automatically enforced
  • Adaptive difficulty: Provides hints if you struggle
  • Automatic grading: Pass/Try Again decision with quiz code
  • Progress tracking: Tracks concepts demonstrated and examples provided

Example quiz:

$ paribus quiz --topic "moral hazard"

┌─────────────────────────────────────────────┐
│ 10-Minute Conversational Quiz               │
│                                             │
│ Topic: moral hazard                         │
│                                             │
│ This quiz will assess your understanding... │
└─────────────────────────────────────────────┘

Quiz starting now...

Quiz: Let's start with the basics. In your own words, what is moral hazard?
You: [Your answer]
...

============================================================
QUIZ COMPLETE
============================================================

Topic              moral hazard
Duration           9.2 minutes
Questions          8
Concepts           3/4
Examples           2

┌─────────────────────────────────────────────┐
│ PASS                                        │
│                                             │
│ You demonstrated solid understanding of    │
│ moral hazard.                               │
│ Quiz Code: 7823                             │
└─────────────────────────────────────────────┘

Change quiz duration (in minutes):

paribus quiz --topic "insurance markets" --duration 15

Course Assistant (RAG-based Q&A)

First, index your course materials:

# Index individual files
paribus index data/course_materials/syllabus.txt --doc-type syllabus

# Index multiple files at once
paribus index data/course_materials/**/*.txt

# Index with specific document type
paribus index slides/*.pdf --doc-type slides

Then ask questions about your course:

paribus ask "When is the final exam?"

Example with output:

$ paribus ask "What topics does the course cover?"

Question: What topics does the course cover?

Retrieved Sources:
#  Source              Type      Similarity
1  syllabus.txt        syllabus  87.3%
2  week1_intro.txt     slides    72.1%

Generating answer...

┌─────────────────────────────────────────────┐
│ Answer                                      │
│                                             │
│ According to the syllabus, the course      │
│ covers market failures in healthcare,      │
│ insurance economics, adverse selection,    │
│ moral hazard, provider behavior, healthcare│
│ spending, comparative health systems, and  │
│ healthcare reform proposals.               │
└─────────────────────────────────────────────┘

Advanced options:

Filter by document type:

paribus ask "What are the grading criteria?" --doc-type syllabus

Retrieve more documents:

paribus ask "Explain adverse selection" --top-k 5

Filter by multiple document types:

paribus ask "Find examples of moral hazard" --doc-type slides --doc-type notes

Document Indexing

Index course materials for semantic search:

# Index single document
paribus index path/to/document.pdf

# Index multiple documents
paribus index syllabus.pdf lecture1.pdf notes.txt

# Index with document type metadata
paribus index syllabus.pdf --doc-type syllabus
paribus index week*.pdf --doc-type slides

# Index into a specific collection
paribus index documents/*.pdf --collection econ3180_spring2024

Supported formats:

  • PDF (.pdf) - Lecture slides, textbooks, papers
  • Word Documents (.docx) - Syllabi, notes
  • Plain Text (.txt) - Notes, transcripts

Document types for metadata:

  • syllabus - Course syllabi
  • slides - Lecture slides and presentations
  • notes - TA notes, class summaries
  • textbook - Textbook chapters and readings
  • other - Other materials

The indexer will:

  • Extract text from documents
  • Chunk text into 500-word segments with 50-word overlap
  • Generate semantic embeddings
  • Store in ChromaDB vector database
  • Preserve metadata for filtering

Development Commands

Run tests:

pytest

Run tests with coverage:

pytest --cov=paribusai --cov-report=html

Format code:

black src/ tests/

Lint code:

ruff check src/ tests/

Type check:

mypy src/

Project Structure

paribusAI/
├── src/
│   └── paribusai/
│       ├── __init__.py
│       ├── cli_module.py        # CLI entry point with all commands
│       ├── config.py            # Configuration management
│       ├── api/
│       │   ├── __init__.py
│       │   └── lm_studio.py     # LM Studio API client
│       ├── prompts/
│       │   ├── __init__.py
│       │   └── socratic.py      # Socratic mentoring prompts
│       ├── rag.py               # RAG: indexing, retrieval, embeddings
│       └── quiz.py              # Quiz state machine and grading
├── tests/
│   └── integration/             # Integration tests
│       ├── test_rag.py          # RAG tests (18 tests)
│       └── test_quiz.py         # Quiz mode tests (25 tests)
├── data/
│   ├── course_materials/        # Course documents (indexed)
│   ├── vector_db/               # ChromaDB storage
│   └── logs/                    # Conversation logs
├── .beads/                      # Issue tracking
│   ├── issues.jsonl
│   └── formulas/                # Molecule workflow templates
├── pyproject.toml
├── .env.example
├── .env                         # Your configuration
└── README.md

Development Phases

  • ✅ Phase 1 (Complete): Basic CLI, LM Studio connection, configuration system
  • ✅ Phase 2 (Complete): Conversation memory, multi-turn dialogue, quiz mode with state machine
  • ✅ Phase 3 (Complete): RAG integration, document indexing, semantic search, course assistant
  • Phase 4 (Planned): Web interface (Streamlit/Gradio), advanced analytics, usage tracking

Current Status: Phase 1-3 complete with 43 passing integration tests. Ready for production use in educational settings.

Technical Stack

  • Backend: Python 3.11+, FastAPI
  • LLM Server: LM Studio (OpenAI-compatible API)
  • Vector DB: ChromaDB
  • CLI: Click with Rich for terminal UI
  • Embeddings: sentence-transformers
  • Document Processing: pypdf, python-docx
  • UI: Rich (terminal), Streamlit/Gradio (web)
  • Deployment: Local (Mac Mini M4)

Configuration

ParibusAI uses environment variables for configuration. Key settings:

LM Studio Settings

  • LM_STUDIO_URL: API endpoint (default: http://127.0.0.1:1234/v1)
  • MODEL_NAME: Model name loaded in LM Studio
  • TEMPERATURE: Response temperature (0.0 for deterministic)
  • MAX_TOKENS: Maximum response length
  • API_TIMEOUT: Request timeout in seconds
  • MAX_RETRIES: Number of retry attempts

Application Settings

  • ENVIRONMENT: development/staging/production
  • APP_HOST: Application host (default: 127.0.0.1)
  • APP_PORT: Application port (default: 8000)
  • DEBUG: Enable debug mode

Quiz Settings

  • QUIZ_DURATION_SECONDS: Quiz duration (default: 600 = 10 minutes)
  • MAX_WORDS_PER_TURN: Maximum words per response (default: 80)
  • THINKING_PAUSE_SECONDS: Pause duration (default: 5)
  • MIN_CONCEPTS_TO_PASS: Required concepts to pass (default: 2)

RAG Settings

  • VECTOR_DB_TYPE: chromadb or faiss
  • VECTOR_DB_PATH: Path to vector database
  • COURSE_MATERIALS_PATH: Path to course materials
  • RAG_TOP_K: Number of documents to retrieve
  • RAG_MIN_SIMILARITY: Minimum similarity threshold

Course Settings

  • COURSE_CODE: Course code (e.g., ECON3180)
  • COURSE_NAME: Course name
  • PROFESSOR_EMAIL: Fallback contact email
  • INSTITUTION_NAME: Institution name

Privacy & Data Management

ParibusAI is designed with privacy as a core principle:

  • Fully self-hosted: All LLM inference runs locally via LM Studio
  • No external API calls: No data leaves your machine
  • FERPA compliant: No PII transmitted to third parties
  • Conversation logs: Stored locally with configurable retention
  • Optional encryption: Logs can be encrypted at rest

Contributing

This project is currently in early development. Contributions welcome once we reach Phase 1 completion.

License

MIT

Author

Gonzalo Maldonado (gonz@sanscourier.com)

Acknowledgments

Originally developed as "Eve: The Health Economics AI Companion" using GPT-4 for ECON 3180 at Western Michigan University. This version is a fully self-hosted alternative for academic institutions prioritizing data privacy and cost control.

About

Self-hosted Socratic AI mentor for education. RAG, ChromaDB, local LLM inference. Python.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages