Skip to content

jagonmoy/VocaBuilder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

93 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Vocabuilder - AI-Powered Vocabulary Builder

A modern web application for building vocabulary with AI-powered word analysis, learning features, and intelligent story generation. Built with cutting-edge LLM technology and modern web frameworks.

πŸš€ Features

Core Vocabulary Management

  • Word Management: Add, organize, and track vocabulary words with learning progress
  • List Organization: Create custom lists to categorize words by topic, difficulty, or learning goals
  • Learning Progress: Mark words as learned/unlearned with visual progress tracking
  • User Authentication: Secure JWT-based registration and login system

AI-Powered Learning Features

  • Multi-Model AI Analysis: Choose between OpenAI GPT-4 and Google Gemini for word information
  • Intelligent Word Validation: AI-powered validation to ensure only real English words are added
  • GRE-Focused Learning: Prioritizes advanced, GRE-relevant meanings and contexts
  • Contextual Story Generation: Create engaging stories using your vocabulary words in proper context
  • Semantic Similarity: Find similar words using vector embeddings and semantic search

Advanced AI Capabilities

  • LangChain Integration: Modern prompt engineering with structured output parsing
  • Vector Database: ChromaDB-powered semantic search for finding related words
  • Embedding Generation: OpenAI embeddings for advanced word similarity analysis
  • Prompt Engineering: Specialized prompts for vocabulary learning and story generation

🧠 LLM Concepts & AI Architecture

Language Model Integration

  • OpenAI GPT-4o-mini: High-quality text generation with JSON output parsing
  • Google Gemini 2.0 Flash: Fast, efficient AI model for vocabulary analysis
  • Model Selection: Dynamic switching between AI models based on user preference
  • Temperature Control: Optimized temperature (0.7) for consistent, creative outputs

Prompt Engineering & Chain Architecture

  • Structured Prompts: ChatPromptTemplate-based prompts for consistent AI responses
  • GRE-Focused Prompts: Specialized prompts that prioritize advanced vocabulary meanings
  • Context-Aware Story Generation: Prompts that ensure vocabulary words are used in proper GRE context
  • LangChain Chains: Efficient prompt β†’ LLM β†’ parser pipelines for structured outputs

Vector Search & Semantic Understanding

  • Word Embeddings: OpenAI text-embedding-3-small for semantic word representation
  • ChromaDB Integration: Vector database for similarity search and word relationships
  • Semantic Similarity: Find related words based on meaning, not just spelling
  • Context-Aware Search: Search within specific word lists for targeted learning

πŸ› οΈ Tech Stack

Frontend

  • React 18 - Modern UI framework with hooks and concurrent features
  • TypeScript - Full type safety and enhanced developer experience
  • Vite - Lightning-fast build tool and development server
  • Tailwind CSS - Utility-first CSS framework for responsive design
  • ShadCN UI - Beautiful, accessible component library
  • TanStack Query - Efficient data fetching, caching, and synchronization
  • React Router - Client-side routing with protected routes
  • Lucide React - Beautiful, consistent icon library

Backend

  • FastAPI - Modern, fast Python web framework with automatic API documentation
  • SQLModel - SQL database integration with Pydantic models
  • PostgreSQL - Robust, scalable relational database
  • Alembic - Database migration management
  • JWT Authentication - Secure token-based authentication system
  • CORS Middleware - Cross-origin resource sharing support

AI & Machine Learning

  • LangChain Core - Framework for building LLM applications
  • OpenAI API - GPT-4o-mini for text generation and embeddings
  • Google Generative AI - Gemini 2.0 Flash integration
  • ChromaDB - Vector database for semantic search
  • NumPy - Numerical computing for vector operations

Development & Deployment

  • UV - Fast Python package manager and environment management
  • Docker - Containerization for PostgreSQL and development
  • Render - Cloud deployment platform
  • Git - Version control with GitHub integration

πŸ“¦ Quick Start

Prerequisites

  • Python 3.9+ - Backend runtime
  • Node.js 18+ - Frontend runtime
  • PostgreSQL 12+ - Database server
  • Docker - Optional, for easy database setup

1. Database Setup

Option A: Docker (Recommended)

# Create PostgreSQL container
docker run --name vocabuilder-postgres \
  -e POSTGRES_PASSWORD=postgres \
  -e POSTGRES_DB=vocabuilder \
  -p 15432:5432 -d postgres:15

# Optional: Add Adminer for database management
docker run --name adminer \
  --network host -d adminer
# Access at http://localhost:8080

Option B: Local PostgreSQL

# Install PostgreSQL (Ubuntu/Debian)
sudo apt install postgresql postgresql-contrib

# Create database
sudo -u postgres createdb vocabuilder

# Or on macOS with Homebrew
brew install postgresql
brew services start postgresql
createdb vocabuilder

2. Backend Setup

# Navigate to backend directory
cd Backend

# Install UV (Python package manager)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install dependencies
uv sync

# Create environment file
cp .env.example .env
# Edit .env with your configuration

# Apply database migrations
./scripts/migrations/db.sh migrate apply

# Start development server
uv run uvicorn main:app --reload

3. Frontend Setup

# Navigate to frontend directory
cd Frontend

# Install dependencies
npm install

# Create environment file
cp .env.example .env
# Edit .env with your configuration

# Start development server
npm run dev

4. Access Your Application

πŸ”§ Environment Variables

Backend (.env)

# Database
DATABASE_URL=postgresql://postgres:postgres@localhost:15432/vocabuilder

# Authentication
JWT_SECRET=your-super-secret-jwt-key-here

# AI APIs
OPENAI_API_KEY=your-openai-api-key
GEMINI_API_KEY=your-google-gemini-api-key

# Server
PORT=8000

Frontend (.env)

# API Configuration
VITE_API_URL=http://localhost:8000

# Google OAuth (optional)
VITE_GOOGLE_CLIENT_ID=your-google-oauth-client-id

πŸ—„οΈ Database Schema

Core Tables & Relationships

The Vocabuilder database uses a normalized structure with four main tables and their relationships:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚      User       β”‚    β”‚      List       β”‚    β”‚      Word       β”‚
β”‚                 β”‚    β”‚                 β”‚    β”‚                 β”‚
β”‚ id (PK)         │◄──── user_id (FK)    β”‚    β”‚ id (PK)         β”‚
β”‚ username        β”‚    β”‚ id (PK)         │◄──── list_id (FK)    β”‚
β”‚ email           β”‚    β”‚ name            β”‚    β”‚ user_id (FK)    β”‚
β”‚ hashed_password β”‚    β”‚ description     β”‚    β”‚ dictionary_id   β”‚
β”‚ google_id       β”‚    β”‚ created_at      β”‚    β”‚ learned         β”‚
β”‚ is_active       β”‚    β”‚ updated_at      β”‚    β”‚                 β”‚
β”‚ created_at      β”‚    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ updated_at      β”‚                                    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                    β”‚
         β”‚                                              β”‚
         β”‚                                              β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                       β”‚
                                                       β–Ό
                                              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                              β”‚   Dictionary    β”‚
                                              β”‚                 β”‚
                                              β”‚ id (PK)         β”‚
                                              β”‚ word            β”‚
                                              β”‚ synonyms        β”‚
                                              β”‚ antonyms        β”‚
                                              β”‚ meanings        β”‚
                                              β”‚ examples        β”‚
                                              β”‚ embeddings      β”‚
                                              β”‚ created_at      β”‚
                                              β”‚ updated_at      β”‚
                                              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

User: Authentication and user management List: Custom word categories for users
Word: Links users to dictionary entries with learning status Dictionary: Shared word information and AI-generated content

πŸ“± Application Pages & Features

1. Home Dashboard (/)

  • Quick Add Word: Instant word addition with AI validation
  • Learning Progress: Visual overview of learned vs. unlearned words
  • Recent Activity: Track your vocabulary building journey
  • Quick Actions: Access to word generator and story creator

2. Word Generator (/generator)

  • AI-Powered Analysis: Generate comprehensive word information
  • Model Selection: Choose between OpenAI GPT and Google Gemini
  • GRE-Focused Results: Prioritizes advanced, test-relevant meanings
  • Real-time Generation: Live AI responses with loading states

3. Story Generator (/story)

  • Interactive Word Selection: Choose multiple words from your vocabulary
  • Context-Aware Stories: AI generates stories using words in proper GRE context
  • Learning-Focused Content: Simple language with sophisticated vocabulary usage
  • Word Meaning Explanations: Detailed breakdown of how each word was used

4. Word Lists (/lists)

  • Custom Organization: Create themed lists for focused learning
  • Progress Tracking: Monitor learning progress within each list
  • Bulk Operations: Manage multiple words efficiently
  • Similar Word Discovery: Find related words using semantic search

5. Word Management (/)

  • Comprehensive View: All your vocabulary words in one place
  • Learning Status: Mark words as learned/unlearned
  • List Assignment: Organize words into custom categories
  • Search & Filter: Find words quickly with advanced filtering

πŸ” AI Features Deep Dive

Word Information Generation

# LangChain-based prompt engineering
WORD_INFO_PROMPT = ChatPromptTemplate.from_messages([
    ("system", "You are a specialized GRE vocabulary assistant..."),
    ("user", "VALIDATION: First, determine if '{word}' is a real English word...")
])

# Structured output parsing
chain = prompt | llm | JsonOutputParser()
result = await chain.ainvoke({"word": word})

Story Generation with Context

# Context-aware story creation
STORY_PROMPT = ChatPromptTemplate.from_messages([
    ("system", "You are a creative storyteller who helps people learn GRE vocabulary..."),
    ("user", "Create an engaging story using these vocabulary words: {words}...")
])

# No parser needed for creative text
chain = prompt | llm
result = await chain.ainvoke({"words": words})

Semantic Similarity Search

# Vector-based word similarity
class VectorService:
    def find_similar_words(self, query_string: str, top_n: int = 5):
        query_embedding = self.get_embedding(query_string)
        results = self.collection.query(
            query_embeddings=[query_embedding],
            n_results=top_n,
            include=["metadatas", "distances"]
        )
        return self.format_results(results)

πŸš€ Development Workflow

Backend Development

# Database migrations
./scripts/migrations/db.sh migrate generate "add new feature"
./scripts/migrations/db.sh migrate apply

# Code quality
uv run black .
uv run isort .
uv run flake8 .

# Testing
uv run pytest

# Development server
uv run uvicorn main:app --reload

Frontend Development

# Development server
npm run dev

# Build for production
npm run build

# Preview production build
npm run preview

# Code quality
npm run lint
npm run format

Database Management

# Open database shell
./scripts/migrations/db.sh shell

# Check migration status
./scripts/migrations/db.sh migrate status

# Reset database (development only)
./scripts/migrations/db.sh reset

πŸ”„ Database Migrations & Schema Changes

Migration Workflow

When you make changes to database models, you need to generate and apply migrations:

cd Backend

# 1. Generate migration after changing models
./scripts/migrations/db.sh migrate generate "describe your changes"

# 2. Apply migration locally
./scripts/migrations/db.sh migrate apply

# 3. Test your changes
uv run uvicorn main:app --reload

About

Simple LLM Based Application to learn words

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published