A modern web application for building vocabulary with AI-powered word analysis, learning features, and intelligent story generation. Built with cutting-edge LLM technology and modern web frameworks.
- Word Management: Add, organize, and track vocabulary words with learning progress
- List Organization: Create custom lists to categorize words by topic, difficulty, or learning goals
- Learning Progress: Mark words as learned/unlearned with visual progress tracking
- User Authentication: Secure JWT-based registration and login system
- Multi-Model AI Analysis: Choose between OpenAI GPT-4 and Google Gemini for word information
- Intelligent Word Validation: AI-powered validation to ensure only real English words are added
- GRE-Focused Learning: Prioritizes advanced, GRE-relevant meanings and contexts
- Contextual Story Generation: Create engaging stories using your vocabulary words in proper context
- Semantic Similarity: Find similar words using vector embeddings and semantic search
- LangChain Integration: Modern prompt engineering with structured output parsing
- Vector Database: ChromaDB-powered semantic search for finding related words
- Embedding Generation: OpenAI embeddings for advanced word similarity analysis
- Prompt Engineering: Specialized prompts for vocabulary learning and story generation
- OpenAI GPT-4o-mini: High-quality text generation with JSON output parsing
- Google Gemini 2.0 Flash: Fast, efficient AI model for vocabulary analysis
- Model Selection: Dynamic switching between AI models based on user preference
- Temperature Control: Optimized temperature (0.7) for consistent, creative outputs
- Structured Prompts: ChatPromptTemplate-based prompts for consistent AI responses
- GRE-Focused Prompts: Specialized prompts that prioritize advanced vocabulary meanings
- Context-Aware Story Generation: Prompts that ensure vocabulary words are used in proper GRE context
- LangChain Chains: Efficient prompt β LLM β parser pipelines for structured outputs
- Word Embeddings: OpenAI text-embedding-3-small for semantic word representation
- ChromaDB Integration: Vector database for similarity search and word relationships
- Semantic Similarity: Find related words based on meaning, not just spelling
- Context-Aware Search: Search within specific word lists for targeted learning
- React 18 - Modern UI framework with hooks and concurrent features
- TypeScript - Full type safety and enhanced developer experience
- Vite - Lightning-fast build tool and development server
- Tailwind CSS - Utility-first CSS framework for responsive design
- ShadCN UI - Beautiful, accessible component library
- TanStack Query - Efficient data fetching, caching, and synchronization
- React Router - Client-side routing with protected routes
- Lucide React - Beautiful, consistent icon library
- FastAPI - Modern, fast Python web framework with automatic API documentation
- SQLModel - SQL database integration with Pydantic models
- PostgreSQL - Robust, scalable relational database
- Alembic - Database migration management
- JWT Authentication - Secure token-based authentication system
- CORS Middleware - Cross-origin resource sharing support
- LangChain Core - Framework for building LLM applications
- OpenAI API - GPT-4o-mini for text generation and embeddings
- Google Generative AI - Gemini 2.0 Flash integration
- ChromaDB - Vector database for semantic search
- NumPy - Numerical computing for vector operations
- UV - Fast Python package manager and environment management
- Docker - Containerization for PostgreSQL and development
- Render - Cloud deployment platform
- Git - Version control with GitHub integration
- Python 3.9+ - Backend runtime
- Node.js 18+ - Frontend runtime
- PostgreSQL 12+ - Database server
- Docker - Optional, for easy database setup
# Create PostgreSQL container
docker run --name vocabuilder-postgres \
-e POSTGRES_PASSWORD=postgres \
-e POSTGRES_DB=vocabuilder \
-p 15432:5432 -d postgres:15
# Optional: Add Adminer for database management
docker run --name adminer \
--network host -d adminer
# Access at http://localhost:8080# Install PostgreSQL (Ubuntu/Debian)
sudo apt install postgresql postgresql-contrib
# Create database
sudo -u postgres createdb vocabuilder
# Or on macOS with Homebrew
brew install postgresql
brew services start postgresql
createdb vocabuilder# Navigate to backend directory
cd Backend
# Install UV (Python package manager)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install dependencies
uv sync
# Create environment file
cp .env.example .env
# Edit .env with your configuration
# Apply database migrations
./scripts/migrations/db.sh migrate apply
# Start development server
uv run uvicorn main:app --reload# Navigate to frontend directory
cd Frontend
# Install dependencies
npm install
# Create environment file
cp .env.example .env
# Edit .env with your configuration
# Start development server
npm run dev- Frontend: http://localhost:5173
- Backend API: http://localhost:8000
- API Documentation: http://localhost:8000/docs
- Database UI: http://localhost:8080 (if using Adminer)
# Database
DATABASE_URL=postgresql://postgres:postgres@localhost:15432/vocabuilder
# Authentication
JWT_SECRET=your-super-secret-jwt-key-here
# AI APIs
OPENAI_API_KEY=your-openai-api-key
GEMINI_API_KEY=your-google-gemini-api-key
# Server
PORT=8000# API Configuration
VITE_API_URL=http://localhost:8000
# Google OAuth (optional)
VITE_GOOGLE_CLIENT_ID=your-google-oauth-client-idThe Vocabuilder database uses a normalized structure with four main tables and their relationships:
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β User β β List β β Word β
β β β β β β
β id (PK) ββββββ€ user_id (FK) β β id (PK) β
β username β β id (PK) ββββββ€ list_id (FK) β
β email β β name β β user_id (FK) β
β hashed_password β β description β β dictionary_id β
β google_id β β created_at β β learned β
β is_active β β updated_at β β β
β created_at β βββββββββββββββββββ βββββββββββββββββββ
β updated_at β β
βββββββββββββββββββ β
β β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββ
β Dictionary β
β β
β id (PK) β
β word β
β synonyms β
β antonyms β
β meanings β
β examples β
β embeddings β
β created_at β
β updated_at β
βββββββββββββββββββ
User: Authentication and user management
List: Custom word categories for users
Word: Links users to dictionary entries with learning status
Dictionary: Shared word information and AI-generated content
- Quick Add Word: Instant word addition with AI validation
- Learning Progress: Visual overview of learned vs. unlearned words
- Recent Activity: Track your vocabulary building journey
- Quick Actions: Access to word generator and story creator
- AI-Powered Analysis: Generate comprehensive word information
- Model Selection: Choose between OpenAI GPT and Google Gemini
- GRE-Focused Results: Prioritizes advanced, test-relevant meanings
- Real-time Generation: Live AI responses with loading states
- Interactive Word Selection: Choose multiple words from your vocabulary
- Context-Aware Stories: AI generates stories using words in proper GRE context
- Learning-Focused Content: Simple language with sophisticated vocabulary usage
- Word Meaning Explanations: Detailed breakdown of how each word was used
- Custom Organization: Create themed lists for focused learning
- Progress Tracking: Monitor learning progress within each list
- Bulk Operations: Manage multiple words efficiently
- Similar Word Discovery: Find related words using semantic search
- Comprehensive View: All your vocabulary words in one place
- Learning Status: Mark words as learned/unlearned
- List Assignment: Organize words into custom categories
- Search & Filter: Find words quickly with advanced filtering
# LangChain-based prompt engineering
WORD_INFO_PROMPT = ChatPromptTemplate.from_messages([
("system", "You are a specialized GRE vocabulary assistant..."),
("user", "VALIDATION: First, determine if '{word}' is a real English word...")
])
# Structured output parsing
chain = prompt | llm | JsonOutputParser()
result = await chain.ainvoke({"word": word})# Context-aware story creation
STORY_PROMPT = ChatPromptTemplate.from_messages([
("system", "You are a creative storyteller who helps people learn GRE vocabulary..."),
("user", "Create an engaging story using these vocabulary words: {words}...")
])
# No parser needed for creative text
chain = prompt | llm
result = await chain.ainvoke({"words": words})# Vector-based word similarity
class VectorService:
def find_similar_words(self, query_string: str, top_n: int = 5):
query_embedding = self.get_embedding(query_string)
results = self.collection.query(
query_embeddings=[query_embedding],
n_results=top_n,
include=["metadatas", "distances"]
)
return self.format_results(results)# Database migrations
./scripts/migrations/db.sh migrate generate "add new feature"
./scripts/migrations/db.sh migrate apply
# Code quality
uv run black .
uv run isort .
uv run flake8 .
# Testing
uv run pytest
# Development server
uv run uvicorn main:app --reload# Development server
npm run dev
# Build for production
npm run build
# Preview production build
npm run preview
# Code quality
npm run lint
npm run format# Open database shell
./scripts/migrations/db.sh shell
# Check migration status
./scripts/migrations/db.sh migrate status
# Reset database (development only)
./scripts/migrations/db.sh resetWhen you make changes to database models, you need to generate and apply migrations:
cd Backend
# 1. Generate migration after changing models
./scripts/migrations/db.sh migrate generate "describe your changes"
# 2. Apply migration locally
./scripts/migrations/db.sh migrate apply
# 3. Test your changes
uv run uvicorn main:app --reload