A complete implementation of a Shakespeare-style chatbot using nanoGPT, trained on Shakespeare's complete works. This project includes Docker containerization, a web-based chat interface, and automated setup via Makefile.
# Clone and setup
git clone https://github.com/kumogire/shakespeare-chatbot.git
cd shakespeare-chatbot
# Build and run with Docker
make docker-build
make docker-run
# Access the chat interface at http://localhost:8080# Setup virtual environment and dependencies
make setup
# Download and prepare data
make prepare-data
# Train the model (takes ~30 min with GPU, 2-3 hours with CPU)
make train
# Start the chat interface
make chatshakespeare-chatbot/
βββ README.md # This file
βββ Makefile # Automation commands
βββ Dockerfile # Container configuration
βββ docker-compose.yml # Multi-service orchestration
βββ requirements.txt # Python dependencies
βββ src/
β βββ train.py # Model training script
β βββ chat_interface.py # Web-based chat interface
β βββ model_utils.py # Model loading/generation utilities
β βββ prepare_data.py # Data preparation script
βββ config/
β βββ train_config.py # Training configuration
βββ data/
β βββ shakespeare/ # Training data (auto-downloaded)
βββ models/
β βββ shakespeare_model/ # Saved model checkpoints
βββ static/
β βββ style.css # Web interface styling
β βββ script.js # Frontend JavaScript
βββ templates/
βββ chat.html # Chat interface template
- Python 3.8+
- pip
- (Optional) CUDA-capable GPU for faster training
- Docker
- Docker Compose (optional)
| Command | Description |
|---|---|
make setup |
Create virtual environment and install dependencies |
make prepare-data |
Download and prepare Shakespeare dataset |
make train |
Train the nanoGPT model on Shakespeare |
make train-cpu |
Train using CPU only (slower but works everywhere) |
make chat |
Start the web-based chat interface |
make sample |
Generate sample text from trained model |
make clean |
Clean up generated files and cache |
make docker-build |
Build Docker container |
make docker-run |
Run the complete application in Docker |
make docker-stop |
Stop Docker containers |
make test |
Run basic functionality tests |
git clone https://github.com/kumogire/shakespeare-chatbot.git
cd shakespeare-chatbot# Build the container (includes model training)
make docker-build
# Run the application
make docker-run
# Open your browser to http://localhost:8080The Docker build will automatically:
- Set up the Python environment
- Download Shakespeare's works
- Train the model
- Start the web interface
# 1. Setup environment
make setup
# 2. Prepare training data
make prepare-data
# 3. Train the model
make train # or 'make train-cpu' for CPU-only
# 4. Start chat interface
make chat- Open your browser to
http://localhost:8080 - Type messages in Shakespearean style
- See the AI respond in Shakespeare's voice!
Example conversation:
You: "How fares thee this day?"
Bot: "Fair sir, mine spirits are lifted high, as doth the lark at break of day sing sweetest melodies..."
Edit config/train_config.py to customize:
# Model size
n_layer = 6 # Number of transformer layers
n_head = 6 # Number of attention heads
n_embd = 384 # Embedding dimension
# Training
max_iters = 5000 # Training iterations
batch_size = 64 # Batch size
learning_rate = 1e-3
# Hardware
device = 'cuda' # 'cuda' for GPU, 'cpu' for CPU only- Port: Default 8080 (change in
src/chat_interface.py) - Styling: Modify
static/style.css - Behavior: Edit
static/script.js
# Test data preparation
make test-data
# Test model loading
make test-model
# Generate sample text
make sample- GPU (RTX 3080): ~20-30 minutes
- GPU (GTX 1060): ~45-60 minutes
- CPU (Modern i7): ~2-3 hours
- CPU (Older hardware): ~4-6 hours
Watch for decreasing loss values:
step 0: train loss 4.278, val loss 4.277 # Random text
step 1000: train loss 1.892, val loss 1.945 # Learning words
step 3000: train loss 1.234, val loss 1.456 # Learning grammar
step 5000: train loss 0.823, val loss 1.234 # Coherent Shakespeare!
- Parameters: ~10M (much smaller than GPT-3's 175B!)
- File Size: ~40MB
- RAM Usage: ~500MB during inference
- Base: Python 3.9 slim
- Auto-training: Model trains during container build
- Volume mounting: Persist models between container runs
- Port mapping: 8080 (host) β 8080 (container)
# Build with custom config
docker build --build-arg TRAINING_ITERS=3000 .
# Run with different port
docker run -p 5000:8080 shakespeare-chatbot
# Mount local model directory
docker run -v $(pwd)/models:/app/models shakespeare-chatbot"CUDA out of memory"
# Use CPU training instead
make train-cpu"Port 8080 already in use"
# Change port in src/chat_interface.py or use different port
docker run -p 5000:8080 shakespeare-chatbot"Model not found"
# Ensure training completed successfully
make train
# Check for model files
ls models/shakespeare_model/Slow training on CPU
# Reduce model size in config/train_config.py
n_layer = 4 # instead of 6
max_iters = 2000 # instead of 5000- Different datasets: Replace Shakespeare with your own text
- Model architecture: Adjust layers, attention heads, embedding size
- Training duration: Longer training = better quality
- Interface improvements: Add conversation history, user profiles
- Deployment: Deploy to cloud platforms (AWS, Google Cloud, etc.)
- Conversation memory/context
- Multiple character personalities
- Fine-tuning on specific plays
- REST API endpoints
- Real-time training updates
MIT License - Feel free to use this project for learning and experimentation!
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request