Modernize NLP course with updated syllabus, slides, and assignments #1

jeremymanning · 2025-12-24T20:13:19Z

Major Updates:

📚 Syllabus Enhancements:

Added 40+ primary source research papers from top venues
Integrated cognitive science and linguistics papers (Fedorenko, Schrimpf, etc.)
Added comprehensive HuggingFace course chapter references throughout
Updated all weeks with modern 2023-2025 papers (Self-RAG, Mixtral, Llama 3, etc.)
Enhanced distributional semantics, neuroscience, and dialogue sections

📊 Complete Beamer Slide Decks (7 weeks):

Week 1: Introduction, String Manipulation, ELIZA
Week 2: Computational Linguistics, Tokenization, POS Tagging
Weeks 3-4: Text Embeddings, LSA, LDA, Word2Vec, Modern Topic Modeling
Weeks 5-6: Transformers, Attention, BERT, Neuroscience Perspectives
Week 7: Models of Conversation, Pragmatics, Dialogue, Common Ground
Week 8: GPT Evolution (GPT-1 to GPT-4), LLMs and the Brain
Week 9: RAG, Mixture of Experts, Self-Supervised Learning, Future Directions

All slides include:

Engaging emojis and modern design (Metropolis theme)
Code examples with syntax highlighting
TikZ diagrams and visualizations
Discussion questions for undergraduates
References to primary sources and HuggingFace resources
Interactive learning elements

🤖 GitHub Actions Automation:

Automatic LaTeX compilation on push
PDF generation for all slides
Beautiful web interface with gradient design
GitHub Pages deployment
One-click access to all course materials

📝 Comprehensive Assignment Updates:

Assignment 4: Context-Aware Customer Service Chatbot
- RAG-based implementation with semantic search
- BERT/sentence-transformers + FAISS
- Multi-turn conversation handling
- Comprehensive evaluation metrics
Assignment 5: Build and Train GPT Model
- Implement transformer from scratch
- Multiple implementation paths (scratch/nanoGPT/HF)
- Training on custom datasets
- Attention visualization and analysis
- Comparison with GPT-2
Final Project: Capstone Research Project
- 22 diverse project ideas across 8 categories
- Clear timeline with milestones
- Detailed grading rubric
- Team-based (2-3 students)
- Leverages GenAI for ambitious scope

All assignments designed for sophisticated undergraduate work with GenAI coding assistance.

Technical Details:

All slides compilable with pdflatex/xelatex
Automated CI/CD pipeline for slides
Comprehensive documentation and resources
Modern tools: HuggingFace, PyTorch, Colab-compatible
Cutting-edge topics: RAG, MoE, multimodal learning, agents

Major Updates: ============= 📚 Syllabus Enhancements: - Added 40+ primary source research papers from top venues - Integrated cognitive science and linguistics papers (Fedorenko, Schrimpf, etc.) - Added comprehensive HuggingFace course chapter references throughout - Updated all weeks with modern 2023-2025 papers (Self-RAG, Mixtral, Llama 3, etc.) - Enhanced distributional semantics, neuroscience, and dialogue sections 📊 Complete Beamer Slide Decks (7 weeks): - Week 1: Introduction, String Manipulation, ELIZA - Week 2: Computational Linguistics, Tokenization, POS Tagging - Weeks 3-4: Text Embeddings, LSA, LDA, Word2Vec, Modern Topic Modeling - Weeks 5-6: Transformers, Attention, BERT, Neuroscience Perspectives - Week 7: Models of Conversation, Pragmatics, Dialogue, Common Ground - Week 8: GPT Evolution (GPT-1 to GPT-4), LLMs and the Brain - Week 9: RAG, Mixture of Experts, Self-Supervised Learning, Future Directions All slides include: - Engaging emojis and modern design (Metropolis theme) - Code examples with syntax highlighting - TikZ diagrams and visualizations - Discussion questions for undergraduates - References to primary sources and HuggingFace resources - Interactive learning elements 🤖 GitHub Actions Automation: - Automatic LaTeX compilation on push - PDF generation for all slides - Beautiful web interface with gradient design - GitHub Pages deployment - One-click access to all course materials 📝 Comprehensive Assignment Updates: - Assignment 4: Context-Aware Customer Service Chatbot - RAG-based implementation with semantic search - BERT/sentence-transformers + FAISS - Multi-turn conversation handling - Comprehensive evaluation metrics - Assignment 5: Build and Train GPT Model - Implement transformer from scratch - Multiple implementation paths (scratch/nanoGPT/HF) - Training on custom datasets - Attention visualization and analysis - Comparison with GPT-2 - Final Project: Capstone Research Project - 22 diverse project ideas across 8 categories - Clear timeline with milestones - Detailed grading rubric - Team-based (2-3 students) - Leverages GenAI for ambitious scope All assignments designed for sophisticated undergraduate work with GenAI coding assistance. Technical Details: - All slides compilable with pdflatex/xelatex - Automated CI/CD pipeline for slides - Comprehensive documentation and resources - Modern tools: HuggingFace, PyTorch, Colab-compatible - Cutting-edge topics: RAG, MoE, multimodal learning, agents

Assignment Updates: ================== 📝 Assignment 1: ELIZA (113 → 520 lines) - Added historical context and ELIZA effect psychology - New Part 2: Analysis and Exploration (conversation testing, pattern analysis) - New Part 3: Reflection on conversation, understanding, and ethics - Optional extensions: advanced pattern matching, emotion tracking, hybrid systems - Comprehensive grading rubric and resources - Emphasis on critical thinking about AI 📊 Assignment 2: SPAM Classifier (145 → 589 lines) - Expanded to require multiple methods: traditional ML + neural + ensemble - Comprehensive evaluation with 6+ metrics, not just AUC - Systematic error analysis (20+ failure cases) - NEW: Adversarial testing (create spam to evade detection) - Real-world deployment considerations - Statistical rigor requirements (cross-validation, significance tests) - Performance benchmarks: AUC > 0.85 (minimum), > 0.96 (excellent) 📈 Assignment 3: Wikipedia Embeddings (95 → 856 lines) - Expanded from 4 to 10+ embedding methods across 5 categories - Classical: LSA, LDA - Static: Word2Vec, GloVe, FastText - Contextualized: BERT, GPT-2 - Modern: Sentence-BERT, Llama 3 8B - Topic Models: BERTopic, Top2Vec - Sophisticated clustering with multiple algorithms - Quantitative metrics (silhouette, Davies-Bouldin, coherence) - Cognitive science connection (distributional hypothesis, human judgments) - Advanced visualization (interactive Plotly, UMAP + t-SNE) - Optional extensions: cross-lingual, temporal analysis, applications - 40-60 hour multi-week project All Assignments Now Include: - Clear learning objectives - Detailed grading rubrics (100 points + bonus) - Tips for success and common pitfalls - Comprehensive resources and references - Submission guidelines and checklists - Academic integrity policies - FAQ sections - Professional structure matching Assignments 4-5 Philosophy: All assignments now balance hands-on implementation with deep analytical thinking, leveraging GenAI for ambitious scope while ensuring genuine understanding.

Copilot

Pull request overview

This PR modernizes an NLP course with comprehensive updates including new beamer slide decks covering 7 weeks of content, spanning from introductory ELIZA chatbots through advanced topics like Transformers, GPT evolution, RAG (Retrieval Augmented Generation), and Mixture of Experts. The materials include engaging emojis, code examples with syntax highlighting, TikZ diagrams, discussion questions for undergraduates, and references to primary sources and HuggingFace resources.

Key Changes:

Complete slide decks for weeks 1, 5-6, 7, 8, and 9
Modern design with Metropolis/Madrid themes and extensive visualizations
Integration of cognitive neuroscience perspectives on language
Comprehensive code examples in Python using HuggingFace transformers
Assignment descriptions and pedagogical materials

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
slides/week9/lecture.tex	Week 9 slides covering RAG, MoE, self-supervised learning, CLIP, and future directions with final project guidelines
slides/week8/lecture.tex	Week 8 slides on GPT evolution (GPT-1 through GPT-4), open-source LLMs, brain-language model convergence, and Turing test discussion
slides/week7/lecture.tex	Week 7 slides on conversation models, pragmatics, dialogue, common ground, and embodied language with ConvoKit demo
slides/week5-6/lecture.tex	Weeks 5-6 slides covering sequence-to-sequence models, attention mechanisms, transformers, BERT, and cognitive neuroscience perspectives
slides/week1/lecture.tex	Week 1 introduction slides covering consciousness, language vs. thought, pattern matching, ELIZA, and Assignment 1
slides/week1/README.md	Documentation for week 1 slides with compilation instructions and presentation tips

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Updated all assignments to 1-week timelines (except Assignment 3: 2 weeks, Final Project: 4 weeks) - Added comprehensive day-by-day syllabus schedule with: * MWF 10:00-11:05 lecture times * X-hours (Tu 12:00-12:50) in first 3 weeks * MLK Day and instructor absence noted * Topics, readings, slide links, and assignment deadlines for each class - Updated slides README with detailed lecture-by-lecture breakdown - All assignments now include daily schedules optimized for GenAI assistance

Major updates: Schedule corrections: - Update meeting times to MWF 10:10-11:15 (was 10:00-11:05) - Update X-hour to Thursday 12:15-1:05 (was Tuesday 12:00-12:50) - Update syllabus with corrected Thursday X-hour dates Individual lecture slides (24 total): - Split combined lecture files into individual decks for each lecture - Week 1: Lectures 1-3 (Introduction, Pattern Matching, ELIZA) - Week 2: Lectures 4-6 (Data Cleaning, Tokenization, POS/Sentiment) - Week 3: Lectures 7-8 (Classic Embeddings, Word Embeddings) - Week 4: Lectures 9-11 (Contextual Embeddings, Dimensionality Reduction, Cognitive Models) - Week 5: Lectures 12-14 (Attention, Transformers, Training) - Week 6: Lectures 15-17 (BERT Deep Dive, Variants, Applications) - Week 7: Lectures 18-20 (GPT Architecture, Scaling, Implementation) - Week 9: Lectures 21-23 (RAG, MoE, Ethics) - Week 10: Lecture 24 (Final Project Work Session) - All slides use Metropolis theme with emojis, TikZ diagrams, and discussion questions X-hour demo notebooks (3 total): - Week 1: ELIZA implementation and debugging workshop - Week 2: Text classification with multiple methods - Week 3: Embeddings comparison (LSA, LDA, Word2Vec, visualization) - All notebooks are Google Colab-ready with hands-on exercises Additional materials: - Week 10 presentation guidelines (comprehensive Markdown guide) - Removed Week 8 lecture slides (no classes during instructor absence) All materials ready for Winter 2026 term.

- Update classroom from TBD to Moore 302 - Correct course end date to March 9, 2026 (last day of classes) - Update Week 10 schedule: presentations on March 9 - Final project materials due March 13 (final exam period)

claude added 2 commits December 24, 2025 20:02

jeremymanning requested a review from Copilot December 24, 2025 20:13

Copilot started reviewing on behalf of jeremymanning December 24, 2025 20:14 View session

Copilot AI reviewed Dec 24, 2025

View reviewed changes

claude added 3 commits December 24, 2025 21:59

Correct classroom location and course end date

12ee283

- Update classroom from TBD to Moore 302 - Correct course end date to March 9, 2026 (last day of classes) - Update Week 10 schedule: presentations on March 9 - Final project materials due March 13 (final exam period)

jeremymanning merged commit 424ed87 into main Dec 25, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modernize NLP course with updated syllabus, slides, and assignments #1

Modernize NLP course with updated syllabus, slides, and assignments #1

Uh oh!

jeremymanning commented Dec 24, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Modernize NLP course with updated syllabus, slides, and assignments #1

Modernize NLP course with updated syllabus, slides, and assignments #1

Uh oh!

Conversation

jeremymanning commented Dec 24, 2025

Major Updates:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants