jxnl
diff --git a/‎.claude/agents/content-completeness-checker.md‎
Lines changed: 28 additions & 11 deletions b/‎.claude/agents/content-completeness-checker.md‎
Lines changed: 28 additions & 11 deletions
diff --git a/‎docs/workshops/chapter0-slides.md‎
Lines changed: 317 additions & 0 deletions b/‎docs/workshops/chapter0-slides.md‎
Lines changed: 317 additions & 0 deletions
@@ -4,30 +4,47 @@ description: Use this agent when you need to verify that educational content is
 model: sonnet
 ---
 
-You are an expert content auditor specializing in educational material completeness verification. Your primary responsibility is to systematically compare markdown files and transcript files to identify missing content, inconsistencies, and gaps that could impact the learning experience.
+You are an expert content auditor and editor specializing in educational material completeness verification and improvement. Your primary responsibility is to systematically compare markdown files, transcript files, slides.md files, and office hours content to identify missing content, inconsistencies, and gaps - then actively edit the files to add missing sections and improve completeness.
+
+**Important**: Office hours content exists in the `docs/office-hours/` directory where each week maps to a corresponding chapter. Always include office hours analysis when reviewing chapter content, as they contain valuable Q&A, troubleshooting, and additional explanations that should be incorporated into the main materials.
 
 When analyzing content:
 
-1. **Structural Analysis**: Compare the overall structure and flow between markdown and transcript versions. Look for missing sections, topics, or logical sequences that appear in one format but not the other.
+1. **Multi-Format Analysis**: Compare the overall structure and flow between markdown, transcript, slides.md, and office hours versions. Look for missing sections, topics, or logical sequences that appear in one format but not the others.
 
-2. **Content Mapping**: Create a comprehensive mapping of key concepts, examples, code snippets, and explanations present in each format. Identify content that exists in transcripts but is absent from markdown files, and vice versa.
+2. **Content Mapping**: Create a comprehensive mapping of key concepts, examples, code snippets, and explanations present in each format. Identify content that exists in transcripts, slides, or office hours but is absent from main markdown files, and vice versa.
 
 3. **Educational Completeness**: Focus on pedagogically important elements like:
    - Key learning objectives and concepts
    - Code examples and demonstrations
    - Explanations of complex topics
    - Practical exercises or workshops
    - Important clarifications or corrections made during live sessions
+   - Visual elements and diagrams from slides
+   - Q&A content and troubleshooting from office hours
+
+4. **Technical Accuracy**: Verify that technical content (code, commands, configurations) is consistently represented across all formats, noting any discrepancies in implementation details.
 
-4. **Technical Accuracy**: Verify that technical content (code, commands, configurations) is consistently represented across both formats, noting any discrepancies in implementation details.
+5. **Context Preservation**: Ensure that important contextual information from live sessions (Q&A, troubleshooting, real-time problem-solving), slide presentations, and office hours discussions is captured in the markdown materials.
 
-5. **Context Preservation**: Ensure that important contextual information from live sessions (Q&A, troubleshooting, real-time problem-solving) is captured in the markdown materials.
+6. **Formatting and Admonitions**: Verify that all admonitions (notes, warnings, tips, etc.) are properly formatted and functional. Check that admonition syntax is correct and that content inside admonitions is properly indented with tabs (not spaces). Ensure admonitions will render properly in the final documentation.
 
-Your analysis should:
-- Provide a clear summary of completeness status
+Your workflow should:
+- Provide a clear summary of completeness status across all formats
 - List specific missing content with file locations and context
 - Prioritize gaps by educational impact (critical, important, minor)
-- Suggest specific actions to address identified gaps
-- Note any content that appears in markdown but not in transcripts (potential additions)
-
-Always request clarification about which specific files or directories to analyze if the scope is not clearly defined. Focus on actionable findings that help maintain high-quality educational materials.
+- **ACTIVELY EDIT FILES** to add missing sections, examples, and explanations
+- **Edit both chapter files and slides.md files** as needed to ensure completeness across formats
+- Ensure consistent formatting and structure across updated files
+- Note any content that appears in one format but not others (potential additions)
+
+When editing files:
+- Add missing sections with appropriate headers and formatting
+- Include code examples and explanations from transcripts/slides/office hours
+- Preserve the existing style and tone of the materials
+- Add clear transitions between sections
+- Ensure all edits maintain educational flow and coherence
+- **Add FAQ sections** at the end of each chapter that incorporate relevant Q&A content from office hours, addressing common questions and issues related to the main topic
+- **Fix admonition formatting** to ensure proper syntax and rendering (e.g., `!!! note`, `!!! warning`, `!!! tip`) with content properly indented using tabs, not spaces
+
+Always request clarification about which specific files or directories to analyze if the scope is not clearly defined. Focus on actionable improvements that enhance the learning experience.
@@ -0,0 +1,317 @@
+# Chapter 0 Slides
+
+## jxnl.co
+
+@jxnlco
+
+## Systematically Improving RAG Applications
+
+**Session 0:** Beyond Implementation to Improvement: A Product Mindset for RAG
+
+Jason Liu
+
+---
+
+## Welcome to the Course
+
+**Instructor:** Jason Liu - AI/ML Consultant & Educator
+
+**Mission:** Dismantle guesswork in AI development and replace it with structured, measurable, and repeatable processes.
+
+**Your Commitment:**
+- Stick with the material
+- Have conversations with teammates  
+- Make time to look at your data
+- Instrument your systems
+- Ask yourself: "What work am I trying to do?"
+
+---
+
+## Who Am I?
+
+**Background:** Computer Vision, Computational Mathematics, Mathematical Physics (University of Waterloo)
+
+**Facebook:** Content Policy, Moderation, Public Risk & Safety
+- Built dashboards and RAG applications to identify harmful content
+- Computational social sciences applications
+
+**Stitch Fix:** Computer Vision, Multimodal Retrieval  
+- Variational autoencoders and GANs for GenAI
+- **$50M revenue impact** from recommendation systems
+- $400K annual data curation budget
+- Hundreds of millions of recommendations/week
+
+---
+
+## Current Focus
+
+**Why Consulting vs Building?**
+- Hand injury in 2021-2022 limited typing
+- Highest leverage: advising startups and education
+- Helping others build while hands recover
+
+**Client Experience:**
+- HubSpot, Zapier, Limitless, and many others
+- Personal assistants, construction AI, research tools
+- Query understanding, prompt optimization, embedding search
+- Fine-tuning, MLOps, and observability
+
+---
+
+## Who Are You?
+
+**Cohort Composition:**
+- **30%** Founders and CTOs
+- **20%** Senior Engineers  
+- **50%** Software Engineers, Data Scientists, PMs, Solution Engineers, Consultants
+
+**Companies Represented:**
+- OpenAI, Amazon, Microsoft, Google
+- Anthropic, NVIDIA, and many others
+
+**Excited to hear about your challenges and experiences!**
+
+---
+
+## Course Structure: 6-Week Journey
+
+### Week 1: Synthetic Data Generation
+- Create precision/recall evaluations
+- Start with text chunks → synthetic questions
+- Build baseline evaluation suite
+
+### Week 2: Fine-Tuning and Few-Shot Examples
+- Convert evals to few-shot examples
+- Fine-tune models for better performance
+- Evaluate rerankers and methodologies
+
+### Week 3: Deploy and Collect Feedback
+- Deploy system to real users
+- Collect ratings and feedback
+- Improve evals with real user data
+
+---
+
+## Course Structure (continued)
+
+### Week 4: Topic Modeling and Segmentation  
+- Use clustering to identify valuable topics
+- Decide what to double down on vs abandon
+- Focus resources on economically valuable work
+
+### Week 5: Multimodal RAG Improvements
+- Incorporate images, tables, code search
+- Contextual retrieval and summarization
+- Target specific query segments
+
+### Week 6: Function Calling and Query Understanding
+- Combine all systems with intelligent routing
+- Query → Path selection → Multimodal RAG → Final answer
+- Complete end-to-end orchestration
+
+---
+
+## Learning Format
+
+**Asynchronous Lectures (Fridays)**
+- Watch videos on your schedule
+- Take notes and prepare questions
+
+**Office Hours (Tuesdays & Thursdays)**  
+- Multiple time zones supported
+- Active learning and discussion
+- Question-driven sessions
+
+**Guest Lectures (Wednesdays)**
+- Industry experts and practitioners
+- Q&A with speakers
+- Real-world case studies
+
+**Slack Community**
+- Ongoing discussions
+- Peer support and collaboration
+
+---
+
+## The Critical Mindset Shift
+
+### ❌ Implementation Mindset
+- "We need to implement RAG"
+- Obsessing over embedding dimensions  
+- Success = works in demo
+- Big upfront architecture decisions
+- Focus on picking "best" model
+
+### ✅ Product Mindset  
+- "We need to help users find answers faster"
+- Tracking answer relevance and task completion
+- Success = users keep coming back
+- Architecture that can evolve
+- Focus on learning from user behavior
+
+**Launching your RAG system is just the beginning!**
+
+---
+
+## Why Most RAG Implementations Fail
+
+**The Problem:** Treating RAG as a technical project, not a product
+
+**What Happens:**
+1. Focus on technical components (embeddings, vector DB, LLM)
+2. Consider it "complete" when deployed
+3. Works for demos, struggles with real complexity
+4. Users lose trust as limitations surface
+5. No clear metrics or improvement process
+6. Resort to ad-hoc tweaking based on anecdotes
+
+**The Solution:** Product mindset with continuous improvement
+
+---
+
+## The Key Insight: RAG as Recommendation Engine
+
+**Stop thinking:** Retrieval → Augmentation → Generation
+
+**Start thinking:** Recommendation Engine + Language Model
+
+```
+User Query → Query Understanding → Multiple Retrieval Paths
+                                    ↓
+                        [Document] [Image] [Table] [Code]
+                                    ↓
+                          Filtering & Ranking
+                                    ↓
+                            Context Assembly
+                                    ↓
+                              Generation
+                                    ↓
+                             User Response
+                                    ↓
+                             Feedback Loop
+```
+
+---
+
+## What This Means
+
+### 1. Generation Quality = Retrieval Quality
+- World's best prompt + garbage context = garbage answers
+- Focus on getting the right information to the LLM
+
+### 2. Different Questions Need Different Strategies
+- Amazon doesn't recommend books like electronics
+- Your RAG shouldn't use same approach for every query
+
+### 3. Feedback Drives Improvement  
+- User interactions reveal what works
+- Continuous learning from real usage patterns
+
+---
+
+## What Does Success Look Like?
+
+### Feeling of Success
+- **Less anxiety** when hearing "just make the AI better"
+- **Less overwhelmed** when told to "look at your data"  
+- **Confidence** in making data-driven decisions
+
+### Tangible Outcomes
+- Identify high-impact tasks systematically
+- Prioritize and make informed trade-offs
+- Choose metrics that correlate with business outcomes
+- Drive user satisfaction, retention, and usage
+
+---
+
+## The System Approach
+
+**What is a System?**
+- Structured approach to solving problems
+- Framework for evaluating technologies  
+- Decision-making process for prioritization
+- Methodology for diagnosing performance
+- Standard metrics and benchmarks
+
+**Why Systems Matter:**
+- Frees mental energy for innovation
+- Replaces guesswork with testing
+- Enables quantitative vs "feels better" assessments
+- Secures resources through data-driven arguments
+
+---
+
+## RAG vs Recommendation Systems
+
+**The Reality:** RAG is a 4-step recommendation system
+
+1. **Multiple Retrieval Indices** (multimodal: images, tables, text)
+2. **Filtering** (top-k selection)  
+3. **Scoring/Ranking** (rerankers, relevance)
+4. **Context Assembly** (prepare for generation)
+
+**The Problem:** Engineers focus on generation without knowing if right information is retrieved
+
+**The Solution:** Improve search to improve retrieval to improve generation
+
+---
+
+## Experimentation Over Implementation
+
+**Instead of:** "Make the AI better"
+
+**Ask:**
+- Why am I looking at this data?
+- What's the goal and hypothesis?
+- What signals am I looking for?
+- Is the juice worth the squeeze?
+- How can I use this to improve?
+
+**Success Formula:** Flywheel in place + Consistent effort = Continuous improvement
+
+Like building muscle: track calories and workouts, don't just weigh yourself daily
+
+---
+
+## Course Commitments
+
+### My Commitment to You
+- Be online and answer questions
+- Provide extensive office hours support
+- Share real-world experience and case studies
+- Connect you with industry experts
+
+### Your Commitment
+- Engage with the material actively
+- Look at your own data and systems
+- Participate in discussions and office hours
+- Apply learnings to your real projects
+
+**Together, we'll transform your RAG from demo to production-ready product**
+
+---
+
+## Key Takeaway
+
+> **Successful RAG systems aren't projects that ship once—they're products that improve continuously.**
+
+The difference between success and failure isn't the embedding model or vector database you choose.
+
+It's whether you treat RAG as:
+- **❌ Static implementation** that slowly decays
+- **✅ Living product** that learns from every interaction
+
+**Let's build systems that get better every week! 🚀**
+
+---
+
+## Next Week
+
+**Week 1: Kickstart the Data Flywheel**
+
+- Synthetic data generation strategies
+- Building precision/recall evaluations
+- Creating your evaluation foundation
+- "Fake it till you make it" with synthetic data
+
+**Come prepared to look at your data!**