Week 1: How did we get here? And where are we, really?
Similar to 266 - First 4 weeks talk about transformers
In this class we will
- Develop intuition for computational linguistics
- Understand how GenAI models work
- We will talk about transformers
- We will talk about diffusion models (images)
- Learn about generative AI architectures
- Lear about generative AI tasks and their problem structures
- Can use it to process sound, video, text, seemingly endless possibilities
- Emphasize practical understanding
- Want to build a foundation to continue learning about Final
- Implement a RAG with gold answers
- Constraints on what technologies you can use
- Make a model that is preforment

Goals:
- Remind ourselves of how 'AI' used to be approached
- contrast this with the modern LLM approach
- Modern LLMs are very powerful. What does this real mean and how can you go about measuring this?
Grading
- 5 Assignments
- 5 late days in the "bank" can use up to 2 days on one assignment
- Don't need permission
- Use Ed discussion instead of slack
- Can be anonymous or ask just the teacher if you want
Resources
- Compute resources
- Google Colab Pro ($10/mo) highly recommended
- V100 and Hi-RAM
- A variety of commercial (some lower cost than others, some are free) are vailable and may be used throughout the class
- Many open source models/tools are available to run in the google colab environments
- Google Colab Pro ($10/mo) highly recommended
Older AI approach
- Encoding (all) knowledge for the computer
- Computer uses know base plus inference engine to understand directives, make plans and carry them out
- Cyc knowledge base of common sense know contains millions of rules written in a formal representation language
- Know of a 7 yo with tons of rules
- Didn't work out
- Takes more than just the rules
- Expert systems encoded rules that helped humans make decision and reach conclusions
- Understand what it DOES and what it DOES NOT DO
- Move away from the Hype so we don't get another AI winter


- Fruit flies like a banana
LLM Approach
- No rules!
- Predict the next token using probabilities
- Trained on 15+ trillion tokens
- A bigger model does not necessarily mean a better model
