This repository contains educational, step-by-step Jupyter notebooks that demonstrate foundational concepts in Natural Language Processing (NLP) and automatic differentiation. The goal is to provide clear, well-documented code and explanations for core representation techniques and the basics of building neural networks from scratch.
-
micrograd/
derivative.ipynb: Introduction to automatic differentiation and computational graphs, inspired by micrograd. Includes a minimal implementation of the Value class and visualization of computation graphs.image.png: Visualization asset for computation graphs.
-
Representations/
BagOfWords.ipynb: Implements the Bag of Words (BoW) model from scratch, including tokenization, vocabulary building, encoding, and visualization.word2vec.ipynb: Step-by-step implementation of the Word2Vec skip-gram model with negative sampling, including training loop, loss computation, and similarity evaluation.
- Goal: Represent text documents as fixed-length vectors based on word frequency.
- Steps:
- Tokenize and preprocess a sample corpus.
- Build a vocabulary and encode each document as a frequency vector.
- Visualize the resulting BoW matrix as a heatmap.
- Goal: Learn dense vector representations (embeddings) for words such that similar words are close in the embedding space.
- Steps:
- Prepare skip-gram training data from a corpus.
- Define and initialize embedding matrices.
- Implement the training loop with negative sampling and categorical cross-entropy loss.
- Evaluate embeddings using cosine similarity.
- Goal: Illustrate the basics of automatic differentiation and computational graphs.
- Steps:
- Define a simple Value class to track operations and gradients.
- Visualize computation graphs.
- Demonstrate forward and backward passes for simple functions.
-
Clone the repository:
git clone <raghulchandramouli/Spelled-out-intro-to-Neural-Networks> cd LLM101
-
Install dependencies:
- Python 3.7+
- Jupyter Notebook
- numpy, matplotlib, graphviz
You can install requirements with:
pip install numpy matplotlib graphviz
-
Run the notebooks:
jupyter notebook
Open any notebook in the browser and run the cells step by step.
- All code is written from scratch for clarity and learning.
- Each notebook contains detailed explanations and visualizations.
- Suitable for beginners and those seeking to understand the internals of NLP representations and autodiff.
MIT License
Author: [Your