Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement MAML for adaptive regression with documentation and dependencies #34

Merged
merged 7 commits into from
Nov 8, 2024
16 changes: 16 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,22 @@ This PoC demonstrates a practical application of the Narrative Field System in a

This implementation showcases how the Narrative Field System can be applied to analyze and track narrative dynamics in a specific context.

### model_agnostic_meta_learning.py

This PoC implements a Model-Agnostic Meta-Learning (MAML) approach for rapid adaptation of narrative models to new contexts. Key features include:

- `MetaModelGenerator`: Core meta-learning architecture with:
- Skip connections for improved gradient flow
- Adaptive learning rate scheduling
- Enhanced visualization capabilities
- Robust error handling and metrics tracking
- Task generation system for synthetic narrative scenarios
- Multi-step adaptation process with gradient clipping
- Comprehensive visualization suite for adaptation analysis
- Built-in feature importance analysis and learning curve tracking

This implementation enables the system to quickly adapt to new narrative contexts with minimal data, making it particularly valuable for modeling emerging story dynamics.

## Development Guidelines

- Follow PEP 8 style guide and use Black for code formatting.
Expand Down
2 changes: 2 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,5 @@ transformers
PyWavelets
scikit-learn
sentence-transformers
wandb
matplotlib
129 changes: 129 additions & 0 deletions src/maml.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
# Model-Agnostic Meta-Learning (MAML) for Adaptive Regression

## Abstract

This paper presents an implementation and analysis of Model-Agnostic Meta-Learning (MAML) applied to adaptive regression tasks. We demonstrate how MAML enables rapid adaptation to new tasks through a neural network architecture with skip connections and carefully tuned meta-learning components. Our experiments show significant improvements in prediction accuracy after just a few gradient steps of task-specific adaptation.

## 1. Introduction

Meta-learning, or "learning to learn", aims to create models that can quickly adapt to new tasks with minimal training data. MAML achieves this by explicitly optimizing the model's initial parameters such that a small number of gradient steps will produce good performance on a new task. Our implementation focuses on regression problems with the following key features:

- Multi-layer neural network with skip connections for improved gradient flow
- Controlled synthetic task generation for systematic evaluation
- Comprehensive visualization and analysis tools
- Robust training with gradient clipping and early stopping

## 2. Architecture

### 2.1 Model Design

The core architecture consists of:

- Input layer: Linear transformation to first hidden layer
- Hidden layers: Multiple layers with skip connections and ReLU activation
- Output layer: Linear transformation to prediction space

Skip connections help maintain gradient flow during both meta-training and adaptation. Weight initialization uses small normal distributions (σ=0.01) to prevent initial predictions from being too extreme.

### 2.2 Meta-Learning Components

Key meta-learning elements include:

- Inner loop optimization: Task-specific adaptation using SGD
- Outer loop optimization: Meta-parameter updates using SGD with momentum
- Learning rate scheduling: ReduceLROnPlateau for automatic adjustment
- Gradient clipping: Both inner and outer loops for stability

## 3. Task Generation

Synthetic tasks are generated with controlled complexity:

1. Normalized input features
2. Task-specific random transformations
3. Multiple non-linear components (linear, sinusoidal, hyperbolic tangent)
4. Adaptive noise based on signal magnitude
5. Support/query set splitting for evaluation

## 4. Training Process

The training pipeline includes:

- Batch processing of multiple tasks
- Multiple gradient steps for task adaptation
- Comprehensive metric tracking
- Early stopping based on validation performance
- Learning rate adjustment based on loss plateaus

## 5. Results and Analysis

Our implementation demonstrates:

- Rapid adaptation to new tasks (3-5 gradient steps)
- Robust performance across varying task complexities
- Effective feature importance identification
- Clear visualization of adaptation progress

### 5.1 Visualization Components

We provide detailed visualizations including:

1. Pre/post adaptation predictions
2. Feature importance analysis
3. Error distribution changes
4. Adaptation learning curves

## 6. Implementation Details

Key technical features:

- PyTorch implementation with GPU support
- Type hints for improved code clarity
- Comprehensive error handling
- Modular design for easy extension
- Logging and optional experiment tracking

## 7. Conclusion

Our MAML implementation successfully demonstrates rapid adaptation capabilities for regression tasks. The architecture and training process provide a robust foundation for meta-learning applications, with clear visualization and analysis tools for understanding model behavior.

## References

1. Finn, C., Abbeel, P., & Levine, S. (2017). [Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks.](https://arxiv.org/abs/1703.03400)
2. Antoniou, A., Edwards, H., & Storkey, A. (2019). [How to train your MAML](https://arxiv.org/abs/1810.09502).

## Appendix: Code Structure

The implementation is organized into key components:

1. MetaModelGenerator: Core meta-learning model
2. Task generation utilities
3. Training and evaluation loops
4. Visualization and analysis tools

For detailed implementation, see the accompanying source code.

## Citations

```bibtex
@misc{finn2017modelagnosticmetalearningfastadaptation,
title={Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks},
author={Chelsea Finn and Pieter Abbeel and Sergey Levine},
year={2017},
eprint={1703.03400},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/1703.03400},
}
```

```bibtex
@misc{antoniou2019trainmaml,
title={How to train your MAML},
author={Antreas Antoniou and Harrison Edwards and Amos Storkey},
year={2019},
eprint={1810.09502},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/1810.09502},
}
```
Loading
Loading