Skip to content

A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!

License

Notifications You must be signed in to change notification settings

HarleyCoops/self-adaptive-llms

 
 

Repository files navigation

Transformer²: Self-Adaptive LLMs with Docker Support

arXiv

The Revolution in AI Adaptation

Transformer² represents a paradigm shift in artificial intelligence, introducing a revolutionary approach to how Large Language Models (LLMs) adapt and learn. While traditional LLMs remain static after training, Transformer² brings the concept of "living intelligence" to AI systems through several groundbreaking innovations:

🧠 Dynamic Neural Architecture

At its core, Transformer² introduces a sophisticated approach to neural weight manipulation through Singular Value Decomposition (SVD). Unlike traditional static weight matrices, Transformer² decomposes these matrices into independent components, each representing different aspects of the model's knowledge. This decomposition enables the system to selectively enhance or suppress specific components in real-time, similar to how biological neural networks reconfigure themselves for different tasks.

🔄 Two-Pass Adaptive Processing

The system employs a revolutionary two-pass mechanism that mimics biological cognitive processes:

  1. Task Analysis Phase:

    • The system first analyzes and identifies task properties
    • Employs sophisticated pattern recognition to understand task requirements
    • Uses one of three increasingly powerful adaptation methods
  2. Dynamic Adaptation Phase:

    • Combines specialized "expert" vectors trained through reinforcement learning
    • Optimizes neural pathways specifically for the current task
    • Achieves real-time weight matrix modification without retraining

🎯 Singular Value Finetuning (SVF)

SVF represents a breakthrough in parameter-efficient training:

  • Uses reinforcement learning to develop task-specific expertise
  • Creates compact "expert" z-vectors for different domains
  • Requires orders of magnitude fewer parameters than traditional methods
  • Enables natural compositionality for complex task adaptation

🔬 Three-Tier Adaptation Framework

The system implements three increasingly sophisticated adaptation strategies:

  1. Prompt Engineering:

    • Uses carefully crafted prompts for task classification
    • Dynamically selects appropriate pre-trained expertise
    • Provides efficient baseline adaptation
  2. Classification Expert:

    • Employs a specialized SVF-tuned classifier
    • Offers more nuanced task identification
    • Enables more precise adaptation selection
  3. Few-shot Adaptation:

    • Represents the most advanced adaptation strategy
    • Combines multiple expert vectors through weighted interpolation
    • Uses Cross-Entropy Method (CEM) for optimal weight discovery
    • Achieves superior performance through sophisticated blending of expertise

🌟 Technical Breakthroughs

The system achieves several technical innovations:

  • Real-time Adaptation: Modifies behavior during inference without retraining
  • Compositionality: Combines different types of expertise for novel tasks
  • Efficiency: Maintains high performance with minimal parameter overhead
  • Cross-Model Transfer: Enables knowledge sharing between different models
  • Biological Inspiration: Mirrors natural adaptive systems

💡 Implementation Excellence

The technical implementation showcases several architectural innovations:

  • Sophisticated SVD-based weight matrix decomposition (U⋅Σ⋅V^T)
  • Selective modification of singular values through z-vectors
  • Maintenance of full rank information unlike low-rank approaches
  • Built-in regularization through controlled component modification

🎯 Practical Advantages

The system delivers numerous practical benefits:

  • Consistently outperforms traditional methods like LoRA
  • Functions effectively with limited training data
  • Avoids catastrophic forgetting in continuous learning
  • Enables efficient knowledge transfer between models
  • Supports sustainable AI development practices

🚀 Future Implications

Transformer² points toward a future of truly adaptive AI:

  • Enables continuous, lifelong learning capabilities
  • Supports dynamic task adaptation without retraining
  • Provides a framework for self-organizing AI systems
  • Opens new possibilities for efficient model development
  • Paves the way for more sustainable AI scaling

This implementation provides a Docker-based deployment of Transformer², specifically designed for Windows environments while maintaining full GPU support through NVIDIA Container Toolkit.

Introduction

Transformer² (Transformer-squared) represents a paradigm shift in how Large Language Models (LLMs) adapt to diverse tasks. Traditional fine-tuning approaches often struggle with computational intensity and static behavior across varied tasks. This implementation introduces dynamic, real-time adaptation by selectively modifying singular components of weight matrices, enabling LLMs to optimize their behavior for specific tasks without extensive retraining.

Core Innovation: Two-Pass Adaptation Mechanism

The framework employs a sophisticated two-pass mechanism during inference:

  1. Task Analysis: A dispatch system identifies task properties and requirements
  2. Dynamic Adaptation: Task-specific "expert" vectors, trained through reinforcement learning, are combined to optimize model behavior for the incoming prompt

Technical Architecture

At its core, Transformer² leverages Singular Value Decomposition (SVD) to decompose the LLM's weight matrices into independent components. This decomposition allows for:

  • Identification of principal components in the model's knowledge representation
  • Selective enhancement/suppression of specific components for task optimization
  • Minimal parameter overhead while maintaining adaptability

The framework introduces Singular Value Finetuning (SVF), which uses reinforcement learning to learn task-specific z-vectors. These vectors act as "amplifiers" or "dampeners" for different components of the weight matrices, enabling precise task-specific adaptations.

This Implementation

This repository provides my Docker-based implementation of Transformer², specifically designed for Windows environments. The containerized approach ensures consistent behavior across different systems while maintaining full GPU support through NVIDIA Container Toolkit.

Key Features

  • Containerized Linux environment for Windows compatibility
  • CUDA-enabled runtime for GPU acceleration
  • Persistent model caching
  • Streamlined deployment process
  • Support for all original Transformer² evaluation methods

Prerequisites

  1. Windows 10/11 with WSL2 enabled
  2. Docker Desktop for Windows
  3. NVIDIA Container Toolkit
  4. NVIDIA GPU with CUDA support
  5. At least 16GB RAM recommended
  6. Hugging Face account with access to the Llama model family
  7. Hugging Face API token with read access

Quick Start

  1. Clone the repository:
git clone https://github.com/HarleyCoops/self-adaptive-llms.git
cd self-adaptive-llms
  1. Set up Hugging Face authentication:

    • Create a .env file in the root directory
    • Add your Hugging Face token: HUGGING_FACE_TOKEN=your_token_here
  2. Build and run the container:

# Build the container
docker-compose build

# Start an interactive shell
docker-compose run --rm self-adaptive-llm
  1. Run evaluations:
# Few-shot evaluation
./run.sh bash scripts/eval_few_shot.sh

# Prompt-based evaluation
./run.sh bash scripts/eval_prompt_based.sh

Technical Documentation

📊 Interactive Function Flowcharts

This implementation includes a comprehensive suite of interactive flowcharts located in the /docs directory. These flowcharts use flowchart.js to provide detailed visualizations of the system's architecture and processes.

Viewing the Flowcharts

The flowcharts are interactive HTML files that can be viewed in several ways:

Option 1: Direct Browser Access After cloning the repository, open any flowchart HTML file directly in your browser:

# Windows
start docs/math_flowchart.html

# macOS
open docs/math_flowchart.html

Option 2: Local Development Server For a development environment with auto-refresh:

cd docs
python -m http.server 8000
# Visit http://localhost:8000 in your browser

Each flowchart provides an interactive visualization of different system components:

  • Math Module: Implementation flow of the math task handler
  • Base Classes: Core system architecture and interfaces
  • SVD Reinforcement: Weight matrix manipulation and RL loop
  • And many more...

🔄 Core System Components

📝 Task-Specific Implementations

🛠 Utility and Infrastructure

Each interactive flowchart provides:

  • 📋 Detailed function signatures and parameter descriptions
  • 🔄 Control flow visualization with animated transitions
  • 🔗 Component interdependencies with clickable navigation
  • ⚡ Data transformation pipeline visualization
  • 🚨 Error handling pathways and edge cases
  • 💡 Inline documentation and implementation notes

The flowcharts are designed to be both educational and practical:

  • 🎓 Perfect for understanding the system architecture
  • 🔍 Useful for debugging and development
  • 📚 Valuable for academic research and documentation
  • 🤝 Helpful for new contributors

Dynamic Visualizations

The implementation includes sophisticated animations that visualize the model's internal processes:

Animation Components

Located in animations/transformer2_animations.py, the visualization system provides:

  • Real-time SVD decomposition visualization
  • Z-vector adaptation trajectories
  • Weight matrix transformation animations
  • Task-specific adaptation visualization
  • Performance metric evolution

The animations are rendered using state-of-the-art visualization libraries and can be used for:

  • Research presentations
  • Educational purposes
  • Debugging and analysis
  • Performance monitoring

Media assets in animations/media/ support these visualizations with:

  • Component diagrams
  • State transition animations
  • Performance graphs
  • Architecture schematics

Container Structure

The Docker implementation includes:

  • Ubuntu 22.04 base image with CUDA 12.1 support
  • Conda environment with Python 3.11
  • PyTorch with CUDA support
  • All project dependencies pre-configured
  • Mounted volumes for code and model caching

Evaluation Methods

Transformer² supports three adaptation methods:

  1. Prompt-based Adaptation

    • Uses specific prompts to classify tasks
    • Selects appropriate pre-trained z-vectors
  2. Classifier-based Adaptation

    • Employs a trained task classifier
    • Automatically identifies tasks during inference
  3. Few-shot Adaptation

    • Combines multiple pre-trained z-vectors through weighted interpolation
    • Optimizes weights based on few-shot evaluation performance

Configuration

Key configuration files:

  • environment.yml: Conda environment specification
  • docker-compose.yml: Container orchestration settings
  • Dockerfile: Container build instructions
  • requirements.txt: Python dependencies

Performance Considerations

As noted in the original paper, the framework shows significant improvements across various tasks:

  • Outperforms LoRA on text-based tasks
  • Shows strong performance in vision-language tasks
  • Demonstrates effective cross-model knowledge transfer

For detailed performance metrics and comparisons, refer to the original paper.

Extending Compute Resources

For users with limited local GPU resources, several cloud platforms offer free or cost-effective GPU access:

🌩️ Google Colab Integration

  1. Setup Steps:

    !git clone https://github.com/HarleyCoops/self-adaptive-llms.git
    !cd self-adaptive-llms
    !pip install -r requirements.txt
  2. Environment Variables:

    import os
    os.environ['HUGGING_FACE_TOKEN'] = 'your_token_here'
  3. Running Evaluations:

    !python svd_reinforce_hydra.py --config-dir=cfgs --config-name=config \
        base_model@_global_=llama3i8b optimization@_global_=cem \
        task@_global_=few_shot_math

📊 Kaggle Notebooks

  1. Setup:

    • Create a new Notebook with GPU (T4/P100)
    • Select "Docker" as the accelerator
    • Enable internet access
  2. Installation:

    !git clone https://github.com/HarleyCoops/self-adaptive-llms.git
    !cd self-adaptive-llms
    !pip install -r requirements.txt
  3. Configuration:

    import os
    os.environ['HUGGING_FACE_TOKEN'] = 'your_token_here'

☁️ Vast.ai (Pay-as-you-go Option)

  1. Create Instance:

    • Select an instance with 12+ GB VRAM
    • Choose Ubuntu 22.04 with CUDA support
  2. Setup Commands:

    git clone https://github.com/HarleyCoops/self-adaptive-llms.git
    cd self-adaptive-llms
    pip install -r requirements.txt
  3. Environment Setup:

    export HUGGING_FACE_TOKEN='your_token_here'

🔄 Code Modifications for Cloud

When using cloud resources, consider these adjustments:

  1. Memory Optimization:

    # In tasks/math.py, adjust GPU memory usage based on available VRAM
    gpu_memory_utilization=0.8  # Increase if more VRAM available
  2. Batch Size Adjustment:

    # Increase for better performance with more VRAM
    max_num_batched_tokens=4096
  3. Checkpoint Saving:

    # Add to your training loop to save progress
    model.save_checkpoint('/content/checkpoints/')

📝 Best Practices

  1. Resource Management:

    • Monitor GPU memory usage with nvidia-smi
    • Use persistent storage for model checkpoints
    • Implement early stopping for efficient resource use
  2. Data Handling:

    • Cache downloaded models and datasets
    • Use efficient data loading techniques
    • Implement proper cleanup procedures
  3. Cost Optimization:

    • Use free tiers when possible (Colab, Kaggle)
    • Monitor usage on pay-as-you-go platforms
    • Implement automatic shutdown on completion

Citation

@misc{sun2025texttransformer2selfadaptivellms,
      title={$\text{Transformer}^2$: Self-adaptive LLMs}, 
      author={Qi Sun and Edoardo Cetin and Yujin Tang},
      year={2025},
      eprint={2501.06252},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Acknowledgments

This implementation builds upon the original work by Sakana AI, adapting it for Windows environments through containerization.

About

A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.4%
  • Shell 1.2%
  • Dockerfile 0.4%