Skip to content

EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models

Notifications You must be signed in to change notification settings

showlab/EVOLVE-VLA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models

arXiv Project Page Code HuggingFace


🎯 Overview

EVOLVE-VLA enables Vision-Language-Action (VLA) models to continuously adapt through autonomous environment interaction—learning by doing, not just watching.

Current VLA models rely on supervised fine-tuning (SFT) with extensive expert demonstrations, leading to:

  • 💰 High labor costs: Hundreds of demonstrations per task
  • 🔒 Rigid memorization: Policies that merely replay training trajectories
  • Poor adaptation: Inability to recover from execution deviations

Our test-time training framework addresses these limitations by enabling VLAs to self-improve during deployment through online reinforcement learning, requiring minimal or even zero task-specific demonstrations.


✨ Key Features

🚀 Test-Time Training

  • Continue learning during deployment through environment interaction
  • Minimal supervision required (1-shot or zero-shot)
  • Autonomous exploration and self-improvement via online RL

🎯 Progress-Based Reward

  • Learned progress estimator replaces impractical oracle rewards
  • Dense, continuous feedback for sample-efficient learning
  • No access to ground-truth success signals needed at test time

🔧 Technical Innovations

  1. Accumulative Progress Estimation: Smooths noisy point-wise estimates into stable signals through interval-based sampling and incremental aggregation

  2. Progressive Horizon Extension: Gradual curriculum learning that extends exploration horizons, enabling the policy to master simpler sub-tasks before tackling complete long-horizon tasks


📊 Main Results

Setting Improvement Details
Long-Horizon Tasks +8.6% LIBERO-Long benchmark
1-Shot Learning +22.0% Minimal demonstration data
Cross-Task Transfer 0% → 20.8% Zero-shot generalization without task-specific SFT

🌟 Emergent Capabilities

Through autonomous test-time training, EVOLVE-VLA develops skills entirely absent from demonstrations:

  • Error Recovery: Re-attempting failed grasps and adjusting strategies
  • Adaptation: Handling unexpected object state changes
  • Novel Strategies: Discovering alternative manipulation approaches (e.g., grasping cup body instead of handle)

🎥 Demo Videos

Check out our project page for video demonstrations showing:

  1. Error recovery through repeated grasp attempts
  2. Adaptation to unexpected state changes
  3. Novel manipulation strategies learned autonomously

🛠️ Code & Model

📦 Coming Soon

We are committed to releasing the following upon publication:

  • 🔓 Full training code for EVOLVE-VLA
  • 🔓 Inference codebase for deploying trained models
  • 🔓 Pre-trained models on HuggingFace
  • 🔓 Evaluation scripts for LIBERO benchmark
  • 📖 Detailed documentation and tutorials

Stay tuned! Star this repo to get notified when the code is released.


📖 Citation

If you find our work useful, please cite:

@article{bai2025evolve,
  title={EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models},
  author={Bai, Zechen and Gao, Chen and Shou, Mike Zheng},
  journal={arXiv preprint arXiv:2512.14666},
  year={2025}
}

🙏 Acknowledgments

We thank the following projects and teams for their valuable contributions to the community:

  • OpenVLA for the open-source VLA model and codebase
  • SimpleVLA-RL for pioneering work on RL fine-tuning for VLAs
  • verl for the efficient RL training framework
  • VLAC for the vision-language action critic model
  • LIBERO team for providing the benchmark

⭐ Star this repo to stay updated on our code release!

Project Page | Paper | Code (Soon) | Model (Soon)

About

EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages