EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models

🎯 Overview

EVOLVE-VLA enables Vision-Language-Action (VLA) models to continuously adapt through autonomous environment interaction—learning by doing, not just watching.

Current VLA models rely on supervised fine-tuning (SFT) with extensive expert demonstrations, leading to:

💰 High labor costs: Hundreds of demonstrations per task
🔒 Rigid memorization: Policies that merely replay training trajectories
❌ Poor adaptation: Inability to recover from execution deviations

Our test-time training framework addresses these limitations by enabling VLAs to self-improve during deployment through online reinforcement learning, requiring minimal or even zero task-specific demonstrations.

✨ Key Features

🚀 Test-Time Training

Continue learning during deployment through environment interaction
Minimal supervision required (1-shot or zero-shot)
Autonomous exploration and self-improvement via online RL

🎯 Progress-Based Reward

Learned progress estimator replaces impractical oracle rewards
Dense, continuous feedback for sample-efficient learning
No access to ground-truth success signals needed at test time

🔧 Technical Innovations

Accumulative Progress Estimation: Smooths noisy point-wise estimates into stable signals through interval-based sampling and incremental aggregation
Progressive Horizon Extension: Gradual curriculum learning that extends exploration horizons, enabling the policy to master simpler sub-tasks before tackling complete long-horizon tasks

📊 Main Results

Setting	Improvement	Details
Long-Horizon Tasks	+8.6%	LIBERO-Long benchmark
1-Shot Learning	+22.0%	Minimal demonstration data
Cross-Task Transfer	0% → 20.8%	Zero-shot generalization without task-specific SFT

🌟 Emergent Capabilities

Through autonomous test-time training, EVOLVE-VLA develops skills entirely absent from demonstrations:

✅ Error Recovery: Re-attempting failed grasps and adjusting strategies
✅ Adaptation: Handling unexpected object state changes
✅ Novel Strategies: Discovering alternative manipulation approaches (e.g., grasping cup body instead of handle)

🎥 Demo Videos

Check out our project page for video demonstrations showing:

Error recovery through repeated grasp attempts
Adaptation to unexpected state changes
Novel manipulation strategies learned autonomously

🛠️ Code & Model

📦 Coming Soon

We are committed to releasing the following upon publication:

🔓 Full training code for EVOLVE-VLA
🔓 Inference codebase for deploying trained models
🔓 Pre-trained models on HuggingFace
🔓 Evaluation scripts for LIBERO benchmark
📖 Detailed documentation and tutorials

Stay tuned! Star this repo to get notified when the code is released.

📖 Citation

If you find our work useful, please cite:

@article{bai2025evolve,
  title={EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models},
  author={Bai, Zechen and Gao, Chen and Shou, Mike Zheng},
  journal={arXiv preprint arXiv:2512.14666},
  year={2025}
}

🙏 Acknowledgments

We thank the following projects and teams for their valuable contributions to the community:

OpenVLA for the open-source VLA model and codebase
SimpleVLA-RL for pioneering work on RL fine-tuning for VLAs
verl for the efficient RL training framework
VLAC for the vision-language action critic model
LIBERO team for providing the benchmark

⭐ Star this repo to stay updated on our code release!

Project Page | Paper | Code (Soon) | Model (Soon)

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
README.md		README.md
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models

🎯 Overview

✨ Key Features

🚀 Test-Time Training

🎯 Progress-Based Reward

🔧 Technical Innovations

📊 Main Results

🌟 Emergent Capabilities

🎥 Demo Videos

🛠️ Code & Model

📦 Coming Soon

📖 Citation

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

showlab/EVOLVE-VLA

Folders and files

Latest commit

History

Repository files navigation

EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models

🎯 Overview

✨ Key Features

🚀 Test-Time Training

🎯 Progress-Based Reward

🔧 Technical Innovations

📊 Main Results

🌟 Emergent Capabilities

🎥 Demo Videos

🛠️ Code & Model

📦 Coming Soon

📖 Citation

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages