FlowerVLA

Moritz Reuss¹, Hongyi Zhou¹, Marcel Ruehle¹, Ömer Erdinç Yağmurlu¹, Fabian Otto², Rudolf Lioutikov¹

¹Intuitive Robots Lab (IRL), Karlsruhe Institute of Technology (KIT) ²Microsoft Research

FLOWER: An efficient & versatile Flow-VLA

FLOWER VLA is a lightweight, efficient Vision-Language-Action (VLA) policy for robotic manipulation tasks that achieves state-of-the-art performance on multiple benchmarks. Built on a rectified flow architecture with several key architecture features:

Efficient Architecture: At ~1B parameters, FLOWER is significantly smaller than most VLA models
Low Training Cost: Only requires ~200 GPU hours of pretraining
Low Memory Footprint: Uses <3GB of GPU memory for inference with single image setting
SOTA Performance: Achieves sota results on CALVIN and LIBERO benchmarks

For the pretraining of FLOWER and finetuning for Aloha check out our other codebase.

Model Overview

FLOWER VLA uses a Florence-2-large-based VLM combined with a rectified flow architecture to predict robot actions from visual observations and language instructions. The model efficiently handles multi-step action sequences through a chunking mechanism.

[Insert model architecture diagram here]

Key architectural components:

Florence-2 DaVit Image Encoder [350M params]
Language conditioning through cross-attention
Half of the Florence-2 LLM layers for vision and language fusion [205M]
Rectified Flow Predition for fast action generation (all results in CALVIN and LIBERO are achieved using just 4 denoising steps)
Global AdaLN for parameter efficient conditioning
Action chunking for multi-step action generation

Installation

To begin, clone this repository locally

git clone --recurse-submodules [email protected]:intuitive-robots/flower_vla_calvin.git
export flower_calvin_ROOT=$(pwd)/flower_vla_calvin

Install requirements (Note we provided a changed verison of pyhash, given numerous problems we encountered when installing it manually on our slurm cluster) You can also try to install setup tools using pip.

cd $flower_calvin_ROOT
conda create -n flower_cal python=3.9
conda activate flower_cal
cd calvin_env/tacto
pip install -e .
cd ..
pip install -e .
cd ..
cd LIBERO
pip install -r requirements.txt
pip install -e .
pip install numpy~=1.23
cd ..
pip install setuptools==57.5.0
cd pyhash-0.9.3
python setup.py build
python setup.py install
cd ..

Next we can install the rest of the missing packages

pip install -r requirements.txt

Download

CALVIN Dataset

If you want to train on the CALVIN dataset, choose a split with:

cd $flower_calvin_ROOT/dataset
sh download_data.sh D | ABCD

Training

To train the FLOWER FLOWERl with the 4 GPUS, run:

python flower/training.py

You can use the pretrained FLOWER checkpoint from hf-link to train your own model on any of the datasets.

Note that during training the full CALVIN eval or LIBERO rollouts will be called after rollout_lh_skip_epochs and then every callbacks.rollout_lh.rollout_freq*1k training steps. Check out the training config for adopting the parameters.

For replication of the orginial training results I recommend to use 4 GPUs with a batch_size of 8 and train them for 40k steps for ABC (ABCD) and evaluating after 19 epochs to get the best possible results. See training configs for details.

Preprocessing with CALVIN

Since FLOWER uses action chunking, it needs to load multiple (~10) episode_{}.npz files for each inference. In combination with batching, this results in a large disk bandwidth needed for each iteration (usually ~2000MB/iteration). This has the potential of significantly reducing your GPU utilization rate during training depending on your hardware. Therefore, you can use the script extract_by_key.py to extract the data into a single file, avoiding opening too many episode files when using the CALVIN dataset.

Usage example:

python preprocess/extract_by_key.py -i /YOUR/PATH/TO/CALVIN/ \
    --in_task all

python preprocess/extract_by_key.py -i /hkfs/work/workspace/scratch/ft4740-play3/data --in_task all

Params:

Run this command to see more detailed information:

python preprocess/extract_by_key.py -h

Important params:

--in_root: /YOUR/PATH/TO/CALVIN/, e.g /data3/geyuan/datasets/CALVIN/
--extract_key: A key of dict(episode_xxx.npz), default is 'rel_actions', the saved file name depends on this (i.e ep_{extract_key}.npy) Optional params:
--in_task: default is 'all', meaning all task folders (e.g task_ABCD_D/) of CALVIN
--in_split: default is 'all', meaning both training/ and validation/
--out_dir: optional, default is 'None', and will be converted to {in_root}/{in_task}/{in_split}/extracted/
--force: whether to overwrite existing extracted data Thanks to @ygtxr1997 for debugging the GPU utilization and providing a merge request.

Evaluation

Download the pretrained FLOWER from Hugging Face: You can find all checkpoints under:

FLOWER Collection

We provide pretrained checkpoints for all CALVIN and LIBERO challenges.

Performance Comparison on CALVIN Challenges (1000 chains)

Below is the average performance of FLOWER on CALVIN. It currenty is SOTA for all variants of CALVIN:

Train→Test	Method	PrT	1	2	3	4	5	Avg. Len.
ABCD→D	Diff-P-CNN	×	86.3%	72.7%	60.1%	51.2%	41.7%	3.16±0.06
	Diff-P-T	×	78.3%	53.9%	33.8%	20.4%	11.3%	1.98±0.09
	RoboFlamingo	✓	96.4%	89.6%	82.4%	74.0%	66.0%	4.09±0.00
	GR-1	✓	94.9%	89.6%	84.4%	78.9%	73.1%	4.21±0.00
	MDT	×	98.6%	95.8%	91.6%	86.2%	80.1%	4.52±0.02
	MoDE	✓	97.1%	92.5%	87.9%	83.5%	77.9%	4.39±0.04
	KomosVLA	✓	98.0%	93.6%	85.4%	77.8%	70.4%	4.49
	FLOWER (ours)	×	99.1%	97.8%	95.2%	92.4%	87.8%	4.67±0.04
ABC→D	Diff-P-CNN	×	63.5%	35.3%	19.4%	10.7%	6.4%	1.35±0.05
	Diff-P-T	×	62.2%	30.9%	13.2%	5.0%	1.6%	1.13±0.02
	RoboFlamingo	✓	82.4%	61.9%	46.6%	33.1%	23.5%	2.47±0.00
	GR-1	✓	85.4%	71.2%	59.6%	49.7%	40.1%	3.06±0.00
	3DDA	×	93.8%	80.3%	66.2%	53.3%	41.2%	3.35
	MoDE	✓	96.2%	88.9%	81.1%	71.8%	63.5%	4.01±0.04
	KomosVLA	✓	98.0%	93.6%	85.4%	77.8%	70.4%	4.25
	VPP	✓	95.7%	91.2%	86.3%	81.0%	75.0%	4.29
	Seer	✓	96.3%	91.6%	86.1%	80.3%	74.0%	4.28
	FLOWER (ours)	×	99.3%	95.9%	90.5%	84.8%	77.5%	4.54±0.02
D→D	FLOWER (ours)	×	98.4%	94.0%	87.9%	81.7%	74.1%	4.36±0.04

Performance on LIBERO Benchmarks

FLOWER achieves strong performance across all LIBERO benchmarks:

Benchmark	FLOWER Success Rate
LIBERO-10	94.5%
LIBERO-90	93.4%
LIBERO-SPATIAL	97.2%
LIBERO-OBJECT	99.3%
LIBERO-GOAL	96.9%

Common Issues

Sometimes this causes problems for the python env so just delete it:

log.info(f"Using calvin_env with commit {get_git_commit_hash(Path(calvin_env.__file__))}.")

The path for this line is in the CALVIN env repo: https://github.com/mees/calvin_env/blob/797142c588c21e76717268b7b430958dbd13bf48/calvin_env/envs/play_table_env.py#L72

Acknowledgements

This work is only possible because of the code from the following open-source projects and datasets. We thank all authors for their work:

CALVIN

Original: https://github.com/mees/calvin

License: MIT

LIBERO

Original: https://github.com/Lifelong-Robot-Learning/LIBERO

License: https://github.com/Lifelong-Robot-Learning/LIBERO?tab=MIT-1-ov-file

Mimictest

Original: mimictest License: license

HULC

Original: https://github.com/lukashermann/hulc

License: MIT

FLOWER Pretraining Codebase

Original: https://github.com/intuitive-robots/FLOWER_Diffusion_Policy

License: https://github.com/intuitive-robots/FLOWER_Diffusion_Policy/blob/main/LICENSE

Citation

If you found the code usefull, please cite our work: (arxiv coming very soon)

@inproceedings{
    reuss2025flower,
    ???
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
LIBERO @ 8f1084e		LIBERO @ 8f1084e
calvin_env @ 797142c		calvin_env @ 797142c
conf		conf
flower		flower
preprocess		preprocess
pyhash-0.9.3		pyhash-0.9.3
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
sbatch_train_calvin.sh		sbatch_train_calvin.sh
sbatch_train_libero.sh		sbatch_train_libero.sh
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FlowerVLA

FLOWER: An efficient & versatile Flow-VLA

Model Overview

Installation

Download

CALVIN Dataset

Training

Preprocessing with CALVIN

Usage example:

Params:

Evaluation

Performance Comparison on CALVIN Challenges (1000 chains)

Performance on LIBERO Benchmarks

Common Issues

Acknowledgements

CALVIN

LIBERO

Mimictest

HULC

FLOWER Pretraining Codebase

Citation

About

Uh oh!

Releases

Packages

Languages

License

intuitive-robots/flower_vla_calvin

Folders and files

Latest commit

History

Repository files navigation

FlowerVLA

FLOWER: An efficient & versatile Flow-VLA

Model Overview

Installation

Download

CALVIN Dataset

Training

Preprocessing with CALVIN

Usage example:

Params:

Evaluation

Performance Comparison on CALVIN Challenges (1000 chains)

Performance on LIBERO Benchmarks

Common Issues

Acknowledgements

CALVIN

LIBERO

Mimictest

HULC

FLOWER Pretraining Codebase

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages