Expected zero shot performance on custom robot

Hi! Thanks a lot for open-sourcing this awesome work.

I’ve been testing OmniVLA on a custom wheeled robot in outdoor environments, and I’m seeing mixed performance, as shown in the videos below (video speed is 3x):

https://github.com/user-attachments/assets/31c438d4-8dea-4381-908b-3bc135552b3e.mp4

_acceptable performance_

https://github.com/user-attachments/assets/4e8a50b1-ea0e-4c75-9fdf-a0296189b186.mp4

_bad performance_


The green line is generated from OmniVLA’s output waypoints, scaled by a metric resolution of 0.2 m. I tried both the satellite and prompt modalities, and the behavior is very similar.

I’m running inference on a desktop RTX 5090. The forward pass takes ~100 ms, but I’m only receiving images at 4 Hz. I also tried all released checkpoints (omnivla-original, omnivla-original-balance, omnivla-finetuned-cast) with similar results.

For inference, I wrote a ROS wrapper around the official script:

https://github.com/NHirose/OmniVLA/blob/main/inference/run_omnivla.py

The wrapper fills in the robot state and produces a cmd_vel output, while keeping the core inference code unchanged. I limited the command speeds to:

linear velocity ≤ 0.3 m/s

angular velocity ≤ 0.75 rad/s

Here is a sample video where the model fails to keep the robot on the sidewalk. The prompt is: "navigate on the center of the sidewalk".

https://github.com/user-attachments/assets/01efb820-4834-4d0b-90a1-09ce3d7aa770.mp4

I was wondering:
- What's the expected zero shot performance of the OmniVLA on unseen robots?
- What's the amount of data you think it would be needed for fine tuning the model to reach the same performance as the one you get in the training platforms?
- What's the best format to store the dataset for finetuning?

Thanks!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expected zero shot performance on custom robot #11

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Expected zero shot performance on custom robot #11

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions