Poor inference performance in the real robot

<img width="1191" height="777" alt="Image" src="https://github.com/user-attachments/assets/7511b039-e118-45ec-b021-7837e66b6bea" />
I have collected 200 robot trajectories and trained them using Lingbot-VLA for a total of 200,000 steps, saving checkpoints every 50,000 steps.

I tested the 100,000-step checkpoint and found the inference performance to be very poor, even though the training loss dropped to around 0.02. I am training on A100 GPUs with a batch size of 32 (16 per node). The dataset is in LeRobot format.

I am struggling to identify the root cause. Could you provide some insights? Is it more likely an issue with the dataset size, action normalization, training configuration, or deployment pipeline

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Poor inference performance in the real robot #28

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Poor inference performance in the real robot #28

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions