Skip to content

Poor inference performance in the real robot #28

@gzyabc

Description

@gzyabc
Image I have collected 200 robot trajectories and trained them using Lingbot-VLA for a total of 200,000 steps, saving checkpoints every 50,000 steps.

I tested the 100,000-step checkpoint and found the inference performance to be very poor, even though the training loss dropped to around 0.02. I am training on A100 GPUs with a batch size of 32 (16 per node). The dataset is in LeRobot format.

I am struggling to identify the root cause. Could you provide some insights? Is it more likely an issue with the dataset size, action normalization, training configuration, or deployment pipeline

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions