Skip to content

Lerobot data loader issue with lerobot_base_train_config.gin #21

@lial1115

Description

@lial1115

It appears that the current version of x-mobility does not support for lerobot, leading to a runtime error during training. A key discrepancy observed is in the number of semantic labels:

  • lerobot provides 15 semantic labels
  • The corresponding parquet data only includes 7 semantic labels

When running the training command (detailed below), x-mobility fails to handle the 15 labels from lerobot (treating them as out-of-range for the 7 labels), resulting in a CUDA device-side assert error.

Reproduction Steps

Execute the training script with the specified config and data paths:

python3 train.py -c configs/lerobot_base_train_config.gin \
-d /mnt/data/notebook3/x_mobility_isaac_sim_mobilitygen \
-o /mnt/data/notebook3/mobilitygen_output

Error Log

RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stack trace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1

/pytorch/aten/src/ATen/native/cuda/NLLLoss2d.cu:73: nll_loss2d_forward_no_reduce_kernel: block: [88,0,0], thread: [963,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions