Custom Dataset Generation for Training

Hi,

I have gone through the README.md for the training and evaluation as well as ros_deployment using the provided datasets and pretrained model checkpoints. Now, **I am trying to generate my own custom dataset** per my use case for training X-Mobility from scratch. I performed EDA for the provided datasets, `x_mobility_isaac_sim_nav2_100k` and `x_mobility_isaac_sim_random_160k` so that I have a foundation to base my new dataset on. Based on that, I have the following doubts and clarifications about the dataset fields:
________________________________________________________________________________________________________________
1. Each `.pqt` file has the following columns:
```
'acqtime', 'camera_intrinsic_matrix', 'camera_extrinsic_matrix',
'perspective_semantic_image', 'perspective_semantic_image_shape',
'perspective_semantic_image_labels', 'camera_image', 'driving_command',
'route', 'ego_speed', 'path', 'route_poses', 'goal_id',
'semantic_labels'.
```
**Is my deduction correct that 'acqtime', 'camera_intrinsic_matrix', 'camera_extrinsic_matrix' and 'goal_id' are NOT being used by either of the 2 models?** If they are being used in the code, can you please clarify how data for 'acqtime', 'camera_extrinsic_matrix' and 'goal_id' are obtained? 
________________________________________________________________________________________________________________
2. **`'route'` vs `'route_poses'`**
I see that the fields 'route' and 'route_poses' are both used in the ActionPolicy code. In the provided `.pqt` files, each entry of the `'route'` field is a .jpg image encoded as a bytestring, which when decoded gives a 64x64 binary image representing the route being followed. The `'route_poses'` field, on the other hand, is an array of len = 40, which is later converted into an array of coordinate pairs (((x1, y1), (x2, y2)), ((x2, y2), (x3, y3))...etc.) in the code. While going through [`action_policy.py`](https://github.com/NVlabs/X-MOBILITY/blob/main/model/x_mobility/action_policy.py), I noticed that `'route_vectors'` which has been obtained from `'route_poses'` is being used in the forward pass [(line 174)](https://github.com/NVlabs/X-MOBILITY/blob/fa20de916cee3f15e2ae20a3073e9adcdc4694d6/model/x_mobility/action_policy.py#L174), while `'route'` is being used in the inference function in [(line 210)](https://github.com/NVlabs/X-MOBILITY/blob/fa20de916cee3f15e2ae20a3073e9adcdc4694d6/model/x_mobility/action_policy.py#L210). **With `'route'` and `'route_poses'` being 2 different types of data (bytestring vs array), how is one used for the forward pass and the other used for inference?** 
________________________________________________________________________________________________________________
3. **How is the `'path'` field generated?**
I understand that `'path'` is the final output of the X-Mobility model during inference. But I need to generate data entries for the `'path'` field for training. I am not sure how to go about this. I assumed there would be some similarities between the `'route_poses'` field and the `'path'` field because the former is being used to create the final path in [`x_mobility_navigator.py`, line 276](https://github.com/NVlabs/X-MOBILITY/blob/fa20de916cee3f15e2ae20a3073e9adcdc4694d6/ros2_deployment/x_mobility_navigator/x_mobility_navigator/x_mobility_navigator.py#L276). In the `.pqt` file entries, however, `'path'` is an array of len = 10 whereas `'route_poses'` is of len = 40 and the values don't seem to be having any correlation with each other. **Could you provide any guidance on how to generate the 'path' field for training purposes, and why 'route_poses' and 'path' have different lengths and formats?**
________________________________________________________________________________________________________________
4. **Can transfer learning be performed in this framework?** Specifically, can I use pre-trained checkpoints and continue training the model with my custom dataset?
_________________________________________________________________________________________________________________
Apologies in advance for the lengthy post.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom Dataset Generation for Training #7

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Custom Dataset Generation for Training #7

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions