-
Notifications
You must be signed in to change notification settings - Fork 2
Model Modification for Autoregressive Usage #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
2. core/models.py (SignLanguagePoseDiffusion):
3. core/training.py (PoseTrainingPortal):
|
|
1. config/option.py add parser.add_argument('--lambda_vel' and '--load_num' 2. training.py Introduced a weight (lambda_vel) for the velocity loss term |
This pull request modifies the training pipeline to support autoregressive generation of fluent sign language poses, conditioned on a whole disfluent sequence and the previously generated fluent history.
1. data/load_data.py (SignLanguagePoseDataset):
Modified _getitem_ to enable autoregressive training. Instead of returning a fixed initial segment of fluent sequence: It now randomly samples a target chunk (data, length=chunk_len) from the ground truth fluent sequence. And it extracts the corresponding ground truth fluent pose history preceding this chunk and returns it as conditions['previous_output'].
The full disfluent sequence remains as conditions['input_sequence'].
Replaced the custom global mean/std calculation with pose_anonymization.data.normalization.normalize_mean_std. Data is now normalized by calling this function on the Pose objects after loading.
Ensured the sampled target_chunk is always padded (with zeros) to the fixed chunk_len within _getitem_ if the sampled segment is shorter (e.g., at the end of a sequence or for short sequences). The corresponding target_mask is padded with True (masked).
Parameter Renaming: The fluent_frames parameter in init is now internally referred to as chunk_len to better reflect its role in the autoregressive setup.