Port CarRacing workflow to Gymnasium and modernize training stack #46
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Port CarRacing workflow to Gymnasium and modernize training stack
switch data generation and rollout utilities to Gymnasium’s CarRacing-v3, adapting reset/step signatures, reward accumulation, render modes, and controller loading semantics
harden MDRNN GMM loss by clamping component scales, using softplus parameterization, and replacing manual log-sum logic with torch.logsumexp
refresh training scripts: cast observations to float32, enforce drop_last loaders, rework latent projection to handle arbitrary channel counts, migrate LR scheduler imports, and add shape/debug logging
overhaul VAE training loop with torchvision v2 transforms, β-scheduled KL weighting, richer progress logging, and scaffolding for AMP while updating the loss implementation
remove the bundled ReduceLROnPlateau clone now that torch.optim’s scheduler is used directly and bump requirements to torch/torchvision ≥2.1 with gymnasium[box2d]
check in exp_dir/ training artifacts (controller/MDRNN checkpoints, job logs, VAE samples) and a PostScript torch asset