Fix ML environment registration bugs and improve meta-learning entry points#573
Open
marcusfechner wants to merge 4 commits intoFarama-Foundation:masterfrom
Open
Fix ML environment registration bugs and improve meta-learning entry points#573marcusfechner wants to merge 4 commits intoFarama-Foundation:masterfrom
marcusfechner wants to merge 4 commits intoFarama-Foundation:masterfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR fixes several bugs encountered when first using the Meta-Learning environments following the examples from the documentation. These issues can be confusing for newcomers to Meta-World, and I hope these fixes make the initial experience significantly smoother.
Summary
"Fix ML1 train and test envs only using test split" (
e227f7b): Thefor split in ["train", "test"]loop registered lambdas forMeta-World/ML1-trainandMeta-World/ML1-test, but the lambda capturedsplitby reference rather than by value. By the time either lambda was called, the loop had finished andsplitwas always"test". This meantML1-trainenvironments were silently using test tasks instead of training tasks. Fixed by binding the loop variable via a default argument (_split=split).Reproduce:
"Make sample_tasks_on_reset True to prevent initial tasks not being set on reset()" (
ed5f15b):PseudoRandomTaskSelectWrapperdefaulted tosample_tasks_on_reset=False, which meant that on the initialreset()call no task was sampled and the environment had no task set. This caused environments created through the ML entry points to start in an undefined task state. Changing the default toTrueensures a task is always sampled when the environment is reset.Reproduce:
"Add single env entry point for ML1" (
141a8a9): ML1 registrations only provided avector_entry_point, so callinggym.make("Meta-World/ML1-train", env_name="...")would fail because there was no non-vector entry point. Added a_ml1_entry_pointfunction and registered it asentry_pointalongside the existingvector_entry_point, enabling single-environment creation viagym.make().Reproduce:
"Refactor meta_batch_size handling in ML environment functions to allow None values" (
1fa1b11):meta_batch_sizedefaulted to20across all ML entry points, which was arbitrary and would silently fail for benchmarks where the number of environment classes didn't divide20. Changed the default toNone, which now auto-defaults to the number of environment classes (one sub-environment per class) and can be overwritten by specifyingmeta_batch_size. I Also added descriptive assertion messages that report the actual values whenmeta_batch_sizeis too small or not evenly divisible.Reproduce: