-
-
Notifications
You must be signed in to change notification settings - Fork 982
Include initial state manager for BipedalWalker #1305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Include initial state manager for BipedalWalker #1305
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting solution
Could you add tests and some general documentation on how to use this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking through the changes, I don't understand why the class is required or half the changes.
Could you explain what the changes do and why they are implemented like this?
.gitignore
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove these changes
Yes, I'll include an overview, tests and a tutorial here soon. In short, the idea is to provide the possibility of designing the obstacle course, by choosing which obstacles at which point should be included in the terrain and also their characteristics. Also, if the design choice for the solution is a problem, I can refactor it to suit a specific standard. |
To me, the class seems unnecessary, or at least in its overly complex. We wish to have an easy-to-understand code base |
I've removed the class and created two internal functions:
Looks simpler to me, let me know what you think. Where can I include examples on how to use this? Here's an example on how this could be used to design a specific terrain (or copy an existing terrain): import gymnasium as gym
GRASS, STUMP, STAIRS, PIT, _STATES_ = range(5)
OBSTACLES = dict(
down_stairs=dict(state=2, metadata=(-1, 4, 2)),
up_stairs=dict(state=2, metadata=(1, 4, 2)),
small_stump=dict(state=1, metadata=1),
large_stump=dict(state=1, metadata=3),
hole=dict(state=3, metadata=3),
)
env = gym.make("BipedalWalker-v3", hardcore=True, render_mode="human")
metadata = dict(
states=[OBSTACLES["hole"], OBSTACLES["hole"], OBSTACLES["large_stump"]],
x_variations=False,
y_variations=True
)
# Use the `options` parameter to pass the terrain metadata and initialize the environment
options = dict(metadata=metadata)
env.reset()
# Execute for visualization
for _ in range(100):
# this is where you would insert your policy
action = env.action_space.sample()
# step (transition) through the environment with the action
# receiving the next observation, reward and if the episode has terminated or truncated
observation, reward, terminated, truncated, info = env.step(action) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arthur-plautz Your changes make way more sense, thanks for the changes.
Could you add documentation in the class docstring with a section on determining terrian with a note in the version section.
Then could you add some tests in tests/envs/test_env_implementations.py
to check that the implementation works as expected.
@@ -281,7 +288,80 @@ def _destroy(self): | |||
self.legs = [] | |||
self.joints = [] | |||
|
|||
def _process_terrain_metadata(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a docstring to explain what the function does
self.terrain_grass = TERRAIN_GRASS | ||
self._terrain_metadata = dict(states=[]) | ||
|
||
def _generate_terrain_state(self, state: int) -> any: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add docstring
|
||
if self._predefined_terrain: | ||
if state == GRASS: | ||
next_state = self._terrain_metadata["states"][0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why for GRASS do we only get the first while for the next state we pop the first?
Description
The changes described in this PR contribute with the possibility of controlling the terrain generation for the
BipedalWalkerV3
.Being able to retrieve and configure a specific terrain for this environment can be very useful for multiple use-cases, by allowing the user to have a greater visibility and control over the terrain.
The
TerrainMetadata
class was added to manage the read/write operations over random values generated to build the terrain, when calling theenv.reset
method. The code snippet below showcases the use of the this new functionality:Further details on the motivation of this change are described in the referenced issue below.
Fixes # 892
Type of change
Please delete options that are not relevant.
Checklist:
pre-commit
checks withpre-commit run --all-files
(seeCONTRIBUTING.md
instructions to set it up)