Skip to content

Commit cd701a0

Browse files
authored
Fixes for CarRacing-v3 (#496)
* Fixes for CarRacing-v3 and Gymnasium v1.0 * Update to constant schedule class * Add score normalization for bipedal walker and lunar lander * Update CarRacing hyperparams * Update SB3
1 parent ad1ae18 commit cd701a0

File tree

10 files changed

+27
-13
lines changed

10 files changed

+27
-13
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ hub
1919
*.mp4
2020
*.json
2121
_build/
22+
run_crossq_bipedal.sh
2223

2324
tests/dummy_env/build/
2425

CHANGELOG.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,17 @@
1-
## Release 2.6.1 (WIP)
1+
## Release 2.7.0a0 (WIP)
22

33
### Breaking Changes
4-
- Upgraded to SB3 >= 2.6.1
4+
- Upgraded to SB3 >= 2.7.0
55
- `linear_schedule` now returns a `SimpleLinearSchedule` object for better portability
66
- Renamed `LunarLander-v2` to `LunarLander-v3` in hyperparameters
7+
- Renamed `CarRacing-v2` to `CarRacing-v3` in hyperparameters
78

89
### New Features
910

1011
### Bug fixes
1112
- Docker GPU images are now working again
1213
- Use `ConstantSchedule`, and `SimpleLinearSchedule` instead of `constant_fn` and `linear_schedule`
14+
- Fixed `CarRacing-v3` hyperparameters for newer Gymnasium version
1315

1416
### Documentation
1517

hyperparams/ppo.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -347,13 +347,13 @@ MiniGrid-ObstructedMaze-2Dlh-v0:
347347
n_timesteps: !!float 1e7 # Unsolved
348348

349349

350-
CarRacing-v2:
350+
CarRacing-v3:
351351
env_wrapper:
352352
- rl_zoo3.wrappers.FrameSkip:
353353
skip: 2
354-
- gymnasium.wrappers.resize_observation.ResizeObservation:
355-
shape: 64
356-
- gymnasium.wrappers.gray_scale_observation.GrayScaleObservation:
354+
- rl_zoo3.wrappers.YAMLCompatResizeObservation:
355+
shape: [64, 64]
356+
- gymnasium.wrappers.transform_observation.GrayscaleObservation:
357357
keep_dim: true
358358
frame_stack: 2
359359
normalize: "{'norm_obs': False, 'norm_reward': True}"

hyperparams/ppo_lstm.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -283,13 +283,13 @@ InvertedPendulumSwingupBulletEnv-v0:
283283
clip_range: 0.2
284284

285285

286-
CarRacing-v2:
286+
CarRacing-v3:
287287
env_wrapper:
288288
# - rl_zoo3.wrappers.FrameSkip:
289289
# skip: 2
290-
- gymnasium.wrappers.resize_observation.ResizeObservation:
291-
shape: 64
292-
- gymnasium.wrappers.gray_scale_observation.GrayScaleObservation:
290+
- rl_zoo3.wrappers.YAMLCompatResizeObservation:
291+
shape: [64, 64]
292+
- gymnasium.wrappers.transform_observation.GrayscaleObservation:
293293
keep_dim: true
294294
frame_stack: 2
295295
normalize: "{'norm_obs': False, 'norm_reward': True}"

hyperparams/sac.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -161,7 +161,7 @@ MinitaurBulletDuckEnv-v0:
161161
learning_starts: 10000
162162

163163
# To be tuned
164-
CarRacing-v2:
164+
CarRacing-v3:
165165
env_wrapper:
166166
- rl_zoo3.wrappers.FrameSkip:
167167
skip: 2

rl_zoo3/plots/plot_from_file.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -156,6 +156,8 @@ def plot_from_file(): # noqa: C901
156156
"Ant": "AntBulletEnv-v0",
157157
"Hopper": "HopperBulletEnv-v0",
158158
"Walker": "Walker2DBulletEnv-v0",
159+
"LunarLanderContinuous": "LunarLanderContinuous-v3",
160+
"BipedalWalker": "BipedalWalker-v3",
159161
}
160162
# Backward compat
161163
skip_all_algos_dict = False

rl_zoo3/plots/score_normalization.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,8 @@ class ReferenceScore(NamedTuple):
2222
ReferenceScore("AntBulletEnv-v0", 300, 3500),
2323
ReferenceScore("HopperBulletEnv-v0", 20, 2500),
2424
ReferenceScore("Walker2DBulletEnv-v0", 200, 2500),
25+
ReferenceScore("LunarLanderContinuous-v3", -200, 250),
26+
ReferenceScore("BipedalWalker-v3", -100, 300),
2527
]
2628

2729
# Alternative scaling

rl_zoo3/version.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
2.6.1a1
1+
2.7.0a0

rl_zoo3/wrappers.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,17 @@
44
import numpy as np
55
from gymnasium import spaces
66
from gymnasium.core import ObsType
7+
from gymnasium.wrappers import ResizeObservation
78
from sb3_contrib.common.wrappers import TimeFeatureWrapper # noqa: F401 (backward compatibility)
89
from stable_baselines3.common.type_aliases import GymResetReturn, GymStepReturn
910

1011

12+
# Convert to tuple, so it is compatible with YAML
13+
class YAMLCompatResizeObservation(ResizeObservation):
14+
def __init__(self, env: gym.Env, shape: list[int]):
15+
super().__init__(env, (shape[0], shape[1]))
16+
17+
1118
class TruncatedOnSuccessWrapper(gym.Wrapper):
1219
"""
1320
Reset on success and offsets the reward.

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
See https://github.com/DLR-RM/rl-baselines3-zoo
1616
"""
1717
install_requires = [
18-
"sb3_contrib>=2.6.1a1,<3.0",
18+
"sb3_contrib>=2.7.0a0,<3.0",
1919
"gymnasium>=0.29.1,<1.2.0",
2020
"huggingface_sb3>=3.0,<4.0",
2121
"tqdm",

0 commit comments

Comments
 (0)