v2.6.0: New `LogEveryNTimesteps` callback and `has_attr` method, refactored hyperparameter optimization
LatestSB3 Contrib (more algorithms): https://github.com/Stable-Baselines-Team/stable-baselines3-contrib
RL Zoo3 (training framework): https://github.com/DLR-RM/rl-baselines3-zoo
Stable-Baselines Jax (SBX): https://github.com/araffin/sbx
To upgrade:
pip install stable_baselines3 sb3_contrib rl_zoo3 --upgrade
New Features:
- Added
has_attr
method forVecEnv
to check if an attribute exists - Added
LogEveryNTimesteps
callback to dump logs every N timesteps (note: you need to passlog_interval=None
to avoid any interference) - Added Gymnasium v1.1 support
Bug fixes:
SubProcVecEnv
will now exit gracefully (without big traceback) when usingKeyboardInterrupt
SB3-Contrib
- Renamed
_dump_logs()
todump_logs()
- Fixed issues with
SubprocVecEnv
andMaskablePPO
by usingvec_env.has_attr()
(pickling issues, mask function not present)
RL Zoo
- Refactored hyperparameter optimization. The Optuna Journal storage backend is now supported (recommended default) and you can easily load tuned hyperparameter via the new
--trial-id
argument oftrain.py
. - Save the exact command line used to launch a training
- Added support for special vectorized env (e.g. Brax, IsaacSim) by allowing to override the
VecEnv
class use to instantiate the env in theExperimentManager
- Allow to disable auto-logging by passing
--log-interval -2
(useful when logging things manually) - Added Gymnasium v1.1 support
- Fixed use of old HF api in
get_hf_trained_models()
SBX (SB3 + Jax)
- Updated PPO to support
net_arch
, and additional fixes - Fixed entropy coeff wrongly logged for SAC and derivatives.
- Fixed PPO
predict()
for env that were not normalized (action spaces with limits != [-1, 1]) - PPO now logs the standard deviation
Deprecations:
algo._dump_logs()
is deprecated in favor ofalgo.dump_logs()
and will be removed in SB3 v2.7.0
Others:
- Updated black from v24 to v25
- Improved error messages when checking Box space equality (loading
VecNormalize
) - Updated test to reflect how
set_wrapper_attr
should be used now
Documentation:
- Clarify the use of Gym wrappers with
make_vec_env
in the section on Vectorized Environments (@pstahlhofen) - Updated callback doc for
EveryNTimesteps
- Added doc on how to set env attributes via
VecEnv
calls - Added ONNX export example for
MultiInputPolicy
(@darkopetrovic)
New Contributors
- @pstahlhofen made their first contribution in #2079
- @darkopetrovic made their first contribution in #2098
Full Changelog: v2.5.0...v2.6.0