Skip to content

Commit bddd1ab

Browse files
authored
Release 2.5.1 (#304)
* Fix typo in kl-div * Update tb legacy instructions * Bump version * Capitalize Leibler * Typo in GAIL model
1 parent f238a4c commit bddd1ab

File tree

13 files changed

+26
-16
lines changed

13 files changed

+26
-16
lines changed

docs/guide/tensorboard.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,14 @@ For that, you need to define several environment variables:
8989
export OPENAI_LOG_FORMAT='stdout,log,csv,tensorboard'
9090
export OPENAI_LOGDIR=path/to/tensorboard/data
9191
92+
and to configure the logger using:
93+
94+
.. code-block:: python
95+
96+
from stable_baselines.logger import configure
97+
98+
configure()
99+
92100
93101
Then start tensorboard with:
94102

docs/misc/changelog.rst

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,11 @@ Changelog
55

66
For download links, please look at `Github release page <https://github.com/hill-a/stable-baselines/releases>`_.
77

8-
Pre-Release 2.5.1a0 (WIP)
8+
Release 2.5.1 (2019-05-04)
99
--------------------------
1010

11+
**Bug fixes + improvements in the VecEnv**
12+
1113
- doc update (fix example of result plotter + improve doc)
1214
- fixed logger issues when stdout lacks ``read`` function
1315
- fixed a bug in ``common.dataset.Dataset`` where shuffling was not disabled properly (it affects only PPO1 with recurrent policies)
@@ -20,8 +22,8 @@ Pre-Release 2.5.1a0 (WIP)
2022
``set_attr`` now returns ``None`` rather than a list of ``None``. (@kantneel)
2123
- ``GAIL``: ``gail.dataset.ExpertDataset` supports loading from memory rather than file, and
2224
``gail.dataset.record_expert`` supports returning in-memory rather than saving to file.
23-
- added support in ``VecEnvWrapper`` for accessing attributes of arbitrarily deeply nested
24-
instances of ``VecEnvWrapper`` and ``VecEnv``. This is allowed as long as the attribute belongs
25+
- added support in ``VecEnvWrapper`` for accessing attributes of arbitrarily deeply nested
26+
instances of ``VecEnvWrapper`` and ``VecEnv``. This is allowed as long as the attribute belongs
2527
to exactly one of the nested instances i.e. it must be unambiguous. (@kantneel)
2628
- fixed bug where result plotter would crash on very short runs (@Pastafarianist)
2729
- added option to not trim output of result plotter by number of timesteps (@Pastafarianist)

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -143,7 +143,7 @@
143143
license="MIT",
144144
long_description=long_description,
145145
long_description_content_type='text/markdown',
146-
version="2.5.1a0",
146+
version="2.5.1",
147147
)
148148

149149
# python setup.py sdist

stable_baselines/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,4 +9,4 @@
99
from stable_baselines.trpo_mpi import TRPO
1010
from stable_baselines.sac import SAC
1111

12-
__version__ = "2.5.1a0"
12+
__version__ = "2.5.1"

stable_baselines/acktr/acktr_cont.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ def learn(env, policy, value_fn, gamma, lam, timesteps_per_batch, num_timesteps,
7272
:param num_timesteps: (int) the total number of timesteps to run
7373
:param animate: (bool) if render env
7474
:param callback: (function) called every step, used for logging and saving
75-
:param desired_kl: (float) the Kullback leibler weight for the loss
75+
:param desired_kl: (float) the Kullback-Leibler weight for the loss
7676
"""
7777
obfilter = ZFilter(env.observation_space.shape)
7878

stable_baselines/acktr/acktr_disc.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ class ACKTR(ActorCriticRLModel):
3333
:param vf_fisher_coef: (float) The weight for the fisher loss on the value function
3434
:param learning_rate: (float) The initial learning rate for the RMS prop optimizer
3535
:param max_grad_norm: (float) The clipping value for the maximum gradient
36-
:param kfac_clip: (float) gradient clipping for Kullback leiber
36+
:param kfac_clip: (float) gradient clipping for Kullback-Leibler
3737
:param lr_schedule: (str) The type of scheduler for the learning rate update ('linear', 'constant',
3838
'double_linear_con', 'middle_drop' or 'double_middle_drop')
3939
:param verbose: (int) the verbosity level: 0 none, 1 training information, 2 tensorflow debug

stable_baselines/acktr/kfac.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ def __init__(self, learning_rate=0.01, momentum=0.9, clip_kl=0.01, kfac_update=2
2222
2323
:param learning_rate: (float) The learning rate
2424
:param momentum: (float) The momentum value for the TensorFlow momentum optimizer
25-
:param clip_kl: (float) gradient clipping for Kullback leiber
25+
:param clip_kl: (float) gradient clipping for Kullback-Leibler
2626
:param kfac_update: (int) update kfac after kfac_update steps
2727
:param stats_accum_iter: (int) how may steps to accumulate stats
2828
:param full_stats_init: (bool) whether or not to fully initalize stats

stable_baselines/acktr/utils.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,12 +33,12 @@ def dense(input_tensor, size, name, weight_init=None, bias_init=0, weight_loss_d
3333

3434
def kl_div(action_dist1, action_dist2, action_size):
3535
"""
36-
Kullback leiber divergence
36+
Kullback-Leibler divergence
3737
3838
:param action_dist1: ([TensorFlow Tensor]) action distribution 1
3939
:param action_dist2: ([TensorFlow Tensor]) action distribution 2
4040
:param action_size: (int) the shape of an action
41-
:return: (float) Kullback leiber divergence
41+
:return: (float) Kullback-Leibler divergence
4242
"""
4343
mean1, std1 = action_dist1[:, :action_size], action_dist1[:, action_size:]
4444
mean2, std2 = action_dist2[:, :action_size], action_dist2[:, action_size:]

stable_baselines/common/distributions.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ def neglogp(self, x):
3939

4040
def kl(self, other):
4141
"""
42-
Calculates the Kullback-Leiber divergence from the given probabilty distribution
42+
Calculates the Kullback-Leibler divergence from the given probabilty distribution
4343
4444
:param other: ([float]) the distibution to compare with
4545
:return: (float) the KL divergence of the two distributions

stable_baselines/gail/model.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ class GAIL(TRPO):
1515
:param expert_dataset: (ExpertDataset) the dataset manager
1616
:param gamma: (float) the discount value
1717
:param timesteps_per_batch: (int) the number of timesteps to run per batch (horizon)
18-
:param max_kl: (float) the kullback leiber loss threashold
18+
:param max_kl: (float) the Kullback-Leibler loss threshold
1919
:param cg_iters: (int) the number of iterations for the conjugate gradient calculation
2020
:param lam: (float) GAE factor
2121
:param entcoeff: (float) the weight for the entropy loss

0 commit comments

Comments
 (0)