Skip to content

Commit 55e810a

Browse files
committed
Switch to Markdown documentation (MyST parser)
1 parent d42f915 commit 55e810a

30 files changed

Lines changed: 732 additions & 781 deletions

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
### Bug fixes
1313

1414
### Documentation
15+
- Switched to Markdown documentation (using MyST parser)
1516

1617
### Other
1718

docs/conda_env.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,3 +24,4 @@ dependencies:
2424
- tqdm
2525
- pyyaml>=5.1
2626
- pytablewriter==1.2.0
27+
- myst-parser>=4,<6

docs/conf.py

Lines changed: 22 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,7 @@
7070
"sphinx.ext.viewcode",
7171
# 'sphinx.ext.intersphinx',
7272
# 'sphinx.ext.doctest'
73+
"myst_parser",
7374
]
7475

7576
autodoc_typehints = "description"
@@ -86,8 +87,7 @@
8687
# The suffix(es) of source filenames.
8788
# You can specify multiple suffix as a list of string:
8889
#
89-
# source_suffix = ['.rst', '.md']
90-
source_suffix = ".rst"
90+
source_suffix = [".rst", ".md"]
9191

9292
# The master toctree document.
9393
master_doc = "index"
@@ -102,7 +102,7 @@
102102
# List of patterns, relative to source directory, that match files and
103103
# directories to ignore when looking for source files.
104104
# This pattern also affects html_static_path and html_extra_path .
105-
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
105+
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store", "README.md"]
106106

107107
# The name of the Pygments (syntax highlighting) style to use.
108108
pygments_style = "sphinx"
@@ -200,6 +200,25 @@ def setup(app):
200200

201201
# -- Extension configuration -------------------------------------------------
202202

203+
myst_heading_anchors = 4
204+
# See: https://myst-parser.readthedocs.io/en/latest/syntax/optional.html
205+
myst_enable_extensions = [
206+
# "amsmath",
207+
"attrs_inline",
208+
"colon_fence",
209+
"deflist",
210+
"dollarmath",
211+
"fieldlist",
212+
# "html_admonition",
213+
"html_image",
214+
# "linkify",
215+
# "replacements",
216+
# "smartquotes",
217+
# "strikethrough",
218+
"substitution",
219+
# "tasklist",
220+
]
221+
203222
# Example configuration for intersphinx: refer to the Python standard library.
204223
# intersphinx_mapping = {
205224
# 'python': ('https://docs.python.org/3/', None),

docs/guide/config.md

Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
(config)=
2+
3+
# Configuration
4+
5+
## Hyperparameter yaml syntax
6+
7+
The syntax used in `hyperparameters/algo_name.yml` for setting
8+
hyperparameters (likewise the syntax to [overwrite
9+
hyperparameters](https://github.com/DLR-RM/rl-baselines3-zoo#overwrite-hyperparameters)
10+
on the cli) may be specialized if the argument is a function. See
11+
examples in the `hyperparameters/` directory. For example:
12+
13+
- Specify a linear schedule for the learning rate:
14+
15+
```yaml
16+
learning_rate: lin_0.012486195510232303
17+
```
18+
19+
Specify a different activation function for the network:
20+
21+
```yaml
22+
policy_kwargs: "dict(activation_fn=nn.ReLU)"
23+
```
24+
25+
For a custom policy:
26+
27+
```yaml
28+
policy: my_package.MyCustomPolicy # for instance stable_baselines3.ppo.MlpPolicy
29+
```
30+
31+
## Env Normalization
32+
33+
In the hyperparameter file, `normalize: True` means that the training
34+
environment will be wrapped in a
35+
[VecNormalize](https://github.com/DLR-RM/stable-baselines3/blob/master/stable_baselines3/common/vec_env/vec_normalize.py#L13)
36+
wrapper.
37+
38+
[Normalization
39+
uses](https://github.com/DLR-RM/rl-baselines3-zoo/issues/64) the
40+
default parameters of `VecNormalize`, with the exception of `gamma`
41+
which is set to match that of the agent. This can be
42+
[overridden](https://github.com/DLR-RM/rl-baselines3-zoo/blob/v0.10.0/hyperparams/sac.yml#L239)
43+
using the appropriate `hyperparameters/algo_name.yml`, e.g.
44+
45+
```yaml
46+
normalize: "{'norm_obs': True, 'norm_reward': False}"
47+
```
48+
49+
## Env Wrappers
50+
51+
You can specify in the hyperparameter config one or more wrapper to use
52+
around the environment:
53+
54+
for one wrapper:
55+
56+
```yaml
57+
env_wrapper: gym_minigrid.wrappers.FlatObsWrapper
58+
```
59+
60+
for multiple, specify a list:
61+
62+
```yaml
63+
env_wrapper:
64+
- rl_zoo3.wrappers.TruncatedOnSuccessWrapper:
65+
reward_offset: 1.0
66+
- sb3_contrib.common.wrappers.TimeFeatureWrapper
67+
```
68+
69+
Note that you can easily specify parameters too.
70+
71+
By default, the environment is wrapped with a `Monitor` wrapper to
72+
record episode statistics. You can specify arguments to it using
73+
`monitor_kwargs` parameter to log additional data. That data *must* be
74+
present in the info dictionary at the last step of each episode.
75+
76+
For instance, for recording success with goal envs
77+
(e.g. `FetchReach-v1`):
78+
79+
```yaml
80+
monitor_kwargs: dict(info_keywords=('is_success',))
81+
```
82+
83+
or recording final x position with `Ant-v3`:
84+
85+
```yaml
86+
monitor_kwargs: dict(info_keywords=('x_position',))
87+
```
88+
89+
Note: for known `GoalEnv` like `FetchReach`,
90+
`info_keywords=('is_success',)` is actually the default.
91+
92+
You can also specify environment keyword arguments with:
93+
94+
```yaml
95+
env_kwargs:
96+
gravity: 0.0
97+
```
98+
99+
## VecEnvWrapper
100+
101+
You can specify which `VecEnvWrapper` to use in the config, the same
102+
way as for env wrappers (see above), using the `vec_env_wrapper` key:
103+
104+
For instance:
105+
106+
```yaml
107+
vec_env_wrapper: stable_baselines3.common.vec_env.VecMonitor
108+
```
109+
110+
Note: `VecNormalize` is supported separately using `normalize`
111+
keyword, and `VecFrameStack` has a dedicated keyword `frame_stack`.
112+
113+
## Callbacks
114+
115+
Following the same syntax as env wrappers, you can also add custom
116+
callbacks to use during training.
117+
118+
```yaml
119+
callback:
120+
- rl_zoo3.callbacks.ParallelTrainCallback:
121+
gradient_steps: 256
122+
```

docs/guide/config.rst

Lines changed: 0 additions & 129 deletions
This file was deleted.

docs/guide/custom_env.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
(custom)=
2+
3+
# Custom Environment
4+
5+
The easiest way to add support for a custom environment is to edit
6+
`rl_zoo3/import_envs.py` and register your environment here. Then, you
7+
need to add a section for it in the hyperparameters file
8+
(`hyperparams/algo.yml` or a custom yaml file that you can specify
9+
using `--conf-file` argument).

docs/guide/custom_env.rst

Lines changed: 0 additions & 11 deletions
This file was deleted.

0 commit comments

Comments
 (0)