DLR-RM
diff --git a/‎CHANGELOG.md‎
Lines changed: 1 addition & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/conda_env.yml‎
Lines changed: 1 addition & 0 deletions b/‎docs/conda_env.yml‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/conf.py‎
Lines changed: 22 additions & 3 deletions b/‎docs/conf.py‎
Lines changed: 22 additions & 3 deletions
diff --git a/‎docs/guide/config.md‎
Lines changed: 122 additions & 0 deletions b/‎docs/guide/config.md‎
Lines changed: 122 additions & 0 deletions
diff --git a/‎docs/guide/config.rst‎
Lines changed: 0 additions & 129 deletions b/‎docs/guide/config.rst‎
Lines changed: 0 additions & 129 deletions
diff --git a/‎docs/guide/custom_env.md‎
Lines changed: 9 additions & 0 deletions b/‎docs/guide/custom_env.md‎
Lines changed: 9 additions & 0 deletions
diff --git a/‎docs/guide/custom_env.rst‎
Lines changed: 0 additions & 11 deletions b/‎docs/guide/custom_env.rst‎
Lines changed: 0 additions & 11 deletions
@@ -12,6 +12,7 @@
 ### Bug fixes
 
 ### Documentation
+- Switched to Markdown documentation (using MyST parser)
 
 ### Other
 
 
@@ -24,3 +24,4 @@ dependencies:
     - tqdm
     - pyyaml>=5.1
     - pytablewriter==1.2.0
+    - myst-parser>=4,<6
@@ -70,6 +70,7 @@
     "sphinx.ext.viewcode",
     # 'sphinx.ext.intersphinx',
     # 'sphinx.ext.doctest'
+    "myst_parser",
 ]
 
 autodoc_typehints = "description"
@@ -86,8 +87,7 @@
 # The suffix(es) of source filenames.
 # You can specify multiple suffix as a list of string:
 #
-# source_suffix = ['.rst', '.md']
-source_suffix = ".rst"
+source_suffix = [".rst", ".md"]
 
 # The master toctree document.
 master_doc = "index"
@@ -102,7 +102,7 @@
 # List of patterns, relative to source directory, that match files and
 # directories to ignore when looking for source files.
 # This pattern also affects html_static_path and html_extra_path .
-exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
+exclude_patterns = ["_build", "Thumbs.db", ".DS_Store", "README.md"]
 
 # The name of the Pygments (syntax highlighting) style to use.
 pygments_style = "sphinx"
@@ -200,6 +200,25 @@ def setup(app):
 
 # -- Extension configuration -------------------------------------------------
 
+myst_heading_anchors = 4
+# See: https://myst-parser.readthedocs.io/en/latest/syntax/optional.html
+myst_enable_extensions = [
+    # "amsmath",
+    "attrs_inline",
+    "colon_fence",
+    "deflist",
+    "dollarmath",
+    "fieldlist",
+    # "html_admonition",
+    "html_image",
+    # "linkify",
+    # "replacements",
+    # "smartquotes",
+    # "strikethrough",
+    "substitution",
+    # "tasklist",
+]
+
 # Example configuration for intersphinx: refer to the Python standard library.
 # intersphinx_mapping = {
 #     'python': ('https://docs.python.org/3/', None),
 
@@ -0,0 +1,122 @@
+(config)=
+
+# Configuration
+
+## Hyperparameter yaml syntax
+
+The syntax used in `hyperparameters/algo_name.yml` for setting
+hyperparameters (likewise the syntax to [overwrite
+hyperparameters](https://github.com/DLR-RM/rl-baselines3-zoo#overwrite-hyperparameters)
+on the cli) may be specialized if the argument is a function. See
+examples in the `hyperparameters/` directory. For example:
+
+- Specify a linear schedule for the learning rate:
+
+```yaml
+learning_rate: lin_0.012486195510232303
+```
+
+Specify a different activation function for the network:
+
+```yaml
+policy_kwargs: "dict(activation_fn=nn.ReLU)"
+```
+
+For a custom policy:
+
+```yaml
+policy: my_package.MyCustomPolicy  # for instance stable_baselines3.ppo.MlpPolicy
+```
+
+## Env Normalization
+
+In the hyperparameter file, `normalize: True` means that the training
+environment will be wrapped in a
+[VecNormalize](https://github.com/DLR-RM/stable-baselines3/blob/master/stable_baselines3/common/vec_env/vec_normalize.py#L13)
+wrapper.
+
+[Normalization
+uses](https://github.com/DLR-RM/rl-baselines3-zoo/issues/64) the
+default parameters of `VecNormalize`, with the exception of `gamma`
+which is set to match that of the agent. This can be
+[overridden](https://github.com/DLR-RM/rl-baselines3-zoo/blob/v0.10.0/hyperparams/sac.yml#L239)
+using the appropriate `hyperparameters/algo_name.yml`, e.g.
+
+```yaml
+normalize: "{'norm_obs': True, 'norm_reward': False}"
+```
+
+## Env Wrappers
+
+You can specify in the hyperparameter config one or more wrapper to use
+around the environment:
+
+for one wrapper:
+
+```yaml
+env_wrapper: gym_minigrid.wrappers.FlatObsWrapper
+```
+
+for multiple, specify a list:
+
+```yaml
+env_wrapper:
+    - rl_zoo3.wrappers.TruncatedOnSuccessWrapper:
+        reward_offset: 1.0
+    - sb3_contrib.common.wrappers.TimeFeatureWrapper
+```
+
+Note that you can easily specify parameters too.
+
+By default, the environment is wrapped with a `Monitor` wrapper to
+record episode statistics. You can specify arguments to it using
+`monitor_kwargs` parameter to log additional data. That data *must* be
+present in the info dictionary at the last step of each episode.
+
+For instance, for recording success with goal envs
+(e.g. `FetchReach-v1`):
+
+```yaml
+monitor_kwargs: dict(info_keywords=('is_success',))
+```
+
+or recording final x position with `Ant-v3`:
+
+```yaml
+monitor_kwargs: dict(info_keywords=('x_position',))
+```
+
+Note: for known `GoalEnv` like `FetchReach`,
+`info_keywords=('is_success',)` is actually the default.
+
+You can also specify environment keyword arguments with:
+
+```yaml
+env_kwargs:
+  gravity: 0.0
+```
+
+## VecEnvWrapper
+
+You can specify which `VecEnvWrapper` to use in the config, the same
+way as for env wrappers (see above), using the `vec_env_wrapper` key:
+
+For instance:
+
+```yaml
+vec_env_wrapper: stable_baselines3.common.vec_env.VecMonitor
+```
+
+Note: `VecNormalize` is supported separately using `normalize`
+keyword, and `VecFrameStack` has a dedicated keyword `frame_stack`.
+
+## Callbacks
+
+Following the same syntax as env wrappers, you can also add custom
+callbacks to use during training.
+
+```yaml
+callback:
+  - rl_zoo3.callbacks.ParallelTrainCallback:
+      gradient_steps: 256
+```
@@ -0,0 +1,9 @@
+(custom)=
+
+# Custom Environment
+
+The easiest way to add support for a custom environment is to edit
+`rl_zoo3/import_envs.py` and register your environment here. Then, you
+need to add a section for it in the hyperparameters file
+(`hyperparams/algo.yml` or a custom yaml file that you can specify
+using `--conf-file` argument).