Skip to content

Commit 9e2b3cb

Browse files
Improve the tutorial rendering (#1353)
1 parent 8d5fd4f commit 9e2b3cb

12 files changed

+49
-40
lines changed

Diff for: docs/conf.py

+25-2
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,11 @@
1212

1313
# -- Project information -----------------------------------------------------
1414
import os
15-
import re
1615
import sys
1716
import time
1817

1918
import sphinx_gallery.gen_rst
19+
import sphinx_gallery.sorting
2020
from furo.gen_tutorials import generate_tutorials
2121

2222

@@ -123,10 +123,30 @@ def setup(app):
123123
124124
.. rst-class:: sphx-glr-example-title
125125
126+
.. note::
127+
This example is compatible with Gymnasium version |release|.
128+
126129
.. _sphx_glr_{1}:
127130
128131
"""
129132

133+
tutorial_sorting = {
134+
"tutorials/gymnasium_basics": [
135+
"environment_creation",
136+
"implementing_custom_wrappers",
137+
"handling_time_limits",
138+
"load_quadruped_model",
139+
"*",
140+
],
141+
"tutorials/training_agents": [
142+
"blackjack_q_learning",
143+
"frozenlake_q_learning",
144+
"mujoco_reinforce",
145+
"vector_a2c",
146+
"*",
147+
],
148+
}
149+
130150
sphinx_gallery_conf = {
131151
"ignore_pattern": r"__init__\.py",
132152
"examples_dirs": "./tutorials",
@@ -135,10 +155,13 @@ def setup(app):
135155
"show_signature": False,
136156
"show_memory": False,
137157
"min_reported_time": float("inf"),
138-
"filename_pattern": f"{re.escape(os.sep)}run_",
158+
# "filename_pattern": f"{re.escape(os.sep)}run_",
139159
"default_thumb_file": os.path.join(
140160
os.path.dirname(__file__), "_static/img/gymnasium-github.png"
141161
),
162+
# order the tutorial presentation order
163+
"within_subsection_order": sphinx_gallery.sorting.FileNameSortKey,
164+
"subsection_order": lambda folder: tutorial_sorting[folder],
142165
}
143166

144167
# All tutorials in the tutorials directory will be generated automatically

Diff for: docs/tutorials/README.rst

-7
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,2 @@
11
Tutorials
22
=========
3-
4-
We provide two sets of tutorials: basics and training.
5-
6-
* The aim of the basics tutorials is to showcase the fundamental API of Gymnasium to help users implement it
7-
* The most common application of Gymnasium is for training RL agents, the training tutorials aim to show a range of example implementations for different environments
8-
9-
Additionally, we provide the third party tutorials as a link for external projects that utilise Gymnasium that could help users.

Diff for: docs/tutorials/gymnasium_basics/README.rst

+3-7
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,6 @@
11
Gymnasium Basics
2-
----------------
2+
================
33

4-
.. toctree::
5-
:hidden:
4+
.. _gallery_section_name:
65

7-
environment_creation
8-
implementing_custom_wrappers
9-
handling_time_limits
10-
load_quadruped_model
6+
The aim of these tutorials is to showcase the fundamental API of Gymnasium to help users implement it

Diff for: docs/tutorials/gymnasium_basics/environment_creation.py

+1-4
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,7 @@
33
Make your own custom environment
44
================================
55
6-
This documentation overviews creating new environments and relevant
7-
useful wrappers, utilities and tests included in Gymnasium designed for
8-
the creation of new environments.
9-
6+
This tutorial shows how to create new environment and links to relevant useful wrappers, utilities and tests included in Gymnasium.
107
118
Setup
129
------

Diff for: docs/tutorials/gymnasium_basics/handling_time_limits.py

+4-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,10 @@
22
Handling Time Limits
33
====================
44
5-
In using Gymnasium environments with reinforcement learning code, a common problem observed is how time limits are incorrectly handled. The ``done`` signal received (in previous versions of OpenAI Gym < 0.26) from ``env.step`` indicated whether an episode has ended. However, this signal did not distinguish whether the episode ended due to ``termination`` or ``truncation``.
5+
This tutorial explains how time limits should be correctly handled with `termination` and `truncation` signals.
6+
7+
The ``done`` signal received (in previous versions of OpenAI Gym < 0.26) from ``env.step`` indicated whether an episode has ended.
8+
However, this signal did not distinguish whether the episode ended due to ``termination`` or ``truncation``.
69
710
Termination
811
-----------

Diff for: docs/tutorials/gymnasium_basics/implementing_custom_wrappers.py

+1
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
============================
44
55
In this tutorial we will describe how to implement your own custom wrappers.
6+
67
Wrappers are a great way to add functionality to your environments in a modular way.
78
This will save you a lot of boilerplate code.
89

Diff for: docs/tutorials/gymnasium_basics/load_quadruped_model.py

+1-2
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,7 @@
22
Load custom quadruped robot environments
33
========================================
44
5-
In this tutorial we will see how to use the `MuJoCo/Ant-v5` framework to create a quadruped walking environment,
6-
using a model file (ending in `.xml`) without having to create a new class.
5+
In this tutorial create a mujoco quadruped walking environment using a model file (ending in `.xml`) without having to create a new class.
76
87
Steps:
98

Diff for: docs/tutorials/training_agents/README.rst

+3-7
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,6 @@
11
Training Agents
2-
---------------
2+
===============
33

4-
.. toctree::
5-
:hidden:
4+
.. _gallery_section_name:
65

7-
blackjack_q_learning
8-
frozenlake_q_learning
9-
mujoco_reinforce
10-
vector_a2c
6+
The most common application of Gymnasium is for training RL agents. Therefore, these tutorials aim to show a range of example implementations for different environments.

Diff for: docs/tutorials/training_agents/blackjack_q_learning.py

+3-2
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
"""
2-
Solving Blackjack with Q-Learning
3-
=================================
2+
Solving Blackjack with Tabular Q-Learning
3+
=========================================
44
5+
This tutorial trains an agent for BlackJack using tabular Q-learning.
56
"""
67

78
# %%

Diff for: docs/tutorials/training_agents/frozenlake_q_learning.py

+3-2
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
"""
2-
Frozenlake benchmark
3-
====================
2+
Solving Frozenlake with Tabular Q-Learning
3+
==========================================
44
5+
This tutorial trains an agent for FrozenLake using tabular Q-learning.
56
"""
67

78
# %%

Diff for: docs/tutorials/training_agents/mujoco_reinforce.py

+1-3
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,7 @@
77
:width: 400
88
:alt: agent-environment-diagram
99
10-
This tutorial serves 2 purposes:
11-
1. To understand how to implement REINFORCE [1] from scratch to solve Mujoco's InvertedPendulum-v4
12-
2. Implementation a deep reinforcement learning algorithm with Gymnasium's v0.26+ `step()` function
10+
This tutorial implements REINFORCE with neural networks for a MuJoCo environment.
1311
1412
We will be using **REINFORCE**, one of the earliest policy gradient methods. Unlike going under the burden of learning a value function first and then deriving a policy out of it,
1513
REINFORCE optimizes the policy directly. In other words, it is trained to maximize the probability of Monte-Carlo returns. More on that later.

Diff for: docs/tutorials/training_agents/vector_a2c.py

+4-3
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,15 @@
11
"""
2-
Training A2C with Vector Envs and Domain Randomization
3-
======================================================
2+
Speeding up A2C Training with Vector Envs
3+
=========================================
44
5+
This tutorial demonstrates training with vector environments to it speed up.
56
"""
67

78
# %%
89
# Notice
910
# ------
1011
#
11-
# If you encounter an RuntimeError like the following comment raised on multiprocessing/spawn.py, wrap up the code from ``gym.vector.make=`` or ``gym.vector.AsyncVectorEnv`` to the end of the code by ``if__name__ == '__main__'``.
12+
# If you encounter an RuntimeError like the following comment raised on multiprocessing/spawn.py, wrap up the code from ``gym.make_vec=`` or ``gym.vector.AsyncVectorEnv`` to the end of the code by ``if__name__ == '__main__'``.
1213
#
1314
# ``An attempt has been made to start a new process before the current process has finished its bootstrapping phase.``
1415
#

0 commit comments

Comments
 (0)