From 4da0908e8ac052970c3969a4daf59c6e66aa109e Mon Sep 17 00:00:00 2001 From: "Nota, Christopher" Date: Sun, 17 Mar 2024 15:48:53 -0400 Subject: [PATCH 1/9] update setup.py --- docs/source/guide/basic_concepts.rst | 2 +- docs/source/guide/benchmark_performance.rst | 2 +- setup.py | 8 ++++---- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/source/guide/basic_concepts.rst b/docs/source/guide/basic_concepts.rst index 13cbcb78..621463c3 100644 --- a/docs/source/guide/basic_concepts.rst +++ b/docs/source/guide/basic_concepts.rst @@ -324,7 +324,7 @@ The library contains an automatically plotting utility that generates appropriat This will generate a plot that looks like the following (after tweaking the whitespace through the ``matplotlib`` UI): -.. image:: ../../../benchmarks/atari40.png +.. image:: ../../../benchmarks/atari_40m.png An optional parameter is ``test_episodes``, which is set to 100 by default. After running for the given number of frames, the agent will be evaluated for a number of episodes specified by ``test_episodes`` with training disabled. diff --git a/docs/source/guide/benchmark_performance.rst b/docs/source/guide/benchmark_performance.rst index 831237cd..4d3be98a 100644 --- a/docs/source/guide/benchmark_performance.rst +++ b/docs/source/guide/benchmark_performance.rst @@ -43,7 +43,7 @@ our agents achieved very similar behavior to the agents tested by DeepMind. MuJoCo Benchmark ------------------ -`MuJoCo https://mujoco.org`_ is "a free and open source physics engine that aims to facilitate research and development in robotics, biomechanics, graphics and animation, and other areas where fast and accurate simulation is needed." +`MuJoCo `_ is "a free and open source physics engine that aims to facilitate research and development in robotics, biomechanics, graphics and animation, and other areas where fast and accurate simulation is needed." The MuJoCo Gym environments are a common benchmark in RL research for evaluating agents with continuous action spaces. We ran each continuous preset for 5 million timesteps (in this case, timesteps are equal to frames). The learning rate was decayed over the course of training using cosine annealing. diff --git a/setup.py b/setup.py index feb7c15a..ac0ee84d 100644 --- a/setup.py +++ b/setup.py @@ -26,10 +26,10 @@ "torch-testing==0.0.2", # pytorch assertion library ], "docs": [ - "sphinx~=3.2.1", # documentation library - "sphinx-autobuild~=2020.9.1", # documentation live reload - "sphinx-rtd-theme~=0.5.0", # documentation theme - "sphinx-automodapi~=0.13.0", # autogenerate docs for modules + "sphinx~=7.2.6", # documentation library + "sphinx-autobuild~=2024.2.4", # documentation live reload + "sphinx-rtd-theme~=2.0.0", # documentation theme + "sphinx-automodapi~=0.17.0", # autogenerate docs for modules ], } From 4b7551ec3376a956311d7e382df478fd932dfcf2 Mon Sep 17 00:00:00 2001 From: "Nota, Christopher" Date: Sun, 17 Mar 2024 15:53:28 -0400 Subject: [PATCH 2/9] fix sphinx warnings --- docs/source/conf.py | 2 +- docs/source/guide/basic_concepts.rst | 12 ++++++------ 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/source/conf.py b/docs/source/conf.py index 203248f9..8bdcb5c4 100644 --- a/docs/source/conf.py +++ b/docs/source/conf.py @@ -72,4 +72,4 @@ # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". -html_static_path = ['_static'] +# html_static_path = ['_static'] diff --git a/docs/source/guide/basic_concepts.rst b/docs/source/guide/basic_concepts.rst index 621463c3..912ea728 100644 --- a/docs/source/guide/basic_concepts.rst +++ b/docs/source/guide/basic_concepts.rst @@ -160,8 +160,8 @@ A few other quick things to note: ``f.no_grad(x)`` runs a forward pass with ``to ``f.target(x)`` calls the *target network* (an advanced concept used in algorithms such as DQN. For example, David Silver's `course notes `_) associated with the ``Approximation``, also with ``torch.no_grad()``. The ``autonomous-learning-library`` provides a few thin wrappers over ``Approximation`` for particular purposes, such as ``QNetwork``, ``VNetwork``, ``FeatureNetwork``, and several ``Policy`` implementations. -Environments ------------- +ALL Environments +---------------- The importance of the ``Environment`` in reinforcement learning nearly goes without saying. In the ``autonomous-learning-library``, the prepackaged environments are simply wrappers for `OpenAI Gym `_, the defacto standard library for RL environments. @@ -216,8 +216,8 @@ Of course, this control loop is not exactly feature-packed. Generally, it's better to use the ``Experiment`` module described later. -Presets -------- +ALL Presets +----------- In the ``autonomous-learning-library``, agents are *compositional*, which means that the behavior of a given ``Agent`` depends on the behavior of several other objects. Users can compose agents with specific behavior by passing appropriate objects into the constructor of the high-level algorithms contained in ``all.agents``. @@ -274,8 +274,8 @@ If a ``Preset`` is loaded from disk, then we can instansiate a test ``Agent`` us -Experiment ----------- +ALL Experiments +--------------- Finally, we have all of the components necessary to introduce the ``run_experiment`` helper function. ``run_experiment`` is the built-in control loop for running reinforcement learning experiment. From 2c5f519103cefad73149a2edc9a6be35d0aeded2 Mon Sep 17 00:00:00 2001 From: "Nota, Christopher" Date: Sun, 17 Mar 2024 16:06:25 -0400 Subject: [PATCH 3/9] edit handling of docs dependencies --- .readthedocs.yml | 1 + setup.py | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/.readthedocs.yml b/.readthedocs.yml index 5a2a5475..20a7039e 100644 --- a/.readthedocs.yml +++ b/.readthedocs.yml @@ -11,6 +11,7 @@ python: path: . extra_requirements: - docs + - dev sphinx: configuration: docs/source/conf.py diff --git a/setup.py b/setup.py index ac0ee84d..54b33f01 100644 --- a/setup.py +++ b/setup.py @@ -36,7 +36,7 @@ extras["all"] = ( extras["atari"] + extras["mujoco"] + extras["pybullet"] + extras["ma-atari"] ) -extras["dev"] = extras["all"] + extras["test"] + extras["docs"] +extras["dev"] = extras["all"] + extras["test"] setup( name="autonomous-learning-library", From 7a204eb3444dad82c9415bec808b1f97f7935947 Mon Sep 17 00:00:00 2001 From: "Nota, Christopher" Date: Sun, 17 Mar 2024 16:33:25 -0400 Subject: [PATCH 4/9] update docs to latest --- all/scripts/__init__.py | 14 ++++++++++++++ all/scripts/plot.py | 1 + docs/source/conf.py | 2 +- docs/source/guide/basic_concepts.rst | 11 +++++------ docs/source/guide/getting_started.rst | 12 ++++++------ docs/source/index.rst | 2 +- 6 files changed, 28 insertions(+), 14 deletions(-) diff --git a/all/scripts/__init__.py b/all/scripts/__init__.py index e69de29b..451cdde7 100644 --- a/all/scripts/__init__.py +++ b/all/scripts/__init__.py @@ -0,0 +1,14 @@ +from . import plot +from . import train_atari +from . import train_classic +from . import train_continuous +from . import train_mujoco +from . import train_multiagent_atari +from . import train_pybullet +from . import train +from . import watch_atari +from . import watch_classic +from . import watch_continuous +from . import watch_mujoco +from . import watch_multiagent_atari +from . import watch_pybullet diff --git a/all/scripts/plot.py b/all/scripts/plot.py index 0ba4472a..b6056d66 100644 --- a/all/scripts/plot.py +++ b/all/scripts/plot.py @@ -1,3 +1,4 @@ +"""Plot the results of experiments.""" import argparse from all.experiments import plot_returns_100 diff --git a/docs/source/conf.py b/docs/source/conf.py index 8bdcb5c4..332536d3 100644 --- a/docs/source/conf.py +++ b/docs/source/conf.py @@ -18,7 +18,7 @@ # -- Project information ----------------------------------------------------- project = 'autonomous-learning-library' -copyright = '2020, Chris Nota' +copyright = '2024, Chris Nota' author = 'Chris Nota' # The full version, including alpha/beta/rc tags diff --git a/docs/source/guide/basic_concepts.rst b/docs/source/guide/basic_concepts.rst index 912ea728..08ff00b4 100644 --- a/docs/source/guide/basic_concepts.rst +++ b/docs/source/guide/basic_concepts.rst @@ -164,7 +164,7 @@ ALL Environments ---------------- The importance of the ``Environment`` in reinforcement learning nearly goes without saying. -In the ``autonomous-learning-library``, the prepackaged environments are simply wrappers for `OpenAI Gym `_, the defacto standard library for RL environments. +In the ``autonomous-learning-library``, the prepackaged environments are simply wrappers for `Gymnasium `_ (formerly OpenAI Gym), the defacto standard library for RL environments. .. figure:: ./ale.png @@ -173,7 +173,7 @@ In the ``autonomous-learning-library``, the prepackaged environments are simply We add a few additional features: -1) ``gym`` primarily uses ``numpy.array`` for representing states and actions. We automatically convert to and from ``torch.Tensor`` objects so that agent implemenetations need not consider the difference. +1) ``gymnasium`` primarily uses ``numpy.array`` for representing states and actions. We automatically convert to and from ``torch.Tensor`` objects so that agent implemenetations need not consider the difference. 2) We add properties to the environment for ``state``, ``reward``, etc. This simplifies the control loop and is generally useful. 3) We apply common preprocessors, such as several standard Atari wrappers. However, where possible, we prefer to perform preprocessing using ``Body`` objects to maximize the flexibility of the agents. @@ -181,7 +181,7 @@ Below, we show how several different types of environments can be created: .. code-block:: python - from all.environments import AtariEnvironment, GymEnvironment, PybulletEnvironment + from all.environments import AtariEnvironment, GymEnvironment, MujocoEnvironment # create an Atari environment on the gpu env = AtariEnvironment('Breakout', device='cuda') @@ -190,7 +190,7 @@ Below, we show how several different types of environments can be created: env = GymEnvironment('CartPole-v0') # create a PyBullet environment on the cpu - env = PybulletEnvironment('cheetah') + env = MujocoEnvironment('HalfCheetah-v4') Now we can write our first control loop: @@ -284,7 +284,6 @@ Here is a quick example: .. code-block:: python - from gym import envs from all.experiments import run_experiment from all.presets import atari from all.environments import AtariEnvironment @@ -313,7 +312,7 @@ You can view the results in ``tensorboard`` by running the following command: tensorboard --logdir runs -In addition to the ``tensorboard`` logs, every 100 episodes, the mean and standard deviation of the previous 100 episode returns are written to ``runs/[agent]/[env]/returns100.csv``. +In addition to the ``tensorboard`` logs, every 100 episodes, the mean, standard deviation, min, and max of the previous 100 episode returns are written to ``runs/[agent]/[env]/returns100.csv``. This is much faster to read and plot than Tensorboard's proprietary format. The library contains an automatically plotting utility that generates appropriate plots for an *entire* ``runs`` directory as follows: diff --git a/docs/source/guide/getting_started.rst b/docs/source/guide/getting_started.rst index 34caa4df..cd666d50 100644 --- a/docs/source/guide/getting_started.rst +++ b/docs/source/guide/getting_started.rst @@ -4,9 +4,9 @@ Getting Started Prerequisites ------------- -The Autonomous Learning Library requires a recent version of PyTorch (~=1.8.0 recommended). +The Autonomous Learning Library requires a recent version of PyTorch (at least v2.2.0 is recommended). Additionally, Tensorboard is required in order to enable logging. -We also strongly recommend using a machine with a fast GPU (at minimum a GTX 970 or better, a GTX 1080ti or better is preferred). +We also strongly recommend using a machine with a fast GPU with at least 11 GB of VRAM (a GTX 1080ti or better is preferred). Installation ------------ @@ -35,7 +35,7 @@ An alternate approach, that may be useful when following this tutorial, is to in cd autonomous-learning-library pip install -e .[dev] -``dev`` will install all of the optional dependencies for developers of the repo, such as unit test and documentation dependencies, as well as all environments. +``dev`` will install all of the optional dependencies for developers of the repo, such as unit test dependencies, as well as all environments. If you chose to clone the repository, you can test your installation by running the unit test suite: .. code-block:: bash @@ -56,7 +56,7 @@ For example, a PPO agent can be run on Cart-Pole as follows: all-classic CartPole-v0 a2c -The results will be written to ``runs/a2c__``, where ```` and ```` are strings generated by the library. +The results will be written to ``runs/a2c_CartPole-v0_``, ```` is generated by the library. You can view these results and other information through `tensorboard`: .. code-block:: bash @@ -84,9 +84,9 @@ Finally, to watch the trained model in action, we provide a `watch` scripts for .. code-block:: bash - all-watch-classic CartPole-v0 runs/a2c__/preset.pt + all-watch-classic CartPole-v0 runs/a2c_CartPole-v0_/preset.pt You need to find the by checking the ``runs`` directory. Each of these scripts can be found the ``scripts`` directory of the main repository. -Be sure to check out the ``atari`` and ``continuous`` scripts for more fun! +Be sure to check out the ``atari`` and ``mujoco`` scripts for more fun! diff --git a/docs/source/index.rst b/docs/source/index.rst index f3b311d2..a2218842 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -26,7 +26,7 @@ Enjoy! guide/benchmark_performance .. toctree:: - :maxdepth: 4 + :maxdepth: 1 :caption: Modules: modules/agents From ea0bc8d79e939b2059f7f064938b4fd0405d9744 Mon Sep 17 00:00:00 2001 From: "Nota, Christopher" Date: Sun, 17 Mar 2024 17:04:29 -0400 Subject: [PATCH 5/9] run formatter --- all/scripts/__init__.py | 30 ++++++++++++++++-------------- all/scripts/plot.py | 1 - 2 files changed, 16 insertions(+), 15 deletions(-) diff --git a/all/scripts/__init__.py b/all/scripts/__init__.py index 451cdde7..04462d2f 100644 --- a/all/scripts/__init__.py +++ b/all/scripts/__init__.py @@ -1,14 +1,16 @@ -from . import plot -from . import train_atari -from . import train_classic -from . import train_continuous -from . import train_mujoco -from . import train_multiagent_atari -from . import train_pybullet -from . import train -from . import watch_atari -from . import watch_classic -from . import watch_continuous -from . import watch_mujoco -from . import watch_multiagent_atari -from . import watch_pybullet +from . import ( + plot, + train, + train_atari, + train_classic, + train_continuous, + train_mujoco, + train_multiagent_atari, + train_pybullet, + watch_atari, + watch_classic, + watch_continuous, + watch_mujoco, + watch_multiagent_atari, + watch_pybullet, +) diff --git a/all/scripts/plot.py b/all/scripts/plot.py index b6056d66..0ba4472a 100644 --- a/all/scripts/plot.py +++ b/all/scripts/plot.py @@ -1,4 +1,3 @@ -"""Plot the results of experiments.""" import argparse from all.experiments import plot_returns_100 From 93d09747b1962c508c2566487e2f6ec70fbb13b3 Mon Sep 17 00:00:00 2001 From: "Nota, Christopher" Date: Sun, 17 Mar 2024 17:06:09 -0400 Subject: [PATCH 6/9] correct getting started doc --- docs/source/guide/getting_started.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/source/guide/getting_started.rst b/docs/source/guide/getting_started.rst index cd666d50..2166d105 100644 --- a/docs/source/guide/getting_started.rst +++ b/docs/source/guide/getting_started.rst @@ -50,7 +50,7 @@ Running a Preset Agent The goal of the Autonomous Learning Library is to provide components for building new agents. However, the library also includes a number of "preset" agent configurations for easy benchmarking and comparison, as well as some useful scripts. -For example, a PPO agent can be run on Cart-Pole as follows: +For example, an a2c agent can be run on CartPole as follows: .. code-block:: bash @@ -89,4 +89,4 @@ Finally, to watch the trained model in action, we provide a `watch` scripts for You need to find the by checking the ``runs`` directory. Each of these scripts can be found the ``scripts`` directory of the main repository. -Be sure to check out the ``atari`` and ``mujoco`` scripts for more fun! +Be sure to check out the ``all-atari`` and ``all-mujoco`` scripts for more fun! From f9b5a7d2bd013797c952a1e041ebe6dab2c19d3b Mon Sep 17 00:00:00 2001 From: "Nota, Christopher" Date: Sun, 17 Mar 2024 17:08:49 -0400 Subject: [PATCH 7/9] remove script imports --- all/scripts/__init__.py | 16 ---------------- 1 file changed, 16 deletions(-) diff --git a/all/scripts/__init__.py b/all/scripts/__init__.py index 04462d2f..e69de29b 100644 --- a/all/scripts/__init__.py +++ b/all/scripts/__init__.py @@ -1,16 +0,0 @@ -from . import ( - plot, - train, - train_atari, - train_classic, - train_continuous, - train_mujoco, - train_multiagent_atari, - train_pybullet, - watch_atari, - watch_classic, - watch_continuous, - watch_mujoco, - watch_multiagent_atari, - watch_pybullet, -) From 385eec8d22326afe292818e4ba984dbc758c18bd Mon Sep 17 00:00:00 2001 From: "Nota, Christopher" Date: Sun, 17 Mar 2024 17:10:11 -0400 Subject: [PATCH 8/9] remove dev installation from .readthedocs.yml --- .readthedocs.yml | 1 - 1 file changed, 1 deletion(-) diff --git a/.readthedocs.yml b/.readthedocs.yml index 20a7039e..5a2a5475 100644 --- a/.readthedocs.yml +++ b/.readthedocs.yml @@ -11,7 +11,6 @@ python: path: . extra_requirements: - docs - - dev sphinx: configuration: docs/source/conf.py From 57313b1c24cb5047c03acb9d571cb304d7f05957 Mon Sep 17 00:00:00 2001 From: "Nota, Christopher" Date: Sun, 17 Mar 2024 17:15:18 -0400 Subject: [PATCH 9/9] fix localhost link --- docs/source/guide/getting_started.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/guide/getting_started.rst b/docs/source/guide/getting_started.rst index 2166d105..81dc7e11 100644 --- a/docs/source/guide/getting_started.rst +++ b/docs/source/guide/getting_started.rst @@ -63,7 +63,7 @@ You can view these results and other information through `tensorboard`: tensorboard --logdir runs -By opening your browser to , you should see a dashboard that looks something like the following (you may need to adjust the "smoothing" parameter): +By opening your browser to `http://localhost:6006`_, you should see a dashboard that looks something like the following (you may need to adjust the "smoothing" parameter): .. image:: tensorboard.png