Skip to content

Commit 42f2514

Browse files
authored
Merge pull request #124 from cpnota/release/0.4.0
Release/0.4.0
2 parents 4575eca + ba03123 commit 42f2514

File tree

124 files changed

+2992
-1730
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

124 files changed

+2992
-1730
lines changed

.gitignore

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,11 @@
11
# python
22
*.pyc
33
__pycache__
4-
all.egg-info
4+
autonomous_learning_library.egg-info
5+
6+
# build directories
7+
/build
8+
/dist
59

610
# editor
711
.vscode

.pylintrc

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -330,7 +330,7 @@ indent-after-paren=4
330330
indent-string=' '
331331

332332
# Maximum number of characters on a single line.
333-
max-line-length=100
333+
max-line-length=120
334334

335335
# Maximum number of lines in a module.
336336
max-module-lines=1000
@@ -437,6 +437,7 @@ good-names=i,
437437
n,
438438
t,
439439
e,
440+
u,
440441
kl,
441442
ax
442443

.readthedocs.yml

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# .readthedocs.yml
2+
# Read the Docs configuration file
3+
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
4+
5+
# Required
6+
version: 2
7+
8+
# Build documentation in the docs/ directory with Sphinx
9+
sphinx:
10+
configuration: docs/source/conf.py
11+
12+
# Build documentation with MkDocs
13+
#mkdocs:
14+
# configuration: mkdocs.yml
15+
16+
# Optionally build your docs in additional formats such as PDF and ePub
17+
formats: all
18+
19+
# Optionally set the version of Python and requirements required to build your docs
20+
python:
21+
version: 3.7
22+
install:
23+
- method: pip
24+
path: .
25+
extra_requirements:
26+
- docs

CONTRIBUTING.md

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# Contributing
2+
3+
Contributions and suggestions are welcome!
4+
If you are interested in contributing either bug fixes or new features, open an issue and we can talk about it!
5+
New PRs will require:
6+
7+
1. New unit tests for any new or changed common module, and all unit tests should pass.
8+
2. All code should follow a similar style to the rest of the repository and the linter should pass.
9+
3. Documentation of new features.
10+
4. Manual approval.
11+
12+
13+
We use the [GitFlow](https://datasift.github.io/gitflow/IntroducingGitFlow.html) model, meaning that all PRs should be opened against the `develop` branch!
14+
To begin, you can run the following commands:
15+
16+
```
17+
git clone https://github.com/cpnota/autonomous-learning-library.git
18+
cd autonomous-learning-library
19+
git checkout develop
20+
pip install -e .[docs]
21+
```
22+
23+
The unit tests may be run using:
24+
25+
```
26+
make test
27+
```
28+
29+
Finally, you rebuild the documentation using:
30+
31+
```
32+
cd docs
33+
make clean && make html
34+
```
35+
36+
Happy hacking!

Makefile

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
install:
2-
pip install https://download.pytorch.org/whl/cu100/torch-1.1.0-cp37-cp37m-linux_x86_64.whl
3-
pip install https://download.pytorch.org/whl/cu100/torchvision-0.3.0-cp37-cp37m-linux_x86_64.whl
4-
pip install tensorflow
2+
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
3+
pip install tensorboard
54
pip install -e .
65

76
lint:
@@ -15,3 +14,13 @@ tensorboard:
1514

1615
benchmark:
1716
tensorboard --logdir benchmarks/runs --port=6007
17+
18+
clean:
19+
rm -rf dist
20+
rm -rf build
21+
22+
build: clean
23+
python setup.py sdist bdist_wheel
24+
25+
deploy: lint test build
26+
twine upload dist/*

README.md

Lines changed: 45 additions & 127 deletions
Original file line numberDiff line numberDiff line change
@@ -1,36 +1,30 @@
1-
# The Autonomous Learning Library: An Object-Oriented Deep Reinforcement Learning Library in Pytorch
2-
3-
The Autonomous Learning Library (`all`) is an object-oriented deep reinforcement learning library in `pytorch`. The goal of the library is to provide implementations of modern reinforcement learning algorithms that reflect the way that reinforcement learning researchers think about agent design and to provide the components necessary to build and test new ideas with minimal overhead.
4-
5-
## Why use `all`?
6-
7-
The primary reason for using `all` over its many competitors is because it contains components that allow you to *build your own* reinforcement learning agents.
8-
We provide out-of-the-box modules for:
9-
10-
- [x] Custom Q-Networks, V-Networks, policy networks, and feature networks
11-
- [x] Generic function approximation
12-
- [x] Target networks
13-
- [x] Polyak averaging
14-
- [x] Experience Replay
15-
- [x] Prioritized Experience Replay
16-
- [x] Advantage Estimation
17-
- [x] Generalized Advantage Estimation (GAE)
18-
- [x] Easy parameter and learning rate scheduling
19-
- [x] An enhanced `nn` module (includes dueling layers, noisy layers, action bounds, and the coveted `nn.Flatten`)
20-
- [x] `gym` to `pytorch` wrappers
21-
- [x] Atari wrappers
22-
- [x] An `Experiment` API for comparing and evaluating agents
23-
- [x] A `SlurmExperiment` API for running massive experiments on computing clusters
24-
- [x] A `Writer` object for easily logging information in `tensorboard`
25-
- [x] Plotting utilities for generating paper-worthy result plots
26-
27-
Rather than being embedded in the agents, all of these modules are available for use by your own custom agents.
28-
Additionally, the included agents accept custom versions of any of the above objects.
29-
Have a new type of replay buffer in mind?
30-
Code it up and pass it directly to our `DQN` and `DDPG` implementations.
31-
Additionally, our agents were written with readibility as a primary concern, so they are easy to modify.
32-
33-
## Algorithms
1+
# The Autonomous Learning Library: A PyTorch Library for Building Reinforcement Learning Agents
2+
3+
The `autonomous-learning-library` is an object-oriented deep reinforcement learning (DRL) library for PyTorch.
4+
The goal of the library is to provide the necessary components for quickly building and evaluating novel reinforcement learning agents,
5+
as well as providing high-quality reference implementations of modern DRL algorithms.
6+
The full documentation can be found at the following URL: [https://autonomous-learning-library.readthedocs.io](https://autonomous-learning-library.readthedocs.io).
7+
8+
## Tools for Building New Agents
9+
10+
The primary goal of the `autonomous-learning-library` is to facilitate the rapid development of new reinforcement learning agents by providing common tools for building and evaluation agents, such as:
11+
12+
* A flexible function `Approximation` API that integrates features such as target networks, gradient clipping, learning rate schedules, model checkpointing, multi-headed networks, loss scaling, logging, and more.
13+
* Various memory buffers, including prioritized experience replay (PER), generalized advantage estimation (GAE), and more.
14+
* A `torch`-based `Environment` interface that simplies agent implementations by cutting out the `numpy` middleman.
15+
* Common wrappers and agent enhancements for replicating standard benchmarks.
16+
* [Slurm](https://slurm.schedmd.com/documentation.html) integration for running large-scale experiments.
17+
* Plotting and logging utilities including `tensorboard` integration and utilities for generating common plots.
18+
19+
See the [documentation](https://autonomous-learning-library.readthedocs.io) guide for a full description of the functionality provided by the `autonomous-learning-library`.
20+
Additionally, we provide an [example project](https://github.com/cpnota/all-example-project) which demonstrates the best practices for building new agents.
21+
22+
## High-Quality Reference Implementations
23+
24+
The `autonomous-learning-library` separates reinforcement learning agents into two modules: `all.agents`, which provides flexible, high-level implementations of many common algorithms which can be adapted to new problems and environments, and `all.presets` which provides specific instansiations of these agents tuned for particular sets of environments, including Atari games, classic control tasks, and PyBullet robotics simulations. Some benchmark results showing results on-par with published results can be found below:
25+
26+
![atari40](benchmarks/atari40.png)
27+
![pybullet](benchmarks/pybullet.png)
3428

3529
As of today, `all` contains implementations of the following deep RL algorithms:
3630

@@ -49,131 +43,55 @@ It also contains implementations of the following "vanilla" agents, which provid
4943
- [x] Vanilla Q-Learning
5044
- [x] Vanilla Sarsa
5145

52-
We will try to stay up-to-date with advances in the field, but we do not intend to implement every algorithm. Rather, we prefer to maintain a smaller set of high-quality agents that have achieved notoriety in the field.
53-
54-
We have labored to make sure that our implementations produce results comparable to published results.
55-
Here's a sampling of performance on several Atari games:
56-
57-
![atari40](atari40.png)
58-
59-
These results were generated using the `all.presets.atari` module, the `SlurmExperiment` utility, and the `all.experiments.plots` module.
60-
61-
## Example
62-
63-
Our agents implement a single method: `action = agent.act(state, reward)`.
64-
Much of the complexity is handled behind the scenes, making the final API simple.
65-
Unlike many libraries, we do not combine the learning algorithm and the training loop.
66-
Instead, our agents can be embedded in existing applications and trained in the online setting.
67-
68-
The `all.presets` includes agents that preconfigured for certain types of environments.
69-
It can be used as follows:
70-
71-
```python
72-
from all.presets.atari import dqn
73-
from all.environments import AtariEnvironment
46+
## Installation
7447

75-
env = AtariEnvironment('Breakout')
76-
agent = dqn(lr=3e-4)(env)
48+
First, you will need a new version of [PyTorch](https://pytorch.org) (>1.3), as well as [Tensorboard](https://pypi.org/project/tensorboard/).
49+
Then, you can install the `autonomous-learning-library` through PyPi:
7750

78-
while True:
79-
if env.done:
80-
env.reset()
81-
else:
82-
env.step(action)
83-
env.render()
84-
action = agent.act(env.state, env.reward)
8551
```
86-
87-
However, generally we recommend using the `Experiment` API, which adds many additional features:
88-
89-
```python
90-
from all.presets.atari import a2c, dqn
91-
from all.environments import AtariEnvironment
92-
from all.experiments import Experiment
93-
94-
# use graphics card for much faster training
95-
device = 'cuda'
96-
experiment = Experiment(AtariEnvironment('Breakout', device=device), frames=10e6)
97-
experiment.run(a2c(device=device))
98-
experiment.run(dqn(device=device))
52+
pip install autonomous-learning-library
9953
```
10054

101-
Results can be viewed by typing:
55+
Alternately, you can install directly from this repository:
10256

10357
```
104-
make tensorboard
105-
```
106-
107-
## Installation
108-
109-
This library is built on top of `pytorch`.
110-
If you don't want your trials to take forever, it is highly recommended that you make sure your installation has CUDA support (and that you have a CUDA compatible GPU).
111-
You'll also need `tensorflow` in order to use `tensorboard` (used for storing and plotting runs).
112-
113-
There are two ways to install the `autonomous-learning-library` : a "light" installation, which assumes that the major dependencies are already installed, and a "full" installation which installs everything from scratch.
114-
115-
### Light Installation
116-
117-
Use this if you already have `pytorch` and `tensorflow` installed.
118-
Simply run:
119-
120-
```bash
58+
git clone https://github.com/cpnota/autonomous-learning-library.git
59+
cd autonomous-learning-library
12160
pip install -e .
12261
```
12362

124-
Presto! If you have any trouble with installing the Gym environments, check out their [GitHub page](https://github.com/openai/gym) and try whatever they recommend in [current year].
125-
126-
### Full Installation
63+
You can also install the prerequisites using:
12764

128-
If you're on Linux and don't have `pytorch` or `tensorflow` installed, we did you the courtesy of providing a helpful install script:
129-
130-
```bash
131-
make install
13265
```
133-
134-
With any luck, the `all` library should now be installed!
135-
136-
### Testing Your Installation
137-
138-
The unit tests may be run using:
139-
66+
pip install autonomous-learning-library[pytorch]
14067
```
141-
make test
142-
```
143-
144-
If the unit tests pass with no errors, it is more than likely that your installation works! The unit test run every agent using both `cpu` and `cuda` for a few timesteps/episodes.
14568

14669
## Running the Presets
14770

148-
You can easily benchmark the included algorithms using the scripts in `./benchmarks`.
149-
To run a simple `CartPole` benchmark, run:
71+
If you just want to test out some cool agents, the `scripts` directory contains the basic code for doing so.
15072

15173
```
152-
python scripts/classic.py CartPole-v1 dqn
74+
python scripts/atari.py Breakout a2c
15375
```
15476

155-
Results are printed to the console, and can also be viewed by running:
77+
You can watch the training progress using:
15678

15779
```
158-
make tensorboard
80+
tensorboard --logdir runs
15981
```
16082

16183
and opening your browser to http://localhost:6006.
162-
163-
To run an Atari benchmark in CUDA mode (warning: this could take several hours to run, depending on your machine):
84+
Once the model is trained to your satisfaction, you can watch the trained model play using:
16485

16586
```
166-
python scripts/atari.py Pong dqn
87+
python scripts/watch_atari.py Breakout "runs/_a2c [id]"
16788
```
16889

169-
If you want to run in `cpu` mode (~10x slower on my machine), you can add ```--device cpu```:
170-
171-
```
172-
python scipts/atari.py Pong dqn --device cpu
173-
```
90+
where `id` is the ID of your particular run. You should should be able to find it using tab completion or by looking in the `runs` directory.
91+
The `autonomous-learning-library` also contains presets and scripts for classic control and PyBullet environments.
17492

17593
## Note
17694

177-
This library was built at the [Autonomous Learning Laboratory](http://all.cs.umass.edu) (ALL) at the [University of Massachusetts, Amherst](https://www.umass.edu).
95+
This library was built in the [Autonomous Learning Laboratory](http://all.cs.umass.edu) (ALL) at the [University of Massachusetts, Amherst](https://www.umass.edu).
17896
It was written and is currently maintained by Chris Nota (@cpnota).
17997
The views expressed or implied in this repository do not necessarily reflect the views of the ALL.

all/agents/__init__.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
from .ddqn import DDQN
66
from .dqn import DQN
77
from .ppo import PPO
8+
from .rainbow import Rainbow
89
from .sac import SAC
910
from .vac import VAC
1011
from .vpg import VPG
@@ -16,8 +17,11 @@
1617
"A2C",
1718
"C51",
1819
"DDPG",
20+
"DDQN",
1921
"DQN",
2022
"PPO",
23+
"Rainbow",
24+
"SAC",
2125
"VAC",
2226
"VPG",
2327
"VQN",

all/agents/_agent.py

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -25,13 +25,11 @@ def act(self, state, reward):
2525
This method allows the agent to do whatever is necessary for itself on a given timestep.
2626
However, the agent must ultimately return an action.
2727
28-
Parameters
29-
----------
30-
state: The environment state at the current timestep
31-
reward: The reward from the previous timestep
32-
info (optional): The info object from the environment
28+
Args:
29+
state (all.environment.State): The environment state at the current timestep.
30+
reward (torch.Tensor): The reward from the previous timestep.
31+
info (:obj:, optional): The info object from the environment.
3332
34-
Returns
35-
_______
36-
action: The action to take at the current timestep
33+
Returns:
34+
torch.Tensor: The action to take at the current timestep.
3735
"""

0 commit comments

Comments
 (0)