Skip to content

Commit

Permalink
Merge pull request #124 from cpnota/release/0.4.0
Browse files Browse the repository at this point in the history
Release/0.4.0
  • Loading branch information
cpnota authored Jan 20, 2020
2 parents 4575eca + ba03123 commit 42f2514
Show file tree
Hide file tree
Showing 124 changed files with 2,992 additions and 1,730 deletions.
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,11 @@
# python
*.pyc
__pycache__
all.egg-info
autonomous_learning_library.egg-info

# build directories
/build
/dist

# editor
.vscode
Expand Down
3 changes: 2 additions & 1 deletion .pylintrc
Original file line number Diff line number Diff line change
Expand Up @@ -330,7 +330,7 @@ indent-after-paren=4
indent-string=' '

# Maximum number of characters on a single line.
max-line-length=100
max-line-length=120

# Maximum number of lines in a module.
max-module-lines=1000
Expand Down Expand Up @@ -437,6 +437,7 @@ good-names=i,
n,
t,
e,
u,
kl,
ax

Expand Down
26 changes: 26 additions & 0 deletions .readthedocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# .readthedocs.yml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2

# Build documentation in the docs/ directory with Sphinx
sphinx:
configuration: docs/source/conf.py

# Build documentation with MkDocs
#mkdocs:
# configuration: mkdocs.yml

# Optionally build your docs in additional formats such as PDF and ePub
formats: all

# Optionally set the version of Python and requirements required to build your docs
python:
version: 3.7
install:
- method: pip
path: .
extra_requirements:
- docs
36 changes: 36 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Contributing

Contributions and suggestions are welcome!
If you are interested in contributing either bug fixes or new features, open an issue and we can talk about it!
New PRs will require:

1. New unit tests for any new or changed common module, and all unit tests should pass.
2. All code should follow a similar style to the rest of the repository and the linter should pass.
3. Documentation of new features.
4. Manual approval.


We use the [GitFlow](https://datasift.github.io/gitflow/IntroducingGitFlow.html) model, meaning that all PRs should be opened against the `develop` branch!
To begin, you can run the following commands:

```
git clone https://github.com/cpnota/autonomous-learning-library.git
cd autonomous-learning-library
git checkout develop
pip install -e .[docs]
```

The unit tests may be run using:

```
make test
```

Finally, you rebuild the documentation using:

```
cd docs
make clean && make html
```

Happy hacking!
15 changes: 12 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
install:
pip install https://download.pytorch.org/whl/cu100/torch-1.1.0-cp37-cp37m-linux_x86_64.whl
pip install https://download.pytorch.org/whl/cu100/torchvision-0.3.0-cp37-cp37m-linux_x86_64.whl
pip install tensorflow
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
pip install tensorboard
pip install -e .

lint:
Expand All @@ -15,3 +14,13 @@ tensorboard:

benchmark:
tensorboard --logdir benchmarks/runs --port=6007

clean:
rm -rf dist
rm -rf build

build: clean
python setup.py sdist bdist_wheel

deploy: lint test build
twine upload dist/*
172 changes: 45 additions & 127 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,36 +1,30 @@
# The Autonomous Learning Library: An Object-Oriented Deep Reinforcement Learning Library in Pytorch

The Autonomous Learning Library (`all`) is an object-oriented deep reinforcement learning library in `pytorch`. The goal of the library is to provide implementations of modern reinforcement learning algorithms that reflect the way that reinforcement learning researchers think about agent design and to provide the components necessary to build and test new ideas with minimal overhead.

## Why use `all`?

The primary reason for using `all` over its many competitors is because it contains components that allow you to *build your own* reinforcement learning agents.
We provide out-of-the-box modules for:

- [x] Custom Q-Networks, V-Networks, policy networks, and feature networks
- [x] Generic function approximation
- [x] Target networks
- [x] Polyak averaging
- [x] Experience Replay
- [x] Prioritized Experience Replay
- [x] Advantage Estimation
- [x] Generalized Advantage Estimation (GAE)
- [x] Easy parameter and learning rate scheduling
- [x] An enhanced `nn` module (includes dueling layers, noisy layers, action bounds, and the coveted `nn.Flatten`)
- [x] `gym` to `pytorch` wrappers
- [x] Atari wrappers
- [x] An `Experiment` API for comparing and evaluating agents
- [x] A `SlurmExperiment` API for running massive experiments on computing clusters
- [x] A `Writer` object for easily logging information in `tensorboard`
- [x] Plotting utilities for generating paper-worthy result plots

Rather than being embedded in the agents, all of these modules are available for use by your own custom agents.
Additionally, the included agents accept custom versions of any of the above objects.
Have a new type of replay buffer in mind?
Code it up and pass it directly to our `DQN` and `DDPG` implementations.
Additionally, our agents were written with readibility as a primary concern, so they are easy to modify.

## Algorithms
# The Autonomous Learning Library: A PyTorch Library for Building Reinforcement Learning Agents

The `autonomous-learning-library` is an object-oriented deep reinforcement learning (DRL) library for PyTorch.
The goal of the library is to provide the necessary components for quickly building and evaluating novel reinforcement learning agents,
as well as providing high-quality reference implementations of modern DRL algorithms.
The full documentation can be found at the following URL: [https://autonomous-learning-library.readthedocs.io](https://autonomous-learning-library.readthedocs.io).

## Tools for Building New Agents

The primary goal of the `autonomous-learning-library` is to facilitate the rapid development of new reinforcement learning agents by providing common tools for building and evaluation agents, such as:

* A flexible function `Approximation` API that integrates features such as target networks, gradient clipping, learning rate schedules, model checkpointing, multi-headed networks, loss scaling, logging, and more.
* Various memory buffers, including prioritized experience replay (PER), generalized advantage estimation (GAE), and more.
* A `torch`-based `Environment` interface that simplies agent implementations by cutting out the `numpy` middleman.
* Common wrappers and agent enhancements for replicating standard benchmarks.
* [Slurm](https://slurm.schedmd.com/documentation.html) integration for running large-scale experiments.
* Plotting and logging utilities including `tensorboard` integration and utilities for generating common plots.

See the [documentation](https://autonomous-learning-library.readthedocs.io) guide for a full description of the functionality provided by the `autonomous-learning-library`.
Additionally, we provide an [example project](https://github.com/cpnota/all-example-project) which demonstrates the best practices for building new agents.

## High-Quality Reference Implementations

The `autonomous-learning-library` separates reinforcement learning agents into two modules: `all.agents`, which provides flexible, high-level implementations of many common algorithms which can be adapted to new problems and environments, and `all.presets` which provides specific instansiations of these agents tuned for particular sets of environments, including Atari games, classic control tasks, and PyBullet robotics simulations. Some benchmark results showing results on-par with published results can be found below:

![atari40](benchmarks/atari40.png)
![pybullet](benchmarks/pybullet.png)

As of today, `all` contains implementations of the following deep RL algorithms:

Expand All @@ -49,131 +43,55 @@ It also contains implementations of the following "vanilla" agents, which provid
- [x] Vanilla Q-Learning
- [x] Vanilla Sarsa

We will try to stay up-to-date with advances in the field, but we do not intend to implement every algorithm. Rather, we prefer to maintain a smaller set of high-quality agents that have achieved notoriety in the field.

We have labored to make sure that our implementations produce results comparable to published results.
Here's a sampling of performance on several Atari games:

![atari40](atari40.png)

These results were generated using the `all.presets.atari` module, the `SlurmExperiment` utility, and the `all.experiments.plots` module.

## Example

Our agents implement a single method: `action = agent.act(state, reward)`.
Much of the complexity is handled behind the scenes, making the final API simple.
Unlike many libraries, we do not combine the learning algorithm and the training loop.
Instead, our agents can be embedded in existing applications and trained in the online setting.

The `all.presets` includes agents that preconfigured for certain types of environments.
It can be used as follows:

```python
from all.presets.atari import dqn
from all.environments import AtariEnvironment
## Installation

env = AtariEnvironment('Breakout')
agent = dqn(lr=3e-4)(env)
First, you will need a new version of [PyTorch](https://pytorch.org) (>1.3), as well as [Tensorboard](https://pypi.org/project/tensorboard/).
Then, you can install the `autonomous-learning-library` through PyPi:

while True:
if env.done:
env.reset()
else:
env.step(action)
env.render()
action = agent.act(env.state, env.reward)
```

However, generally we recommend using the `Experiment` API, which adds many additional features:

```python
from all.presets.atari import a2c, dqn
from all.environments import AtariEnvironment
from all.experiments import Experiment

# use graphics card for much faster training
device = 'cuda'
experiment = Experiment(AtariEnvironment('Breakout', device=device), frames=10e6)
experiment.run(a2c(device=device))
experiment.run(dqn(device=device))
pip install autonomous-learning-library
```

Results can be viewed by typing:
Alternately, you can install directly from this repository:

```
make tensorboard
```

## Installation

This library is built on top of `pytorch`.
If you don't want your trials to take forever, it is highly recommended that you make sure your installation has CUDA support (and that you have a CUDA compatible GPU).
You'll also need `tensorflow` in order to use `tensorboard` (used for storing and plotting runs).

There are two ways to install the `autonomous-learning-library` : a "light" installation, which assumes that the major dependencies are already installed, and a "full" installation which installs everything from scratch.

### Light Installation

Use this if you already have `pytorch` and `tensorflow` installed.
Simply run:

```bash
git clone https://github.com/cpnota/autonomous-learning-library.git
cd autonomous-learning-library
pip install -e .
```

Presto! If you have any trouble with installing the Gym environments, check out their [GitHub page](https://github.com/openai/gym) and try whatever they recommend in [current year].

### Full Installation
You can also install the prerequisites using:

If you're on Linux and don't have `pytorch` or `tensorflow` installed, we did you the courtesy of providing a helpful install script:

```bash
make install
```

With any luck, the `all` library should now be installed!

### Testing Your Installation

The unit tests may be run using:

pip install autonomous-learning-library[pytorch]
```
make test
```

If the unit tests pass with no errors, it is more than likely that your installation works! The unit test run every agent using both `cpu` and `cuda` for a few timesteps/episodes.

## Running the Presets

You can easily benchmark the included algorithms using the scripts in `./benchmarks`.
To run a simple `CartPole` benchmark, run:
If you just want to test out some cool agents, the `scripts` directory contains the basic code for doing so.

```
python scripts/classic.py CartPole-v1 dqn
python scripts/atari.py Breakout a2c
```

Results are printed to the console, and can also be viewed by running:
You can watch the training progress using:

```
make tensorboard
tensorboard --logdir runs
```

and opening your browser to http://localhost:6006.

To run an Atari benchmark in CUDA mode (warning: this could take several hours to run, depending on your machine):
Once the model is trained to your satisfaction, you can watch the trained model play using:

```
python scripts/atari.py Pong dqn
python scripts/watch_atari.py Breakout "runs/_a2c [id]"
```

If you want to run in `cpu` mode (~10x slower on my machine), you can add ```--device cpu```:

```
python scipts/atari.py Pong dqn --device cpu
```
where `id` is the ID of your particular run. You should should be able to find it using tab completion or by looking in the `runs` directory.
The `autonomous-learning-library` also contains presets and scripts for classic control and PyBullet environments.

## Note

This library was built at the [Autonomous Learning Laboratory](http://all.cs.umass.edu) (ALL) at the [University of Massachusetts, Amherst](https://www.umass.edu).
This library was built in the [Autonomous Learning Laboratory](http://all.cs.umass.edu) (ALL) at the [University of Massachusetts, Amherst](https://www.umass.edu).
It was written and is currently maintained by Chris Nota (@cpnota).
The views expressed or implied in this repository do not necessarily reflect the views of the ALL.
4 changes: 4 additions & 0 deletions all/agents/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
from .ddqn import DDQN
from .dqn import DQN
from .ppo import PPO
from .rainbow import Rainbow
from .sac import SAC
from .vac import VAC
from .vpg import VPG
Expand All @@ -16,8 +17,11 @@
"A2C",
"C51",
"DDPG",
"DDQN",
"DQN",
"PPO",
"Rainbow",
"SAC",
"VAC",
"VPG",
"VQN",
Expand Down
14 changes: 6 additions & 8 deletions all/agents/_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,13 +25,11 @@ def act(self, state, reward):
This method allows the agent to do whatever is necessary for itself on a given timestep.
However, the agent must ultimately return an action.
Parameters
----------
state: The environment state at the current timestep
reward: The reward from the previous timestep
info (optional): The info object from the environment
Args:
state (all.environment.State): The environment state at the current timestep.
reward (torch.Tensor): The reward from the previous timestep.
info (:obj:, optional): The info object from the environment.
Returns
_______
action: The action to take at the current timestep
Returns:
torch.Tensor: The action to take at the current timestep.
"""
Loading

0 comments on commit 42f2514

Please sign in to comment.