OpenAI-final-project

Introducing the GoalGridWorld Environment project that was inspired by the grounded language learning (here is the reference paper https://arxiv.org/abs/1706.06551). This project was executed within the OpenAI scholarship program. The GoalGridWorld environment is a matrix where an agent navigates according to commands [go, avoid] to hit or avoid target cells that are represented as randomly displaced pairs of colored objects [triangle, square, circle/red, green, blue]. Basically, there are 18 possible commands but only 8 actual "words". Thanks to an NLP-like approach in this case, an agent can handle more complex/combinatorial commands.

Feel free to use the code to run your experiments.

I want to thank Alec Radford for his mentorship and introducing me to NLP and Joshua Achiam for his valuable advice on reinforcement learning and creating the Spinning Up (https://spinningup.openai.com/en/latest/index.html) that was extremely helpful for my project.

Setup

conda install --yes --file requirements.txt
Source activate spinningup
Python3 train.py

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OpenAI-final-project

Setup

About

Uh oh!

Releases

Packages

Languages

SophiaAr/OpenAI-final-project

Folders and files

Latest commit

History

Repository files navigation

OpenAI-final-project

Setup

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages