Refinements for Recsys and references

Dean Wampler · Dean Wampler · commit 7021891bdd98 · 2020-09-03T07:14:51.000-07:00
diff --git a/images/AnyscaleAcademyLogo.png b/images/AnyscaleAcademyLogo.png
diff --git a/ray-rllib/References-Reinforcement-Learning.ipynb b/ray-rllib/References-Reinforcement-Learning.ipynb
@@ -36,6 +36,7 @@
     "\n",
     "Several blog posts and series provide concise introductions to RL:\n",
     "\n",
+    "* [Anatomy of a custom environment for RLlib](https://medium.com/distributed-computing-with-ray/anatomy-of-a-custom-environment-for-rllib-327157f269e5).\n",
     "* [A Reinforcement Learning Cheat Sheet](https://towardsdatascience.com/reinforcement-learning-cheat-sheet-2f9453df7651).\n",
     "* [Reinforcement Learning Explained](https://www.oreilly.com/radar/reinforcement-learning-explained/), Junling Hu, 2016. A gentle introduction to the ideas of RL.\n",
     "* [A Beginner's Guide to Deep Reinforcement Learning](https://pathmind.com/wiki/deep-reinforcement-learning), Pathmind, 2019. From Pathmind, which uses RLlib for its products and services. Lots of good references at the end of this post.\n",
@@ -91,6 +92,16 @@
     "* Ziyu Wang, Tom Schaul, Matteo Hessel, et al., \"Dueling Network Architectures for Deep Reinforcement Learning\", November 2015, [arxiv](https://arxiv.org/abs/1511.06581)."
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Recommender Systems Using RL\n",
+    "\n",
+    "* Ken Goldberg, Theresa Roeder, Dhruv Gupta, Chris Perkins, \"Eigentaste: A Constant Time Collaborative Filtering Algorithm\", *Information Retrieval*, 4(2), 133-151 (July 2001) [pdf](https://goldberg.berkeley.edu/pubs/eigentaste.pdf).\n",
+    "* Julian McAuley and Jure Leskovec, \"From Amateurs to Connoisseurs: Modeling the Evolution of User Expertise through Online Reviews\", [arxiv](https://arxiv.org/abs/1303.4402) (March 18, 2013).\n"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -150,7 +161,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.7.6"
+   "version": "3.7.7"
   }
  },
  "nbformat": 4,
diff --git a/ray-rllib/explore-rllib/03-Custom-Environments-Reward-Shaping.ipynb b/ray-rllib/explore-rllib/03-Custom-Environments-Reward-Shaping.ipynb
@@ -15,7 +15,9 @@
     "We cover two important concepts: \n",
     "\n",
     "1. How to create your own _Markov Decision Process_ abstraction.\n",
-    "2. How to shape the reward of your environment so make your agent more effective. "
+    "2. How to shape the reward of your environment so make your agent more effective. \n",
+    "\n",
+    "For a more detailed discussion of how to build a custom environment for training a policy with RLlib using OpenAI [Gym](https://gym.openai.com/), see the [Recsys](../recsys/00-Recsys-Overview.ipynb) (recommender system) lessons and the blog post [\"Anatomy of a custom environment for RLlib\"](https://medium.com/distributed-computing-with-ray/anatomy-of-a-custom-environment-for-rllib-327157f269e5). Full source code for that post is available at <https://github.com/DerwenAI/gym_example>. "
    ]
   },
   {
diff --git a/ray-rllib/recsys/01-Recsys.ipynb b/ray-rllib/recsys/01-Recsys.ipynb
@@ -836,7 +836,9 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "metadata": {},
+   "metadata": {
+    "scrolled": true
+   },
    "outputs": [],
    "source": [
     "from ray.tune.registry import register_env\n",
@@ -856,7 +858,9 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "metadata": {},
+   "metadata": {
+    "scrolled": true
+   },
    "outputs": [],
    "source": [
     "TRAIN_ITER = 20\n",
@@ -1073,6 +1077,13 @@
    "source": [
     "ray.shutdown()"
    ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
   }
  ],
  "metadata": {
@@ -1091,7 +1102,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.7.4"
+   "version": "3.7.7"
   }
  },
  "nbformat": 4,

Original file line number	Diff line number	Diff line change
`@@ -15,7 +15,9 @@`
`15`	`15`	`"We cover two important concepts: \n",`
`16`	`16`	`"\n",`
`17`	`17`	`"1. How to create your own _Markov Decision Process_ abstraction.\n",`
`18`		`- "2. How to shape the reward of your environment so make your agent more effective. "`
	`18`	`+ "2. How to shape the reward of your environment so make your agent more effective. \n",`
	`19`	`+ "\n",`
	`20`	`+ "For a more detailed discussion of how to build a custom environment for training a policy with RLlib using OpenAI [Gym](https://gym.openai.com/), see the [Recsys](../recsys/00-Recsys-Overview.ipynb) (recommender system) lessons and the blog post [\"Anatomy of a custom environment for RLlib\"](https://medium.com/distributed-computing-with-ray/anatomy-of-a-custom-environment-for-rllib-327157f269e5). Full source code for that post is available at <https://github.com/DerwenAI/gym_example>. "`
`19`	`21`	`]`
`20`	`22`	`},`
`21`	`23`	`{`