Skip to content

Commit 7021891

Browse files
author
Dean Wampler
committed
Refinements for Recsys and references
1 parent fb6eb6a commit 7021891

File tree

4 files changed

+29
-5
lines changed

4 files changed

+29
-5
lines changed

images/AnyscaleAcademyLogo.png

595 Bytes
Loading

ray-rllib/References-Reinforcement-Learning.ipynb

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@
3636
"\n",
3737
"Several blog posts and series provide concise introductions to RL:\n",
3838
"\n",
39+
"* [Anatomy of a custom environment for RLlib](https://medium.com/distributed-computing-with-ray/anatomy-of-a-custom-environment-for-rllib-327157f269e5).\n",
3940
"* [A Reinforcement Learning Cheat Sheet](https://towardsdatascience.com/reinforcement-learning-cheat-sheet-2f9453df7651).\n",
4041
"* [Reinforcement Learning Explained](https://www.oreilly.com/radar/reinforcement-learning-explained/), Junling Hu, 2016. A gentle introduction to the ideas of RL.\n",
4142
"* [A Beginner's Guide to Deep Reinforcement Learning](https://pathmind.com/wiki/deep-reinforcement-learning), Pathmind, 2019. From Pathmind, which uses RLlib for its products and services. Lots of good references at the end of this post.\n",
@@ -91,6 +92,16 @@
9192
"* Ziyu Wang, Tom Schaul, Matteo Hessel, et al., \"Dueling Network Architectures for Deep Reinforcement Learning\", November 2015, [arxiv](https://arxiv.org/abs/1511.06581)."
9293
]
9394
},
95+
{
96+
"cell_type": "markdown",
97+
"metadata": {},
98+
"source": [
99+
"#### Recommender Systems Using RL\n",
100+
"\n",
101+
"* Ken Goldberg, Theresa Roeder, Dhruv Gupta, Chris Perkins, \"Eigentaste: A Constant Time Collaborative Filtering Algorithm\", *Information Retrieval*, 4(2), 133-151 (July 2001) [pdf](https://goldberg.berkeley.edu/pubs/eigentaste.pdf).\n",
102+
"* Julian McAuley and Jure Leskovec, \"From Amateurs to Connoisseurs: Modeling the Evolution of User Expertise through Online Reviews\", [arxiv](https://arxiv.org/abs/1303.4402) (March 18, 2013).\n"
103+
]
104+
},
94105
{
95106
"cell_type": "markdown",
96107
"metadata": {},
@@ -150,7 +161,7 @@
150161
"name": "python",
151162
"nbconvert_exporter": "python",
152163
"pygments_lexer": "ipython3",
153-
"version": "3.7.6"
164+
"version": "3.7.7"
154165
}
155166
},
156167
"nbformat": 4,

ray-rllib/explore-rllib/03-Custom-Environments-Reward-Shaping.ipynb

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,9 @@
1515
"We cover two important concepts: \n",
1616
"\n",
1717
"1. How to create your own _Markov Decision Process_ abstraction.\n",
18-
"2. How to shape the reward of your environment so make your agent more effective. "
18+
"2. How to shape the reward of your environment so make your agent more effective. \n",
19+
"\n",
20+
"For a more detailed discussion of how to build a custom environment for training a policy with RLlib using OpenAI [Gym](https://gym.openai.com/), see the [Recsys](../recsys/00-Recsys-Overview.ipynb) (recommender system) lessons and the blog post [\"Anatomy of a custom environment for RLlib\"](https://medium.com/distributed-computing-with-ray/anatomy-of-a-custom-environment-for-rllib-327157f269e5). Full source code for that post is available at <https://github.com/DerwenAI/gym_example>. "
1921
]
2022
},
2123
{

ray-rllib/recsys/01-Recsys.ipynb

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -836,7 +836,9 @@
836836
{
837837
"cell_type": "code",
838838
"execution_count": null,
839-
"metadata": {},
839+
"metadata": {
840+
"scrolled": true
841+
},
840842
"outputs": [],
841843
"source": [
842844
"from ray.tune.registry import register_env\n",
@@ -856,7 +858,9 @@
856858
{
857859
"cell_type": "code",
858860
"execution_count": null,
859-
"metadata": {},
861+
"metadata": {
862+
"scrolled": true
863+
},
860864
"outputs": [],
861865
"source": [
862866
"TRAIN_ITER = 20\n",
@@ -1073,6 +1077,13 @@
10731077
"source": [
10741078
"ray.shutdown()"
10751079
]
1080+
},
1081+
{
1082+
"cell_type": "code",
1083+
"execution_count": null,
1084+
"metadata": {},
1085+
"outputs": [],
1086+
"source": []
10761087
}
10771088
],
10781089
"metadata": {
@@ -1091,7 +1102,7 @@
10911102
"name": "python",
10921103
"nbconvert_exporter": "python",
10931104
"pygments_lexer": "ipython3",
1094-
"version": "3.7.4"
1105+
"version": "3.7.7"
10951106
}
10961107
},
10971108
"nbformat": 4,

0 commit comments

Comments
 (0)