|
17 | 17 | "cell_type": "markdown", |
18 | 18 | "metadata": {}, |
19 | 19 | "source": [ |
20 | | - "The full source code for this example recommender system is also in `recsys.py`. You can run it with the default settings, e.g., to exercise the code, use the following command:\n", |
| 20 | + "For reference, the GitHub public repo for this code is available at <https://github.com/anyscale/academy/blob/master/ray-rllib/recsys> and full source code for this example recommender system is also in the `recsys.py` script. You can run that with default settings to exercise the code:\n", |
21 | 21 | "\n", |
22 | 22 | "```shell\n", |
23 | 23 | "python recsys.py\n", |
|
108 | 108 | "outputs": [], |
109 | 109 | "source": [ |
110 | 110 | "from pathlib import Path\n", |
| 111 | + "import os\n", |
111 | 112 | "import pandas as pd\n", |
112 | 113 | "\n", |
113 | | - "DATA_PATH = Path(\"jester-data-1.csv\")\n", |
| 114 | + "DATA_PATH = Path(os.getcwd()) / Path(\"jester-data-1.csv\")\n", |
114 | 115 | "sample = load_data(DATA_PATH)\n", |
115 | 116 | "\n", |
116 | 117 | "df = pd.DataFrame(sample)\n", |
|
212 | 213 | "cell_type": "markdown", |
213 | 214 | "metadata": {}, |
214 | 215 | "source": [ |
215 | | - "That plot shows a \"knee\" in the curve at `k=12` where the decrease in error begins to level out. That's a reasonable number of clusters, such that each cluster will tend to have ~8% of the items. That choice has an inherent trade-off:\n", |
| 216 | + "This kind of cluster analysis has stochastic aspects, so results may differ on different runs. Generally, the plot shows a \"knee\" in the curve near `k=7` as the decrease in error begins to level out. That's a reasonable number of clusters, such that each cluster will tend to have ~14% of the items. That choice has an inherent trade-off:\n", |
216 | 217 | "\n", |
217 | 218 | " * too few clusters → poor predictions (less accuracy)\n", |
218 | 219 | " * too many clusters → poor predictive power (less recall)\n", |
219 | 220 | "\n", |
220 | | - "Now we can run K-means in `scikit-learn` with that hyperparameter `k=12` to get the clusters that we'll use in our RL environment:" |
| 221 | + "Now we can run K-means in `scikit-learn` with that hyperparameter `k=7` to get the clusters that we'll use in our RL environment:" |
221 | 222 | ] |
222 | 223 | }, |
223 | 224 | { |
|
226 | 227 | "metadata": {}, |
227 | 228 | "outputs": [], |
228 | 229 | "source": [ |
229 | | - "K_CLUSTERS = 12\n", |
| 230 | + "K_CLUSTERS = 7\n", |
230 | 231 | "\n", |
231 | 232 | "km = KMeans(n_clusters=K_CLUSTERS)\n", |
232 | 233 | "km.fit(X)\n", |
|
279 | 280 | "cell_type": "markdown", |
280 | 281 | "metadata": {}, |
281 | 282 | "source": [ |
282 | | - "BTW, let's take a look at the top three clusters from this analysis…" |
| 283 | + "BTW, let's take a look at the K clusters from this analysis…" |
283 | 284 | ] |
284 | 285 | }, |
285 | 286 | { |
|
290 | 291 | "source": [ |
291 | 292 | "plt.scatter(\n", |
292 | 293 | " X[y_km == 0, 0], X[y_km == 0, 1],\n", |
293 | | - " s=100, c=\"lightgreen\",\n", |
| 294 | + " s=50, c=\"lightgreen\",\n", |
294 | 295 | " marker=\"s\", edgecolor=\"black\",\n", |
295 | | - " label=\"cluster 1\"\n", |
| 296 | + " label=\"cluster 0\"\n", |
296 | 297 | ")\n", |
297 | 298 | "\n", |
298 | 299 | "plt.scatter(\n", |
299 | 300 | " X[y_km == 1, 0], X[y_km == 1, 1],\n", |
300 | | - " s=100, c=\"orange\",\n", |
| 301 | + " s=50, c=\"orange\",\n", |
301 | 302 | " marker=\"o\", edgecolor=\"black\",\n", |
302 | | - " label=\"cluster 2\"\n", |
| 303 | + " label=\"cluster 1\"\n", |
303 | 304 | ")\n", |
304 | 305 | "\n", |
305 | 306 | "plt.scatter(\n", |
306 | 307 | " X[y_km == 2, 0], X[y_km == 2, 1],\n", |
307 | | - " s=100, c=\"lightblue\",\n", |
| 308 | + " s=50, c=\"lightblue\",\n", |
308 | 309 | " marker=\"v\", edgecolor=\"black\",\n", |
| 310 | + " label=\"cluster 2\"\n", |
| 311 | + ")\n", |
| 312 | + "\n", |
| 313 | + "plt.scatter(\n", |
| 314 | + " X[y_km == 3, 0], X[y_km == 3, 1],\n", |
| 315 | + " s=50, c=\"blue\",\n", |
| 316 | + " marker=\"^\", edgecolor=\"black\",\n", |
309 | 317 | " label=\"cluster 3\"\n", |
310 | 318 | ")\n", |
311 | 319 | "\n", |
| 320 | + "plt.scatter(\n", |
| 321 | + " X[y_km == 4, 0], X[y_km == 4, 1],\n", |
| 322 | + " s=50, c=\"yellow\",\n", |
| 323 | + " marker=\"<\", edgecolor=\"black\",\n", |
| 324 | + " label=\"cluster 4\"\n", |
| 325 | + ")\n", |
| 326 | + "\n", |
| 327 | + "plt.scatter(\n", |
| 328 | + " X[y_km == 5, 0], X[y_km == 5, 1],\n", |
| 329 | + " s=50, c=\"purple\",\n", |
| 330 | + " marker=\">\", edgecolor=\"black\",\n", |
| 331 | + " label=\"cluster 5\"\n", |
| 332 | + ")\n", |
| 333 | + "\n", |
| 334 | + "plt.scatter(\n", |
| 335 | + " X[y_km == 6, 0], X[y_km == 6, 1],\n", |
| 336 | + " s=50, c=\"brown\",\n", |
| 337 | + " marker=\"X\", edgecolor=\"black\",\n", |
| 338 | + " label=\"cluster 6\"\n", |
| 339 | + ")\n", |
| 340 | + "\n", |
312 | 341 | "# plot the centroids\n", |
313 | 342 | "plt.scatter(\n", |
314 | 343 | " km.cluster_centers_[:, 0], km.cluster_centers_[:, 1],\n", |
315 | 344 | " s=250, marker=\"*\",\n", |
316 | 345 | " c=\"red\", edgecolor=\"black\",\n", |
317 | | - " label=\"centroids\"\n", |
| 346 | + " label=\"centers\"\n", |
318 | 347 | ")\n", |
319 | 348 | "\n", |
320 | 349 | "plt.legend(scatterpoints=1)\n", |
|
326 | 355 | "cell_type": "markdown", |
327 | 356 | "metadata": {}, |
328 | 357 | "source": [ |
329 | | - "Not bad. Those clusters show some separation, at least along those three most dominant dimensions." |
| 358 | + "Not bad, based on the centers these clusters show some separation – at least when we plot in 2 dimensions." |
330 | 359 | ] |
331 | 360 | }, |
332 | 361 | { |
|
830 | 859 | "metadata": {}, |
831 | 860 | "outputs": [], |
832 | 861 | "source": [ |
833 | | - "TRAIN_ITER = 30\n", |
| 862 | + "TRAIN_ITER = 20\n", |
834 | 863 | "\n", |
835 | 864 | "df = pd.DataFrame(columns=[ \"min_reward\", \"avg_reward\", \"max_reward\", \"steps\", \"checkpoint\"])\n", |
836 | 865 | "status = \"reward {:6.2f} {:6.2f} {:6.2f} len {:4.2f} saved {}\"\n", |
|
940 | 969 | "AGENT.restore(BEST_CHECKPOINT)\n", |
941 | 970 | "history = []\n", |
942 | 971 | "\n", |
943 | | - "for episode_reward in run_rollout(AGENT, env, n_iter=100, verbose=False):\n", |
| 972 | + "for episode_reward in run_rollout(AGENT, env, n_iter=500, verbose=False):\n", |
944 | 973 | " history.append(episode_reward)\n", |
945 | 974 | " \n", |
946 | 975 | "print(\"average reward:\", round(sum(history) / len(history), 3))" |
|
984 | 1013 | "cell_type": "markdown", |
985 | 1014 | "metadata": {}, |
986 | 1015 | "source": [ |
987 | | - "## Evaluate Learning with TensorBoard\n", |
| 1016 | + "## Evaluate learning with TensorBoard\n", |
988 | 1017 | "\n", |
989 | 1018 | "You also can run [TensorBoard](https://www.tensorflow.org/tensorboard) to visualize the RL training metrics from the log files. The results during training were written to a directory under `$HOME/ray_results`\n", |
990 | 1019 | "\n", |
|
1003 | 1032 | "cell_type": "markdown", |
1004 | 1033 | "metadata": {}, |
1005 | 1034 | "source": [ |
| 1035 | + "## Exercises\n", |
| 1036 | + "\n", |
| 1037 | + "For the exercises, there are several ways to modify the Gym environment or the RLlib training parameters, then compare how the outcomes differ:\n", |
| 1038 | + "\n", |
| 1039 | + " 1. Re-run using smaller and larger K values\n", |
| 1040 | + " 2. Adjust the rewards for depleted and unrated actions\n", |
| 1041 | + " 3. Increase the number of training iterations\n", |
| 1042 | + " 4. Compare use of the other dataset partitions during rollout: `\"jester-data-2.csv\"` or `\"jester-data-3.csv\"`\n", |
| 1043 | + "\n", |
| 1044 | + "For each of these variations compare:\n", |
| 1045 | + "\n", |
| 1046 | + " * baseline with random actions \n", |
| 1047 | + " * baseline with the naïve strategy\n", |
| 1048 | + " * predicted average reward from training\n", |
| 1049 | + " * stats from the rollout\n", |
| 1050 | + "\n", |
| 1051 | + "Let's discuss the results as a group.\n", |
| 1052 | + "\n", |
| 1053 | + "Other questions to discuss:\n", |
| 1054 | + "\n", |
| 1055 | + " 1. In what ways could the \"warm start\" be improved?\n", |
| 1056 | + " 2. How could this code be modified to scale to millions of users? Or to thousands of items?" |
| 1057 | + ] |
| 1058 | + }, |
| 1059 | + { |
| 1060 | + "cell_type": "markdown", |
| 1061 | + "metadata": {}, |
| 1062 | + "source": [ |
| 1063 | + "## Clean up\n", |
| 1064 | + "\n", |
1006 | 1065 | "Finally, let's shutdown Ray gracefully:" |
1007 | 1066 | ] |
1008 | 1067 | }, |
|
1032 | 1091 | "name": "python", |
1033 | 1092 | "nbconvert_exporter": "python", |
1034 | 1093 | "pygments_lexer": "ipython3", |
1035 | | - "version": "3.7.7" |
| 1094 | + "version": "3.7.4" |
1036 | 1095 | } |
1037 | 1096 | }, |
1038 | 1097 | "nbformat": 4, |
|
0 commit comments