anyscale
diff --git a/‎ray-rllib/recsys/01-Recsys.ipynb‎
Lines changed: 76 additions & 17 deletions b/‎ray-rllib/recsys/01-Recsys.ipynb‎
Lines changed: 76 additions & 17 deletions
@@ -17,7 +17,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The full source code for this example recommender system is also in `recsys.py`. You can run it with the default settings, e.g., to exercise the code, use the following command:\n",
+    "For reference, the GitHub public repo for this code is available at <https://github.com/anyscale/academy/blob/master/ray-rllib/recsys> and full source code for this example recommender system is also in the `recsys.py` script. You can run that with default settings to exercise the code:\n",
     "\n",
     "```shell\n",
     "python recsys.py\n",
@@ -108,9 +108,10 @@
    "outputs": [],
    "source": [
     "from pathlib import Path\n",
+    "import os\n",
     "import pandas as pd\n",
     "\n",
-    "DATA_PATH = Path(\"jester-data-1.csv\")\n",
+    "DATA_PATH = Path(os.getcwd()) / Path(\"jester-data-1.csv\")\n",
     "sample = load_data(DATA_PATH)\n",
     "\n",
     "df = pd.DataFrame(sample)\n",
@@ -212,12 +213,12 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "That plot shows a \"knee\" in the curve at `k=12` where the decrease in error begins to level out. That's a reasonable number of clusters, such that each cluster will tend to have ~8% of the items. That choice has an inherent trade-off:\n",
+    "This kind of cluster analysis has stochastic aspects, so results may differ on different runs. Generally, the plot shows a \"knee\" in the curve near `k=7` as the decrease in error begins to level out. That's a reasonable number of clusters, such that each cluster will tend to have ~14% of the items. That choice has an inherent trade-off:\n",
     "\n",
     "  * too few clusters → poor predictions (less accuracy)\n",
     "  * too many clusters → poor predictive power (less recall)\n",
     "\n",
-    "Now we can run K-means in `scikit-learn` with that hyperparameter `k=12` to get the clusters that we'll use in our RL environment:"
+    "Now we can run K-means in `scikit-learn` with that hyperparameter `k=7` to get the clusters that we'll use in our RL environment:"
    ]
   },
   {
@@ -226,7 +227,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "K_CLUSTERS = 12\n",
+    "K_CLUSTERS = 7\n",
     "\n",
     "km = KMeans(n_clusters=K_CLUSTERS)\n",
     "km.fit(X)\n",
@@ -279,7 +280,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "BTW, let's take a look at the top three clusters from this analysis…"
+    "BTW, let's take a look at the K clusters from this analysis…"
    ]
   },
   {
@@ -290,31 +291,59 @@
    "source": [
     "plt.scatter(\n",
     "    X[y_km == 0, 0], X[y_km == 0, 1],\n",
-    "    s=100, c=\"lightgreen\",\n",
+    "    s=50, c=\"lightgreen\",\n",
     "    marker=\"s\", edgecolor=\"black\",\n",
-    "    label=\"cluster 1\"\n",
+    "    label=\"cluster 0\"\n",
     ")\n",
     "\n",
     "plt.scatter(\n",
     "    X[y_km == 1, 0], X[y_km == 1, 1],\n",
-    "    s=100, c=\"orange\",\n",
+    "    s=50, c=\"orange\",\n",
     "    marker=\"o\", edgecolor=\"black\",\n",
-    "    label=\"cluster 2\"\n",
+    "    label=\"cluster 1\"\n",
     ")\n",
     "\n",
     "plt.scatter(\n",
     "    X[y_km == 2, 0], X[y_km == 2, 1],\n",
-    "    s=100, c=\"lightblue\",\n",
+    "    s=50, c=\"lightblue\",\n",
     "    marker=\"v\", edgecolor=\"black\",\n",
+    "    label=\"cluster 2\"\n",
+    ")\n",
+    "\n",
+    "plt.scatter(\n",
+    "    X[y_km == 3, 0], X[y_km == 3, 1],\n",
+    "    s=50, c=\"blue\",\n",
+    "    marker=\"^\", edgecolor=\"black\",\n",
     "    label=\"cluster 3\"\n",
     ")\n",
     "\n",
+    "plt.scatter(\n",
+    "    X[y_km == 4, 0], X[y_km == 4, 1],\n",
+    "    s=50, c=\"yellow\",\n",
+    "    marker=\"<\", edgecolor=\"black\",\n",
+    "    label=\"cluster 4\"\n",
+    ")\n",
+    "\n",
+    "plt.scatter(\n",
+    "    X[y_km == 5, 0], X[y_km == 5, 1],\n",
+    "    s=50, c=\"purple\",\n",
+    "    marker=\">\", edgecolor=\"black\",\n",
+    "    label=\"cluster 5\"\n",
+    ")\n",
+    "\n",
+    "plt.scatter(\n",
+    "    X[y_km == 6, 0], X[y_km == 6, 1],\n",
+    "    s=50, c=\"brown\",\n",
+    "    marker=\"X\", edgecolor=\"black\",\n",
+    "    label=\"cluster 6\"\n",
+    ")\n",
+    "\n",
     "# plot the centroids\n",
     "plt.scatter(\n",
     "    km.cluster_centers_[:, 0], km.cluster_centers_[:, 1],\n",
     "    s=250, marker=\"*\",\n",
     "    c=\"red\", edgecolor=\"black\",\n",
-    "    label=\"centroids\"\n",
+    "    label=\"centers\"\n",
     ")\n",
     "\n",
     "plt.legend(scatterpoints=1)\n",
@@ -326,7 +355,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Not bad. Those clusters show some separation, at least along those three most dominant dimensions."
+    "Not bad, based on the centers these clusters show some separation – at least when we plot in 2 dimensions."
    ]
   },
   {
@@ -830,7 +859,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "TRAIN_ITER = 30\n",
+    "TRAIN_ITER = 20\n",
     "\n",
     "df = pd.DataFrame(columns=[ \"min_reward\", \"avg_reward\", \"max_reward\", \"steps\", \"checkpoint\"])\n",
     "status = \"reward {:6.2f} {:6.2f} {:6.2f}  len {:4.2f}  saved {}\"\n",
@@ -940,7 +969,7 @@
     "AGENT.restore(BEST_CHECKPOINT)\n",
     "history = []\n",
     "\n",
-    "for episode_reward in run_rollout(AGENT, env, n_iter=100, verbose=False):\n",
+    "for episode_reward in run_rollout(AGENT, env, n_iter=500, verbose=False):\n",
     "    history.append(episode_reward)\n",
     "    \n",
     "print(\"average reward:\", round(sum(history) / len(history), 3))"
@@ -984,7 +1013,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Evaluate Learning with TensorBoard\n",
+    "## Evaluate learning with TensorBoard\n",
     "\n",
     "You also can run [TensorBoard](https://www.tensorflow.org/tensorboard) to visualize the RL training metrics from the log files. The results during training were written to a directory under `$HOME/ray_results`\n",
     "\n",
@@ -1003,6 +1032,36 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
+    "## Exercises\n",
+    "\n",
+    "For the exercises, there are several ways to modify the Gym environment or the RLlib training parameters, then compare how the outcomes differ:\n",
+    "\n",
+    "  1. Re-run using smaller and larger K values\n",
+    "  2. Adjust the rewards for depleted and unrated actions\n",
+    "  3. Increase the number of training iterations\n",
+    "  4. Compare use of the other dataset partitions during rollout: `\"jester-data-2.csv\"` or `\"jester-data-3.csv\"`\n",
+    "\n",
+    "For each of these variations compare:\n",
+    "\n",
+    "  * baseline with random actions \n",
+    "  * baseline with the naïve strategy\n",
+    "  * predicted average reward from training\n",
+    "  * stats from the rollout\n",
+    "\n",
+    "Let's discuss the results as a group.\n",
+    "\n",
+    "Other questions to discuss:\n",
+    "\n",
+    "  1. In what ways could the \"warm start\" be improved?\n",
+    "  2. How could this code be modified to scale to millions of users?  Or to thousands of items?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Clean up\n",
+    "\n",
     "Finally, let's shutdown Ray gracefully:"
    ]
   },
@@ -1032,7 +1091,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.7.7"
+   "version": "3.7.4"
   }
  },
  "nbformat": 4,