UW-Madison-DataScience
diff --git a/‎notebooks/01-introduction.ipynb‎
Lines changed: 4 additions & 4 deletions b/‎notebooks/01-introduction.ipynb‎
Lines changed: 4 additions & 4 deletions
diff --git a/‎notebooks/02-regression.ipynb‎
Lines changed: 6 additions & 6 deletions b/‎notebooks/02-regression.ipynb‎
Lines changed: 6 additions & 6 deletions
diff --git a/‎notebooks/03-classification.ipynb‎
Lines changed: 12 additions & 12 deletions b/‎notebooks/03-classification.ipynb‎
Lines changed: 12 additions & 12 deletions
diff --git a/‎notebooks/04-ensemble-methods.ipynb‎
Lines changed: 8 additions & 8 deletions b/‎notebooks/04-ensemble-methods.ipynb‎
Lines changed: 8 additions & 8 deletions
@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "markdown",
-   "id": "50eec27f",
+   "id": "5950f81b",
    "metadata": {},
    "source": [
     "---\n",
@@ -44,7 +44,7 @@
     "\n",
     "The image below shows the relationships between artificial intelligence, machine learning and deep learning.\n",
     "\n",
-    "![An infographic showing some of the relationships between AI, ML, and DL](fig/introduction/AI_ML_DL_differences.png)\n",
+    "![An infographic showing some of the relationships between AI, ML, and DL](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/introduction/AI_ML_DL_differences.png)\n",
     "The image above is by Tukijaaliwa, CC BY-SA 4.0, via Wikimedia Commons, original source\n",
     "\n",
     "\n",
@@ -126,7 +126,7 @@
     "\n",
     "If we want our ML models to make predictions or classifications, we also provide \"labels\" as our expected \"answers/results\". The model will then be trained on the input features to try and match our provided labels. This is done by providing a \"Target Array\" (usually referred to as the code variable `y`) which contains the \"labels or values\" that we wish to predict using the features data.\n",
     "\n",
-    "![Types of Machine Learning](fig/introduction/sklearn_input.png)\n",
+    "![Types of Machine Learning](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/introduction/sklearn_input.png)\n",
     "Figure from the [Python Data Science Handbook](https://github.com/jakevdp/PythonDataScienceHandbook)\n",
     "\n",
     "# What will we cover today?\n",
@@ -135,7 +135,7 @@
     "\n",
     "The figure below provides a nice overview of some of the sub-domains of ML and the techniques used within each sub-domain. We recommend checking out the Scikit-Learn [webpage](https://scikit-learn.org/stable/index.html) for additional examples of the topics we will cover in this lesson. We will cover topics highlighted in blue: classical learning techniques such as regression, classification, clustering, and dimension reduction, as well as ensemble methods and a brief introduction to neural networks using perceptrons.\n",
     "\n",
-    "![Types of Machine Learning](fig/introduction/ML_summary.png)\n",
+    "![Types of Machine Learning](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/introduction/ML_summary.png)\n",
     "[Image from Vasily Zubarev via their blog](https://vas3k.com/blog/machine_learning/) with modifications in blue to denote lesson content.\n",
     "\n",
     "{% include links.md %}\n",
 
@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "markdown",
-   "id": "53247487",
+   "id": "c26c7a74",
    "metadata": {},
    "source": [
     "---\n",
@@ -48,7 +48,7 @@
     "\n",
     "Regression can be as simple as drawing a \"line of best fit\" through data points, known as linear regression, or more complex models such as polynomial regression, and is used routinely around the world in both industry and research. You may have already used regression in the past without knowing that it is also considered a machine learning technique!\n",
     "\n",
-    "![Example of linear and polynomial regressions](fig/regression_example.png)\n",
+    "![Example of linear and polynomial regressions](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/regression_example.png)\n",
     "\n",
     "## Linear regression using Scikit-Learn\n",
     "\n",
@@ -95,7 +95,7 @@
     "~~~\n",
     "{: .language-python}\n",
     "\n",
-    "![Comparison of the regressions of our dataset](fig/penguin_regression.png)\n",
+    "![Comparison of the regressions of our dataset](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/penguin_regression.png)\n",
     "\n",
     "In this regression example we will create a Linear Regression model that will try to predict `y` values based upon `x` values.\n",
     "\n",
@@ -180,7 +180,7 @@
     "~~~\n",
     "{: .language-python}\n",
     "\n",
-    "![Comparison of the regressions of our dataset](fig/regress_penguin_lin.png)\n",
+    "![Comparison of the regressions of our dataset](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/regress_penguin_lin.png)\n",
     "\n",
     "\n",
     "\n",
@@ -222,7 +222,7 @@
     "~~~\n",
     "{: .language-python}\n",
     "\n",
-    "![Comparison of the regressions of our dataset](fig/penguin_regression_all.png)\n",
+    "![Comparison of the regressions of our dataset](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/penguin_regression_all.png)\n",
     "\n",
     "Oh dear. It looks like our linear regression fits okay for our subset of the penguin data, and a few additional samples, but there appears to be a cluster of points that are poorly predicted by our model. Even if we re-trained our model using all samples it looks unlikely that our model would perform much better due to the two-cluster nature of our dataset.\n",
     "\n",
@@ -378,7 +378,7 @@
     "{: .language-python}\n",
     "\n",
     "\n",
-    "![Comparison of the regressions of our dataset](fig/penguin_regression_poly.png)\n",
+    "![Comparison of the regressions of our dataset](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/penguin_regression_poly.png)\n",
     "\n",
     "::::::::::::::::::::::::::::::::::::: challenge\n",
     "\n",
 
@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "markdown",
-   "id": "7c7f9ca5",
+   "id": "2433c099",
    "metadata": {},
    "source": [
     "---\n",
@@ -34,10 +34,10 @@
     "## The penguins dataset\n",
     "We're going to be using the penguins dataset of Allison Horst, published [here](https://github.com/allisonhorst/palmerpenguins), The dataset contains 344 size measurements for three penguin species (Chinstrap, Gentoo and Adélie) observed on three islands in the Palmer Archipelago, Antarctica.\n",
     "\n",
-    "![*Artwork by @allison_horst*](fig/palmer_penguins.png)\n",
+    "![*Artwork by @allison_horst*](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/palmer_penguins.png)\n",
     "\n",
     "The physical attributes measured are flipper length, beak length, beak width, body mass, and sex.\n",
-    "![*Artwork by @allison_horst*](fig/culmen_depth.png)\n",
+    "![*Artwork by @allison_horst*](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/culmen_depth.png)\n",
     "\n",
     "In other words, the dataset contains 344 rows with 7 features i.e. 5 physical attributes, species and the island where the observations were made.\n",
     "\n",
@@ -126,7 +126,7 @@
     "~~~\n",
     "{: .language-python}\n",
     "\n",
-    "![Visualising the penguins dataset](fig/e3_penguins_vis.png)\n",
+    "![Visualising the penguins dataset](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/e3_penguins_vis.png)\n",
     "\n",
     "As there are four measurements for each penguin, we need quite a few plots to visualise all four dimensions against each other. Here is a handy Seaborn function to do so:\n",
     "\n",
@@ -136,15 +136,15 @@
     "~~~\n",
     "{: .language-python}\n",
     "\n",
-    "![Visualising the penguins dataset](fig/pairplot.png)\n",
+    "![Visualising the penguins dataset](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/pairplot.png)\n",
     "\n",
     "We can see that penguins from each species form fairly distinct spatial clusters in these plots, so that you could draw lines between those clusters to delineate each species. This is effectively what many classification algorithms do. They use the training data to delineate the observation space, in this case the 4 measurement dimensions, into classes. When given a new observation, the model finds which of those class areas the new observation falls in to.\n",
     "\n",
     "\n",
     "## Classification using a decision tree\n",
     "We'll first apply a decision tree classifier to the data. Decisions trees are conceptually similar to flow diagrams (or more precisely for the biologists: dichotomous keys). They split the classification problem into a binary tree of comparisons, at each step comparing a measurement to a value, and moving left or right down the tree until a classification is reached.\n",
     "\n",
-    "![Decision tree for classifying penguins](fig/decision_tree_example.png)\n",
+    "![Decision tree for classifying penguins](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/decision_tree_example.png)\n",
     "\n",
     "\n",
     "Training and using a decision tree in Scikit-Learn is straightforward:\n",
@@ -183,7 +183,7 @@
     "~~~\n",
     "{: .language-python}\n",
     "\n",
-    "![Decision tree for classifying penguins](fig/e3_dt_2.png)\n",
+    "![Decision tree for classifying penguins](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/e3_dt_2.png)\n",
     "\n",
     "The first first question (`depth=1`) splits the training data into \"Adelie\" and \"Gentoo\" categories using the criteria `flipper_length_mm <= 206.5`, and the next two questions (`depth=2`) split the \"Adelie\" and \"Gentoo\" categories into \"Adelie & Chinstrap\" and \"Gentoo & Chinstrap\" predictions. \n",
     "\n",
@@ -214,7 +214,7 @@
     "~~~\n",
     "{: .language-python}\n",
     "\n",
-    "![Classification space for our decision tree](fig/e3_dt_space_2.png)\n",
+    "![Classification space for our decision tree](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/e3_dt_space_2.png)\n",
     "\n",
     "## Tuning the `max_depth` hyperparameter\n",
     "\n",
@@ -244,7 +244,7 @@
     "~~~\n",
     "{: .language-python}\n",
     "\n",
-    "![Performance of decision trees of various depths](fig/e3_dt_overfit.png)\n",
+    "![Performance of decision trees of various depths](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/e3_dt_overfit.png)\n",
     "\n",
     "Here we can see that a `max_depth=2` performs slightly better on the test data than those with `max_depth > 2`. This can seem counter intuitive, as surely more questions should be able to better split up our categories and thus give better predictions?\n",
     "\n",
@@ -260,7 +260,7 @@
     "~~~\n",
     "{: .language-python}\n",
     "\n",
-    "![Simplified decision tree](fig/e3_dt_6.png)\n",
+    "![Simplified decision tree](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/e3_dt_6.png)\n",
     "\n",
     "It looks like our decision tree has split up the training data into the correct penguin categories and more accurately than the `max_depth=2` model did, however it used some very specific questions to split up the penguins into the correct categories. Let's try visualising the classification space for a more intuitive understanding:\n",
     "~~~\n",
@@ -277,7 +277,7 @@
     "~~~\n",
     "{: .language-python}\n",
     "\n",
-    "![Classification space of the simplified decision tree](fig/e3_dt_space_6.png)\n",
+    "![Classification space of the simplified decision tree](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/e3_dt_space_6.png)\n",
     "\n",
     "Earlier we saw that the `max_depth=2` model split the data into 3 simple bounding boxes, whereas for `max_depth=5` we see the model has created some very specific classification boundaries to correctly classify every point in the training data.\n",
     "\n",
@@ -454,7 +454,7 @@
     "- **`C`**: Balances smoothness of the decision boundary and misclassifications; start with `C=1`, increase for tighter boundaries, decrease to prevent overfitting.\n",
     "\n",
     "\n",
-    "![Classification space generated by the SVM model](fig/e3_svc_space.png)\n",
+    "![Classification space generated by the SVM model](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/e3_svc_space.png)\n",
     "\n",
     "While this SVM model performs slightly worse than our decision tree (95.6% vs. 98.5%), it's likely that the non-linear boundaries will perform better when exposed to more and more real data, as decision trees are prone to overfitting and requires complex linear models to reproduce simple non-linear boundaries. It's important to pick a model that is appropriate for your problem and data trends!\n",
     "\n",
 
@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "markdown",
-   "id": "f32a0da4",
+   "id": "d442dc97",
    "metadata": {},
    "source": [
     "---\n",
@@ -47,7 +47,7 @@
     "\n",
     "If we trained the same model multiple times on the same data we would expect very similar answers, and so the emphasis with stacking is to choose different models that can be used to build up a reliable concensus. Regression is then typically a good choice for the final decision-making model.\n",
     "\n",
-    "![Stacking](fig/stacking.jpeg)\n",
+    "![Stacking](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/stacking.jpeg)\n",
     "\n",
     "[Image from Vasily Zubarev via their blog](https://vas3k.com/blog/machine_learning/)\n",
     "\n",
@@ -59,7 +59,7 @@
     "\n",
     "The most common example is known as the Random Forest algorithm, which we'll take a look at later on. Random Forests are typically used as a faster, computationally cheaper alternative to Neural Networks, which is ideal for real-time applications like camera face detection prompts.\n",
     "\n",
-    "![Stacking](fig/bagging.jpeg)\n",
+    "![Stacking](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/bagging.jpeg)\n",
     "\n",
     "[Image from Vasily Zubarev via their blog](https://vas3k.com/blog/machine_learning/)\n",
     "\n",
@@ -69,7 +69,7 @@
     "\n",
     "Just like for bagging, boosting is trained mostly on subsets, however in this case these subsets are not randomly generated but are instead built using poorly estimated predictions. Boosting can produce some very high accuracies by learning from it's mistakes, but due to the iterative nature of these improvements it doesn't parallelize well unlike the other ensemble methods. Despite this it can still be a faster, and computationally cheaper alternative to Neural Networks.\n",
     "\n",
-    "![Stacking](fig/boosting.jpeg)\n",
+    "![Stacking](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/boosting.jpeg)\n",
     "\n",
     "[Image from Vasily Zubarev via their blog](https://vas3k.com/blog/machine_learning/)\n",
     "\n",
@@ -125,7 +125,7 @@
     "\n",
     "Random forests are built on decision trees and can provide another way to address over-fitting. Rather than classifying based on one single decision tree (which could overfit the data), an average of results of many trees can be derived for more robust/accurate estimates compared against single trees used in the ensemble.\n",
     "\n",
-    "![Random Forests](fig/randomforest.png) \n",
+    "![Random Forests](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/randomforest.png) \n",
     "\n",
     "[Image from Venkatak Jagannath](https://commons.wikimedia.org/wiki/File:Random_forest_diagram_complete.png)\n",
     "\n",
@@ -169,7 +169,7 @@
     "~~~\n",
     "{: .language-python}\n",
     "\n",
-    "![random forest trees](fig/rf_5_trees.png)\n",
+    "![random forest trees](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/rf_5_trees.png)\n",
     "\n",
     "We can see the first 5 (of 100) trees that were fitted as part of the forest. \n",
     "\n",
@@ -193,7 +193,7 @@
     "~~~\n",
     "{: .language-python}\n",
     "\n",
-    "![random forest clf space](fig/EM_rf_clf_space.png)\n",
+    "![random forest clf space](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/EM_rf_clf_space.png)\n",
     "\n",
     "There is still some overfitting indicated by the regions that contain only single points but using the same hyper-parameter settings used to fit the decision tree classifier, we can see that overfitting is reduced.\n",
     "\n",
@@ -329,7 +329,7 @@
     "~~~\n",
     "{: .language-python}\n",
     "\n",
-    "![Regressor predictions and average from stack](fig/house_price_voting_regressor.svg)\n",
+    "![Regressor predictions and average from stack](https://github.com/UW-Madison-DataScience/machine-learning-novice-sklearn/raw/gh-pages/episodes/fig/house_price_voting_regressor.svg)\n",
     "\n",
     "Finally, lets see how the average compares against each single estimator in the stack? \n",
     "\n",