Skip to content

Commit 7a33657

Browse files
committed
Updated lesson 26 demo and activity
1 parent 55d9b96 commit 7a33657

File tree

2 files changed

+291
-48
lines changed

2 files changed

+291
-48
lines changed

notebooks/unit4/lesson_26/Lesson_26_activity.ipynb

Lines changed: 24 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,7 @@
9494
"source": [
9595
"### 1.2. Train test split\n",
9696
"\n",
97-
"Use `train_test_split` to split the data into training and testing sets. Use `random_state=42` for reproducibility."
97+
"Use `train_test_split` to split the data into training and testing sets. Use `random_state=315` for reproducibility."
9898
]
9999
},
100100
{
@@ -163,9 +163,7 @@
163163
"source": [
164164
"### 2.2. Test set evaluation\n",
165165
"\n",
166-
"For classification, we use accuracy instead of R². Use the model's `predict` method to get predictions and `score` method to get accuracy.\n",
167-
"\n",
168-
"**Hint:** `model.score(X, y)` returns the accuracy for classifiers."
166+
"For classification, we can use accuracy, F1 score and/or AUC-ROC (and others) instead of R². Use sklearn's [`metrics`](https://scikit-learn.org/stable/api/sklearn.metrics.html) module ."
169167
]
170168
},
171169
{
@@ -177,7 +175,11 @@
177175
"source": [
178176
"logistic_predictions = # YOUR CODE HERE\n",
179177
"logistic_accuracy = # YOUR CODE HERE\n",
180-
"print(f'Logistic Regression accuracy on test set: {logistic_accuracy:.4f}')"
178+
"logistic_f1 = # YOUR CODE HERE\n",
179+
"logistic_auc = # YOUR CODE HERE\n",
180+
"print(f'Logistic regression accuracy on test set: {logistic_accuracy:.4f}')\n",
181+
"print(f'Logistic regression F1 score on test set: {logistic_f1:.4f}')\n",
182+
"print(f'Logistic regression AUC-ROC score on test set: {logistic_auc:.4f}')"
181183
]
182184
},
183185
{
@@ -187,7 +189,7 @@
187189
"source": [
188190
"### 2.3. Performance analysis\n",
189191
"\n",
190-
"For classification, we visualize performance using a confusion matrix."
192+
"For classification, visualize performance using a confusion matrix."
191193
]
192194
},
193195
{
@@ -207,14 +209,14 @@
207209
"source": [
208210
"## 3. Multilayer perceptron (MLP) classifier\n",
209211
"\n",
210-
"Now let's build a neural network classifier using `MLPClassifier`.\n",
212+
"Now let's build a neural network classifier using sklearn's [`MLPClassifier`](https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html).\n",
211213
"\n",
212214
"### 3.1. Single epoch training function\n",
213215
"\n",
214216
"Complete the training function below. It should:\n",
215217
"1. Split the data into training and validation sets\n",
216218
"2. Call `partial_fit` on the model (remember to pass `classes=[0, 1]` on the first call)\n",
217-
"3. Record training and validation accuracy in the history dictionary\n",
219+
"3. Record training and validation [`log_loss`](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.log_loss.html) (aka binary cross-entropy) in the history dictionary\n",
218220
"\n",
219221
"**Hint:** Use `model.partial_fit(X, y, classes=[0, 1])` for the first epoch. For subsequent epochs, `partial_fit` remembers the classes."
220222
]
@@ -228,14 +230,17 @@
228230
"source": [
229231
"def train(model: MLPClassifier, df: pd.DataFrame, training_history: dict, classes: list = None) -> tuple[MLPClassifier, dict]:\n",
230232
" '''Trains sklearn MLP classifier model on given dataframe using validation split.\n",
231-
" Returns the updated model and training history dictionary.'''\n",
233+
" Returns the updated model and training history dictionary containing training and\n",
234+
" validation log loss. If classes are not provided, assumes 0 and 1.'''\n",
235+
"\n",
236+
" global features, label\n",
232237
"\n",
233238
" df, val_df = train_test_split(df, random_state=315)\n",
234239
" \n",
235240
" # YOUR CODE HERE: call partial_fit on the model\n",
236241
" # If classes is provided, pass it to partial_fit\n",
237242
" \n",
238-
" # YOUR CODE HERE: append training and validation accuracy to history\n",
243+
" # YOUR CODE HERE: append training and validation log loss to history\n",
239244
" \n",
240245
" return model, training_history"
241246
]
@@ -267,8 +272,8 @@
267272
"epochs = 10\n",
268273
"\n",
269274
"training_history = {\n",
270-
" 'training_accuracy': [],\n",
271-
" 'validation_accuracy': []\n",
275+
" 'training_loss': [],\n",
276+
" 'validation_loss': []\n",
272277
"}\n",
273278
"\n",
274279
"mlp_model = # YOUR CODE HERE: create MLPClassifier\n",
@@ -285,7 +290,7 @@
285290
"source": [
286291
"### 3.3. Learning curves\n",
287292
"\n",
288-
"Plot the training and validation accuracy over epochs to visualize the learning process."
293+
"Plot the training and validation loss over epochs to visualize the learning process."
289294
]
290295
},
291296
{
@@ -295,7 +300,7 @@
295300
"metadata": {},
296301
"outputs": [],
297302
"source": [
298-
"# YOUR CODE HERE: plot training and validation accuracy\n",
303+
"# YOUR CODE HERE: plot training and validation loss\n",
299304
"# Use plt.plot() for each curve\n",
300305
"# Add title, xlabel, ylabel, and legend"
301306
]
@@ -319,7 +324,11 @@
319324
"source": [
320325
"mlp_predictions = # YOUR CODE HERE\n",
321326
"mlp_accuracy = # YOUR CODE HERE\n",
322-
"print(f'MLP accuracy on test set: {mlp_accuracy:.4f}')"
327+
"mlp_f1 = # YOUR CODE HERE\n",
328+
"mlp_auc = # YOUR CODE HERE\n",
329+
"print(f'MLP accuracy on test set: {mlp_accuracy:.4f}')\n",
330+
"print(f'MLP F1 score on test set: {mlp_accuracy:.4f}')\n",
331+
"print(f'MLP AUC-ROC score on test set: {mlp_accuracy:.4f}')"
323332
]
324333
},
325334
{

notebooks/unit4/lesson_26/Lesson_26_demo.ipynb

Lines changed: 267 additions & 33 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)