Skip to content

Commit e2d34d7

Browse files
authored
Merge pull request #111 from gperdrizet/dev
Updated lesson 12, 13 and 14 solution notebooks
2 parents 657eb98 + 4f84ff6 commit e2d34d7

File tree

6 files changed

+1434
-116
lines changed

6 files changed

+1434
-116
lines changed

notebooks/notebooks.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -133,4 +133,6 @@ units:
133133
- name: "In class demo"
134134
file: "Lesson_14_demo.ipynb"
135135
- name: "Activity"
136-
file: "Lesson_14_activity.ipynb"
136+
file: "Lesson_14_activity.ipynb"
137+
- name: "Activity solution"
138+
file: "Lesson_14_activity_solution.ipynb"

notebooks/unit2/lesson_12/Lesson_12_activity_solution.ipynb

Lines changed: 211 additions & 20 deletions
Large diffs are not rendered by default.

notebooks/unit2/lesson_13/Lesson_13_activity_solution.ipynb

Lines changed: 77 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -190,9 +190,17 @@
190190
" - **Bonus**: Calculate the standard deviation and explain what it tells you about the variability in monthly rainfall patterns"
191191
]
192192
},
193+
{
194+
"cell_type": "markdown",
195+
"id": "c0fa8e75",
196+
"metadata": {},
197+
"source": [
198+
"### Probability of rain calculation"
199+
]
200+
},
193201
{
194202
"cell_type": "code",
195-
"execution_count": 7,
203+
"execution_count": null,
196204
"id": "996e854d",
197205
"metadata": {},
198206
"outputs": [
@@ -205,14 +213,21 @@
205213
}
206214
],
207215
"source": [
208-
"# 1. Calculate probability of rain\n",
209216
"rainy_days = len(df[df['weather_condition'] == 'Rainy'])\n",
210217
"total_days = len(df)\n",
211218
"p_rain = rainy_days / total_days\n",
212219
"\n",
213220
"print(f\"Based on our data, there's a {p_rain*100:.1f}% chance of rain on any given day\")"
214221
]
215222
},
223+
{
224+
"cell_type": "markdown",
225+
"id": "32786f54",
226+
"metadata": {},
227+
"source": [
228+
"### Binomial distribution calculation"
229+
]
230+
},
216231
{
217232
"cell_type": "code",
218233
"execution_count": 8,
@@ -226,6 +241,14 @@
226241
"probabilities = stats.binom.pmf(k_values, n_days, p_rain)"
227242
]
228243
},
244+
{
245+
"cell_type": "markdown",
246+
"id": "22f972af",
247+
"metadata": {},
248+
"source": [
249+
"### Binomial distribution plot"
250+
]
251+
},
229252
{
230253
"cell_type": "code",
231254
"execution_count": 9,
@@ -270,9 +293,17 @@
270293
"**Probability of 15+ rainy days**: We can calculate this using the cumulative distribution function."
271294
]
272295
},
296+
{
297+
"cell_type": "markdown",
298+
"id": "0ecbe5a9",
299+
"metadata": {},
300+
"source": [
301+
"### Probability of >= 15 rainy days"
302+
]
303+
},
273304
{
274305
"cell_type": "code",
275-
"execution_count": 13,
306+
"execution_count": null,
276307
"id": "5bc3a33e",
277308
"metadata": {},
278309
"outputs": [
@@ -285,12 +316,19 @@
285316
}
286317
],
287318
"source": [
288-
"# Calculate probability of 15 or more rainy days\n",
289319
"prob_15_or_more = 1 - stats.binom.cdf(14, n_days, p_rain)\n",
290320
"\n",
291321
"print(f\"Probability of 15+ rainy days: {prob_15_or_more:.4f} ({prob_15_or_more*100:.2f}%)\")"
292322
]
293323
},
324+
{
325+
"cell_type": "markdown",
326+
"id": "981ac816",
327+
"metadata": {},
328+
"source": [
329+
"### Extra: Binomial cumulative distribution function (CDF) visualization"
330+
]
331+
},
294332
{
295333
"cell_type": "code",
296334
"execution_count": 14,
@@ -401,9 +439,17 @@
401439
" - **Bonus**: Repeat the experiment with different sample sizes (n=5, n=10, n=50). How does sample size affect the spread and normality of the sampling distribution?"
402440
]
403441
},
442+
{
443+
"cell_type": "markdown",
444+
"id": "c879505c",
445+
"metadata": {},
446+
"source": [
447+
"### Population distribution"
448+
]
449+
},
404450
{
405451
"cell_type": "code",
406-
"execution_count": 16,
452+
"execution_count": null,
407453
"id": "da2ffd4c",
408454
"metadata": {},
409455
"outputs": [
@@ -427,7 +473,6 @@
427473
}
428474
],
429475
"source": [
430-
"# 1. Examine population distribution\n",
431476
"population_mean = df['rainfall_inches'].mean()\n",
432477
"population_std = df['rainfall_inches'].std()\n",
433478
"\n",
@@ -449,14 +494,21 @@
449494
"The population distribution is highly right-skewed with many zero values (no rain) and a long tail of higher rainfall amounts."
450495
]
451496
},
497+
{
498+
"cell_type": "markdown",
499+
"id": "d2b5ce7c",
500+
"metadata": {},
501+
"source": [
502+
"### Sampling"
503+
]
504+
},
452505
{
453506
"cell_type": "code",
454-
"execution_count": 17,
507+
"execution_count": null,
455508
"id": "9b1cacd5",
456509
"metadata": {},
457510
"outputs": [],
458511
"source": [
459-
"# 2. Create sampling distribution\n",
460512
"n_samples = 1000\n",
461513
"sample_size = 30\n",
462514
"sample_means = []\n",
@@ -468,9 +520,17 @@
468520
"sample_means = np.array(sample_means)"
469521
]
470522
},
523+
{
524+
"cell_type": "markdown",
525+
"id": "af68a64a",
526+
"metadata": {},
527+
"source": [
528+
"### Sampling distribution plot"
529+
]
530+
},
471531
{
472532
"cell_type": "code",
473-
"execution_count": 24,
533+
"execution_count": null,
474534
"id": "6dd93618",
475535
"metadata": {},
476536
"outputs": [
@@ -486,7 +546,6 @@
486546
}
487547
],
488548
"source": [
489-
"# 3. Visualize sampling distribution\n",
490549
"standard_error = population_std / np.sqrt(sample_size)\n",
491550
"\n",
492551
"# Normal curve\n",
@@ -502,6 +561,14 @@
502561
"plt.show()"
503562
]
504563
},
564+
{
565+
"cell_type": "markdown",
566+
"id": "ebf6dae2",
567+
"metadata": {},
568+
"source": [
569+
"### Sampling distribution versus population comparison"
570+
]
571+
},
505572
{
506573
"cell_type": "code",
507574
"execution_count": 23,

0 commit comments

Comments
 (0)