mlr-org · lona-k · Feb 10, 2025 · Feb 10, 2025 · Feb 10, 2025 · Feb 10, 2025
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -45,6 +45,7 @@ Imports:
     rprojroot,
     stringi
 Remotes:
+    mlr-org/mlr3
     mlr-org/mlr3extralearners,
     mlr-org/mlr3batchmark,
     mlr-org/mlr3proba,

diff --git a/book/_quarto.yml b/book/_quarto.yml
@@ -43,8 +43,9 @@ book:
       - chapters/chapter11/large-scale_benchmarking.qmd
       - chapters/chapter12/model_interpretation.qmd
       - chapters/chapter13/beyond_regression_and_classification.qmd
-      - chapters/chapter14/algorithmic_fairness.qmd
-      - chapters/chapter15/predsets_valid_inttune.qmd
+      - chapters/chapter14/quantile_regression.qmd
+      - chapters/chapter15/algorithmic_fairness.qmd
+      - chapters/chapter16/predsets_valid_inttune.qmd
     - chapters/references.qmd
   appendices:
       - chapters/appendices/solutions.qmd # online only

diff --git a/book/book.bib b/book/book.bib
@@ -1436,3 +1436,14 @@ @book{hutter2019automated
     publisher = {Springer},
     keywords = {}
 }
+@article{yu_quantile_2003,
+    author = {Yu, Keming and Lu, Zudi and Stander, Julian},
+    doi = {10.1111/1467-9884.00363},
+    journal = {Journal of the Royal Statistical Society: Series D (The Statistician)},
+    number = {3},
+    pages = {331--350},
+    title = {Quantile regression: applications and current research areas},
+    doi = {10.1111/1467-9884.00363},
+    volume = {52},
+    year = {2003},
+}
diff --git a/book/chapters/appendices/Rplots.pdf b/book/chapters/appendices/Rplots.pdf
diff --git a/book/chapters/appendices/solutions.qmd b/book/chapters/appendices/solutions.qmd
@@ -2018,13 +2018,45 @@ benchmark(design)$aggregate(meas)[, .(learner_id, clust.silhouette)]
 
 We can see that we get the silhouette closest to `1` with `K=2` so we might use this value for future experiments.
 
+## Solutions to @sec-quantile-regression
+
+1.  Manually `$train()` a GBM regression model from `r ref_pkg("mlr3extralearners")` on the mtcars task to predict the 95th percentile of the target variable. Make sure that you split the data and only use the test data for fitting the earner.
+
+We start by loading the packages and creating the tasks and the test train split.
+
+```{r solutions-108}
+library(mlr3)
+library(mlr3extralearners)
+
+task = tsk("california_housing")
+splits = partition(task)
+```
+
+In the next step, we initialize the learner as `"regr.gbm"` and explicitly set the quantiles parameter to 0.95. In order for the learner to be able to predict this quantile, we need to specify the `predict_type`. Lastly, we train the learner using only the test data.
+
+```{r solutions-109}
+lrn_gbm = lrn("regr.gbm", predict_type = "quantiles", quantiles = 0.95)
+lrn_gbm$train(task, row_ids = splits$train)
+```
+
+2.  Use the test data to evaluate your learner with the pinball loss.
+
+Firstly, we use the learner from the last exercise to predict on the test data. Then we calculate the pinball loss using the predictions.
+
+```{r solutions-110}
+prds_gbm = lrn_gbm$predict(task, row_ids = splits$test)
+score_gbm = prds_gbm$score(msr("regr.pinball", alpha = 0.95, id = "q0.95"))
+score_gbm
+```
+
+
 ## Solutions to @sec-fairness
 
 1. Train a model of your choice on `tsk("adult_train")` and test it on `tsk("adult_test")`, use any measure of your choice to evaluate your predictions. Assume our goal is to achieve parity in false omission rates across the protected 'sex' attribute. Construct a fairness metric that encodes this and evaluate your model. To get a deeper understanding, look at the `r ref("groupwise_metrics")` function to obtain performance in each group.
 
 For now we simply load the data and look at the data.
 
-```{r solutions-108}
+```{r solutions-111}
 library(mlr3)
 library(mlr3fairness)
 set.seed(8)
@@ -2036,7 +2068,7 @@ tsk_adult_train
 
 We can now train a simple model, e.g., a decision tree and evaluate for accuracy.
 
-```{r solutions-109}
+```{r solutions-112}
 learner = lrn("classif.rpart")
 learner$train(tsk_adult_train)
 prediction = learner$predict(tsk_adult_test)
@@ -2046,30 +2078,30 @@ prediction$score()
 The *false omission rate parity* metric is available via the key `"fairness.fomr"`.
 Note, that evaluating our prediction now requires that we also provide the task.
 
-```{r solutions-110}
+```{r solutions-113}
 msr_1 = msr("fairness.fomr")
 prediction$score(msr_1, tsk_adult_test)
 ```
 
 In addition, we can look at false omission rates in each group.
 The `groupwise_metrics` function creates a metric for each group specified in the `pta` column role:
 
-```{r solutions-111}
+```{r solutions-114}
 tsk_adult_test$col_roles$pta
 ```
 
 We can then use this metric to evaluate our model again.
 This gives us the false omission rates for male and female individuals separately.
 
-```{r solutions-112}
+```{r solutions-115}
 msr_2 = groupwise_metrics(base_measure = msr("classif.fomr"), task = tsk_adult_test)
 prediction$score(msr_2, tsk_adult_test)
 ```
 
 2. Improve your model by employing pipelines that use pre- or post-processing methods for fairness. Evaluate your model along the two metrics and visualize the resulting metrics. Compare the different models using an appropriate visualization.
 
 First we can again construct the learners above.
-```{r solutions-113}
+```{r solutions-116}
 library(mlr3pipelines)
 lrn_1 = po("reweighing_wts") %>>% lrn("classif.rpart")
 lrn_2 = po("learner_cv", lrn("classif.rpart")) %>>%
@@ -2078,7 +2110,7 @@ lrn_2 = po("learner_cv", lrn("classif.rpart")) %>>%
 
 And run the benchmark again. Note, that we use three-fold CV this time for comparison.
 
-```{r solutions-114}
+```{r solutions-117}
 learners = list(learner, lrn_1, lrn_2)
 design = benchmark_grid(tsk_adult_train, learners, rsmp("cv", folds = 3L))
 bmr = benchmark(design)
@@ -2087,7 +2119,7 @@ bmr$aggregate(msrs(c("classif.acc", "fairness.fomr")))
 
 We can now again visualize the result.
 
-```{r solutions-115}
+```{r solutions-118}
 library(ggplot2)
 fairness_accuracy_tradeoff(bmr, msr("fairness.fomr")) +
   scale_color_viridis_d("Learner") +
@@ -2105,12 +2137,12 @@ We can notice two main results:
 
 This can be achieved by adding "race" to the `"pta"` col_role.
 
-```{r solutions-116}
+```{r solutions-119}
 tsk_adult_train$set_col_roles("race", add_to = "pta")
 tsk_adult_train
 ```
 
-```{r solutions-117}
+```{r solutions-120}
 tsk_adult_test$set_col_roles("race", add_to = "pta")
 prediction$score(msr_1, tsk_adult_test)
 ```
@@ -2120,12 +2152,12 @@ Note, that the metric by default computes the maximum discrepancy between all me
 
 If we now compute the `groupwise_metrics`, we will get a metric for the intersection of each group.
 
-```{r solutions-118}
+```{r solutions-121}
 msr_3 = groupwise_metrics(msr("classif.fomr"),  tsk_adult_train)
 unname(sapply(msr_3, function(x) x$id))
 ```
 
-```{r solutions-119}
+```{r solutions-122}
 prediction$score(msr_3, tsk_adult_test)
 ```
 
@@ -2143,7 +2175,7 @@ We'll go through them one by one to deepen our understanding:
 
   We can investigate this further by looking at actual counts:
 
-```{r solutions-120}
+```{r solutions-123}
   table(tsk_adult_test$data(cols = c("race", "sex", "target")))
 ```
 
@@ -2155,17 +2187,19 @@ We'll go through them one by one to deepen our understanding:
 
 First, we create a subset of only `sex`: `Female` and `race`: `"Black", "White"`.
 
-```{r solutions-121}
+```{r solutions-124}
 adult_subset = tsk_adult_test$clone()
 df = adult_subset$data()
 rows = seq_len(nrow(df))[df$race %in% c("Black", "White") & df$sex %in% c("Female")]
 adult_subset$filter(rows)
 adult_subset$set_col_roles("race", add_to = "pta")
 ```
+
 And evaluate our measure again:
 
-```{r solutions-122}
-prediction$score(msr_3, adult_subset)
+```{r solutions-125}
+#| eval: false
+prediction$score(msr_3, task = adult_subset)
 ```
 
 We can see, that between women there is an even bigger discrepancy compared to men.
@@ -2181,7 +2215,7 @@ We can see, that between women there is an even bigger discrepancy compared to m
 
 We start by loading the packages and creating the task.
 
-```{r solutions-123}
+```{r solutions-126}
 library(mlr3)
 library(mlr3extralearners)
 library(mlr3pipelines)
@@ -2192,14 +2226,14 @@ tsk_pima
 
 Below, we see that the task has five features with missing values.
 
-```{r solutions-124}
+```{r solutions-127}
 tsk_pima$missings()
 ```
 
 Next, we create the LightGBM classifier, but don't specify the validation data yet.
 We handle the missing values using a simple median imputation.
 
-```{r solutions-125}
+```{r solutions-128}
 lrn_lgbm = lrn("classif.lightgbm",
   num_iterations = 1000,
   early_stopping_rounds = 10,
@@ -2216,15 +2250,15 @@ The call below sets the `$validate` field of the LightGBM pipeop to `"predefined
 Recall that only the graphlearner itself can specify *how* the validation data is generated.
 The individual pipeops can either use it (`"predefined"`) or not (`NULL`).
 
-```{r solutions-126}
+```{r solutions-129}
 set_validate(glrn, validate = 0.3, ids = "classif.lightgbm")
 glrn$validate
 glrn$graph$pipeops$classif.lightgbm$validate
 ```
 
 Finally, we train the learner and inspect the validation scores and internally tuned parameters.
 
-```{r solutions-127}
+```{r solutions-130}
 glrn$train(tsk_pima)
 
 glrn$internal_tuned_values
@@ -2240,7 +2274,7 @@ glrn$internal_valid_scores
 We start by setting the number of boosting iterations to an internal tune token where the maximum number of boosting iterations is 1000 and the aggregation function the maximum.
 Note that the input to the aggregation function is a list of integer values (the early stopped values for the different resampling iterations), so we need to `unlist()` it first before taking the maximum.
 
-```{r solutions-128}
+```{r solutions-131}
 library(mlr3tuning)
 
 glrn$param_set$set_values(
@@ -2252,14 +2286,14 @@ glrn$param_set$set_values(
 
 Now, we change the validation data from `0.3` to `"test"`, where we can omit the `ids` specification as LightGBM is the base learner.
 
-```{r solutions-129}
+```{r solutions-132}
 set_validate(glrn, validate = "test")
 ```
 
 Next, we create the autotuner using the configuration given in the instructions.
 As the internal validation measures are calculated by `lightgbm` and not `mlr3`, we need to specify whether the metric should be minimized.
 
-```{r solutions-130}
+```{r solutions-133}
 at_lgbm = auto_tuner(
   learner = glrn,
   tuner = tnr("internal"),
@@ -2272,7 +2306,7 @@ at_lgbm$id = "at_lgbm"
 
 Finally, we set up the benchmark design, run it, and evaluate the learners in terms of their classification accuracy.
 
-```{r solutions-131}
+```{r solutions-134}
 design = benchmark_grid(
   task = tsk_pima,
   learners = list(at_lgbm, lrn("classif.rpart")),
@@ -2286,7 +2320,7 @@ bmr$aggregate(msr("classif.acc"))
 
 3. Consider the code below:
 
-   ```{r solutions-132}
+   ```{r solutions-135}
    branch_lrn = as_learner(
      ppl("branch", list(
        lrn("classif.ranger"),
@@ -2349,7 +2383,7 @@ Note that we would normally recommend setting the validation data to `"test"` wh
 
 4. Look at the (failing) code below:
 
-   ```{r solutions-133, error = TRUE}
+   ```{r solutions-136, error = TRUE}
    tsk_sonar = tsk("sonar")
    glrn = as_learner(
      po("pca") %>>% lrn("classif.xgboost", validate = 0.3)

diff --git a/book/chapters/chapter1/introduction_and_overview.qmd b/book/chapters/chapter1/introduction_and_overview.qmd
@@ -3,6 +3,20 @@ aliases:
   - "/introduction_and_overview.html"
 ---
 
+
+```{r}
+# extra packages that must be installed in the docker image
+remotes::install_github("mlr-org/mlr3")
+remotes::install_github("mlr-org/mlr3pipelines")
+remotes::install_github("mlr-org/mlr3fairness@weights")
+remotes::install_github("mlr-org/mlr3learners")
+remotes::install_github("mlr-org/mlr3extralearners")
+remotes::install_cran("qgam")
+remotes::install_github("mlr-org/mlr3batchmark")
+remotes::install_cran("iml")
+remotes::install_github("mlr-org/mlr3spatiotempcv@task_row_hash")
+```
+
 # Introduction and Overview {#sec-introduction}
 
 {{< include ../../common/_setup.qmd >}}

diff --git a/book/chapters/chapter10/Rplots.pdf b/book/chapters/chapter10/Rplots.pdf