dubicube/vignettes/articles/bootstrap-interval-calculation.Rmd at bc5728b8fa71eaeb1977feddd3b44e0c49875ec4 · b-cubed-eu/dubicube · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
---
title: "Calculating Bootstrap Confidence Intervals"
editor_options:
  chunk_output_type: console
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

## Introduction

When working with data cubes, it’s essential to understand the uncertainty surrounding derived statistics. This tutorial introduces the `calculate_bootstrap_ci()` function from **dubicube**,  which uses bootstrap replications to estimate the confidence intervals around statistics calculated from data cubes.

## Calculating bootstrap confidence intervals

In the [bootstrap tutorial](https://b-cubed-eu.github.io/dubicube/articles/bootstrap-method-cubes.html), we introduced bootstrapping as a way to assess the variability of statistics calculated from data cubes.
Bootstrapping involves repeatedly resampling the dataset and recalculating the statistic to create a distribution of possible outcomes (= bootstrap replicates).

This tutorial builds on that foundation by showing how to compute confidence intervals from those bootstrap replicates. Confidence intervals provide a useful summary of uncertainty by indicating a range within which the true value of the statistic is likely to be. We consider four different types of intervals (with confidence level $\alpha$). The choice of confidence interval types and their calculation is in line with the **boot** package in R (Canty & Ripley, [1999](https://CRAN.R-project.org/package=boot)), to ensure ease of implementation. They are based on the definitions provided by Davison & Hinkley ([1997, Chapter 5](https://doi.org/10.1017/CBO9780511802843)) (see also DiCiccio & Efron, [1996](https://doi.org/10.1214/ss/1032280214); Efron, [1987](https://doi.org/10.1080/01621459.1987.10478410)).

### 1. **Percentile**

Uses the percentiles of the bootstrap distribution.

$$
CI_{\text{perc}} = \left[ \hat{\theta}^*_{(\alpha/2)}, \hat{\theta}^*_{(1-\alpha/2)} \right]
$$

where $\hat{\theta}^*_{(\alpha/2)}$ and $\hat{\theta}^*_{(1-\alpha/2)}$ are the $\alpha/2$ and $1-\alpha/2$ percentiles of the bootstrap distribution, respectively.

### 2. **Bias-Corrected and Accelerated (BCa)**

Adjusts for bias and acceleration.

**Bias** refers to the systematic difference between the observed statistic from the original dataset and the center of the bootstrap distribution of the statistic. The bias correction term is calculated as:

$$
\hat{z}_0 = \Phi^{-1}\left(\frac{\#(\hat{\theta}^*_b < \hat{\theta})}{B}\right)
$$

where $\#$ is the counting operator, counting the number of times $\hat{\theta}^*_b$ is smaller than $\hat{\theta}$, and $\Phi^{-1}$ is the inverse cumulative density function of the standard normal distribution. $B$ is the number of bootstrap samples.

**Acceleration** quantifies how sensitive the variability of the statistic is to changes in the data. See further for how this is calculated:

- $a = 0$: The statistic's variability does not depend on the data (e.g., symmetric distribution)
- $a > 0$: Small changes in the data have a large effect on the statistic's variability (e.g., positive skew)
- $a < 0$: Small changes in the data have a smaller effect on the statistic's variability (e.g., negative skew)

The bias and acceleration estimates are then used to calculate adjusted percentiles:

$$
\alpha_1 = \Phi\left( \hat{z}_0 + \frac{\hat{z}_0 + z_{\alpha/2}}{1 - \hat{a}(\hat{z}_0 + z_{\alpha/2})} \right), \quad
\alpha_2 = \Phi\left( \hat{z}_0 + \frac{\hat{z}_0 + z_{1 - \alpha/2}}{1 - \hat{a}(\hat{z}_0 + z_{1 - \alpha/2})} \right)
$$

So, we get:

$$
CI_{\text{bca}} = \left[ \hat{\theta}^*_{(\alpha_1)}, \hat{\theta}^*_{(\alpha_2)} \right]
$$

### 3. **Normal**

Assumes the bootstrap distribution of the statistic is approximately normal:

$$
CI_{\text{norm}} = \left[\hat{\theta} - \text{Bias}_{\text{boot}} - \text{SE}_{\text{boot}} \cdot z_{1-\alpha/2},
\hat{\theta} - \text{Bias}_{\text{boot}} + \text{SE}_{\text{boot}} \cdot z_{1-\alpha/2} \right]
$$

where $z_{1-\alpha/2}$ is the $1-\alpha/2$ quantile of the standard normal distribution.

### 4. **Basic**

Centers the interval using percentiles:

$$
CI_{\text{basic}} = \left[ 2\hat{\theta} - \hat{\theta}^*_{(1-\alpha/2)},
2\hat{\theta} - \hat{\theta}^*_{(\alpha/2)} \right]
$$

where $\hat{\theta}^*_{(\alpha/2)}$ and $\hat{\theta}^*_{(1-\alpha/2)}$ are the $\alpha/2$ and $1-\alpha/2$ percentiles of the bootstrap distribution, respectively.

## Calculating acceleration

The acceleration is calculated as follows:

$$
\hat{a} = \frac{1}{6} \frac{\sum_{i = 1}^{n}(I_i^3)}{\left( \sum_{i = 1}^{n}(I_i^2) \right)^{3/2}}
$$

where $I_i$ denotes the influence of data point $x_i$ on the estimation of $\theta$.
$I_i$ can be estimated using jackknifing.
Examples are (1) the negative jackknife: $I_i = (n-1)(\hat{\theta} - \hat{\theta}_{-i})$, and (2) the positive jackknife $I_i = (n+1)(\hat{\theta}_{-i} - \hat{\theta})$ (Frangos & Schucany, [1990](https://doi.org/10.1016/0167-9473(90)90109-U)).
Here, $\hat{\theta}_{-i}$ is the estimated value leaving out the $i$’th data point $x_i$.
The **boot** package also offers infinitesimal jackknife and regression estimation.
Implementation of these jackknife algorithms can be explored in the future.

In case of the BCa interval, `calculate_bootstrap_ci()` uses the function `calculate_acceleration()` to calculate acceleration.
The latter can also be used on its own to calculate acceleration values to quantify the sensitivity
of a statistic’s variability to changes in the dataset.
For jackknifing, it uses the `perform_jackknifing()` function which is not exported by **dubicube**.

## Getting started with dubicube

Our method can be used on any dataframe from which a statistic is calculated and a grouping variable is present.
For this tutorial, we focus on occurrence cubes.
Therefore, we will use the **b3gbi** package for processing the raw data before we go over to bootstrapping.

```{r, message=FALSE, warning=FALSE}
# Load packages
library(ggplot2)      # Data visualisation
library(dplyr)        # Data wrangling
library(tidyr)        # Data wrangling

# Data loading and processing
library(frictionless) # Load example datasets
library(b3gbi)        # Process occurrence cubes
library(dubicube)     # Analysis of data quality & indicator uncertainty
```

### Loading and processing the data

We load the bird cube data from the **b3data** data package using **frictionless** (see also [here](https://github.com/b-cubed-eu/b3data-scripts)).
It is an occurrence cube for birds in Belgium between 2000 en 2024 using the MGRS grid at 10 km scale.

```{r}
# Read data package
b3data_package <- read_package(
  "https://zenodo.org/records/15211029/files/datapackage.json"
)

# Load bird cube data
bird_cube_belgium <- read_resource(b3data_package, "bird_cube_belgium_mgrs10")
```

We process the cube with **b3gbi**.
First, we select 2000 random rows to make the dataset smaller.
This is to reduce the computation time for this tutorial.
We select the data from 2011 - 2020.

```r
set.seed(123)

# Make dataset smaller
rows <- sample(nrow(bird_cube_belgium), 2000)
bird_cube_belgium <- bird_cube_belgium[rows, ]

# Process cube
processed_cube <- process_cube(
  bird_cube_belgium,
  first_year = 2011,
  last_year = 2020,
  cols_occurrences = "n"
)
processed_cube
```

```{r, echo=FALSE, message=FALSE, results='hide'}
if (
  system.file("bootstrapping", "processed_cube.rds", package = "dubicube") == ""
) {
  # Read data package
  b3data_package <- read_package(
    "https://zenodo.org/records/15211029/files/datapackage.json"
  )

  # Load bird cube data
  bird_cube_belgium <- read_resource(b3data_package, "bird_cube_belgium_mgrs10")

  set.seed(123)

  # Make dataset smaller
  rows <- sample(nrow(bird_cube_belgium), 2000)
  bird_cube_belgium <- bird_cube_belgium[rows, ]

  # Process cube
  processed_cube <- process_cube(
    bird_cube_belgium,
    first_year = 2011,
    last_year = 2020,
    cols_occurrences = "n"
  )

  saveRDS(
    processed_cube,
    file.path("..", "..", "inst", "bootstrapping", "processed_cube.rds")
  )
} else {
  processed_cube <- readRDS(
    system.file("bootstrapping", "processed_cube.rds", package = "dubicube")
  )
}
```

```{r, echo=FALSE}
processed_cube
```

### Analysis of the data

Let's say we are interested in the mean number of observations per grid cell per year.
We create a function to calculate this.

```{r, echo=FALSE}
# nolint start: object_usage_linter.
```

```{r}
# Function to calculate the statistic of interest
# Mean observations per grid cell per year
mean_obs <- function(data) {
  data %>%
    dplyr::mutate(x = mean(obs), .by = "cellCode") %>%
    dplyr::summarise(diversity_val = mean(x), .by = "year") %>%
    as.data.frame()
}
```

```{r, echo=FALSE}
# nolint end
```

We get the following results:

```{r}
mean_obs(processed_cube$data)
```

On their own, these values don’t reveal how much uncertainty surrounds them. To better understand their variability, we use bootstrapping to estimate the distribution of the yearly means. From this distribution, we can calculate bootstrap confidence intervals.

### Bootstrapping

We use the `bootstrap_cube()` function to perform bootstrapping (see also the [bootstrap tutorial](https://b-cubed-eu.github.io/dubicube/articles/bootstrap-method-cubes.html)).

```r
bootstrap_results <- bootstrap_cube(
  data_cube = processed_cube,
  fun = mean_obs,
  grouping_var = "year",
  samples = 1000,
  seed = 123
)
```

```{r, echo=FALSE, message=FALSE, results='hide'}
if (
  system.file(
    "bootstrapping", "bootstrap_results.rds", package = "dubicube"
  ) == ""
) {
  bootstrap_results <- bootstrap_cube(
    data_cube = processed_cube,
    fun = mean_obs,
    grouping_var = "year",
    samples = 1000,
    seed = 123
  )

  saveRDS(
    bootstrap_results,
    file.path("..", "..", "inst", "bootstrapping", "bootstrap_results.rds")
  )
} else {
  bootstrap_results <- readRDS(
    system.file("bootstrapping", "bootstrap_results.rds", package = "dubicube")
  )
}
```

```{r, echo=FALSE}
print("Performing whole-cube bootstrap with `boot::boot()`.")
```

### Interval calculation

Now we can use the `calculate_bootstrap_ci()` function to calculate confidence limits. It relies on the following arguments:

- **`bootstrap_results`**:
  A dataframe containing the bootstrap replicates, where each row represents a bootstrap sample. As returned by `bootstrap_cube()`.

- **`grouping_var`**:
  The column(s) used for grouping the output of `fun()`. For example, if `fun()` returns one value per year, use `grouping_var = "year"`.

- **`type`**:
  A character vector specifying the type(s) of confidence intervals to compute. Options include:
    - `"perc"`: Percentile interval
    - `"bca"`: Bias-corrected and accelerated interval
    - `"norm"`: Normal interval
    - `"basic"`: Basic interval
    - `"all"`: Compute all available interval types (default)

- **`conf`**:
  The confidence level of the intervals. Default is `0.95` (95 % confidence level).

- **`aggregate`**:
  Logical. If `TRUE` (default), the function returns confidence limits per group. If `FALSE`, the confidence limits are added to the original bootstrap dataframe `bootstrap_results`.

- **`data_cube`**:
  Only used when `type = "bca"` and no boot method is used. The input data as a processed data cube (from `b3gbi::process_cube()`).

- **`fun`**:
  Only used when `type = "bca"` and no boot method is used. A user-defined function that computes the statistic(s) of interest from `data_cube$data`. This function should return a dataframe that includes a column named `diversity_val`, containing the statistic to evaluate.

- **`progress`**:
  Logical flag to show a progress bar. Set to `TRUE` to enable progress reporting; default is `FALSE`.

We get a warning message for BCa calculation because we are using a relatively small dataset.
Since we are working with `"boot"` objects, we do not need to specify `data_cube` or `fun`.

```{r}
ci_mean_obs <- calculate_bootstrap_ci(
  bootstrap_results = bootstrap_results,
  grouping_var = "year",
  type = c("perc", "bca", "norm", "basic"),
  conf = 0.95
)
```

```{r}
head(ci_mean_obs)
```

We visualise the distribution of the bootstrap replicates and the confidence intervals.

```{r}
# Make interval type a factor
ci_mean_obs <- ci_mean_obs %>%
  mutate(
    year = as.numeric(year),
    int_type = factor(
      int_type, levels = c("perc", "bca", "norm", "basic")
    )
  )
```

```{r, warning=FALSE}
#| fig.alt: >
#|   Confidence intervals for mean number of occurrences over time.
# Convert bootstrap replicates to dataframe
bootstrap_results_df <- boot_list_to_dataframe(
  boot_list = bootstrap_results,
  grouping_var = "year"
) %>%
  mutate(year = as.numeric(year))

# Get bias values
bias_mean_obs <- bootstrap_results_df %>%
  distinct(year, estimate = est_original, `bootstrap estimate` = est_boot)

# Get estimate values
estimate_mean_obs <- bias_mean_obs %>%
  pivot_longer(cols = c("estimate", "bootstrap estimate"),
               names_to = "Legend", values_to = "value") %>%
  mutate(Legend = factor(Legend, levels = c("estimate", "bootstrap estimate"),
                         ordered = TRUE))
# Visualise
bootstrap_results_df %>%
  ggplot(aes(x = year)) +
  # Distribution
  geom_violin(aes(y = rep_boot, group = year),
              fill = alpha("cornflowerblue", 0.2)) +
  # Estimates and bias
  geom_point(data = estimate_mean_obs, aes(y = value, shape = Legend),
             colour = "firebrick", size = 2, alpha = 0.5) +
  # Intervals
  geom_errorbar(data = ci_mean_obs,
                aes(ymin = ll, ymax = ul, colour = int_type),
                position = position_dodge(0.8), linewidth = 0.8) +
  # Settings
  labs(y = "Mean Number of Observations\nper Grid Cell",
       x = "", shape = "Legend:", colour = "Interval type:") +
  scale_x_continuous(breaks = sort(unique(bootstrap_results_df$year))) +
  theme_minimal() +
  theme(legend.position = "bottom",
        legend.title = element_text(face = "bold"))
```

See the [visualising temporal trends tutorial](https://b-cubed-eu.github.io/dubicube/articles/visualising-temporal-trends.html) for information on which interval types should be calculated and/or reported and how temporal trends can be visualised.

## Advanced usage of `calculate_bootstrap_ci()`
### Comparison with a reference group

As discussed in the [bootstrap tutorial](https://b-cubed-eu.github.io/dubicube/articles/bootstrap-method-cubes.html), we can also  compare indicator values to a reference group. In time series analyses, this often means comparing each year’s indicator to a baseline year (e.g., the first or last year in the series).
To do this, we perform bootstrapping over the difference between indicator values.
This process yields bootstrap replicate distributions of differences in indicator values.

```{r}
bootstrap_results_ref <- bootstrap_cube(
  data_cube = processed_cube,
  fun = mean_obs,
  grouping_var = "year",
  samples = 1000,
  ref_group = 2011,
  seed = 123
)
```

```{r}
head(bootstrap_results_ref)
```

If the BCa interval is calculated and a reference group is used, jackknifing is implemented differently.
Consider $\hat{\theta} = \hat{\theta}_1 - \hat{\theta}_2$ where $\hat{\theta}_1$ is the estimate for the indicator value of a non-reference period (sample size $n_1$) and $\hat{\theta}_2$ is the estimate for the indicator value of a reference period (sample size $n_2$).
The acceleration is now calculated as follows:

$$
\hat{a} = \frac{1}{6} \frac{\sum_{i = 1}^{n_1 + n_2}(I_i^3)}{\left( \sum_{i = 1}^{n_1 + n_2}(I_i^2) \right)^{3/2}}
$$

$I_i$ can be calculated using the negative or positive jackknife. Such that

$\hat{\theta}_{-i} = \hat{\theta}_{1,-i} - \hat{\theta}_2 \text{ for } i = 1, \ldots, n_1$, and

$\hat{\theta}_{-i} = \hat{\theta}_{1} - \hat{\theta}_{2,-i} \text{ for } i = n_1 + 1, \ldots, n_1 + n_2$

Therefore, if you want to calculate the BCa intervals using `calculate_bootstrap_ci()`, you also need to provide `ref_group = 2011`.
Since we are not working with `"boot"` objects, we need to specify `data_cube` and `fun` as well.

```{r}
ci_mean_obs_ref <- calculate_bootstrap_ci(
  bootstrap_results = bootstrap_results_ref,
  grouping_var = "year",
  type = c("perc", "bca", "norm", "basic"),
  data_cube = processed_cube,   # Required for BCa
  fun = mean_obs,               # Required for BCa
  ref_group = 2011              # Required for BCa
)
```

```{r}
ci_mean_obs_ref %>%
  filter(int_type == "bca") %>%
  head()
```

We see that the mean number of observations is higher in some years compared to 2011.
Because the BCa intervals are above 0 in 2014, 2015, 2017 and 2018, we might even say it is significant for those years.
This will be further explored in the [effect classification tutorial](https://b-cubed-eu.github.io/dubicube/articles/effect-classification.html).

```{r}
# Make interval type factor
ci_mean_obs_ref <- ci_mean_obs_ref %>%
  mutate(
    int_type = factor(
      int_type, levels = c("perc", "bca", "norm", "basic")
    )
  )
```

```{r}
#| fig.alt: >
#|   Confidence intervals for mean number of occurrences over time (ref).
# Get bias vales
bias_mean_obs <- bootstrap_results_ref %>%
  distinct(year, estimate = est_original, `bootstrap estimate` = est_boot)

# Get estimate values
estimate_mean_obs <- bias_mean_obs %>%
  pivot_longer(cols = c("estimate", "bootstrap estimate"),
               names_to = "Legend", values_to = "value") %>%
  mutate(Legend = factor(Legend, levels = c("estimate", "bootstrap estimate"),
                         ordered = TRUE))
# Visualise
bootstrap_results_ref %>%
  ggplot(aes(x = year)) +
  # Distribution
  geom_violin(aes(y = rep_boot, group = year),
              fill = alpha("cornflowerblue", 0.2)) +
  # Estimates and bias
  geom_point(data = estimate_mean_obs, aes(y = value, shape = Legend),
             colour = "firebrick", size = 2, alpha = 0.5) +
  # Intervals
  geom_errorbar(data = ci_mean_obs_ref,
                aes(ymin = ll, ymax = ul, colour = int_type),
                position = position_dodge(0.8), linewidth = 0.8) +
  # Settings
  labs(y = "Mean Number of Observations\nper Grid Cell Compared to 2011",
       x = "", shape = "Legend:", colour = "Interval type:") +
  scale_x_continuous(breaks = sort(unique(bootstrap_results_ref$year))) +
  theme_minimal() +
  theme(legend.position = "bottom",
        legend.title = element_text(face = "bold"))
```

Note that the choice of the reference year should be well considered.
Keep in mind which comparisons should be made, and what the motivation is behind the reference period.
A high or low value in the reference period relative to other periods, e.g. an exceptional bad or good year, can affect the magnitude and direction of the calculated differences.
Whether this should be avoided or not, depends on the motivation behind the choice and the research question.
A reference period can be determined by legislation, or by the start of a monitoring campaign.
A specific research question can determine the periods that need to be compared.
Furthermore, the variability of the estimate of reference period affects the width of confidence intervals for the differences.
A more variable reference period will propagate greater uncertainty.
In the case of GBIF data, more data will be available in recent years than in earlier years.
If this is the case, it could make sense to select the last period as a reference period.
In a way, this also avoids the arbitrariness of choice for the reference period.
You compare previous situations with the current situation (last year), where you could repeat this comparison annually, for example.
Finally, when comparing multiple indicators, we recommend using a consistent reference period to maintain comparability

### Transformations

Consider the calculation of Pielou's evenness on a random subset of the data from 2011-2015.
We take only a small subset of the dataset and we artificially create a community with high evenness.
This is an indicator that has values between 0 and 1.
Higher evenness values indicate a more balanced community (a value of 1 means that all species are equally abundant), while low values indicate a more unbalanced community (a value of 0 means that one species dominates completely).

```{r}
set.seed(123)

# Make dataset smaller
rows <- sample(nrow(bird_cube_belgium), 1000)
bird_cube_belgium_even <- bird_cube_belgium[rows, ]

# Make dataset even
bird_cube_belgium_even$n <- rnbinom(nrow(bird_cube_belgium_even),
                                    size = 2, mu = 100)

# Process cube
processed_cube_even <- process_cube(
  bird_cube_belgium_even,
  first_year = 2011,
  last_year = 2015,
  cols_occurrences = "n"
)
```

We create a custom function to calculate evenness:

```{r, echo=FALSE}
# nolint start: object_usage_linter.
```

```{r}
calc_evenness <- function(data) {
  data %>%
    # Calculate number of observations
    dplyr::group_by(year, scientificName) %>%
    dplyr::summarise(obs = sum(obs), .groups = "drop_last") %>%
    # Calculate evenness by year
    dplyr::mutate(
      tot = sum(obs),
      p = obs / tot,
      p_ln_p = p * log(p),
      ln_S = log(dplyr::n_distinct(scientificName)),
      diversity_val = (-sum(p_ln_p)) / ln_S
    ) %>%
    dplyr::ungroup() %>%
    # Get distinct values
    dplyr::distinct(year, diversity_val)
}
```

```{r, echo=FALSE}
# nolint end
```

We perform bootstrapping as before. Note that you can also perform bootstrapping of `processed_cube_even` using the **b3gbi** function `pielou_evenness_ts()`.

```{r}
bootstrap_results_evenness <- bootstrap_cube(
  data_cube = processed_cube_even,
  fun = calc_evenness,
  grouping_var = "year",
  samples = 1000,
  seed = 123
)
```

We calculate the percentile, BCa, normal and basic intervals with `calculate_bootstrap_ci()`.
We get a warning message for BCa calculation because we are using a relatively small dataset.

```{r}
ci_evenness <- calculate_bootstrap_ci(
  bootstrap_results = bootstrap_results_evenness,
  grouping_var = "year",
  type = c("perc", "bca", "norm", "basic")
)
```

```{r}
# Make interval type factor
ci_evenness <- ci_evenness %>%
  mutate(
    year = as.numeric(year),
    int_type = factor(
      int_type, levels = c("perc", "bca", "norm", "basic")
    )
  )
```

```{r, warning=FALSE}
#| fig.alt: >
#|   Confidence intervals for evenness over time.
# Convert bootstrap replicates to dataframe
bootstrap_results_evenness_df <- boot_list_to_dataframe(
  boot_list = bootstrap_results_evenness,
  grouping_var = "year"
) %>%
  mutate(year = as.numeric(year))

# Get bias vales
bias_mean_obs <- bootstrap_results_evenness_df %>%
  distinct(year, estimate = est_original, `bootstrap estimate` = est_boot)

# Get estimate values
estimate_mean_obs <- bias_mean_obs %>%
  pivot_longer(cols = c("estimate", "bootstrap estimate"),
               names_to = "Legend", values_to = "value") %>%
  mutate(Legend = factor(Legend, levels = c("estimate", "bootstrap estimate"),
                         ordered = TRUE))
# Visualise
bootstrap_results_evenness_df %>%
  ggplot(aes(x = year)) +
  # Distribution
  geom_violin(aes(y = rep_boot, group = year),
              fill = alpha("cornflowerblue", 0.2)) +
  # Estimates and bias
  geom_point(data = estimate_mean_obs, aes(y = value, shape = Legend),
             colour = "firebrick", size = 2, alpha = 0.5) +
  # Intervals
  geom_errorbar(data = ci_evenness,
                aes(ymin = ll, ymax = ul, colour = int_type),
                position = position_dodge(0.8), linewidth = 0.8) +
  # Settings
  labs(y = "Evenness", x = "", shape = "Legend:", colour = "Interval type:") +
  scale_x_continuous(
    breaks = sort(unique(bootstrap_results_evenness_df$year))
  ) +
  theme_minimal() +
  theme(legend.position = "bottom",
        legend.title = element_text(face = "bold"))
```

We notice that the normal and basic intervals have limits larger than 1 which is an impossible value for evenness.
This is because their intervals are symmetrical around $\hat{\theta} - \text{Bias}_{\text{boot}}$.
We can use transformation functions to account for this.
The intervals are calculated on the scale of `h` and the inverse function `hinv` are applied to the resulting intervals.
For values between 0 and 1, we can use the logit function and its inverse:

```{r}
# Logit transformation
logit <- function(p) {
  log(p / (1 - p))
}

# Inverse logit transformation
inv_logit <- function(l) {
  exp(l) / (1 + exp(l))
}
```

We enter them through `calculate_bootstrap_ci()`.

```{r}
ci_evenness_trans <- calculate_bootstrap_ci(
  bootstrap_results = bootstrap_results_evenness,
  grouping_var = "year",
  type = c("perc", "bca", "norm", "basic"),
  h = logit,
  hinv = inv_logit
)
```

```{r}
# Make interval type factor
ci_evenness_trans <- ci_evenness_trans %>%
  mutate(
    year = as.numeric(year),
    int_type = factor(
      int_type, levels = c("perc", "bca", "norm", "basic")
    )
  )
```

```{r, warning=FALSE}
#| fig.alt: >
#|   Confidence intervals for evenness over time.
# Visualise
bootstrap_results_evenness_df %>%
  ggplot(aes(x = year)) +
  # Distribution
  geom_violin(aes(y = rep_boot, group = year),
              fill = alpha("cornflowerblue", 0.2)) +
  # Estimates and bias
  geom_point(data = estimate_mean_obs, aes(y = value, shape = Legend),
             colour = "firebrick", size = 2, alpha = 0.5) +
  # Intervals
  geom_errorbar(data = ci_evenness_trans,
                aes(ymin = ll, ymax = ul, colour = int_type),
                position = position_dodge(0.8), linewidth = 0.8) +
  # Settings
  labs(y = "Evenness", x = "", shape = "Legend:", colour = "Interval type:") +
  scale_y_continuous(limits = c(NA, 1)) +
  scale_x_continuous(
    breaks = sort(unique(bootstrap_results_evenness_df$year))
  ) +
  theme_minimal() +
  theme(legend.position = "bottom",
        legend.title = element_text(face = "bold"))
```

Now we see that all the intervals fall within the expected range.

### Issues with bias correction for species richness indicators

Consider the calculation of observed species richness on the same subset used in the previous subsection.
We create a custom function to calculate richness:

```{r, echo=FALSE}
# nolint start: object_usage_linter.
```

```{r}
calc_richness <- function(data) {
  data %>%
    dplyr::group_by(year) %>%
    dplyr::summarise(diversity_val = n_distinct(scientificName),
                     .groups = "drop")
}
```

```{r, echo=FALSE}
# nolint end
```

We perform bootstrapping as before. We will not use any boot method (`method = "group_specific"`).

```{r}
bootstrap_results_richness <- bootstrap_cube(
  data_cube = processed_cube_even,
  fun = calc_richness,
  grouping_var = "year",
  samples = 1000,
  method = "group_specific",
  seed = 123
)
```

We calculate the percentile, BCa, normal and basic intervals with `calculate_bootstrap_ci()`.
We get a warning message for BCa calculation.
The bias is infinite such that the BCa intervals cannot be calculated.

```{r}
ci_richness <- calculate_bootstrap_ci(
  bootstrap_results = bootstrap_results_richness,
  grouping_var = "year",
  type = c("perc", "bca", "norm", "basic"),
  data_cube = processed_cube_even,
  fun = calc_richness
)
```

We notice that none of the intervals cover the estimate. The percentile interval does not account for bias, the BCa interval cannot be calculated because the bias is too large and the normal and basic intervals have overcompensated because of the large bootstrap bias.

```{r}
# Make interval type factor
ci_richness <- ci_richness %>%
  mutate(
    int_type = factor(
      int_type, levels = c("perc", "bca", "norm", "basic")
    )
  )
```

```{r}
#| fig.alt: >
#|   Confidence intervals for richness over time.
# Get bias vales
bias_mean_obs <- bootstrap_results_richness %>%
  distinct(year, estimate = est_original, `bootstrap estimate` = est_boot)

# Get estimate values
estimate_mean_obs <- bias_mean_obs %>%
  pivot_longer(cols = c("estimate", "bootstrap estimate"),
               names_to = "Legend", values_to = "value") %>%
  mutate(Legend = factor(Legend, levels = c("estimate", "bootstrap estimate"),
                         ordered = TRUE))
# Visualise
bootstrap_results_richness %>%
  ggplot(aes(x = year)) +
  # Distribution
  geom_violin(aes(y = rep_boot, group = year),
              fill = alpha("cornflowerblue", 0.2)) +
  # Estimates and bias
  geom_point(data = estimate_mean_obs, aes(y = value, shape = Legend),
             colour = "firebrick", size = 2, alpha = 0.5) +
  # Intervals
  geom_errorbar(data = ci_richness,
                aes(ymin = ll, ymax = ul, colour = int_type),
                position = position_dodge(0.8), linewidth = 0.8) +
  # Settings
  labs(y = "Observed species richness", x = "", shape = "Legend:",
       colour = "Interval type:") +
  scale_x_continuous(breaks = sort(unique(bootstrap_results_richness$year))) +
  theme_minimal() +
  theme(legend.position = "bottom",
        legend.title = element_text(face = "bold"))
```

This issue arises because bootstrap resampling cannot introduce new species that were not present in the original sample (Dixon, [2001, p. 287](https://doi.org/10.1093/oso/9780195131871.003.0014)).
As a result, the observed species richness — which is simply the count of unique species — tends to be negatively biased in bootstrap replicates.
This leads to an extreme mismatch between the original estimate and the distribution of bootstrap replicates.
In such cases, the BCa intervals may fail altogether (e.g., due to infinite bias correction factors), and other bootstrap intervals (normal, basic) may overcorrect.

There is an option within `calculate_bootstrap_ci()` to center the confidence limits around the original estimate (`no_bias = TRUE`).
This means the bootstrap distribution is used to calculate confidence intervals, except for the bootstrap bias.
While it may "solve" technical problems with interval calculation (like infinite or undefined corrections), it does so at the cost of ignoring bootstrap bias.
This approach should only be used with caution and clear justification, such as when the bootstrap bias is known to be an artifact of a sampling limitation and not of the underlying data structure.

Because of this inherent limitation, alternative richness estimators that account for undetected species are preferred when uncertainty quantification is needed.
The **vegan** (Oksanen et al., [2024](https://cran.r-project.org/web/packages/vegan/index.html)) and **iNEXT** (Hsieh et al., [2016](https://doi.org/10.1111/2041-210X.12613)) R packages provide such estimators, including Chao, Jackknife, and coverage-based rarefaction/extrapolation, all of which are designed to handle unseen species and provide meaningful uncertainty estimates.

Some of these estimators are also implemented directly for occurrence cubes in recent versions of **b3gbi** (≥ v0.4.0), offering integration into existing cube-based workflows.
However, it is important to note that these are alternative estimators — they are not equivalent to observed richness and will yield different values by design.

```{r}
ci_richness_no_bias <- calculate_bootstrap_ci(
  bootstrap_results = bootstrap_results_richness,
  grouping_var = "year",
  type = c("perc", "bca", "norm", "basic"),
  no_bias = TRUE,
  data_cube = processed_cube_even,
  fun = calc_richness
)
```

Indeed, the intervals are now centered around the original estimate.

```{r}
# Make interval type factor
ci_richness_no_bias <- ci_richness_no_bias %>%
  mutate(
    int_type = factor(
      int_type, levels = c("perc", "bca", "norm", "basic")
    )
  )
```

```{r}
#| fig.alt: >
#|   Confidence intervals for richness over time.
# Visualise
bootstrap_results_richness %>%
  ggplot(aes(x = year)) +
  # Distribution
  geom_violin(aes(y = rep_boot, group = year),
              fill = alpha("cornflowerblue", 0.2)) +
  # Estimates and bias
  geom_point(data = estimate_mean_obs, aes(y = value, shape = Legend),
             colour = "firebrick", size = 2, alpha = 0.5) +
  # Intervals
  geom_errorbar(data = ci_richness_no_bias,
                aes(ymin = ll, ymax = ul, colour = int_type),
                position = position_dodge(0.8), linewidth = 0.8) +
  # Settings
  labs(y = "Observed species richness", x = "", shape = "Legend:",
       colour = "Interval type:") +
  scale_x_continuous(breaks = sort(unique(bootstrap_results_richness$year))) +
  theme_minimal() +
  theme(legend.position = "bottom",
        legend.title = element_text(face = "bold"))
```

## References
<!-- spell-check: ignore:start -->
Canty, A., & Ripley, B. (1999). boot: Bootstrap Functions (Originally by Angelo Canty for S) [Computer software]. https://CRAN.R-project.org/package=boot

Davison, A. C., & Hinkley, D. V. (1997). *Bootstrap Methods and their Application* (1st ed.). Cambridge University Press. https://doi.org/10.1017/CBO9780511802843

DiCiccio, T. J., & Efron, B. (1996). Bootstrap confidence intervals. Statistical Science, 11(3). https://doi.org/10.1214/ss/1032280214

Dixon, P. M. (2001). The Bootstrap and the Jackknife: Describing the Precision of Ecological Indices. In S. M. Scheiner & J. Gurevitch (Eds.), Design and Analysis of Ecological Experiments (Second Edition, pp. 267–288). Oxford University PressNew York, NY. https://doi.org/10.1093/oso/9780195131871.003.0014

Efron, B. (1987). Better Bootstrap Confidence Intervals. Journal of the American Statistical Association, 82(397), 171–185. https://doi.org/10.1080/01621459.1987.10478410

Frangos, C. C., & Schucany, W. R. (1990). Jackknife estimation of the bootstrap acceleration constant. Computational Statistics & Data Analysis, 9(3), 271–281. https://doi.org/10.1016/0167-9473(90)90109-U
<!-- spell-check: ignore:end -->