README.Rmd

---
output: github_document
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
options(scipen = 100)
```

# mice.mcerror

<!-- badges: start -->
[![Codecov test coverage](https://codecov.io/gh/ellessenne/mice.mcerror/branch/main/graph/badge.svg)](https://app.codecov.io/gh/ellessenne/mice.mcerror?branch=main)
[![R-CMD-check](https://github.com/ellessenne/mice.mcerror/workflows/R-CMD-check/badge.svg)](https://github.com/ellessenne/mice.mcerror/actions)
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
<!-- badges: end -->

The {mice.mcerror} package is an add-on package to {[mice](https://CRAN.R-project.org/package=mice)} that can be used to calculate Monte Carlo error estimates for statistics computed using multiply imputed data.

Note that the Monte Carlo error estimate of an MI statistic is defined as the standard error of the mean of the pseudo-values for that statistic, computed by omitting one
imputation at a time (i.e., using the jackknife).

> _Warning:_ this package is still experimental, so please test it out and find where it breaks!

## Installation

You can install the development version of {mice.mcerror} from [GitHub](https://github.com/) with:

``` r
# install.packages("devtools")
devtools::install_github("ellessenne/mice.mcerror")
```

# Workflow

We replicate the following example from Stata:

```stata
. use http://www.stata-press.com/data/r14/mheart1s20

. mi estimate, dots mcerror: logit attack smokes age bmi hsgrad female

Imputations (20):
  .........10.........20 done

Multiple-imputation estimates                   Imputations       =         20
Logistic regression                             Number of obs     =        154
                                                Average RVI       =     0.0312
                                                Largest FMI       =     0.1355
DF adjustment:   Large sample                   DF:     min       =   1,060.38
                                                        avg       = 223,362.56
                                                        max       = 493,335.88
Model F test:       Equal FMI                   F(   5,71379.3)   =       3.59
Within VCE type:          OIM                   Prob > F          =     0.0030

------------------------------------------------------------------------------
      attack |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      smokes |   1.198595   .3578195     3.35   0.001     .4972789    1.899911
             |   .0068541   .0008562     0.01   0.000     .0056572    .0082212
             |
         age |   .0360159   .0154399     2.33   0.020     .0057541    .0662776
             |   .0002654   .0000351     0.01   0.001     .0002319    .0003108
             |
         bmi |   .1039416   .0476136     2.18   0.029      .010514    .1973692
             |   .0038014   .0008904     0.09   0.006     .0039928    .0044049
             |
      hsgrad |   .1578992   .4049257     0.39   0.697    -.6357464    .9515449
             |   .0091517   .0010209     0.02   0.016     .0086215    .0100602
             |
      female |  -.1067433   .4164735    -0.26   0.798    -.9230191    .7095326
             |   .0077566   .0009279     0.02   0.015      .006985    .0088408
             |
       _cons |  -5.478143   1.685075    -3.25   0.001    -8.782394   -2.173892
             |   .1079841   .0248274     0.07   0.000     .1310618    .1050817
------------------------------------------------------------------------------
Note: Values displayed beneath estimates are Monte Carlo error estimates.
```

Using the {mice.mcerror} it is possible to replicate this.
First, let's fit all models as you would do using {mice}:

```{r}
library(mice)
library(mice.mcerror)

data("mheart1s20.mice")
fit <- with(
  data = mheart1s20.mice,
  expr = glm(
    formula = attack ~ smokes + age + bmi + hsgrad + female,
    family = binomial(link = "logit")
  )
)
```

The pooled estimates can be obtained by using the `summary()` and `pool()` functions:

```{r}
summary(pool(fit), conf.int = TRUE)
```

Analogously, Monte Carlo errors can be computed using the `summary()` and `mcerror()` functions:

```{r}
summary(mcerror(fit, conf.int = TRUE))
```

Please note that with `mcerror()` we need to use the argument `conf.int = TRUE` in the inner function, as we need to compute confidence intervals for each jackknife replicate.

Calling just `pool()` and `mcerror()` returns a larger set of imputation statistics:

```{r}
pool(fit)
mcerror(fit, conf.int = TRUE)
```

Compare this with results from Stata (displayed above) and see that they _should_ be the same!

Additional statistics can be obtained in Stata as follows:

```stata
. mi estimate, vartable mcerror nocitable

Multiple-imputation estimates                   Imputations       =         20
Logistic regression

Variance information
------------------------------------------------------------------------------
             |        Imputation variance                             Relative
             |    Within   Between     Total       RVI       FMI    efficiency
-------------+----------------------------------------------------------------
      smokes |   .127048    .00094   .128035   .007765   .007711       .999615
             |   .000559   .000211   .000613   .001744    .00172        .00009
             |                                                  
         age |   .000237   1.4e-06   .000238   .006245    .00621        .99969
             |   8.6e-07   4.6e-07   1.1e-06   .002054   .002033       .000107
             |                                                  
         bmi |   .001964   .000289   .002267   .154545   .135487       .993271
             |   .000026   .000077   .000085    .04134   .031986        .00166
             |                                                  
      hsgrad |   .162206   .001675   .163965   .010843   .010739       .999463
             |   .000521   .000552   .000827   .003579   .003516       .000185
             |                                                  
      female |   .172187   .001203    .17345   .007338    .00729       .999636
             |   .000614   .000297   .000773   .001811   .001788       .000094
             |                                                  
       _cons |    2.5946   .233211   2.83948   .094377   .086953       .995671
             |   .029651   .070081   .083436   .028332   .024216       .001263
------------------------------------------------------------------------------
Note: Values displayed beneath estimates are Monte Carlo error estimates.
```