-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
159 lines (127 loc) · 6.78 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
options(scipen = 100)
```
# mice.mcerror
<!-- badges: start -->
[![Codecov test coverage](https://codecov.io/gh/ellessenne/mice.mcerror/branch/main/graph/badge.svg)](https://app.codecov.io/gh/ellessenne/mice.mcerror?branch=main)
[![R-CMD-check](https://github.com/ellessenne/mice.mcerror/workflows/R-CMD-check/badge.svg)](https://github.com/ellessenne/mice.mcerror/actions)
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
<!-- badges: end -->
The {mice.mcerror} package is an add-on package to {[mice](https://CRAN.R-project.org/package=mice)} that can be used to calculate Monte Carlo error estimates for statistics computed using multiply imputed data.
Note that the Monte Carlo error estimate of an MI statistic is defined as the standard error of the mean of the pseudo-values for that statistic, computed by omitting one
imputation at a time (i.e., using the jackknife).
> _Warning:_ this package is still experimental, so please test it out and find where it breaks!
## Installation
You can install the development version of {mice.mcerror} from [GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("ellessenne/mice.mcerror")
```
# Workflow
We replicate the following example from Stata:
```stata
. use http://www.stata-press.com/data/r14/mheart1s20
. mi estimate, dots mcerror: logit attack smokes age bmi hsgrad female
Imputations (20):
.........10.........20 done
Multiple-imputation estimates Imputations = 20
Logistic regression Number of obs = 154
Average RVI = 0.0312
Largest FMI = 0.1355
DF adjustment: Large sample DF: min = 1,060.38
avg = 223,362.56
max = 493,335.88
Model F test: Equal FMI F( 5,71379.3) = 3.59
Within VCE type: OIM Prob > F = 0.0030
------------------------------------------------------------------------------
attack | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
smokes | 1.198595 .3578195 3.35 0.001 .4972789 1.899911
| .0068541 .0008562 0.01 0.000 .0056572 .0082212
|
age | .0360159 .0154399 2.33 0.020 .0057541 .0662776
| .0002654 .0000351 0.01 0.001 .0002319 .0003108
|
bmi | .1039416 .0476136 2.18 0.029 .010514 .1973692
| .0038014 .0008904 0.09 0.006 .0039928 .0044049
|
hsgrad | .1578992 .4049257 0.39 0.697 -.6357464 .9515449
| .0091517 .0010209 0.02 0.016 .0086215 .0100602
|
female | -.1067433 .4164735 -0.26 0.798 -.9230191 .7095326
| .0077566 .0009279 0.02 0.015 .006985 .0088408
|
_cons | -5.478143 1.685075 -3.25 0.001 -8.782394 -2.173892
| .1079841 .0248274 0.07 0.000 .1310618 .1050817
------------------------------------------------------------------------------
Note: Values displayed beneath estimates are Monte Carlo error estimates.
```
Using the {mice.mcerror} it is possible to replicate this.
First, let's fit all models as you would do using {mice}:
```{r}
library(mice)
library(mice.mcerror)
data("mheart1s20.mice")
fit <- with(
data = mheart1s20.mice,
expr = glm(
formula = attack ~ smokes + age + bmi + hsgrad + female,
family = binomial(link = "logit")
)
)
```
The pooled estimates can be obtained by using the `summary()` and `pool()` functions:
```{r}
summary(pool(fit), conf.int = TRUE)
```
Analogously, Monte Carlo errors can be computed using the `summary()` and `mcerror()` functions:
```{r}
summary(mcerror(fit, conf.int = TRUE))
```
Please note that with `mcerror()` we need to use the argument `conf.int = TRUE` in the inner function, as we need to compute confidence intervals for each jackknife replicate.
Calling just `pool()` and `mcerror()` returns a larger set of imputation statistics:
```{r}
pool(fit)
mcerror(fit, conf.int = TRUE)
```
Compare this with results from Stata (displayed above) and see that they _should_ be the same!
Additional statistics can be obtained in Stata as follows:
```stata
. mi estimate, vartable mcerror nocitable
Multiple-imputation estimates Imputations = 20
Logistic regression
Variance information
------------------------------------------------------------------------------
| Imputation variance Relative
| Within Between Total RVI FMI efficiency
-------------+----------------------------------------------------------------
smokes | .127048 .00094 .128035 .007765 .007711 .999615
| .000559 .000211 .000613 .001744 .00172 .00009
|
age | .000237 1.4e-06 .000238 .006245 .00621 .99969
| 8.6e-07 4.6e-07 1.1e-06 .002054 .002033 .000107
|
bmi | .001964 .000289 .002267 .154545 .135487 .993271
| .000026 .000077 .000085 .04134 .031986 .00166
|
hsgrad | .162206 .001675 .163965 .010843 .010739 .999463
| .000521 .000552 .000827 .003579 .003516 .000185
|
female | .172187 .001203 .17345 .007338 .00729 .999636
| .000614 .000297 .000773 .001811 .001788 .000094
|
_cons | 2.5946 .233211 2.83948 .094377 .086953 .995671
| .029651 .070081 .083436 .028332 .024216 .001263
------------------------------------------------------------------------------
Note: Values displayed beneath estimates are Monte Carlo error estimates.
```