-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathREADME.Rmd
More file actions
220 lines (184 loc) · 8 KB
/
README.Rmd
File metadata and controls
220 lines (184 loc) · 8 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# causatr
<!-- badges: start -->
[](https://github.com/etverse/causatr/actions/workflows/R-CMD-check.yaml)
[](https://app.codecov.io/gh/etverse/causatr)
<!-- badges: end -->
**causatr** provides a unified interface for causal effect estimation via five
complementary methods: g-computation (parametric g-formula + ICE), inverse
probability weighting (IPW with a self-contained density-ratio engine),
augmented IPW (AIPW — doubly robust), structural nested mean models (SNM —
g-estimation for time-varying effect modification), and propensity score
matching (via [MatchIt](https://kosukeimai.github.io/MatchIt/)).
When multiple methods agree, you can be more confident in your findings — this
is called **methodological triangulation**.
The package implements the methods described in Hernán & Robins (2025)
*Causal Inference: What If* with a simple two-step API:
1. **Fit** the causal model with `causat()`
2. **Contrast** interventions with `contrast()`
## Installation
Install the development version from GitHub:
```r
# install.packages("pak")
pak::pak("etverse/causatr")
```
## Quick example
Estimate the average causal effect of quitting smoking on weight gain
using the NHEFS dataset from Hernán & Robins (2025):
```{r example}
library(causatr)
data("nhefs")
# Step 1: Fit the outcome model via g-computation
fit <- causat(
nhefs,
outcome = "wt82_71",
treatment = "qsmk",
confounders = ~ sex + age + I(age^2) + race + factor(education) +
smokeintensity + I(smokeintensity^2) + smokeyrs + I(smokeyrs^2) +
factor(exercise) + factor(active) + wt71 + I(wt71^2) +
qsmk:smokeintensity,
censoring = "censored"
)
# Step 2: Contrast interventions
result <- contrast(
fit,
interventions = list(quit = static(1), continue = static(0)),
reference = "continue"
)
result
```
## Methodological triangulation
Compare g-computation, IPW, AIPW, and matching on the same data:
```{r triangulation}
conf <- ~ sex + age + race + smokeintensity + smokeyrs +
factor(exercise) + factor(active) + wt71
# G-computation (outcome model)
fit_gc <- causat(nhefs, outcome = "wt82_71", treatment = "qsmk",
confounders = conf, censoring = "censored")
# IPW (treatment model)
fit_ipw <- causat(nhefs, outcome = "wt82_71", treatment = "qsmk",
confounders = conf, estimator = "ipw")
# AIPW (doubly robust)
fit_aipw <- causat(nhefs, outcome = "wt82_71", treatment = "qsmk",
confounders = conf, estimator = "aipw", censoring = "censored")
# Matching (propensity score)
fit_m <- causat(nhefs, outcome = "wt82_71", treatment = "qsmk",
confounders = conf, estimator = "matching", estimand = "ATT")
# All four estimates
intv <- list(quit = static(1), cont = static(0))
rbind(
data.frame(estimator = "gcomp", contrast(fit_gc,
intv, reference = "cont")$contrasts),
data.frame(estimator = "ipw", contrast(fit_ipw,
intv, reference = "cont")$contrasts),
data.frame(estimator = "aipw", contrast(fit_aipw,
intv, reference = "cont")$contrasts),
data.frame(estimator = "matching", contrast(fit_m,
intv, reference = "cont")$contrasts)
)
```
## Intervention types
Beyond static interventions, causatr supports modified treatment policies (MTPs)
and stochastic interventions:
```{r interventions}
fit_cont <- causat(nhefs, outcome = "wt82_71",
treatment = "smokeintensity",
confounders = ~ sex + age + race + wt71,
censoring = "censored")
contrast(fit_cont,
interventions = list(
reduce10 = shift(-10),
halved = scale_by(0.5),
cap20 = threshold(0, 20),
observed = NULL
),
reference = "observed"
)
```
## Diagnostics
Check covariate balance and positivity after fitting:
```{r diagnostics, eval = FALSE}
diag <- diagnose(fit_ipw)
diag # positivity + balance summary
plot(diag) # Love plot (requires cobalt)
```
## Features
- **Five estimation methods**: g-computation (parametric g-formula),
IPW (self-contained density-ratio engine — no runtime dependency on
WeightIt), AIPW (doubly robust — consistent if either outcome or
treatment model is correct), SNM (structural nested mean models —
g-estimation for time-varying effect modification via blip
parameters), and matching (via
[MatchIt](https://kosukeimai.github.io/MatchIt/)). Matching is
binary-only; continuous, categorical, count, and multivariate
treatments use g-comp, IPW, AIPW, or SNM.
- **Longitudinal support**: ICE g-computation (Zivich et al. 2024),
longitudinal IPW, longitudinal AIPW, and longitudinal SNM
(backward-sequential g-estimation) for time-varying treatments.
Sandwich variance via stacked estimating equations, plus parallel
bootstrap via `boot::boot()` (with optional
[future](https://future.futureverse.org/) backend).
- **Flexible interventions**: `static()`, `shift()`, `scale_by()`,
`threshold()`, `dynamic()`, `ipsi()` (incremental propensity score),
and `stochastic()` (user-defined randomised rules with Monte Carlo
integration). Which interventions are available depends on the
estimator — see the
[interventions vignette](https://etverse.github.io/causatr/articles/interventions.html).
- **Treatment types**: binary, continuous, categorical (k > 2), count
(Poisson / negative binomial propensity via `propensity_family =`),
and multivariate (joint) treatments. Multivariate IPW uses sequential
MTP factorisation (Díaz et al. 2023) with optional stabilised weights
(`stabilize = "marginal"`).
- **Any outcome family**: gaussian, binomial (logit / probit /
cloglog), Poisson, quasibinomial (fractional), Gamma, negative
binomial (`MASS::glm.nb`), beta regression (`betareg::betareg`), plus
any family you pass through `model_fn`.
- **Pluggable models**: `stats::glm`, `mgcv::gam`, splines via `ns()`
/ `bs()`, or any fit function with signature `(formula, data, family,
weights, ...)`. A two-tier numeric-variance fallback handles model
classes without a `sandwich::estfun` method.
- **Robust inference**: analytic sandwich SE (default, via a unified
influence-function engine) or nonparametric bootstrap with percentile
CIs. Cluster-robust sandwich via `cluster =`; survey designs
(`survey::svydesign`) auto-extract weights and PSU.
- **Built-in IPCW**: for MAR outcome censoring, `ipcw = TRUE` fits an
internal censoring model and computes stabilised IPCW weights —
provides doubly-robust protection under g-comp and is essential for
IPW under MAR censoring. Custom censoring models via
`censoring_model_fn =`.
- **Contrast types**: risk difference, risk ratio, odds ratio — ratio
and OR use log-scale CIs.
- **Estimands**: ATE, ATT, ATC, or custom subgroups via `subset =` /
`by =`.
- **Effect modification**: `by =` in `contrast()` for subgroup-specific
effects. Under IPW and matching the modifier must be a baseline
variable.
- **Transportability / generalizability**: transport causal estimates
from a study sample to a target population with `target =`. causatr
fits a sampling model P(S=1|L) and reweights (gcomp, IPW) or
augments (AIPW) the estimator to recover the target-population
estimand. Diagnostics include sampling-score overlap and weight
summaries.
- **Built-in diagnostics**: positivity checks, covariate balance via
[cobalt](https://ngreifer.github.io/cobalt/), weight summaries,
censoring model diagnostics, sampling model diagnostics, Love plots.
- **Tidy integration**: `tidy()` / `glance()` / `confint()` / `coef()`
/ `vcov()` / `plot()` (forest plot via
[forrest](https://github.com/etverse/forrest)) / broom-compatible
output.
## References
Hernán MA, Robins JM (2025). *Causal Inference: What If*. Chapman & Hall/CRC.
## Acknowledgements
This package was built with the contribution of [Claude](https://claude.ai),
Anthropic's AI assistant.