Status: COMPLETE — all chunks 16a–16q shipped.
Depends on: Phase 2 (point gcomp), Phase 4 (self-contained IPW), Phase 5 (ICE, for longitudinal), Phase 10 (longitudinal IPW, for longitudinal), Phase 14 (for IPCW composition)
Composes with (planned): Phases 8, 10, 12, 14 via dedicated "AIPW composition" subsections.
causatr owns two estimation engines today — gcomp (outcome model) and IPW (self-contained density-ratio engine). Each is consistent under one modelling assumption: gcomp under a correct outcome model, IPW under a correct treatment model. Augmented IPW (Robins, Rotnitzky, Zhao 1994; Scharfstein, Rotnitzky, Robins 1999) is the natural composition of the two: it plugs in the outcome model for standardization and the IPW weights for a residual correction, giving an estimator that is consistent if either nuisance is correct ("doubly robust"), and efficient — in the semiparametric sense — when both are correct.
AIPW is the workhorse doubly-robust estimator in applied epi / biostats, distinct from TMLE (Targeted Maximum Likelihood) and SDR (Sequentially Doubly Robust) which are lmtp's territory. The difference is that AIPW uses parametric nuisance models with analytical sandwich variance, whereas TMLE/SDR are designed to accommodate ML nuisances via cross-fitting. AIPW is what you reach for when:
- You want both-ways protection against misspecification but don't want the complexity of cross-fitting.
- You want a principled efficiency gain over plain gcomp or plain IPW.
- You already have a credible outcome model and a credible propensity model — why not use both?
causatr is the right home for AIPW because (a) the gcomp and IPW engines already exist and are unit-tested, (b) the sandwich-variance machinery already handles stacked estimating-equation systems, (c) it fits the package's "GLMs/GAMs with analytical inference, no ML" philosophy, and (d) the triangulation story (gcomp vs IPW vs AIPW agreement) is scientifically informative.
- Point AIPW for static binary / continuous shift / continuous scale_by / binary dynamic / IPSI — the full set of interventions the self-contained IPW engine supports.
- Estimands: ATE (primary), ATT and ATC (static binary via the same Bayes-rule numerator trick used by IPW).
- Sandwich variance via a stacked estimating-equation system with three blocks (outcome model, propensity model, AIPW plug-in).
- Bootstrap variance (natural — one refit of both nuisance models per replicate).
- Longitudinal AIPW via ICE-AIPW (Bang & Robins 2005) — depends on Phases 5 + 10.
- DR property tests: deliberately misspecify one nuisance, verify consistency against analytical truth.
- Efficiency tests: AIPW SE ≤ gcomp SE and AIPW SE ≤ IPW SE when both nuisances are correct.
- External cross-check against
delicatessen(Python) on a shared DGP, sincedelicatessenimplements AIPW analytically with a similar stacked-EE sandwich.
-
TMLE / SDR with cross-fitting.
lmtpalready covers this; different design problem (ML nuisances, cross-fitting folds, targeting step). -
Machine-learning nuisances. AIPW with ML nuisances requires cross-fitting to recover
$\sqrt{n}$ -consistency; out of scope by the same logic as plain gcomp (see CLAUDE.md "Why GLMs/GAMs, not ML"). -
AIPW under multivariate treatment. Shipped as chunk 16m (composes Phase 8 joint density with AIPW functional). The multivariate longitudinal AIPW extension (joint time-varying treatments, composing chunk 16m with Phase 19's per-period density chain) shipped in Phase 25 (
PHASE_25_LONGITUDINAL_MULTIVARIATE_AIPW.md). - Targeted maximum likelihood. The targeting step is the TMLE innovation, not the AIPW innovation. AIPW is the non-targeted DR estimator.
-
Stochastic interventions under AIPW. Moved to Phase 24 (
PHASE_24_STOCHASTIC_AIPW.md). Stochastic AIPW requires extending thestochastic()constructor with a user-supplieddensityfunction to compute density-ratio weights. Phase 24 covers both stochastic IPW and stochastic AIPW in a unified phase.
For a static binary intervention a ∈ {0, 1}, the AIPW functional is
where
For a general intervention under the self-contained density-ratio engine (static / shift / scale_by / binary dynamic / IPSI), the AIPW functional generalises to
where make_weight_fn() under intervention g. For static binary this reduces to
Both forms have the same stacked-EE structure — the augmentation term is always "weight × outcome residual at the observed treatment." The implementation is a single pathway that calls existing gcomp (for
The AIPW estimating function for ψ is
Under regularity,
This is the main selling point and the main thing to test. Chunks 16d and 16e are devoted to DR tests with deliberately-misspecified nuisances.
When both nuisances are correctly specified,
- Under consistency of both nuisances,
$\mathrm{Var}(\hat\psi_{\mathrm{AIPW}}) \leq \mathrm{Var}(\hat\psi_{\mathrm{gcomp}})$ and$\mathrm{Var}(\hat\psi_{\mathrm{AIPW}}) \leq \mathrm{Var}(\hat\psi_{\mathrm{IPW}})$ asymptotically. - Efficiency gains over plain IPW can be substantial when the outcome model captures signal the propensity does not.
- Efficiency gains over plain gcomp are typically modest but never negative.
The efficiency story is the secondary selling point and is tested in chunk 16f.
| Intervention | Supported under AIPW? | Notes |
|---|---|---|
static(a) binary |
✓ | The classical case. ATE / ATT / ATC all available; ATT/ATC use the Bayes-rule numerator trick from Phase 4 (numerator absorbs |
static(a) continuous |
⛔ | Same rejection as IPW (Dirac point mass has no Lebesgue density). |
shift(δ), scale_by(c) continuous |
✓ | Density-ratio weight from make_weight_fn(); outcome model evaluated at shifted/scaled A. |
threshold(...) |
⛔ | Same rejection as IPW. |
dynamic(rule) binary |
✓ | Deterministic rule on binary; HT-indicator weight path. |
dynamic(rule) continuous |
⛔ | Same rejection as IPW (Dirac). |
ipsi(δ) |
✓ | Closed-form IPSI weight; outcome model evaluated at the IPSI-shifted expected-propensity treatment. Minor care: the outcome-model augmentation term evaluates at the observed A (not at a drawn IPSI treatment), so the estimator is well-defined without MC. |
stochastic(density=) (Phase 24) |
✓ (when density supplied) | Phase 24: MC-averaged outcome predictions + density-ratio weights via user-supplied density. Without density, rejected (gcomp-only). |
The rejections are exactly those inherited from the self-contained IPW engine — AIPW does not add any rejections, because the outcome-model side is always well-defined.
Estimand gating. check_estimand_intervention_compat() from Phase 4 carries over unchanged. ATT/ATC are only defined for static binary; all other AIPW interventions are ATE-only.
The estimating-equation system is
where
because
and the per-individual IF on ψ is the last row of vcov_from_if().
Implementation strategy. Reuse Phase 4/5's primitives:
-
prepare_model_if()gives$B_{\beta\beta}^{-1}$ and per-individual outcome scores. -
compute_ipw_if_self_contained_one()already builds the$(\alpha, \beta_{\mathrm{AIPW}})$ stacked IF for a single intervention under IPW — the ψ plug-in and$B_{\psi\alpha}$ cross-derivative are exactly what that function computes. For AIPW we extend this function to also absorb the$B_{\psi\beta}$ cross-derivative from the outcome model. - The result is a single per-individual IF vector on
$\hat\psi_{\mathrm{AIPW}}$ , whichvcov_from_if()squares up to a scalar variance (or a matrix across interventions).
The cross-derivatives numDeriv::jacobian of the augmentation term, exactly as IPW already does. No fundamentally new sandwich machinery.
AIPW works for whichever treatment types have both a supported outcome model and a supported propensity model:
| Treatment type | Outcome side | Propensity side | AIPW? |
|---|---|---|---|
| binary | gcomp via model_fn |
binary GLM via propensity_model_fn |
✓ |
| continuous | gcomp via model_fn |
gaussian GLM via propensity_model_fn |
✓ (shift / scale_by) |
| categorical | gcomp via model_fn |
multinomial via nnet::multinom |
✓ (static, dynamic) |
| count | gcomp via model_fn |
poisson or negbin | ✓ (integer shift / valid scale_by) |
| multivariate | gcomp multi | Phase 8 joint density | composition (not Phase 16) |
Sequential application of AIPW through time, using the ICE backward-iteration structure for the outcome side and per-period propensity models for the weight side. At each step
where
Depends on Phase 10 (longitudinal IPW) and Phase 5 (ICE). Chunk 16g.
Status: fixed via the full stacked-EE M-estimation sandwich. The
longitudinal-AIPW analytic sandwich (variance_if_aipw_longitudinal() +
R/variance_if_aipw_longitudinal_ee.R) is the full stacked M-estimation
sandwich numDeriv) of the
summed score; it captures every block-triangular cross-term and is consistent
on both balanced and unbalanced panels (monotone dropout / censoring
row-filter) — one consistent, general variance path. The per-id influence
function for the marginal mean is $\mathrm{IF}i = e\mu^\top B^{-1}\psi_i$, and
vcov_from_if() aggregates it (cluster-on-id by construction).
What this replaced. The earlier assembly used a forward-cascade
approximation of the block-triangular bread inverse. On a balanced panel that
was exactly correct (the DR property makes the pseudo-outcome deviation
causatr_longitudinal_aipw_unbalanced_sandwich); that abort is removed — the
stacked-EE sandwich now handles unbalanced panels directly.
Faithfulness by construction. Designs, weights, and fit masks are extracted
from causatr's own fitted models (the ice_aipw_iterate() result and
fit$details$treatment_models_by_time), so mu_hat exactly. aipw_long_stacked_if() guards on
max|colSums(psi(theta_hat))| relative to the score scale (class
causatr_longitudinal_aipw_sandwich_unfaithful) and steers to the bootstrap if
the reconstructed score does not vanish.
Supported nuisance models (analytic sandwich). Each model's score is
reconstructed from its maximum-likelihood GLM / multinomial form, so the analytic
sandwich covers GLM-family outcome and propensity models (gaussian, binomial,
poisson, Gamma, quasi-*, inverse-gaussian, MASS::glm.nb) and multinomial
(categorical) propensities (nnet::multinom, softmax residual score).
Explicit limitation: penalised / non-likelihood fitters — mgcv::gam and
betareg::betareg — have no vanishing GLM score whose Jacobian is the sandwich
bread (the GAM bread is Vp), so they abort with class
causatr_longitudinal_aipw_sandwich_model and steer to the bootstrap (consistent
with the longitudinal-ICE betareg bootstrap-only path).
Validation. Faithfulness delicatessen
M-estimation sandwich oracle (aipw_long_tau2_delicatessen.py, balanced +
unbalanced) to test-aipw-longitudinal.R). Follows Zivich et al. (2024, Stat. Med.) for the
ICE outcome-model chain composed with the AIPW propensity correction.
- Phase 8 (multivariate treatment IPW) × AIPW. Shipped as chunk 16m. Joint outcome model
Y ~ A1 + A2 + L(Phase 2) + joint density (Phase 8) composed in the AIPW functional. Includes stabilized weights, DR tests, sandwich + bootstrap variance. - Phase 10 (longitudinal IPW) × AIPW. This is exactly ICE-AIPW, chunk 16g.
- Phase 12 (stochastic) × AIPW. Moved to Phase 24 (
PHASE_24_STOCHASTIC_AIPW.md). Stochastic AIPW requires extending thestochastic()constructor with a user-supplieddensityfunction to compute density-ratio weights. Phase 24 covers stochastic IPW (originally "Phase 12b") and stochastic AIPW (originally chunk 16n) in a unified phase. - Phase 14 (IPCW) × AIPW. Triply-weighted estimator:
ψ̂_AIPW,IPCW = standardization + (treatment weight × censoring weight) × outcome residual. Sandwich: stacked EE with outcome model + propensity model + censoring model + plug-in. This is the most efficient estimator in the lineage and is a major deliverable once both Phase 14 and Phase 16 are done. - Survival composition. AIPW-survival (Bai et al. 2013; Zhang & Schaubel 2012) lives in the separate survival package (
etverse/survatr); it imports Phase 16 as the scalar-outcome AIPW primitive and layers the cross-time delta chain on top.
| Chunk | Scope | Depends on | Status |
|---|---|---|---|
| 16a | fit_aipw(): fit outcome model + propensity model, store both in fit$details |
Phases 2, 4 | ✅ done |
| 16b | compute_aipw_contrast_point(): AIPW functional for static binary ATE with sandwich variance |
16a | ✅ done |
| 16c | Bootstrap variance for AIPW (refit both nuisances per replicate) | 16a | ✅ done |
| 16d | DR test: misspecified outcome, correct propensity — verify consistency | 16b | ✅ done |
| 16e | DR test: correct outcome, misspecified propensity — verify consistency | 16b | ✅ done |
| 16f | Efficiency test: AIPW SE ≤ gcomp SE and AIPW SE ≤ IPW SE | 16b | ✅ done |
| 16g | Full intervention set: shift / scale_by / dynamic / IPSI; rejections for static-on-continuous / threshold / stochastic | 16b | ✅ done |
| 16h | ATT / ATC for static binary; effect modification (by = "sex") |
16b | ✅ done |
| 16i | Longitudinal AIPW (ICE-AIPW): sequential nuisance fits through the ICE backward loop | Phase 5, Phase 10, 16b | ✅ done |
| 16j | Categorical + count treatment extensions | 16b | ✅ done |
| 16k | delicatessen external cross-check on a shared DGP |
16b, 16d, 16e | ✅ done |
| 16l | Documentation, vignette, CLAUDE.md phase update, FEATURE_COVERAGE_MATRIX.md rows |
16a–16j | ✅ done |
| 16m | Multivariate point AIPW: joint outcome model Y ~ A1 + A2 + L + multivariate density-ratio weights in augmentation term. Composes Phase 8 joint density with Phase 16 AIPW functional. Lifts causatr_aipw_multivariate_pending rejection for point treatment. Includes stabilized weights (stabilize = "marginal"), propensity_model_fn default warning, and model.matrix(terms(model)) fix for custom fitters. |
Phase 8, 16b | ✅ done |
| 16n | PHASE_24_STOCHASTIC_AIPW.md). Stochastic interventions under AIPW (and IPW) require extending stochastic() with an optional density parameter, which is broader than a single composition chunk. |
— | moved → Phase 24 |
| 16o | IPCW + AIPW: triply-weighted estimator. Point estimate flows through merged IPCW weights in ext_w. Sandwich variance adds Channel 2d (censoring model correction) with three sub-components: direct Hajek effect, outcome cross-block, propensity cross-block. Also fixes latent Ch1_i dimension bug and fit_idx_ps alignment bug. |
Phase 14, 16b | ✅ done |
| 16p | AIPW test coverage parity: outcome types (gamma, quasibinomial, negbin, beta), survey weights, cluster-robust sandwich, missing data, subset estimand, GAM outcome/propensity models. Also discovered and fixed a bug: cluster_vec was not passed to variance_if() for AIPW point estimates in contrast.R. |
16b | ✅ done |
| 16q | Audit model_fn / propensity_model_fn defaults across causatr. Centralized missing()-based warnings in causat() for both arguments. Removed redundant warning from fit_aipw(). Global test suppression via options(causatr.suppress_default_warnings). Updated all vignette and roxygen examples to specify model_fn explicitly. |
all | ✅ done |
-
estimator = "aipw"usesmodel_fn(outcome) andpropensity_model_fn(treatment). When either is not specified,causat()warns and defaults tostats::glm(chunk 16q). AIPW does NOT silently reusemodel_fnfor propensity (chunk 16m). -
estimator = "aipw"triangulates gcomp and IPW, so the three point estimates (gcomp, IPW, AIPW) on the same DGP MUST agree up to Monte Carlo noise when both nuisances are correct. This is a regression invariant for every AIPW DGP test. - Under a deliberately-misspecified nuisance, AIPW MUST be consistent (chunks 16d, 16e). If either test fails, the implementation of the augmentation term is buggy.
- AIPW sandwich SE MUST be ≤ IPW sandwich SE and ≤ gcomp sandwich SE when both nuisances are correctly specified on a large-$n$ DGP (chunk 16f). Violations indicate either a sandwich bug or misspecification.
- AIPW MUST reject the same intervention / treatment / estimand combinations that IPW rejects — the outcome-model augmentation does not rescue Dirac-style point-mass interventions on continuous treatment.
- AIPW MUST NOT hard-abort when the outcome model is a GAM (
mgcv::gam); the sandwich cross-derivative usesnumDeriv::jacobianas the IPW engine already does. - The longitudinal-AIPW analytic sandwich is the full stacked-EE M-estimation sandwich and MUST agree tightly with the bootstrap, the delete-one-id jackknife, and the
delicatessenoracle on both balanced and unbalanced panels. Never re-introduce the forward-cascade shortcut or thecausatr_longitudinal_aipw_unbalanced_sandwichabort (the cascade dropped the baseline-pseudo-regression block and underestimated the SE by ~50% under dropout). The analytic sandwich covers GLM-family +glm.nb+ multinomial (nnet::multinom) nuisances; penalised / non-likelihood fitters (mgcv::gam,betareg) are an explicit limitation that routes to the bootstrap (causatr_longitudinal_aipw_sandwich_model). See the stacked-EE subsection above.
Linear-Gaussian with binary treatment:
L ~ N(0, 1)
A | L ~ Bernoulli(expit(0.3 + 0.5 * L))
Y = 2 + 3 * A + 1.5 * L + ε, ε ~ N(0, 1)
E[Y^1] - E[Y^0] = 3 (ATE)
- Chunk 16b: both nuisances correct — check point estimate, SE, CI vs analytical truth.
- Chunk 16d: outcome model
Y ~ A(drops L) — AIPW must still give ATE = 3. - Chunk 16e: propensity model
A ~ 1(drops L) — AIPW must still give ATE = 3. - Chunk 16f: all three estimators on the same DGP with
$n = 5000$ , 500 replicates, compare empirical SDs.
Reuse the Phase 10 / Phase 5 linear-Gaussian DGP (2-period, time-varying treatment + confounder), validated against lmtp::lmtp_tmle.
- Robins JM, Rotnitzky A, Zhao LP (1994). Estimation of regression coefficients when some regressors are not always observed. JASA 89:846–866. (foundational AIPW paper)
- Scharfstein DO, Rotnitzky A, Robins JM (1999). Adjusting for nonignorable drop-out using semiparametric nonresponse models. JASA 94:1096–1120. (DR property framing)
- Bang H, Robins JM (2005). Doubly robust estimation in missing data and causal inference models. Biometrics 61:962–973. (ICE-AIPW / longitudinal extension)
- Funk MJ, Westreich D, Wiesen C, Stürmer T, Brookhart MA, Davidian M (2011). Doubly robust estimation of causal effects. Am J Epidemiol 173:761–767. (applied epi primer)
- Kang JD, Schafer JL (2007). Demystifying double robustness. Stat Sci 22:523–539. (practical DR pitfalls — cite as a caveat on finite-sample behavior)
- Tchetgen Tchetgen EJ (2009). A commentary on G. Molenberghs' review of missing data. Stat Methods Med Res 18:93–97. (AIPW efficiency vs IPW)