diff --git a/README.md b/README.md index 635aec47..35a4d362 100644 --- a/README.md +++ b/README.md @@ -137,7 +137,7 @@ StatsPAI's focus is **causal inference**. The grid below summarizes method-famil **Legend**: B = broad API coverage within this comparison table; Y = implemented entry points; P = partial, scattered, or single-algorithm support; N = no first-class entry point. These are API-breadth labels, not validation tiers. -**StatsPAI at a glance**: 1,020 registered functions in the live agent registry · 81 submodules · 269k LOC (core) + 97k LOC (tests). All four numbers are reproducible from the canonical generator (`python scripts/registry_stats.py`); the per-module table in [`docs/stats.md`](docs/stats.md) is regenerated from the same script. For the API-breadth matrix (23 method families) and cross-ecosystem line-count comparison, see [`docs/stats.md`](docs/stats.md). +**StatsPAI at a glance**: 1,022 registered functions in the live agent registry · 81 submodules · 269k LOC (core) + 97k LOC (tests). All four numbers are reproducible from the canonical generator (`python scripts/registry_stats.py`); the per-module table in [`docs/stats.md`](docs/stats.md) is regenerated from the same script. For the API-breadth matrix (23 method families) and cross-ecosystem line-count comparison, see [`docs/stats.md`](docs/stats.md). **Validation tiers matter**: `stability="stable"` means the public API is SemVer-stable; it does not by itself mean R/Stata/paper parity. Use `sp.list_functions(validation_status="certified")` for cross-language or published-reference evidence, and inspect `sp.describe_function(name)["limitations"]` before production use. See [`docs/guides/stability.md`](docs/guides/stability.md). diff --git a/README_CN.md b/README_CN.md index c28a6357..b35c9c4d 100644 --- a/README_CN.md +++ b/README_CN.md @@ -46,7 +46,7 @@ StatsPAI 聚焦**因果推断**。下表描述方法家族层面的 API 覆盖 **图例**:B = 本表范围内 API 覆盖较宽;Y = 有已实现入口;P = 部分、分散或单算法支持;N = 无一等入口。这些只是 API 广度标签,不是 validation tier。 -**StatsPAI 一句话概览**:live agent registry 中有 1,020 个注册函数 · 81 个子模块 · 269k 行核心代码 + 97k 行测试。这四个数字都可以由唯一的生成器 (`python scripts/registry_stats.py`) 现场复算;[`docs/stats.md`](docs/stats.md) 中的按模块拆分表也由同一个脚本回写。23 个方法家族的 API 广度矩阵以及跨生态行数对比,详见 [`docs/stats.md`](docs/stats.md)。这些覆盖数字描述 API 广度,不等同于每个函数都有 R/Stata 数值验证;生产使用请查看 `validation_status`。 +**StatsPAI 一句话概览**:live agent registry 中有 1,022 个注册函数 · 81 个子模块 · 269k 行核心代码 + 97k 行测试。这四个数字都可以由唯一的生成器 (`python scripts/registry_stats.py`) 现场复算;[`docs/stats.md`](docs/stats.md) 中的按模块拆分表也由同一个脚本回写。23 个方法家族的 API 广度矩阵以及跨生态行数对比,详见 [`docs/stats.md`](docs/stats.md)。这些覆盖数字描述 API 广度,不等同于每个函数都有 R/Stata 数值验证;生产使用请查看 `validation_status`。 **📦 v1.16.0(2026-05-29)— 正确性修复与跨语言对齐扩展** diff --git a/docs/guides/meta_analysis.md b/docs/guides/meta_analysis.md new file mode 100644 index 00000000..dd199d24 --- /dev/null +++ b/docs/guides/meta_analysis.md @@ -0,0 +1,95 @@ +# Meta-analysis (evidence synthesis) + +> **The output half of a systematic review.** Once you have extracted an +> effect size and its standard error from each study, `sp.meta_analysis` +> pools them — fixed-effect and random-effects — and reports the +> heterogeneity statistics, a prediction interval, Egger's test for +> small-study effects, and forest / funnel plots +> [@dersimonian1986meta; @higgins2002quantifying; @egger1997bias]. + +This is **summary-data** meta-analysis: you pass per-study effects and +SEs (log odds ratios, mean differences, log hazard ratios, …). It does +not fit individual-participant-data models. + +--- + +## 1. One call + +```python +import statspai as sp + +# five studies' log odds ratios and their standard errors +effects = [0.10, 0.25, -0.05, 0.30, 0.15] +ses = [0.05, 0.10, 0.08, 0.12, 0.06] + +m = sp.meta_analysis(effects, ses, labels=["Trial A", "B", "C", "D", "E"]) +print(m.summary()) +``` + +The summary reports **both** models — you do not have to choose blind: + +- **Fixed-effect (inverse-variance)** — assumes one common true effect. +- **Random-effects (DerSimonian-Laird)** — assumes the true effect varies + across studies; wider CI, and the model StatsPAI reports by default + (`method="DL"`). + +Switch the headline model with `method="fixed"`. + +--- + +## 2. Heterogeneity — is pooling even sensible? + +```python +m.q, m.q_pvalue # Cochran's Q and its chi-square p-value +m.i2 # I^2 (fraction of variation due to heterogeneity) +m.tau2 # between-study variance estimate +m.prediction_interval # where a *future* study's true effect is expected +``` + +A high `I^2` (say > 50%) or a small Q p-value warns that a single pooled +number hides real between-study differences — report the random-effects +estimate **and** the prediction interval, and consider a subgroup or +meta-regression analysis. The prediction interval is typically much wider +than the confidence interval of the pooled mean, which is the honest way +to convey heterogeneity. + +--- + +## 3. Publication bias / small-study effects + +```python +egg = m.egger_test() # {'intercept', 'se', 't', 'p_value', 'df'} +print(egg["p_value"]) + +m.funnel_plot() # visual asymmetry check +``` + +Egger's test regresses the standard normal deviate on precision; a +non-zero intercept flags funnel-plot asymmetry (often small-study or +publication bias). Treat it as a screen, not proof — asymmetry has many +possible causes. + +--- + +## 4. The forest plot + +```python +m.forest_plot() # per-study CIs + pooled diamond +``` + +--- + +## Notes & limitations + +- DerSimonian-Laird is the classic random-effects estimator; for very few + studies or rare events, REML / Paule-Mandel `tau^2` and Hartung-Knapp + CIs are more conservative and are on the roadmap. +- Effect sizes must be supplied on an additive scale (log-transform ratio + measures before pooling, then exponentiate the pooled estimate for + reporting). + +## Where to next + +- [Power & sample size for epidemiological designs](power_epi.md) +- [Mendelian randomization](mendelian_family.md) — summary-data MR reuses + the same inverse-variance machinery for genetic instruments. diff --git a/docs/reference/index.md b/docs/reference/index.md index 5bf355cf..95de41de 100644 --- a/docs/reference/index.md +++ b/docs/reference/index.md @@ -1,6 +1,6 @@ # API Reference — Overview -StatsPAI exposes 1,020 registered public functions under a single +StatsPAI exposes 1,022 registered public functions under a single `import statspai as sp` namespace. Reference pages are grouped by methodological area: diff --git a/docs/stats.md b/docs/stats.md index 2cccd17c..4dcd02a4 100644 --- a/docs/stats.md +++ b/docs/stats.md @@ -12,8 +12,8 @@ | Ecosystem / Project | Method | Files | Lines of code | Primary focus | | ------------------------------------ | ---------- | -----: | ------------: | ---------------------------------- | -| **StatsPAI** `src/statspai/` | measured | 647 | **269,043** | validation-tiered causal inference | -| StatsPAI tests (`tests/`) | measured | 529 | 96,514 | — | +| **StatsPAI** `src/statspai/` | measured | 648 | **269,360** | validation-tiered causal inference | +| StatsPAI tests (`tests/`) | measured | 530 | 96,592 | — | | statsmodels 0.14.x | measured | 948 | **381,981** | GLM / time series / general | | linearmodels | measured | 131 | 36,607 | panel / IV | | **Python causal-inference subtotal** | | 1,079 | **418,588** | | @@ -53,8 +53,8 @@ Sorted by LOC. This table is generated from the live source tree by `python scri | `plots` | 5,176 | 6 | 6 | | `spatial` | 5,136 | 29 | 35 | | `core` | 4,951 | 10 | 2 | +| `inference` | 4,941 | 15 | 24 | | `panel` | 4,914 | 12 | 17 | -| `inference` | 4,629 | 14 | 22 | | `matching` | 4,052 | 9 | 23 | | `frontier` | 4,008 | 8 | 12 | | `workflow` | 3,890 | 5 | 1 | @@ -120,7 +120,7 @@ Sorted by LOC. This table is generated from the live source tree by `python scri | `censoring` | 284 | 2 | 2 | | `causal` | 101 | 1 | 0 | | `schemas` | 0 | 0 | 0 | -| **Total** | **269,043** | **647** | **1020** | +| **Total** | **269,360** | **648** | **1022** | ## 3 · Causal-inference coverage matrix (full) diff --git a/mkdocs.yml b/mkdocs.yml index 9d42aa75..86df69e6 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -96,6 +96,7 @@ nav: - JOSS validation dossier: joss_validation_dossier.md - JSS source-audit dossier: jss_source_audit_dossier.md - Guides: + - "Meta-analysis (fixed/random-effects, heterogeneity, Egger, forest/funnel)": guides/meta_analysis.md - "Staggered DiD (Callaway–Sant'Anna)": guides/callaway_santanna.md - "Report card: cs_report()": guides/cs_report.md - "Repeated cross-sections": guides/repeated_cross_sections.md diff --git a/paper.bib b/paper.bib index 5a171ad5..cfa6fe0c 100644 --- a/paper.bib +++ b/paper.bib @@ -10,6 +10,36 @@ @article{chernozhukov2018double doi={10.1111/ectj.12097} } +@article{dersimonian1986meta, + title={Meta-analysis in clinical trials}, + author={DerSimonian, Rebecca and Laird, Nan}, + journal={Controlled Clinical Trials}, + volume={7}, + number={3}, + pages={177--188}, + year={1986} +} + +@article{higgins2002quantifying, + title={Quantifying heterogeneity in a meta-analysis}, + author={Higgins, Julian P. T. and Thompson, Simon G.}, + journal={Statistics in Medicine}, + volume={21}, + number={11}, + pages={1539--1558}, + year={2002} +} + +@article{egger1997bias, + title={Bias in meta-analysis detected by a simple, graphical test}, + author={Egger, Matthias and Davey Smith, George and Schneider, Martin and Minder, Christoph}, + journal={BMJ}, + volume={315}, + number={7109}, + pages={629--634}, + year={1997} +} + @article{wager2018estimation, title={Estimation and inference of heterogeneous treatment effects using random forests}, author={Wager, Stefan and Athey, Susan}, diff --git a/schemas/functions.json b/schemas/functions.json index 145e897d..000d86a8 100644 --- a/schemas/functions.json +++ b/schemas/functions.json @@ -22757,6 +22757,173 @@ "type": "object" } }, + { + "description": "Summary-data meta-analysis with fixed- and random-effects pooling.", + "name": "meta_analysis", + "parameters": { + "properties": { + "alpha": { + "default": 0.05, + "description": "Significance level for confidence/prediction intervals.", + "type": "number" + }, + "effects": { + "description": "Per-study effect sizes (e.g. log odds ratios, mean differences).", + "type": "number" + }, + "labels": { + "description": "Study labels for the forest plot.", + "items": { + "type": "string" + }, + "type": "array" + }, + "method": { + "default": "DL", + "description": "Which model the headline ``estimate`` reports: DerSimonian-Laird random effects (default) or fixed-effect inverse-variance. Both are always computed and available on the result.", + "enum": [ + "DL", + "fixed" + ], + "type": "string" + }, + "se": { + "description": "Per-study standard errors (must be positive).", + "type": "number" + } + }, + "required": [ + "effects", + "se" + ], + "type": "object" + } + }, + { + "description": "Result of a summary-data meta-analysis.", + "name": "MetaAnalysisResult", + "parameters": { + "properties": { + "alpha": { + "default": 0.05, + "description": "Significance level for confidence intervals and tests.", + "type": "number" + }, + "ci": { + "description": "ci parameter (tuple).", + "items": { + "type": "string" + }, + "type": "array" + }, + "effects": { + "description": "effects parameter (np.ndarray).", + "type": "string" + }, + "estimate": { + "description": "estimate parameter (float).", + "type": "number" + }, + "fixed_estimate": { + "description": "fixed_estimate parameter (float).", + "type": "number" + }, + "fixed_se": { + "description": "fixed_se parameter (float).", + "type": "number" + }, + "h2": { + "description": "h2 parameter (float).", + "type": "number" + }, + "i2": { + "description": "i2 parameter (float).", + "type": "number" + }, + "labels": { + "description": "labels parameter (List[str]).", + "items": { + "type": "string" + }, + "type": "array" + }, + "method": { + "description": "Estimator or algorithm variant to use.", + "type": "string" + }, + "p_value": { + "description": "p_value parameter (float).", + "type": "number" + }, + "prediction_interval": { + "description": "prediction_interval parameter (Optional[tuple]).", + "items": { + "type": "string" + }, + "type": "array" + }, + "q": { + "description": "Quantile level.", + "type": "number" + }, + "q_df": { + "description": "q_df parameter (int).", + "type": "integer" + }, + "q_pvalue": { + "description": "q_pvalue parameter (float).", + "type": "number" + }, + "random_estimate": { + "description": "random_estimate parameter (float).", + "type": "number" + }, + "random_se": { + "description": "random_se parameter (float).", + "type": "number" + }, + "se": { + "description": "se parameter (float).", + "type": "number" + }, + "se_studies": { + "description": "se_studies parameter (np.ndarray).", + "type": "string" + }, + "tau2": { + "description": "tau2 parameter (float).", + "type": "number" + }, + "weights": { + "description": "Observation weights.", + "type": "string" + } + }, + "required": [ + "estimate", + "se", + "ci", + "p_value", + "method", + "fixed_estimate", + "fixed_se", + "random_estimate", + "random_se", + "tau2", + "q", + "q_df", + "q_pvalue", + "i2", + "h2", + "prediction_interval", + "weights", + "effects", + "se_studies", + "labels" + ], + "type": "object" + } + }, { "description": "Class wrapper around :func:`msm` for programmatic access.", "name": "MarginalStructuralModel", diff --git a/schemas/index.json b/schemas/index.json index b05582dc..d5b863cc 100644 --- a/schemas/index.json +++ b/schemas/index.json @@ -1,8 +1,8 @@ { "counts": { "agent_cards": 362, - "functions": 1020, - "tools": 484 + "functions": 1022, + "tools": 485 }, "files": [ "tools.json", diff --git a/schemas/tools.json b/schemas/tools.json index 80310ac2..398d9e00 100644 --- a/schemas/tools.json +++ b/schemas/tools.json @@ -11657,6 +11657,48 @@ }, "name": "mepoisson" }, + { + "description": "Summary-data meta-analysis with fixed- and random-effects pooling.", + "input_schema": { + "properties": { + "alpha": { + "default": 0.05, + "description": "Significance level for confidence/prediction intervals.", + "type": "number" + }, + "effects": { + "description": "Per-study effect sizes (e.g. log odds ratios, mean differences).", + "type": "number" + }, + "labels": { + "description": "Study labels for the forest plot.", + "items": { + "type": "string" + }, + "type": "array" + }, + "method": { + "default": "DL", + "description": "Which model the headline ``estimate`` reports: DerSimonian-Laird random effects (default) or fixed-effect inverse-variance. Both are always computed and available on the result.", + "enum": [ + "DL", + "fixed" + ], + "type": "string" + }, + "se": { + "description": "Per-study standard errors (must be positive).", + "type": "number" + } + }, + "required": [ + "effects", + "se" + ], + "type": "object" + }, + "name": "meta_analysis" + }, { "description": "Meta-learner framework for CATE: S-, T-, X-, R-, DR-Learner. Validation: validated evidence tier (known-truth, reference, external-parity, or Monte Carlo artifact). Assumptions: Unconfoundedness: Y(d) perp D | X; Overlap: 0 < P(D=1 | X) < 1; For R-Learner / DR-Learner: orthogonality between treatment residual and outcome residual. Pre-conditions: binary treatment (0/1); covariates numeric; categoricals encoded; enough treated AND control to train separate outcome models (T/X/DR-Learner). Failure modes: Large divergence across learner types -> Use sp.compare_metalearners to identify which learner is biased; DR-Learner is safest under model misspecification; S-Learner estimates near zero regardless of true effect -> S-Learner regularization smooths treatment coefficient toward zero; use T/X/DR instead; X-Learner fails when treated group is very small -> X-Learner needs well-identified control-outcome model; fall back to T-Learner or weighted T-Learner. Alternatives: sp.causal_forest, sp.dml, sp.tmle, sp.bcf. Typical minimum N: 500.", "input_schema": { diff --git a/src/statspai/__init__.py b/src/statspai/__init__.py index 38f4a22f..574208d1 100644 --- a/src/statspai/__init__.py +++ b/src/statspai/__init__.py @@ -409,6 +409,8 @@ cr3_jackknife_vcov, g_computation, front_door, + meta_analysis, + MetaAnalysisResult, ) # Eager: ``msm`` collides (function + subpackage of same name). @@ -1327,6 +1329,9 @@ # G-methods family (g-computation / front-door) "g_computation", "front_door", + # Meta-analysis (evidence synthesis) + "meta_analysis", + "MetaAnalysisResult", # Marginal Structural Models (time-varying treatment) "msm", "MarginalStructuralModel", diff --git a/src/statspai/__init__.pyi b/src/statspai/__init__.pyi index a16b9695..8f56b69b 100644 --- a/src/statspai/__init__.pyi +++ b/src/statspai/__init__.pyi @@ -378,6 +378,8 @@ from .inference.conley import conley as conley from .inference.front_door import front_door as front_door from .inference.g_computation import g_computation as g_computation from .inference.ipw import ipw as ipw +from .inference.meta_analysis import MetaAnalysisResult as MetaAnalysisResult +from .inference.meta_analysis import meta_analysis as meta_analysis from .inference.jackknife import cr2_se as cr2_se from .inference.jackknife import jackknife_se as jackknife_se from .inference.jackknife import wild_cluster_boot as wild_cluster_boot diff --git a/src/statspai/inference/__init__.py b/src/statspai/inference/__init__.py index 8e8d7128..046b19c5 100644 --- a/src/statspai/inference/__init__.py +++ b/src/statspai/inference/__init__.py @@ -23,6 +23,7 @@ cluster_robust_se, cr3_jackknife_vcov, ) +from .meta_analysis import meta_analysis, MetaAnalysisResult __all__ = [ 'wild_cluster_bootstrap', @@ -47,4 +48,6 @@ 'multiway_cluster_vcov', 'cluster_robust_se', 'cr3_jackknife_vcov', + 'meta_analysis', + 'MetaAnalysisResult', ] diff --git a/src/statspai/inference/meta_analysis.py b/src/statspai/inference/meta_analysis.py new file mode 100644 index 00000000..3fbf40d5 --- /dev/null +++ b/src/statspai/inference/meta_analysis.py @@ -0,0 +1,309 @@ +"""Study-level meta-analysis (evidence synthesis). + +Pools effect sizes (and their standard errors) across studies, the workhorse +of systematic reviews in public health and clinical research. Provides both +the fixed-effect (inverse-variance) and random-effects (DerSimonian-Laird) +models, the standard heterogeneity statistics (Cochran's Q, I^2, tau^2, H^2), +a random-effects prediction interval, and Egger's test for small-study / +funnel-plot asymmetry — plus forest- and funnel-plot helpers. + +This is intentionally summary-data meta-analysis (you pass per-study effects +and SEs); it does not fit individual-participant-data models. + +References +---------- +DerSimonian, R. & Laird, N. (1986). "Meta-analysis in clinical trials." +*Controlled Clinical Trials*, 7(3), 177-188. [@dersimonian1986meta] + +Higgins, J.P.T. & Thompson, S.G. (2002). "Quantifying heterogeneity in a +meta-analysis." *Statistics in Medicine*, 21(11), 1539-1558. +[@higgins2002quantifying] + +Egger, M., Davey Smith, G., Schneider, M. & Minder, C. (1997). "Bias in +meta-analysis detected by a simple, graphical test." *BMJ*, 315(7109), +629-634. [@egger1997bias] +""" +from __future__ import annotations + +from dataclasses import dataclass +from typing import Any, Dict, List, Optional, Sequence + +import numpy as np +from scipy import stats + +__all__ = [ + "MetaAnalysisResult", + "meta_analysis", +] + + +@dataclass +class MetaAnalysisResult: + """Result of a summary-data meta-analysis. + + Attributes + ---------- + estimate : float + Pooled effect under the chosen model (random-effects by default). + se, ci : float, tuple + Standard error and confidence interval of the pooled effect. + p_value : float + Two-sided p-value for the pooled effect being zero. + method : str + ``"fixed"`` or ``"DL"`` (DerSimonian-Laird random effects). + fixed_estimate, fixed_se : float + Fixed-effect pooled estimate and SE (always reported). + random_estimate, random_se : float + Random-effects pooled estimate and SE (always reported). + tau2, q, q_df, q_pvalue, i2, h2 : float + Between-study variance and heterogeneity statistics. + prediction_interval : tuple or None + Random-effects prediction interval (where a future study's true + effect is expected to lie); ``None`` when fewer than 3 studies. + weights : np.ndarray + Per-study weights under the chosen model (normalised to sum to 1). + """ + + estimate: float + se: float + ci: tuple + p_value: float + method: str + fixed_estimate: float + fixed_se: float + random_estimate: float + random_se: float + tau2: float + q: float + q_df: int + q_pvalue: float + i2: float + h2: float + prediction_interval: Optional[tuple] + weights: np.ndarray + effects: np.ndarray + se_studies: np.ndarray + labels: List[str] + alpha: float = 0.05 + + def egger_test(self) -> Dict[str, float]: + """Egger's regression test for funnel-plot asymmetry. + + Regresses the standard normal deviate ``y_i / se_i`` on precision + ``1 / se_i``; a non-zero intercept indicates small-study effects. + Requires at least 3 studies. + """ + y = self.effects + s = self.se_studies + k = len(y) + if k < 3: + return {"intercept": float("nan"), "se": float("nan"), + "t": float("nan"), "p_value": float("nan"), "df": k - 2} + snd = y / s # standard normal deviate + precision = 1.0 / s + X = np.column_stack([np.ones(k), precision]) + beta, *_ = np.linalg.lstsq(X, snd, rcond=None) + resid = snd - X @ beta + dof = k - 2 + sigma2 = float(resid @ resid) / dof + xtx_inv = np.linalg.inv(X.T @ X) + se_intercept = float(np.sqrt(sigma2 * xtx_inv[0, 0])) + t_stat = float(beta[0] / se_intercept) + p = float(2 * stats.t.sf(abs(t_stat), dof)) + return {"intercept": float(beta[0]), "se": se_intercept, + "t": t_stat, "p_value": p, "df": dof} + + def summary(self) -> str: + z = stats.norm.ppf(1 - self.alpha / 2) + out = ["=" * 72, "Meta-analysis (summary data)", "=" * 72] + out.append(f"Studies (k) : {len(self.effects)}") + model = ("random-effects (DerSimonian-Laird)" + if self.method == "DL" else "fixed-effect (inverse-variance)") + out.append(f"Model : {model}") + out.append("-" * 72) + fe_lo = self.fixed_estimate - z * self.fixed_se + fe_hi = self.fixed_estimate + z * self.fixed_se + re_lo = self.random_estimate - z * self.random_se + re_hi = self.random_estimate + z * self.random_se + fe_ci = f"[{fe_lo:.4f}, {fe_hi:.4f}]" + re_ci = f"[{re_lo:.4f}, {re_hi:.4f}]" + out.append(f"Fixed-effect pooled : {self.fixed_estimate:.4f} {fe_ci}") + out.append(f"Random-effects pooled: {self.random_estimate:.4f} {re_ci}") + if self.prediction_interval is not None: + pi_lo, pi_hi = self.prediction_interval + out.append(f" prediction interval: [{pi_lo:.4f}, {pi_hi:.4f}]") + out.append("-" * 72) + out.append("Heterogeneity:") + out.append(f" Q({self.q_df}) = {self.q:.4f}, p = {self.q_pvalue:.4g}") + i2_pct = 100 * self.i2 + out.append(f" I^2 = {i2_pct:.1f}% tau^2 = {self.tau2:.4f}" + f" H^2 = {self.h2:.4f}") + egg = self.egger_test() + if not np.isnan(egg["p_value"]): + intercept = egg["intercept"] + out.append(f"Egger's test: intercept = {intercept:.4f}, " + f"p = {egg['p_value']:.4g}") + out.append("=" * 72) + return "\n".join(out) + + def forest_plot(self, ax: Any = None, **kwargs: Any) -> Any: + """Forest plot: per-study effects + CIs with the pooled diamond.""" + import matplotlib.pyplot as plt + + if ax is None: + _, ax = plt.subplots( + figsize=kwargs.pop("figsize", (7, 0.4 * len(self.effects) + 2)) + ) + z = stats.norm.ppf(1 - self.alpha / 2) + y = self.effects + s = self.se_studies + k = len(y) + positions = np.arange(k, 0, -1) + ax.errorbar(y, positions, xerr=z * s, fmt="s", color="#333", + capsize=3, markersize=5, linestyle="none") + ax.axvline(0.0, color="grey", lw=0.8, linestyle="--") + pooled = self.estimate + p_lo, p_hi = self.ci + ax.axvline(pooled, color="#c1272d", lw=1.0) + ax.fill_betweenx([0.2, 0.8], p_lo, p_hi, color="#c1272d", alpha=0.3) + ax.set_yticks(list(positions) + [0.5]) + ax.set_yticklabels(self.labels + ["Pooled"]) + ax.set_xlabel("Effect size") + ax.set_title("Forest plot") + return ax + + def funnel_plot(self, ax: Any = None, **kwargs: Any) -> Any: + """Funnel plot: effect size vs. standard error (precision).""" + import matplotlib.pyplot as plt + + if ax is None: + _, ax = plt.subplots(figsize=kwargs.pop("figsize", (6, 5))) + ax.scatter(self.effects, self.se_studies, color="#333", s=25) + ax.axvline(self.estimate, color="#c1272d", lw=1.0) + ax.set_xlabel("Effect size") + ax.set_ylabel("Standard error") + ax.invert_yaxis() + ax.set_title("Funnel plot") + return ax + + def __repr__(self) -> str: + return ( + f"" + ) + + +def meta_analysis( + effects: Sequence[float], + se: Sequence[float], + *, + method: str = "DL", + labels: Optional[Sequence[str]] = None, + alpha: float = 0.05, +) -> MetaAnalysisResult: + """Summary-data meta-analysis with fixed- and random-effects pooling. + + Parameters + ---------- + effects : sequence of float + Per-study effect sizes (e.g. log odds ratios, mean differences). + se : sequence of float + Per-study standard errors (must be positive). + method : {"DL", "fixed"} + Which model the headline ``estimate`` reports: DerSimonian-Laird + random effects (default) or fixed-effect inverse-variance. Both are + always computed and available on the result. + labels : sequence of str, optional + Study labels for the forest plot. + alpha : float + Significance level for confidence/prediction intervals. + + Returns + ------- + MetaAnalysisResult + With ``.estimate``, ``.ci``, heterogeneity statistics, ``.summary()``, + ``.egger_test()``, ``.forest_plot()``, ``.funnel_plot()``. + + Examples + -------- + >>> import statspai as sp + >>> # five studies' log odds ratios and SEs + >>> r = sp.meta_analysis([0.10, 0.25, -0.05, 0.30, 0.15], + ... [0.05, 0.10, 0.08, 0.12, 0.06]) + >>> r.summary() + """ + y = np.asarray(effects, dtype=float) + s = np.asarray(se, dtype=float) + if y.shape != s.shape or y.ndim != 1: + raise ValueError("effects and se must be 1-D arrays of equal length.") + if np.any(s <= 0): + raise ValueError("all standard errors must be positive.") + k = len(y) + if k < 2: + raise ValueError("meta-analysis needs at least 2 studies.") + if method not in ("DL", "fixed"): + raise ValueError("method must be 'DL' or 'fixed'.") + if labels is None: + labels = [f"Study {i + 1}" for i in range(k)] + else: + labels = list(labels) + + # Fixed-effect inverse-variance. + w = 1.0 / s ** 2 + fe_est = float(np.sum(w * y) / np.sum(w)) + fe_se = float(np.sqrt(1.0 / np.sum(w))) + + # Cochran's Q and DerSimonian-Laird tau^2. + q = float(np.sum(w * (y - fe_est) ** 2)) + df = k - 1 + c = float(np.sum(w) - np.sum(w ** 2) / np.sum(w)) + tau2 = max(0.0, (q - df) / c) if c > 0 else 0.0 + q_pvalue = float(stats.chi2.sf(q, df)) if df > 0 else float("nan") + i2 = max(0.0, (q - df) / q) if q > 0 else 0.0 + h2 = q / df if df > 0 else float("nan") + + # Random-effects (DL) weights. + w_re = 1.0 / (s ** 2 + tau2) + re_est = float(np.sum(w_re * y) / np.sum(w_re)) + re_se = float(np.sqrt(1.0 / np.sum(w_re))) + + z = stats.norm.ppf(1 - alpha / 2) + if method == "DL": + est, se_pooled, weights = re_est, re_se, w_re + else: + est, se_pooled, weights = fe_est, fe_se, w + + ci = (est - z * se_pooled, est + z * se_pooled) + p_value = float(2 * stats.norm.sf(abs(est / se_pooled))) + + # Random-effects prediction interval (Higgins et al. 2009): uses t_{k-2}. + prediction_interval: Optional[tuple] = None + if k >= 3: + t_val = stats.t.ppf(1 - alpha / 2, df=k - 2) + pi_se = np.sqrt(re_se ** 2 + tau2) + prediction_interval = (re_est - t_val * pi_se, re_est + t_val * pi_se) + + return MetaAnalysisResult( + estimate=est, + se=se_pooled, + ci=ci, + p_value=p_value, + method=method, + fixed_estimate=fe_est, + fixed_se=fe_se, + random_estimate=re_est, + random_se=re_se, + tau2=tau2, + q=q, + q_df=df, + q_pvalue=q_pvalue, + i2=i2, + h2=h2, + prediction_interval=prediction_interval, + weights=weights / np.sum(weights), + effects=y, + se_studies=s, + labels=labels, + alpha=alpha, + ) diff --git a/src/statspai/schemas/functions.json b/src/statspai/schemas/functions.json index 145e897d..000d86a8 100644 --- a/src/statspai/schemas/functions.json +++ b/src/statspai/schemas/functions.json @@ -22757,6 +22757,173 @@ "type": "object" } }, + { + "description": "Summary-data meta-analysis with fixed- and random-effects pooling.", + "name": "meta_analysis", + "parameters": { + "properties": { + "alpha": { + "default": 0.05, + "description": "Significance level for confidence/prediction intervals.", + "type": "number" + }, + "effects": { + "description": "Per-study effect sizes (e.g. log odds ratios, mean differences).", + "type": "number" + }, + "labels": { + "description": "Study labels for the forest plot.", + "items": { + "type": "string" + }, + "type": "array" + }, + "method": { + "default": "DL", + "description": "Which model the headline ``estimate`` reports: DerSimonian-Laird random effects (default) or fixed-effect inverse-variance. Both are always computed and available on the result.", + "enum": [ + "DL", + "fixed" + ], + "type": "string" + }, + "se": { + "description": "Per-study standard errors (must be positive).", + "type": "number" + } + }, + "required": [ + "effects", + "se" + ], + "type": "object" + } + }, + { + "description": "Result of a summary-data meta-analysis.", + "name": "MetaAnalysisResult", + "parameters": { + "properties": { + "alpha": { + "default": 0.05, + "description": "Significance level for confidence intervals and tests.", + "type": "number" + }, + "ci": { + "description": "ci parameter (tuple).", + "items": { + "type": "string" + }, + "type": "array" + }, + "effects": { + "description": "effects parameter (np.ndarray).", + "type": "string" + }, + "estimate": { + "description": "estimate parameter (float).", + "type": "number" + }, + "fixed_estimate": { + "description": "fixed_estimate parameter (float).", + "type": "number" + }, + "fixed_se": { + "description": "fixed_se parameter (float).", + "type": "number" + }, + "h2": { + "description": "h2 parameter (float).", + "type": "number" + }, + "i2": { + "description": "i2 parameter (float).", + "type": "number" + }, + "labels": { + "description": "labels parameter (List[str]).", + "items": { + "type": "string" + }, + "type": "array" + }, + "method": { + "description": "Estimator or algorithm variant to use.", + "type": "string" + }, + "p_value": { + "description": "p_value parameter (float).", + "type": "number" + }, + "prediction_interval": { + "description": "prediction_interval parameter (Optional[tuple]).", + "items": { + "type": "string" + }, + "type": "array" + }, + "q": { + "description": "Quantile level.", + "type": "number" + }, + "q_df": { + "description": "q_df parameter (int).", + "type": "integer" + }, + "q_pvalue": { + "description": "q_pvalue parameter (float).", + "type": "number" + }, + "random_estimate": { + "description": "random_estimate parameter (float).", + "type": "number" + }, + "random_se": { + "description": "random_se parameter (float).", + "type": "number" + }, + "se": { + "description": "se parameter (float).", + "type": "number" + }, + "se_studies": { + "description": "se_studies parameter (np.ndarray).", + "type": "string" + }, + "tau2": { + "description": "tau2 parameter (float).", + "type": "number" + }, + "weights": { + "description": "Observation weights.", + "type": "string" + } + }, + "required": [ + "estimate", + "se", + "ci", + "p_value", + "method", + "fixed_estimate", + "fixed_se", + "random_estimate", + "random_se", + "tau2", + "q", + "q_df", + "q_pvalue", + "i2", + "h2", + "prediction_interval", + "weights", + "effects", + "se_studies", + "labels" + ], + "type": "object" + } + }, { "description": "Class wrapper around :func:`msm` for programmatic access.", "name": "MarginalStructuralModel", diff --git a/src/statspai/schemas/index.json b/src/statspai/schemas/index.json index b05582dc..d5b863cc 100644 --- a/src/statspai/schemas/index.json +++ b/src/statspai/schemas/index.json @@ -1,8 +1,8 @@ { "counts": { "agent_cards": 362, - "functions": 1020, - "tools": 484 + "functions": 1022, + "tools": 485 }, "files": [ "tools.json", diff --git a/src/statspai/schemas/tools.json b/src/statspai/schemas/tools.json index 80310ac2..398d9e00 100644 --- a/src/statspai/schemas/tools.json +++ b/src/statspai/schemas/tools.json @@ -11657,6 +11657,48 @@ }, "name": "mepoisson" }, + { + "description": "Summary-data meta-analysis with fixed- and random-effects pooling.", + "input_schema": { + "properties": { + "alpha": { + "default": 0.05, + "description": "Significance level for confidence/prediction intervals.", + "type": "number" + }, + "effects": { + "description": "Per-study effect sizes (e.g. log odds ratios, mean differences).", + "type": "number" + }, + "labels": { + "description": "Study labels for the forest plot.", + "items": { + "type": "string" + }, + "type": "array" + }, + "method": { + "default": "DL", + "description": "Which model the headline ``estimate`` reports: DerSimonian-Laird random effects (default) or fixed-effect inverse-variance. Both are always computed and available on the result.", + "enum": [ + "DL", + "fixed" + ], + "type": "string" + }, + "se": { + "description": "Per-study standard errors (must be positive).", + "type": "number" + } + }, + "required": [ + "effects", + "se" + ], + "type": "object" + }, + "name": "meta_analysis" + }, { "description": "Meta-learner framework for CATE: S-, T-, X-, R-, DR-Learner. Validation: validated evidence tier (known-truth, reference, external-parity, or Monte Carlo artifact). Assumptions: Unconfoundedness: Y(d) perp D | X; Overlap: 0 < P(D=1 | X) < 1; For R-Learner / DR-Learner: orthogonality between treatment residual and outcome residual. Pre-conditions: binary treatment (0/1); covariates numeric; categoricals encoded; enough treated AND control to train separate outcome models (T/X/DR-Learner). Failure modes: Large divergence across learner types -> Use sp.compare_metalearners to identify which learner is biased; DR-Learner is safest under model misspecification; S-Learner estimates near zero regardless of true effect -> S-Learner regularization smooths treatment coefficient toward zero; use T/X/DR instead; X-Learner fails when treated group is very small -> X-Learner needs well-identified control-outcome model; fall back to T-Learner or weighted T-Learner. Alternatives: sp.causal_forest, sp.dml, sp.tmle, sp.bcf. Typical minimum N: 500.", "input_schema": { diff --git a/tests/test_meta_analysis.py b/tests/test_meta_analysis.py new file mode 100644 index 00000000..09a847d3 --- /dev/null +++ b/tests/test_meta_analysis.py @@ -0,0 +1,78 @@ +"""Tests for summary-data meta-analysis (``sp.meta_analysis``). + +Validation against closed-form inverse-variance pooling, the +DerSimonian-Laird heterogeneity formulas, and Egger's test behaviour under +symmetric vs. asymmetric funnels. +""" +import numpy as np +import pytest + +import statspai as sp + + +def test_fixed_effect_matches_inverse_variance(): + y = [0.10, 0.25, -0.05, 0.30, 0.15] + s = [0.05, 0.10, 0.08, 0.12, 0.06] + r = sp.meta_analysis(y, s, method="fixed") + w = 1.0 / np.array(s) ** 2 + fe = float(np.sum(w * np.array(y)) / np.sum(w)) + fe_se = float(np.sqrt(1.0 / np.sum(w))) + assert r.estimate == pytest.approx(fe, abs=1e-10) + assert r.se == pytest.approx(fe_se, abs=1e-10) + assert r.q_df == 4 + assert np.sum(r.weights) == pytest.approx(1.0, abs=1e-9) + + +def test_homogeneous_studies_zero_tau2(): + rng = np.random.default_rng(0) + se = rng.uniform(0.05, 0.15, 25) + y = 0.4 + rng.normal(0, se) # only sampling variation + r = sp.meta_analysis(y, se) + assert r.tau2 == pytest.approx(0.0, abs=1e-3) + assert r.i2 < 0.25 + assert r.random_estimate == pytest.approx(r.fixed_estimate, abs=1e-3) + assert r.estimate == pytest.approx(0.4, abs=0.05) + + +def test_heterogeneous_studies_inflate_random_se(): + rng = np.random.default_rng(1) + mu = rng.normal(0.4, 0.3, 25) # true between-study heterogeneity + se = rng.uniform(0.05, 0.15, 25) + y = mu + rng.normal(0, se) + r = sp.meta_analysis(y, se) + assert r.tau2 > 0 + assert r.i2 > 0.5 + assert r.random_se > r.fixed_se + # prediction interval is wider than the CI of the pooled mean + pi = r.prediction_interval + assert pi is not None + assert (pi[1] - pi[0]) > (r.ci[1] - r.ci[0]) + + +def test_egger_symmetric_vs_asymmetric(): + rng = np.random.default_rng(2) + se = rng.uniform(0.05, 0.30, 50) + y_sym = 0.3 + rng.normal(0, se) + p_sym = sp.meta_analysis(y_sym, se).egger_test()["p_value"] + assert p_sym > 0.10 + + se2 = rng.uniform(0.05, 0.30, 50) + y_asym = 0.3 + 2.0 * se2 + rng.normal(0, se2) # small-study effect + p_asym = sp.meta_analysis(y_asym, se2).egger_test()["p_value"] + assert p_asym < 0.05 + + +def test_validates_inputs(): + with pytest.raises(ValueError, match="at least 2"): + sp.meta_analysis([1.0], [0.1]) + with pytest.raises(ValueError, match="positive"): + sp.meta_analysis([1.0, 2.0], [0.1, -0.1]) + with pytest.raises(ValueError, match="DL|fixed"): + sp.meta_analysis([1.0, 2.0], [0.1, 0.2], method="bogus") + + +def test_summary_and_repr(): + r = sp.meta_analysis([0.1, 0.2, 0.15], [0.05, 0.06, 0.07]) + assert "Meta-analysis" in r.summary() + assert "I^2" in r.summary() + assert "MetaAnalysisResult" in repr(r)