Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 52 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
name: Deploy Docs

on:
push:
branches: [main]
pull_request:
paths:
- "docs/**"
- "mkdocs.yml"
- ".github/workflows/docs.yml"

permissions:
contents: write

jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- uses: dtolnay/rust-toolchain@stable

- uses: Swatinem/rust-cache@v2

- uses: actions/setup-python@v5
with:
python-version: "3.12"

- name: Install dependencies
run: |
pip install uv
uv sync --all-extras

- name: Generate docs assets
run: .venv/bin/python scripts/generate_docs_assets.py

- name: Verify generated assets
run: |
test -f docs/images/causal_impact_plot.png
echo "PNG asset verified."

- name: Build docs (strict mode)
run: .venv/bin/mkdocs build --strict

- name: Verify HTML output
run: |
grep -q "Getting Started" site/index.html
echo "HTML content verified."

- name: Deploy to GitHub Pages
if: github.ref == 'refs/heads/main'
run: .venv/bin/mkdocs gh-deploy --force
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,5 @@ build/
z-ai/
.claude/
*.DS_Store
site/
docs/images/*.png
85 changes: 85 additions & 0 deletions docs/api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# API Reference

## `CausalImpact`

```python
from causal_impact import CausalImpact

ci = CausalImpact(data, pre_period, post_period, model_args=None, alpha=0.05)
```

### Constructor Parameters

| Parameter | Type | Description |
|---|---|---|
| `data` | `DataFrame` or `ndarray` | First column is the response variable, remaining columns are covariates |
| `pre_period` | `list[str \| int]` | `[start, end]` of the pre-intervention period |
| `post_period` | `list[str \| int]` | `[start, end]` of the post-intervention period |
| `model_args` | `dict` or `ModelOptions` | MCMC parameters (see below) |
| `alpha` | `float` | Significance level for credible intervals (default: 0.05) |

### Methods

| Method | Returns | Description |
|---|---|---|
| `summary(output="summary", digits=2)` | `str` | Tabular summary of causal effects. Set `output="report"` for narrative form. |
| `report()` | `str` | Narrative interpretation of results (shortcut for `summary(output="report")`) |
| `plot(metrics=None)` | `Figure` | Matplotlib figure with original/pointwise/cumulative panels. Pass a list like `["original", "cumulative"]` to select panels. |

### Properties

| Property | Type | Description |
|---|---|---|
| `inferences` | `DataFrame` | Per-timestep effects, predictions, and credible intervals |
| `summary_stats` | `dict` | Aggregate statistics (effect mean, CI, p-value, etc.) |
| `posterior_inclusion_probs` | `ndarray \| None` | Posterior inclusion probability per covariate (requires covariates) |

## `ModelOptions`

```python
from causal_impact import ModelOptions

opts = ModelOptions(niter=5000, seed=123)
ci = CausalImpact(data, pre_period, post_period, model_args=opts)
```

### Parameters

| Parameter | Type | Default | Description |
|---|---|---|---|
| `niter` | `int` | 1000 | Total MCMC iterations |
| `nwarmup` | `int` | 500 | Burn-in iterations to discard |
| `nchains` | `int` | 1 | Number of MCMC chains |
| `seed` | `int` | 0 | Random seed for reproducibility |
| `prior_level_sd` | `float` | 0.01 | Prior standard deviation for the local level |
| `standardize_data` | `bool` | `True` | Standardize data before fitting |
| `expected_model_size` | `int` | 1 | Expected number of active covariates for spike-and-slab prior |
| `nseasons` | `int \| None` | `None` | Seasonal cycle count |
| `season_duration` | `int \| None` | `None` | Duration of each seasonal block; defaults to 1 when `nseasons` is set |

!!! note "expected_model_size defaults"
`CausalImpact` sets `expected_model_size=2` by default (matching R).
`ModelOptions` keeps `expected_model_size=1` as the explicit default.
When passing a `ModelOptions` instance, the `ModelOptions` value takes precedence.

## `CausalImpactResults`

Returned by `ci._results`. A frozen dataclass containing all computed quantities.

### Fields

| Field | Type | Description |
|---|---|---|
| `actual` | `ndarray` | Observed y values in the post period |
| `point_effects` | `ndarray` | Mean effect per time point |
| `point_effect_lower` | `ndarray` | Lower 95% CI per time point |
| `point_effect_upper` | `ndarray` | Upper 95% CI per time point |
| `point_effect_mean` | `float` | Mean of point effects across time |
| `ci_lower` | `float` | Lower CI bound on average effect |
| `ci_upper` | `float` | Upper CI bound on average effect |
| `cumulative_effect_total` | `float` | Total cumulative effect |
| `relative_effect_mean` | `float` | Relative effect (effect / predicted) |
| `p_value` | `float` | Bayesian one-sided tail probability |
| `predictions_mean` | `ndarray` | Mean counterfactual prediction |
| `predictions_lower` | `ndarray` | Lower CI on counterfactual |
| `predictions_upper` | `ndarray` | Upper CI on counterfactual |
Empty file added docs/images/.gitkeep
Empty file.
192 changes: 192 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,192 @@
# bsts-causalimpact

Bayesian structural time series for causal inference in Python.
A faithful port of Google's [CausalImpact](https://google.github.io/CausalImpact/) R package, with the Gibbs sampler implemented in Rust via PyO3.

No TensorFlow required. 10-30x faster than R.

## Installation

```bash
pip install bsts-causalimpact
```

Requires Python 3.10+ and a Rust toolchain (only for building from source).

## Example: Measuring the Effect of an Intervention

This walkthrough mirrors the [R CausalImpact tutorial](https://google.github.io/CausalImpact/CausalImpact.html).

### 1. Create Example Data

Construct a synthetic dataset: a response variable `y` driven by a covariate `x`,
with a known intervention effect of +5 units injected after time point 100.

```python
import numpy as np
import pandas as pd
from causal_impact import CausalImpact

rng = np.random.default_rng(42)

n_pre, n_post = 100, 30
n = n_pre + n_post

x = rng.normal(0, 1, size=n).cumsum() + 100
y = 1.2 * x + rng.normal(0, 1, size=n)
y[n_pre:] += 5.0 # inject intervention effect

dates = pd.date_range("2020-01-01", periods=n, freq="D")
data = pd.DataFrame({"y": y, "x": x}, index=dates)

pre_period = ["2020-01-01", "2020-04-09"]
post_period = ["2020-04-10", "2020-05-09"]
```

The first column (`y`) is the response variable. Remaining columns are covariates
that the model uses to build a counterfactual prediction.

### 2. Run the Analysis

```python
ci = CausalImpact(data, pre_period, post_period, model_args={"seed": 42})
```

`CausalImpact` fits a Bayesian structural time series model on the pre-intervention
data, then generates counterfactual predictions for the post-intervention period.

### 3. Visualize the Results

```python
fig = ci.plot()
fig.savefig("causal_impact_plot.png", dpi=150, bbox_inches="tight")
```

![CausalImpact Plot](images/causal_impact_plot.png)

The plot has three panels:

- Top panel: observed data (solid) vs. counterfactual prediction (dashed) with 95% credible interval
- Middle panel: pointwise causal effect (observed minus predicted)
- Bottom panel: cumulative causal effect over the post-intervention period

### 4. Summary Statistics

```python
print(ci.summary())
```

```
Posterior inference {CausalImpact}

Average Cumulative
Actual 117.11 3513.26
Prediction (s.d.) 112.66 (0.49) 3379.75 (14.79)
95% CI [111.63, 113.61] [3348.81, 3408.33]

Absolute effect (s.d.) 4.45 (0.49) 133.51 (14.79)
95% CI [3.50, 5.48] [104.93, 164.45]

Relative effect (s.d.) 3.95% (0.46%) 3.95% (0.46%)
95% CI [3.08%, 4.91%] [3.08%, 4.91%]

Posterior tail-area probability p: 0.001
Posterior prob. of a causal effect: 99.90%
```

The summary table shows the average and cumulative causal effect, along with
credible intervals and a Bayesian p-value.

### 5. Narrative Report

```python
print(ci.report())
```

```
Analysis report {CausalImpact}

During the post-intervention period, the response variable showed a increase
compared to what would have been expected without the intervention.

The average causal effect was 4.45 (95% CI [3.50, 5.48]).

The cumulative effect over the entire post-period was 133.51.

The relative effect was 4.0%.

This effect is statistically significant (p = 0.0010). The probability of
obtaining an effect of this magnitude by chance is very small. Hence, the
causal effect can be considered statistically significant.
```

### 6. Access Raw Inferences

```python
# Per-timestep effects, predictions, and credible intervals
df = ci.inferences
print(df.head())

# Aggregate statistics as a dict
stats = ci.summary_stats
print(stats["point_effect_mean"])
print(stats["p_value"])
```

## Working with Covariates

The model treats the first column as the response and all remaining columns
as covariates. Covariates must not be affected by the intervention.

```python
data = pd.DataFrame({
"y": response,
"x1": covariate_1,
"x2": covariate_2,
}, index=dates)

ci = CausalImpact(data, pre_period, post_period)
```

When covariates are present, the model uses spike-and-slab variable selection
to determine which covariates are informative. Check posterior inclusion
probabilities:

```python
print(ci.posterior_inclusion_probs)
# array([0.98, 0.12]) — x1 is strongly included, x2 is not
```

## Model Parameters

| Parameter | Default | Description |
|---|---|---|
| `niter` | 1000 | Total MCMC iterations |
| `nwarmup` | 500 | Burn-in iterations to discard |
| `nchains` | 1 | Number of MCMC chains |
| `seed` | 0 | Random seed for reproducibility |
| `prior_level_sd` | 0.01 | Prior standard deviation for the local level |
| `standardize_data` | `True` | Standardize data before fitting |
| `expected_model_size` | 2 | Expected number of active covariates (spike-and-slab prior) |
| `nseasons` | `None` | Seasonal cycle count |
| `season_duration` | `None` | Duration of each seasonal block (defaults to 1 when `nseasons` is set) |

Pass parameters via `model_args`:

```python
ci = CausalImpact(
data, pre_period, post_period,
model_args={"niter": 5000, "seed": 123, "prior_level_sd": 0.05}
)
```

## When This Method Is Valid

This method produces reliable estimates only when all of the following hold:

- Control series are not contaminated by the intervention
- The relationship between treated and control series is stable across the pre- and post-intervention periods
- The pre-intervention period is sufficiently long (rule of thumb: at least 3x the post-intervention period)

If any of these assumptions are violated, consider a difference-in-differences or
synthetic control approach instead.
34 changes: 34 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
site_name: bsts-causalimpact
site_url: https://yuminosukesato.github.io/bsts-causalimpact/
repo_url: https://github.com/YuminosukeSato/bsts-causalimpact
repo_name: YuminosukeSato/bsts-causalimpact

theme:
name: material
palette:
- scheme: default
primary: indigo
accent: indigo
features:
- navigation.tabs
- content.code.copy

nav:
- Getting Started: index.md
- API Reference: api.md
- Migration from R: migration-from-r.md
- Migration from tfp: migration-from-tfp.md
- Compatibility Matrix: compatibility-matrix.md

plugins:
- search

markdown_extensions:
- pymdownx.highlight:
anchor_linenums: true
- pymdownx.inlinehilite
- pymdownx.snippets
- pymdownx.superfences
- admonition
- pymdownx.details
- tables
4 changes: 4 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,10 @@ dev = [
"pytest-cov",
"ruff",
]
docs = [
"mkdocs-material>=9.0",
"pymdown-extensions>=10.0",
]

[tool.maturin]
features = ["pyo3/extension-module"]
Expand Down
Loading
Loading