Why Increased Healthcare Spending Fails to Reduce Wait Times in Canada

Modeled provincial healthcare investment efficiency across 10 Canadian provinces to uncover where funding breaks down, enabling evidence-based reallocation recommendations for system planners.

The Finding in 30 Seconds

At the national level, higher provincial budgets correlate with shorter wait times (r = −0.50, p < 0.001). Break the data down by province, and the relationship reverses — provinces with the highest budgets often have the longest waits.

This is a Simpson's Paradox. It is consistent with reactive government funding: provinces with structurally long wait times receive budget increases, but aging populations, physician shortages, and facility constraints absorb the investment without producing wait time improvement.

The budget signal is real. It is not strong enough to be the primary lever.

Quick Numbers

Metric	Value
Pearson correlation	r = −0.50, p < 0.001
Variance explained (OLS)	R² = 0.205 — budget explains 20.5% of wait variation
Effect size	~0.2 days per $1B (directional estimate, not a causal coefficient)
Unexplained variance	79.5% — structural factors dominate
Observations	n = 54 (10 provinces × 6 years, 2013–2018)
Top predictive feature (XGBoost)	`budget_rank` — provincial structural position, not raw spend

What Should Be Done

This analysis does not establish causality. All recommendations are directional hypotheses requiring quasi-experimental validation.

Recommendation	Evidence Basis	Trade-off
Shift KPI to wait-days per $M CAD (efficiency, not total spend)	R² = 0.205: budget level alone is an insufficient performance signal	Efficiency metrics require consistent cost accounting across provinces
Target BC, NB, PEI first (highest unexplained within-province variance)	Province-level analysis shows these provinces diverge most from budget-predicted wait times	Requires provincial buy-in; not a funding conversation
Invest in structural capacity (physician supply, facility distribution) over aggregate budget transfers	79.5% of variance is structural — this is where the ROI is	Longer payback cycle; harder to show as a political win

Repository Structure

.
├── src/                          # Python pipeline
│   ├── config.py                 # Path constants, province maps, analysis params
│   ├── data_ingestion.py         # CIHI download + SQLite write; synthetic fallback
│   ├── data_cleaning.py          # Budget + wait time cleaning + merge
│   ├── feature_engineering.py   # 7 features: per-capita, lag, rank, trend
│   ├── modeling.py               # 3-model strategy: OLS → Ridge/Lasso → XGBoost
│   ├── evaluation.py             # Metrics, partial dependence, decision output
│   ├── run_pipeline.py           # Orchestrator — run this
│   └── requirements.txt
│
├── tests/
│   ├── conftest.py               # Shared pytest fixtures (synthetic data, session-scoped)
│   └── test_pipeline.py          # 25 smoke tests: schema, merge ~60 rows, 7 features, R² gate
│
├── docs/
│   ├── executive_brief.md        # 1-page standalone brief for system planners
│   ├── executive_one_pager.md    # Single-slide summary for senior executives
│   ├── decision_output.md        # Explicit recommendations with evidence + trade-offs
│   ├── program_narrative.md      # TPM framing: decisions, trade-offs, stakeholders
│   ├── slide_deck_outline.md     # 5-slide consulting-grade deck outline
│   ├── communications_guide.md   # Audience-specific briefing summaries and analytical FAQ
│   └── program_delivery.md       # Delivery plan: charter, WBS, gates, RACI, risk register
│
├── notebooks/
│   ├── canadian_healthcare_analysis.Rmd   # R analysis
│   └── canadian_healthcare_analysis.md    # Knitted markdown output
│
├── data/
│   ├── README.md                 # Data dictionary, schemas, province codes, assumptions
│   ├── input/                    # Raw CIHI xlsx files (gitignored — download fresh)
│   └── processed/                # Cleaned CSVs (gitignored — regenerated by pipeline)
│
├── outputs/                      # Rendered R outputs (PDF, HTML)
├── pipelines/
│   ├── README.md                 # Pipeline execution guide and deployment instructions
│   └── github_actions_pipeline.yml   # Draft CI/CD workflow (Phase 2 reference)
├── pytest.ini                    # pytest configuration (testpaths = tests)
├── .gitignore
└── README.md

Model Architecture

Model	Features	Purpose	Limitation
Baseline OLS	Budget only	Replicates R analysis; sanity-check gate (R² ≈ 0.205)	Omitted variable bias; no non-linearity
Ridge / Lasso	All 7 engineered features	Stability with n = 54; Lasso auto-selects features	Less interpretable; penalised coefficients
XGBoost*	All 7 engineered features	Non-linear pattern exploration; feature importance	Pattern exploration only — not for prediction or deployment

*XGBoost falls back to RandomForestRegressor if libomp is not installed (brew install libomp on macOS).

Feature Engineering

	Simple baseline	This analysis
Features	Budget (raw, millions CAD)	7 features capturing per-capita normalization, temporal dynamics, structural position
Problem	Confounded by province size; ON ($59B) vs PEI ($680M) not comparable	Per-capita budget addresses scale; lag addresses timing; rank addresses structural position

All 7 features:

Feature	Addresses
`budget_per_capita`	Province size confound — most important correction
`volume_per_capita`	Demand-side pressure differences across provinces
`budget_lag1`	Tests reactive vs. proactive funding hypothesis
`province_encoded`	Structural position (fiscal scale, ordinal)
`year_trend`	Secular time trend (aging, technology)
`budget_yoy_change`	Direction of investment, not just level
`budget_rank`	Relative provincial position within each year

Population normalization uses static 2016 Census baseline to avoid introducing temporal bias from interpolated estimates.

Key Insights

Budget explains 20.5% of wait time variance — statistically significant, practically limited. The other 79.5% is the more important signal.
Simpson's Paradox at the provincial level — the national negative trend reverses province-by-province, consistent with reactive government funding into structurally constrained systems.
Low marginal return — approximately 0.2 days per $1B on the observed dataset. Large funding increases produce small outcomes.
Structural position outperforms raw budget — XGBoost identifies budget_rank as more predictive than raw Budget, confirming the structural hypothesis.
Diminishing returns — partial dependence analysis estimates that beyond ~$5,000–6,000 per capita, additional spending produces minimal further wait time reduction (directional, n = 54, not a policy rule).

What Was Not Done and Why

Approach	Why Not Used
Deep learning	n = 54; no latent structure; no generalisation basis; overkill
Causal inference (IV / DiD / RDD)	No valid instrument variable; no policy discontinuity; observational panel data
50+ feature models	Noise risk overwhelms n = 54; parsimony is the correct call, not a limitation
Individual patient-level analysis	Not in CIHI public data; aggregate-to-individual inference is the ecological fallacy

Choosing not to use a technique, with documented reasoning, is the senior analytical move.

Documents

Document	Purpose
docs/executive_brief.md	1-page brief — read this first
docs/executive_one_pager.md	Single-slide summary for senior executives
docs/decision_output.md	Full recommendation set with evidence and trade-offs
docs/program_narrative.md	TPM framing: analytical decisions, trade-offs, stakeholder context
docs/slide_deck_outline.md	5-slide consulting-grade deck outline
docs/communications_guide.md	Audience-specific briefing summaries for system planners and technical reviewers
docs/program_delivery.md	Program delivery plan: charter, WBS, milestones, gate criteria, RACI, risk register

| data/README.md | Data dictionary, sourcing, schemas, assumptions |

Setup

Prerequisites: Python 3.10+, pip

# 1. Clone the repo
git clone <repo-url>
cd <repo-name>

# 2. Install dependencies (all free / open-source)
pip install -r src/requirements.txt

# macOS only — required for XGBoost:
brew install libomp

# 3. Run the pipeline
python src/run_pipeline.py

# Optional: attempt live CIHI download
python src/run_pipeline.py --live

No server setup required. SQLite (Python stdlib). No .env file. No credentials.

Output:

data/healthcare.db — SQLite database with raw and processed tables
data/processed/merged_final.csv — read by R notebook
Terminal: model comparison table, feature importance, decision output

R notebook (optional):

# Run Python pipeline first to generate data/processed/merged_final.csv
# Then open notebooks/canadian_healthcare_analysis.Rmd in RStudio and knit

Running the test suite:

pytest

Tests run against synthetic data. No CIHI connection required. Expected: 25 tests pass in under 30 seconds.

Data

Both datasets are free public data from the Canadian Institute for Health Information (CIHI).

Dataset	Source
National Health Expenditure Trends	CIHI data catalogue
Wait Times for Priority Procedures	CIHI data catalogue

Limitations

Correlational, not causal. This analysis does not establish causality. All policy recommendations are framed as directional hypotheses requiring quasi-experimental validation.
Small sample. n = 54 province-year observations. Results are directional; cross-validated R² reflects generalisation limits.
Pre-COVID data. 2013–2018 only. Post-2020 disruption likely changed these dynamics significantly.
Aggregate level. Province-year aggregates mask within-province variation. Ecological fallacy risk prohibits individual-level inference.
Budget is total expenditure, not procedure-specific. Targeted capacity investment analysis requires procedure-level budget data not in the CIHI public dataset.

Author

Nammn Joshii | LinkedIn | GitHub

Provenance: Original analysis: October 2019. Repository structured for public portfolio: April 2026. The analytical findings, data, and code are unchanged from the original analysis. Documentation (program narrative, decision output, delivery plan) reflects structured retrospective framing of the 2019 work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Why Increased Healthcare Spending Fails to Reduce Wait Times in Canada

The Finding in 30 Seconds

Quick Numbers

What Should Be Done

Repository Structure

Model Architecture

Feature Engineering

Key Insights

What Was Not Done and Why

Documents

Setup

Data

Limitations

Author

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
docs		docs
notebooks		notebooks
pipelines		pipelines
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
pytest.ini		pytest.ini

Folders and files

Latest commit

History

Repository files navigation

Why Increased Healthcare Spending Fails to Reduce Wait Times in Canada

The Finding in 30 Seconds

Quick Numbers

What Should Be Done

Repository Structure

Model Architecture

Feature Engineering

Key Insights

What Was Not Done and Why

Documents

Setup

Data

Limitations

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages