Skip to content

Commit 8472ff1

Browse files
committed
docs: add mathematical foundations page for all survival models
1 parent 2a636aa commit 8472ff1

File tree

3 files changed

+153
-0
lines changed

3 files changed

+153
-0
lines changed

TODO.md

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
# TODO – Roadmap for gen_surv
2+
3+
This document outlines future enhancements, features, and ideas for improving the gen_surv package.
4+
5+
---
6+
7+
## 📦 1. Interface and UX
8+
9+
- [ ] Create a `generate(..., return_type="df" | "dict")` interface
10+
- [ ] Add `__version__` using `importlib.metadata` or `poetry-dynamic-versioning`
11+
- [ ] Build a CLI with `typer` or `click`
12+
- [ ] Add example notebooks for each model (`notebooks/` folder)
13+
14+
---
15+
16+
## 📚 2. Documentation
17+
18+
- [ ] Add a "Model Comparison Guide" section
19+
- [ ] Add "How It Works" sections for each model
20+
- [ ] Include usage tutorials in Jupyter format on RTD
21+
- [ ] Optional: add multilingual docs using `sphinx-intl`
22+
23+
---
24+
25+
## 🧪 3. Testing and Quality
26+
27+
- [ ] Add property-based tests with `hypothesis`
28+
- [ ] Cover edge cases (e.g., invalid parameters, n=0, negative censoring)
29+
- [ ] Run tests on multiple Python versions (CI matrix)
30+
31+
---
32+
33+
## 🧠 4. Advanced Models
34+
35+
- [ ] Add Piecewise Exponential Model support
36+
- [ ] Add competing risks / multi-event simulation
37+
- [ ] Implement parametric AFT models (log-normal, log-logistic)
38+
- [ ] Simulate time-varying hazards
39+
- [ ] Add informative or covariate-dependent censoring
40+
41+
---
42+
43+
## 📊 5. Visualization and Analysis
44+
45+
- [ ] Create `plot_survival(df, model=...)` utilities
46+
- [ ] Create `describe_survival(df)` summary helpers
47+
- [ ] Export data to CSV / JSON / Feather
48+
49+
---
50+
51+
## 🌍 6. Ecosystem Integration
52+
53+
- [ ] Add a `GenSurvDataGenerator` compatible with `sklearn`
54+
- [ ] Enable use with `lifelines`, `scikit-survival`, `sksurv`
55+
- [ ] Export in R-compatible formats (.csv, .rds)
56+
57+
---
58+
59+
## 🔁 7. Other Ideas
60+
61+
- [ ] Add performance benchmarks for each model
62+
- [ ] Improve PyPI discoverability (add keywords)
63+
- [ ] Create a Streamlit or Gradio live demo

docs/source/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ It includes generators for:
1818
:caption: Contents
1919
2020
modules
21+
theory
2122
```
2223

2324

docs/source/theory.md

Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
# 📘 Mathematical Foundations of `gen_surv`
2+
3+
This page presents the mathematical formulation behind the survival models implemented in the `gen_surv` package.
4+
5+
---
6+
7+
## 1. Cox Proportional Hazards Model (CPHM)
8+
9+
The hazard function conditioned on covariates is:
10+
11+
$$
12+
h(t \mid X) = h_0(t) \exp(X \\beta)
13+
$$
14+
15+
Where:
16+
17+
- \( h_0(t) \) is the baseline hazard
18+
- \( X \\beta \) is the linear predictor
19+
20+
### Weibull baseline hazard:
21+
22+
$$
23+
h_0(t) = \\lambda \\rho t^{\\rho - 1}
24+
$$
25+
26+
The cumulative hazard is:
27+
28+
$$
29+
\\Lambda_0(t) = \\lambda t^{\\rho}
30+
$$
31+
32+
And the survival function becomes:
33+
34+
$$
35+
S(t \mid X) = \\exp\\left(-\\Lambda_0(t) \\exp(X \\beta)\\right)
36+
$$
37+
38+
---
39+
40+
## 2. Time-Dependent Covariate Model (TDCM)
41+
42+
A generalization of CPHM where covariates change over time:
43+
44+
$$
45+
h(t \mid Z(t)) = h_0(t) \\exp(Z(t) \\beta)
46+
$$
47+
48+
In this package, piecewise covariate values are simulated with dependence across segments using correlated normal draws.
49+
50+
---
51+
52+
## 3. Continuous-Time Multi-State Markov Model (CMM)
53+
54+
Markov model with generator matrix \( Q \). The transition probability matrix is given by:
55+
56+
$$
57+
P(t) = \\exp(Qt)
58+
$$
59+
60+
Where:
61+
62+
- \( Q \) is the rate matrix
63+
- \( P(t)_{ij} \) gives the probability of being in state j at time t given starting in state i
64+
65+
---
66+
67+
## 4. Time-Homogeneous Hidden Markov Model (THMM)
68+
69+
This model simulates observed states with unobserved latent state transitions.
70+
71+
Let:
72+
73+
- \( S_t \) be the latent state at time t
74+
- \( Y_t \) be the observed variable conditional on \( S_t \)
75+
76+
The transition structure is governed by a homogeneous Markov chain with transition matrix \( P \), and emissions are Gaussian:
77+
78+
$$
79+
Y_t \mid S_t = k \\sim \\mathcal{N}(\\mu_k, \\sigma_k^2)
80+
$$
81+
82+
---
83+
84+
## Notes
85+
86+
All models support censoring:
87+
88+
- **Uniform:** \( C_i \\sim U(0, \\text{cens\\_par}) \)
89+
- **Exponential:** \( C_i \\sim \\text{Exp}(\\text{cens\\_par}) \)

0 commit comments

Comments
 (0)