CDCgov · swo · Dec 29, 2025 · Dec 27, 2025 · Dec 29, 2025 · Dec 29, 2025
diff --git a/.github/workflows/mkdocs.yaml b/.github/workflows/mkdocs.yaml
@@ -0,0 +1,52 @@
+name: mkdocs
+on:
+  pull_request:
+  push:
+    branches:
+      - main
+
+env:
+  GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
+
+# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages
+permissions:
+  contents: write
+  pages: write
+  id-token: write
+
+jobs:
+  build:
+    runs-on: ubuntu-latest
+    outputs:
+      page_artifact_id: ${{ steps.upload.outputs.artifact_id }}
+    steps:
+      - uses: actions/checkout@v5
+      - uses: astral-sh/setup-uv@v6
+        with:
+          enable-cache: true
+      - uses: actions/setup-python@v6
+        with:
+          python-version-file: ".python-version"
+      - run: uv sync --locked --only-group mkdocs
+      - run: uv run mkdocs build --strict
+      - uses: actions/upload-pages-artifact@v4
+        with:
+          name: github-pages
+          path: site
+          retention-days: "3"
+
+  deploy:
+    if: ${{ github.event_name == 'push' && github.ref_name == 'main' }}
+
+    runs-on: ubuntu-latest
+    needs: build
+
+    environment:
+      name: github-pages
+      url: ${{ steps.deployment.outputs.page_url }}
+
+    steps:
+      - uses: actions/deploy-pages@v4
+        with:
+          artifact_name: github-pages
+          preview: false
diff --git a/docs/gam.md b/docs/gam.md
@@ -1,7 +1,10 @@
-# Overview
+# GAMs
+
+## Overview
 
 GAM (generalized additive model) models the relation between vaccine uptake ($y_i$) and the smooth version ($f(.)$) of elapsed variable (the number of days after vaccine roll-out) ($x_i$) and the random effect introduced by season ($u_j$) with link function $g^{-1}
 (.)$.
+
 ```math
 g^{-1}(E(y_i)) = f(x_i) + u_j + \beta_0
 
@@ -25,17 +28,20 @@ $y$ is a vector of observed vaccine uptake, $X$ is the design matrix of basis fu
 Because the main effect and the random effect are additive, we consider them separately for now.
 
 ### Main effect
+
 The loglikelihood function is:
 
 ```math
 Loglik(\beta, \lambda, u |y) = Loglik(y| \beta) - \lambda \beta^TS\beta
 ```
+
 $S$ is called penalty matrix that is used to penalize the wiggliness of smooth function. In our case, we will use cubic spline function as the basis function, and the wiggliness of cubic spline function is measured as the integral of squared secondary derivatives of $B_k(x_{i})$, which is:
 
 ```math
 S_{ij} = \int{B''_i(x)B''_j(x)dx}
 
 ```
+
 In this way, $S$ penalizes the curvature of basis function. $\lambda$ is a smoothing parameter to control the balance between smoothness and fidelity of the data, which will be estimated along with $\beta$.
 
 Exponentiating the loglikelhood function, we have:
@@ -83,6 +89,7 @@ For each smooth term, it is possible to have identifiability issue between $f(x_
 \sum_i^N{f(x_i)} = 0
 
 ```
+
 In matrix form, it is:
 
 ```math

diff --git a/docs/javascript/katex.js b/docs/javascript/katex.js
@@ -0,0 +1,10 @@
+document$.subscribe(({ body }) => {
+  renderMathInElement(body, {
+    delimiters: [
+      { left: "$$", right: "$$", display: true },
+      { left: "$", right: "$", display: false },
+      { left: "\\(", right: "\\)", display: false },
+      { left: "\\[", right: "\\]", display: true },
+    ],
+  });
+});
diff --git a/docs/model_details.md b/docs/model_details.md
@@ -1,19 +1,20 @@
-# Overview
+# Model details
 
 These are the mathematical details of the models used to capture and forecast vaccine uptake. There are currently just one model: a mixture of a logistic and linear function. This model proposes a latent true uptake curve, which is subject to observation error. A hierarchy accounts for the unique effects of grouping factors (e.g. season, geography, age) on model parameters.
 
-# Logistic Plus Linear (LPL) Model
+## Logistic Plus Linear (LPL) Model
 
-## Notation
+### Notation
 
 The following notation will be used for the LPL model:
+
 - $t$ = time since the start of the season, expressed as the fraction of a year elapsed
 - $V_t^{obs}$ = number of people surveyed at time $t$ who are vaccinated
 - $N_t^{obs}$ = total number of people surveyed at time $t$
 - $c_t$ = latent true cumulative uptake on day $t$
 - $G$ = grouping factors (e.g. season, geographic area, age group, race/ethnicity), indexed by $i$ with $I$ total factors
 
-## Summary
+### Summary
 
 At a high level, the LPL model is structured as follows:
 
@@ -35,7 +36,7 @@ Here, $t$ is rescaled by dividing by 365, so that $t$ represents the proportion
 \end{align*}
 ```
 
-## Observation Layer
+### Observation Layer
 
 The observed uptake is considered a draw from the beta-binomial distribution, governed in part by the true latent uptake in the population.
 
@@ -47,7 +48,7 @@ The observed uptake is considered a draw from the beta-binomial distribution, go
 
 Note that the shape parameters $\alpha$ and $\beta$ are not declared explicitly. Rather they are implied by an alternate mean and concentration parametrization, described below.
 
-## Functional Structure
+### Functional Structure
 
 The model's functional structure describes the latent true uptake curve:
 
@@ -57,7 +58,7 @@ The model's functional structure describes the latent true uptake curve:
 \end{align*}
 ```
 
- $c_{t,G_1,...,G_I}$ serves as the mean of the beta distribution in the beta-binomial likelihood in the observation-layer. A fixed concentration parameter $d$ is also required. From the mean and concentration, the two shape parameters of the beta distribution are as follows:
+$c_{t,G_1,...,G_I}$ serves as the mean of the beta distribution in the beta-binomial likelihood in the observation-layer. A fixed concentration parameter $d$ is also required. From the mean and concentration, the two shape parameters of the beta distribution are as follows:
 
 ```math
 \begin{align*}
@@ -66,7 +67,7 @@ The model's functional structure describes the latent true uptake curve:
 \end{align*}
 ```
 
-## Hierarchical Structure
+### Hierarchical Structure
 
 Certain parameters of the latent true uptake curve have group-specific deviations, determined as follows:
 
@@ -79,7 +80,7 @@ Certain parameters of the latent true uptake curve have group-specific deviation
 
 and similarly for $M$.
 
-## Priors
+### Priors
 
 ```math
 \begin{align*}

diff --git a/docs/model_journal.md b/docs/model_journal.md
@@ -1,36 +1,37 @@
-# Overview
+# Model "journal"
 
 Choosing a model structure has been trickier than expected. This is an informal record of the attempts that have been made, their outcomes, and consequent directions.
 
-# Cumulative S Curves
+## Cumulative S Curves
 
 The Scenarios team would like to use vaccine uptake forecasts as input for their ODE models. For this purpose, they need uptake curves that are continuously differentiable, not forecasts that are a series of point estimates. Autoregressive and stochastic models are not suitable for this purpose. Consequently, families of S-curves that directly model cumulative uptake will be prioritized, starting with the Hill function.
 
-# Hill Function
+## Hill Function
 
 In brief, the original Hill function model considers latent true uptake to follow a Hill curve shape. The final uptake and midpoint parameters ("A" and "H", respectively) can deviate additively from their overall averages based on grouping factors (e.g. season, state), while the steepness parameter ("n") only has one overall value. Observed uptake is centered on the latent true uptake but may have error in either direction. The magnitude of this error is derived from the 95% confidence interval reported along with each point estimate in the NIS data.
 
-## Trouble Pt. 1: Truncated Normal Observation
+### Trouble Pt. 1: Truncated Normal Observation
 
 The Hill model would not fit - the MCMC chains remained stationary and returned hundreds of identical draws with ESS = 1.0. This happened whether grouping factors were included or not (i.e. one curve fit across all seasons at the national scale). The stationary chains were solved by ignoring the empirical estimates of observation error: when observation error was fixed at a generous value (0.03) or was fit as a free parameter, the MCMC chais were no longer stationary.
 
 Why did empirical observation error break MCMC? The original Hill model used a truncated Normal draw to describe the observation process. The reported 95% confidence intervals were assumed to be Wald intervals, such that an interval's half-width divided by 1.96 approximates the standard deviation of the truncated Normal. These standard deviations were often on the order of 0.001, implying that the observed uptake curves are very close to the latent true uptake. But the Hill function does not fit the data that well: especially in the latter half of seasons, true uptake continues creeping upward while the Hill function asymptotes. Thus, no parameter set exists that can get the Hill function close enough to all data points, and MCMC gets stuck in flat portions of the likelihood landscape.
 
-## Solution Pt. 1: Beta-Binomial Observation
+### Solution Pt. 1: Beta-Binomial Observation
 
 MCMC chains were unstuck by reinterpretting the empirical confidence intervals in terms of the actual data collection process. Cumulative uptake is estimated by the proportion p of N phone survey participants who report being vaccinated. By considering an interval's half-width divided by 1.96 to be the standard error of the mean (SEM) for the reported uptake proportion p, N was estimated at each data point. Sensibly, stimated N is on the order of 1,000 for individual states and 50,000 at the national scale.
 
 With estimates of pN and N in hand, the observation process was replaced with a beta-binomial likelihood, which inherently permits observations to vary farther from the latent true uptake, compared to the truncated Normal likelihood. Consequently, the MCMC chains began sampling parameter space more freely.
 
-## Trouble Pt. 2: Hill Function Shape
+### Trouble Pt. 2: Hill Function Shape
 
 Even with MCMC proceeding, other warning signs arose:
+
 - When grouping on season alone, season-specific deviations in both A and H from their overall averages have very wide 95% credible intervals, straddling 0. And yet, it is clear that uptake curves do differ from one another across seasons.
 - When grouping on season and state, the fitting proceeds very slowly (1-2 it/s). A, A-deviations-by-season, H, and H-deviations-by-season all had very low ESS (40-60, despite 500 samples after warmup). A-deviations-by-state had even lower ESS (10-15). H-deviations-by-state had higher ESS, but the magnitude in variation in H among states was estimated very close to 0.
 
 Together, these warning signs suggest some non-identifiability among the parameters that vary by grouping factor, perhaps again driven by the poor fit of the Hill function to uptake curves.
 
-## Solution Pt 2: Logistic + Linear Functions
+### Solution Pt 2: Logistic + Linear Functions
 
 Many warning signs were alleviated by changing the structure of the latent true uptake from a pure Hill function to a logistic function plus a slope-only linear function (intercept = 0). In this model, the linear slope "M" and the logistic asymptote "A" can deviate additively from their overall averages by group, while the logistic midpoint "H" and steepness "n" are fixed across groups. In particular, this mixed function allows uptake to continue creeping upward late in a season.
 

diff --git a/mkdocs.yaml b/mkdocs.yaml
@@ -0,0 +1,59 @@
+site_name: IUP
+
+nav:
+  - index.md
+  - analytical_plan.md
+  - gam.md
+  - model_details.md
+  - model_journal.md
+
+repo_url: https://github.com/CDCgov/cfa-immunization-uptake-projection
+repo_name: repo
+
+# advanced configuration ------------------------------------------------------
+theme:
+  name: "material"
+  icon:
+    repo: fontawesome/brands/github
+
+plugins:
+  - mkdocstrings:
+      handlers:
+        python:
+          options:
+            # see <https://mkdocstrings.github.io/python/usage/> for options
+            show_root_heading: true
+            show_object_full_path: true
+  - search
+
+markdown_extensions:
+  - pymdownx.highlight:
+      anchor_linenums: true
+      line_spans: __span
+      pygments_lang_class: true
+  - pymdownx.inlinehilite
+  - pymdownx.snippets
+  - pymdownx.arithmatex:
+      generic: true
+  - pymdownx.superfences:
+      custom_fences:
+        # enables rendering of ```mermaid and ```math blocks
+        # the !! parts of this section will trip yaml checkers as "unsafe"
+        - name: mermaid
+          class: mermaid
+          format: !!python/name:pymdownx.superfences.fence_code_format
+        - name: math
+          class: arithmatex
+          format:
+            !!python/object/apply:pymdownx.arithmatex.arithmatex_fenced_format {
+              kwds: { mode: generic, tag: div },
+            }
+
+# math rendering
+extra_javascript:
+  - javascript/katex.js
+  - https://unpkg.com/katex@0/dist/katex.min.js
+  - https://unpkg.com/katex@0/dist/contrib/auto-render.min.js
+
+extra_css:
+  - https://unpkg.com/katex@0/dist/katex.min.css
diff --git a/pyproject.toml b/pyproject.toml
@@ -27,11 +27,9 @@ dependencies = [
 ]
 
 [project.optional-dependencies]
-gam = ["scikit-fda"]
-
+gam = ["scikit-fda>=0.10.1"]
 
 [tool.uv]
-
 [tool.uv.sources]
 nisapi = { git = "https://github.com/CDCgov/nis-py-api" }
 
@@ -46,3 +44,9 @@ dev = [
     "pre-commit>=4.2.0",
     "pytest>=8.4.0",
 ]
+mkdocs = [
+    "mkdocs>=1.6.1",
+    "mkdocs-material>=9.7.1",
+    "mkdocstrings>=1.0.0",
+    "mkdocstrings-python>=2.0.1",
+]