Skip to content

Commit 9a12159

Browse files
authored
Merge pull request #1732 from ihmeuw/model_13.3
Model 13.3: Update to end-of-ENN prevalence for preterm in LNN CSMRisk equation
2 parents 183fba2 + 0b90624 commit 9a12159

10 files changed

Lines changed: 117 additions & 35 deletions

File tree

docs/source/models/causes/neonatal/index.rst

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -208,7 +208,11 @@ unit time.
208208
Modeling Strategy
209209
+++++++++++++++++
210210

211-
The neonatal death model requires only the probability of death (aka "mortality risk") for the early and late neonatal time periods. Rather than using GBD mortality rates and converting them into probability of deaths, we will use mortality risk as direct input data into our model. We will calculate mortality risk input data as age-specific death counts divided by live birth counts from GBD.
211+
The neonatal death model requires only the probability of death (aka "mortality risk") for the early and late neonatal time periods.
212+
These mortality risks are age-group-, sex-, and location-specific.
213+
For brevity, sex and location subscripts are omitted in all equations.
214+
215+
Rather than using GBD mortality rates and converting them into probability of deaths, we will use mortality risk as direct input data into our model. We will calculate mortality risk input data as age-specific death counts divided by live birth counts from GBD.
212216

213217
Note that this strategy does not require any conversion between rates to probabilities NOR does it require any scaling to the duration of the age group. The mortality risk calculated as described below already represents the probability of dying within a neonatal age group and can be used directly as such in the simulation.
214218

@@ -231,7 +235,8 @@ and for a given cause of death:
231235
232236
Note that this strategy was updated in May of 2025 from a prior strategy of converting GBD mortality rates to probabilities. `The pull request that updated this strategy can be found here for reference. <https://github.com/ihmeuw/vivarium_research/pull/1654>`_ This strategy update was pursued following verification and validation issues in neonatal mortality and an exploration of potential solutions in model runs 6.1 through 6.4. Ultimately, a change from mortality rates to mortality risk was preferred given that it is the more policy relevant measure in the context of neonates, and accurately apportioning person time alive within the neonatal age group given the input data available to us was a challenge we judged to be unnecessary.
233237

234-
The calculation of :math:`\text{ACMRisk}_i` (the all-cause mortality risk for a single simulant, :math:`i`) is a bit complicated, however. We begin with a population ACMRisk and use the LBWSG PAF to derive a risk-deleted ACMRisk to which we can then apply the relative risk of LBWSG matching any risk exposure level. Mathematically this is achieved by the following formula:
238+
The calculation of :math:`\text{ACMRisk}_i` (the all-cause mortality risk for a single simulant, :math:`i`) is a bit complicated, however. We begin with a population ACMRisk and use the LBWSG PAF to derive a risk-deleted ACMRisk to which we can then apply the relative risk of LBWSG matching any risk exposure level. Mathematically this is achieved by the following formula.
239+
Starting with this equation, we omit age group subscripts for brevity; all quantities are still age-, sex-, and location-specific.
235240

236241
.. math::
237242
\begin{align*}
@@ -253,7 +258,6 @@ where :math:`\text{BW}_i` and :math:`\text{GA}_i` are the birth weight and gesta
253258
and :math:`\text{CSMRisk}_{i}^{k}` is the cause-specific mortality risk for subcause :math:`k` for simulant :math:`i` (both detailed in the `Modeled Subcauses`_
254259
linked from this page).
255260

256-
257261
In addition to determining which simulants die due to any cause, we also need to determine which subcause is underlying the death. This is done by sampling from a categorical distribution obtained by renormalizing the CSMRisks:
258262

259263
.. math::
@@ -349,6 +353,8 @@ Data Tables
349353
- GBD + assumption about relative risks + intervention model effects
350354
- see subcause models for details
351355

356+
.. _details_of_the_lbwsg_paf_calculation:
357+
352358
Details of the LBWSG PAF calculation
353359
++++++++++++++++++++++++++++++++++++
354360

docs/source/models/causes/neonatal/preterm_birth.rst

Lines changed: 87 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -154,11 +154,17 @@ Note that these probabilities are not used directly in the model and are include
154154
Modeling Strategy
155155
+++++++++++++++++
156156

157-
The Preterm Birth submodel requires only the birth-weight- and gestation-age-stratified cause specific mortality risks for preterm birth complications with and without respiratory distress syndrome during the early and late neonatal periods.
157+
The Preterm Birth submodel only needs to produce the birth-weight- and gestation-age-stratified cause specific mortality risks for preterm birth complications with and without respiratory distress syndrome during the early and late neonatal periods.
158+
(These risks are also implicitly stratified by age group, sex, and location.)
158159

159160
Since this is a PAF-of-one cause, the calculation must take into account the "structural zeros" representing no mortality risk for simulants with a gestational age of 37 or more weeks.
160161

161-
The way these CSMRisks are used is the same for all subcauses, and therefore is included in the :ref:`Overall Neonatal Disorders Model <2021_cause_neonatal_disorders_mncnh>` page. This page describes the birth-weight- and gestational-age-specific cause specific mortality risks that are used for this cause on that page, :math:`\text{CSMRisk}^{\text{preterm with RDS}}_{\text{BW},\text{GA}}` and :math:`\text{CSMRisk}^{\text{preterm without RDS}}_{\text{BW},\text{GA}}`. In both cases, the formula is:
162+
The way these CSMRisks are used is the same for all subcauses, and therefore is included in the :ref:`Overall Neonatal Disorders Model <2021_cause_neonatal_disorders_mncnh>` page. This page describes how to calculate the birth-weight- and gestational-age-specific cause specific mortality risks that are used for the preterm subcauses on that page, namely :math:`\text{CSMRisk}^{\text{preterm with RDS}}_{\text{BW},\text{GA}}` and :math:`\text{CSMRisk}^{\text{preterm without RDS}}_{\text{BW},\text{GA}}`.
163+
As in the equations on the overall neonatal disorders model page, all quantities here
164+
are age-group-, sex-, and location-specific; these subscripts are omitted for brevity.
165+
For both preterm subcauses, the formula is:
166+
167+
.. _preterm_csmrisk_equation:
162168

163169
.. math::
164170
\begin{align*}
@@ -171,7 +177,8 @@ The way these CSMRisks are used is the same for all subcauses, and therefore is
171177
\end{align*}
172178
173179
where :math:`k` is the subcause of interest (preterm birth with or without RDS),
174-
:math:`\text{CSMRisk}` is the cause-specific mortality riskk for preterm birth complications,
180+
:math:`\text{CSMRisk}` is the cause-specific mortality risk for preterm birth complications,
181+
:math:`p_{\text{preterm}}` is the prevalence of preterm (gestational age < 37 weeks) at the *beginning* of the age group,
175182
:math:`f_k` is the fraction of preterm deaths due to subsubcause :math:`k` (with or without RDS), :math:`\text{RR}_{\text{BW},\text{GA}}` is the relative risk of all-cause mortality for a birth weight of :math:`\text{BW}` and gestational age of :math:`\text{GA}`, and :math:`Z` is a normalizing constant selected so that :math:`E[\text{RR}_{\text{BW,GA}} | \text{GA}<37] \cdot Z = 1`. Solving for :math:`Z` gives :math:`Z = 1 / E[\text{RR}_{\text{BW,GA}} | \text{GA}<37]`.
176183

177184
.. note::
@@ -184,7 +191,7 @@ where :math:`k` is the subcause of interest (preterm birth with or without RDS),
184191

185192
We will use a **population size of 195_112** for this calculation. This number was selected in order to satisfy the following criteria:
186193

187-
- The population size per LWBSG exposure category is required to be a perfect square to be compatible with our strategy of initializing individual exposures on a grid within each LBWSG exposure category
194+
- The population size per LBWSG exposure category is required to be a perfect square to be compatible with our strategy of initializing individual exposures on a grid within each LBWSG exposure category
188195

189196
- The total population size of the PAF calculation pipeline must be divisible by the product of the number of LBWSG exposure categories (58), the number of sexes (2), and the number of age groups (2) used in the PAF calculation
190197

@@ -202,7 +209,73 @@ where :math:`k` is the subcause of interest (preterm birth with or without RDS),
202209
Also, it is possible that the choice of :math:`\text{RR}_{\text{BW},\text{GA}}` might not work for every subcause. Since we're moving all the preterm mortality into the preterm categories, there is less room there for mortality from other causes, so depending on the risks involved, we may need to shift mortality from some other causes into the non-preterm categories in order to avoid making things negative.
203210
It is even possible that there is no way to make this work consistently, meaning that any choice of weight function would lead to negative mortality risks. We expect that this will not be an issue, but we haven't actually tried it with the real data yet.
204211

205-
Each individual simulant :math:`i` has their own :math:`\text{CSMR}_i^k` that might be different from :math:`\text{CSMRisk}^k_{\text{BW}_i,\text{GA}_i}` (meaning the average birth-weight- and gestational-age-specific CSMRisk for simulants with the birth weight and gestational age matching simulant :math:`i`. We recommend implementing this as a pipeline eventually because it will be modified by interventions (or access to interventions) relevant to this subcause. (Until we implement those, we will have :math:`\text{CSMRisk}_{i}^k = \text{CSMRisk}^k_{\text{BW}_i,\text{GA}_i}`, though.)
212+
:math:`\text{CSMRisk}` and :math:`p_{\text{preterm}}` are calculated differently for the ENN and LNN age groups.
213+
For clarity of notation, in what follows we will again make explicit the age group
214+
subscripts that have been implicit on every quantity to this point.
215+
(Sex and location remain implicit.)
216+
We define the ENN CSMRisk as:
217+
218+
.. math::
219+
220+
\text{CSMRisk}_\text{ENN} = \frac{\text{enn_death_count}}{\text{live_birth_count}},
221+
222+
where the :math:`\text{enn_death_count}` and :math:`\text{live_birth_count}` are
223+
quantities pulled from GBD, as detailed in the table below.
224+
225+
The LNN CSMRisk is:
226+
227+
.. math::
228+
229+
\text{CSMRisk}_\text{LNN} = \frac{\text{lnn_death_count}}{\text{live_birth_count} - \text{enn_all_cause_death_count}},
230+
231+
where, again, all quantities are pulled from GBD as detailed in the table below.
232+
233+
:math:`p_{\text{preterm}}`, as mentioned above, represents the prevalence/exposure
234+
of preterm (gestational age < 37 weeks) at the *beginning* of the age group.
235+
For ENN, the beginning of the age group is birth, so the prevalence of preterm
236+
at birth is a sum of the birth prevalence for all LBWSG categories with gestational
237+
age less than 37 weeks:
238+
239+
.. math::
240+
241+
p_{\text{preterm},\text{ENN}} = \sum_{\{\text{cat}: \text{GA}<37\}} \text{lbwsg_birth_prevalence}_\text{cat},
242+
243+
where :math:`\text{lbwsg_birth_prevalence}` can be pulled from GBD with minor transformations,
244+
as detailed in the table below.
245+
246+
For LNN, the situation is more complicated, because we need to account
247+
for differential mortality in the ENN period.
248+
Therefore, the easiest way to calculate :math:`p_{\text{preterm},\text{LNN}}` is to get the end-of-ENN preterm
249+
prevalence from the same LBWSG PAF calculation pipeline used for :math:`Z`
250+
above.
251+
As detailed at :ref:`details_of_the_lbwsg_paf_calculation` on the neonatal all-cause
252+
mortality page, there are two iterative steps using microsimulation, with the late neonatal calculations
253+
using the result of the early neonatal calculations.
254+
Similarly to the LNN PAF, *after* the early neonatal calculations are complete, the prevalence of
255+
preterm at the end of the ENN age group should be calculated.
256+
This value should be used as :math:`p_{\text{preterm},\text{LNN}}` for the purposes
257+
of the CSMRisk equation.
258+
259+
Determining the prevalence of preterm is a bit more complex than it sounds, because in the PAF calculation pipeline,
260+
the same number of simulants are assigned to each LBWSG category, rather than assigning each simulant
261+
to a random category with probability equal to that category's prevalence at birth.
262+
Due to this initialization strategy, all quantities calculated in the pipeline must use *weights*
263+
to account for the fact that the simulants in the categories with higher birth prevalence represent more people.
264+
Therefore, :math:`p_{\text{preterm},\text{LNN}}` is calculated as follows:
265+
266+
.. math::
267+
268+
p_{\text{preterm},\text{LNN}} = \frac{
269+
\sum_{\{\text{cat}: \text{GA}<37\}} \text{lbwsg_birth_prevalence}_\text{cat} \times \frac{n_\text{cat} - n^\text{deaths}_\text{cat}}{n_\text{cat}}
270+
}{
271+
\sum_{\text{cat}} \text{lbwsg_birth_prevalence}_\text{cat} \times \frac{n_\text{cat} - n^\text{deaths}_\text{cat}}{n_\text{cat}}
272+
},
273+
274+
where :math:`n_\text{cat}` is the number of simulants initialized into each LBWSG category at birth
275+
and :math:`n^\text{deaths}_\text{cat}` is the number of deaths in each category when ENN mortality was applied.
276+
Note that :math:`n_\text{cat}` will not vary by LBWSG exposure category under the current approach of assigning the same number of simulants to each LBWSG category.
277+
278+
Each individual simulant :math:`i` has their own :math:`\text{CSMR}_i^k` that might be different from :math:`\text{CSMRisk}^k_{\text{BW}_i,\text{GA}_i}` (meaning the average birth-weight- and gestational-age-specific CSMRisk for simulants with the birth weight and gestational age matching simulant :math:`i`). We recommend implementing this as a Vivarium pipeline eventually because it will be modified by interventions (or access to interventions) relevant to this subcause. (Until we implement those, we will have :math:`\text{CSMRisk}_{i}^k = \text{CSMRisk}^k_{\text{BW}_i,\text{GA}_i}`, though.)
206279

207280
The following table shows the data needed for these
208281
calculations.
@@ -212,8 +285,9 @@ Data Tables
212285

213286
.. note::
214287

215-
All quantities pulled from GBD in the following table are for a
216-
specific year, sex, age group, and location.
288+
All quantities pulled from GBD in the following table are pulled
289+
for all modeled years, sexes, age groups, and locations,
290+
except when the age group is explicitly specified.
217291

218292
.. list-table:: Data values and sources
219293
:header-rows: 1
@@ -224,36 +298,24 @@ Data Tables
224298
- Note
225299
* - enn_all_cause_death_count
226300
- Count of deaths due to all causes in the early neonatal age group
227-
- GBD: source='codcorrect', metric_id=1, cause_id=294
301+
- GBD: source='codcorrect', metric_id=1, cause_id=294, age_group_id=2
228302
-
229303
* - enn_death_count
230304
- Count of deaths due to cause neonatal preterm birth complications in the early neonatal age group
231-
- GBD: source='codcorrect', metric_id=1, cause_id=381
305+
- GBD: source='codcorrect', metric_id=1, cause_id=381, age_group_id=2
232306
-
233307
* - lnn_death_count
234308
- Count of deaths due to cause neonatal preterm birth complications in the late neonatal age group
235-
- GBD: source='codcorrect', metric_id=1, cause_id=381
309+
- GBD: source='codcorrect', metric_id=1, cause_id=381, age_group_id=3
236310
-
237311
* - live_birth_count
238312
- Count of live births
239313
- GBD: covariate_id = 1106
240314
-
241-
* - csmrisk_enn
242-
- neonatal preterm birth complications mortality risk in the early neonatal age group
243-
- enn_death_count / live_birth_count
244-
-
245-
* - csmrisk_lnn
246-
- neonatal preterm birth complications mortality risk in the late neonatal age group
247-
- lnn_death_count / (live_birth_count - enn_all_cause_death_count)
248-
-
249-
* - :math:`\text{CSMRisk}`
250-
- neonatal preterm birth complications mortality risk
251-
- either csmrisk_enn or csmrisk_lnn depending on the simulant's age group
315+
* - lbwsg_birth_prevalence
316+
- Birth prevalence of low birthweight and short gestation risk factor
317+
- GBD with post-processing: rei_id = 339, then remove the extraneous category and rescale prevalence :ref:`as described here <rescaling_lbwsg_exposure_data_pulled_from_gbd_2019>`.
252318
-
253-
* - :math:`p_\text{preterm}`
254-
- Prevalence of gestational age <37 weeks at birth
255-
- Derived from :ref:`GBD LBWSG exposure <risk_exposure_lbwsg>`
256-
- Equal to the sum of exposures for all categories with gestational age at birth <37 weeks. A list of such categories can be generated in a manner similar to `this notebook <https://github.com/ihmeuw/vivarium_research_nutrition_optimization/blob/data_prep/data_prep/LBW%20categories.ipynb>`_
257319
* - :math:`f_\text{preterm w RDS}`
258320
- fraction of preterm deaths with RDS
259321
- 85%

docs/source/models/concept_models/vivarium_mncnh_portfolio/concept_model.rst

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1224,7 +1224,7 @@ Default stratifications to all observers should include scenario and input draw.
12241224
- Default
12251225
- Default
12261226
* - 12.1
1227-
- Bugfix to calculation of prevalence of preterm, to ensure we include categories with an upper bound of 37 weeks
1227+
- Bugfix to calculation of prevalence of preterm in :ref:`this equation <preterm_csmrisk_equation>`, to ensure we include categories with an upper bound of 37 weeks
12281228
- Baseline
12291229
- ``model12.1``
12301230
- Default
@@ -1269,6 +1269,13 @@ Default stratifications to all observers should include scenario and input draw.
12691269
- Default
12701270
- Default
12711271
- Default
1272+
* - 13.3
1273+
- Update to use end-of-ENN LBWSG prevalence for the :math:`p_\text{preterm}` for the LNN age group in :ref:`this equation <preterm_csmrisk_equation>`. Details can be found in the diff of `this pull request <https://github.com/ihmeuw/vivarium_research/pull/1732/files>`_.
1274+
- Baseline
1275+
- ``model13.3``
1276+
- Default
1277+
- Default
1278+
- Default
12721279
* - 14.0
12731280
- Wave II updates to the :ref:`antenatal care attendance module <2024_vivarium_mncnh_portfolio_anc_module>`
12741281
- Baseline
@@ -1652,6 +1659,11 @@ Default stratifications to all observers should include scenario and input draw.
16521659
expected change is small but should be in the direction of better verification to GBD
16531660
-
16541661
-
1662+
* - 13.3
1663+
- * Check that neonatal all-cause mortality risks match expectation
1664+
* Check that neonatal cause-specific mortality risks match expectation
1665+
-
1666+
-
16551667
* - 14.0
16561668
- * Confirm ANC attendance exposure varies as expected by pregnancy term length
16571669
* Confirm ANC attendance exposure matches expectation

docs/source/models/concept_models/vivarium_mncnh_portfolio/initial_attributes_module/module_document.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ There is no need for a diagram for the initial attributes module of this simulat
7272
:ref:`facility choice model document
7373
<2024_facility_model_vivarium_mncnh_portfolio>` for instructions
7474
- Will be used to determine ANC attendance in ANC module
75-
* - B. LWBSG category propensity
75+
* - B. LBWSG category propensity
7676
- See the :ref:`correlated propensities
7777
<facility_choice_correlated_propensities_section>` section of the
7878
:ref:`facility choice model document

docs/source/models/concept_models/vivarium_nutrition_optimization/pregnancies/concept_model.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ Documents that contain information specific to the overall model and the child s
6262

6363
**Questions:**
6464

65-
- For stillbirths that become live births due to intervention impact, what should their LWBSG exposure be? Hypothetically near-stillbirths should have lower birth weights than others. Ask Nick K.! GBD may be estimating these outcomes directly? Current assumption is that they will have randomly sampled exposure.
65+
- For stillbirths that become live births due to intervention impact, what should their LBWSG exposure be? Hypothetically near-stillbirths should have lower birth weights than others. Ask Nick K.! GBD may be estimating these outcomes directly? Current assumption is that they will have randomly sampled exposure.
6666

6767
- How should we handle averted stillbirths in our optimization objectives? Note that because stillbirths do not accumulate any DALYs, an objective to minimize DALYs could disincentivize averting stillbirths, which would be inconsistent with improving outcomes.
6868

0 commit comments

Comments
 (0)