Skip to content

Commit ccd8a53

Browse files
authored
Merge branch 'main' into tutorial_implementation
2 parents 3dd465b + 9a12159 commit ccd8a53

24 files changed

Lines changed: 1122 additions & 193 deletions

File tree

docs/source/models/causes/neonatal/index.rst

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -208,7 +208,11 @@ unit time.
208208
Modeling Strategy
209209
+++++++++++++++++
210210

211-
The neonatal death model requires only the probability of death (aka "mortality risk") for the early and late neonatal time periods. Rather than using GBD mortality rates and converting them into probability of deaths, we will use mortality risk as direct input data into our model. We will calculate mortality risk input data as age-specific death counts divided by live birth counts from GBD.
211+
The neonatal death model requires only the probability of death (aka "mortality risk") for the early and late neonatal time periods.
212+
These mortality risks are age-group-, sex-, and location-specific.
213+
For brevity, sex and location subscripts are omitted in all equations.
214+
215+
Rather than using GBD mortality rates and converting them into probability of deaths, we will use mortality risk as direct input data into our model. We will calculate mortality risk input data as age-specific death counts divided by live birth counts from GBD.
212216

213217
Note that this strategy does not require any conversion between rates to probabilities NOR does it require any scaling to the duration of the age group. The mortality risk calculated as described below already represents the probability of dying within a neonatal age group and can be used directly as such in the simulation.
214218

@@ -231,7 +235,8 @@ and for a given cause of death:
231235
232236
Note that this strategy was updated in May of 2025 from a prior strategy of converting GBD mortality rates to probabilities. `The pull request that updated this strategy can be found here for reference. <https://github.com/ihmeuw/vivarium_research/pull/1654>`_ This strategy update was pursued following verification and validation issues in neonatal mortality and an exploration of potential solutions in model runs 6.1 through 6.4. Ultimately, a change from mortality rates to mortality risk was preferred given that it is the more policy relevant measure in the context of neonates, and accurately apportioning person time alive within the neonatal age group given the input data available to us was a challenge we judged to be unnecessary.
233237

234-
The calculation of :math:`\text{ACMRisk}_i` (the all-cause mortality risk for a single simulant, :math:`i`) is a bit complicated, however. We begin with a population ACMRisk and use the LBWSG PAF to derive a risk-deleted ACMRisk to which we can then apply the relative risk of LBWSG matching any risk exposure level. Mathematically this is achieved by the following formula:
238+
The calculation of :math:`\text{ACMRisk}_i` (the all-cause mortality risk for a single simulant, :math:`i`) is a bit complicated, however. We begin with a population ACMRisk and use the LBWSG PAF to derive a risk-deleted ACMRisk to which we can then apply the relative risk of LBWSG matching any risk exposure level. Mathematically this is achieved by the following formula.
239+
Starting with this equation, we omit age group subscripts for brevity; all quantities are still age-, sex-, and location-specific.
235240

236241
.. math::
237242
\begin{align*}
@@ -253,7 +258,6 @@ where :math:`\text{BW}_i` and :math:`\text{GA}_i` are the birth weight and gesta
253258
and :math:`\text{CSMRisk}_{i}^{k}` is the cause-specific mortality risk for subcause :math:`k` for simulant :math:`i` (both detailed in the `Modeled Subcauses`_
254259
linked from this page).
255260

256-
257261
In addition to determining which simulants die due to any cause, we also need to determine which subcause is underlying the death. This is done by sampling from a categorical distribution obtained by renormalizing the CSMRisks:
258262

259263
.. math::
@@ -349,7 +353,10 @@ Data Tables
349353
- GBD + assumption about relative risks + intervention model effects
350354
- see subcause models for details
351355

352-
**Details of the** :math:`\text{PAF}_\text{LBWSG}` **calculation:**
356+
.. _details_of_the_lbwsg_paf_calculation:
357+
358+
Details of the LBWSG PAF calculation
359+
++++++++++++++++++++++++++++++++++++
353360

354361
As stated in the table above, :math:`\text{PAF}_\text{LBWSG}` is the population attributable fraction of all-cause mortality for low birth weight and short gestation. It is computed so that PAF = 1 - 1 / E(:math:`\text{RR}_{\text{BW},\text{GA}}`) from the capped interpolated relative risk function (with expectation taken over the distribution of LBWSG exposure).
355362

@@ -374,6 +381,8 @@ Using the `LBWSG PAF calculation simulation <https://github.com/ihmeuw/vivarium_
374381

375382
So,
376383

384+
.. _details_of_the_lbwsg_paf_calculation_equation:
385+
377386
.. math::
378387
379388
E(\text{RR})_\text{population} = \frac{\sum_{\text{cat}} E(\text{RR})_\text{cat} \times p^\text{birth}_\text{cat} \times \frac{n_\text{cat} - n^\text{deaths}_\text{cat}}{n_\text{cat}}}{\sum_{\text{cat}} p^\text{birth}_\text{cat} \times \frac{n_\text{cat} - n^\text{deaths}_\text{cat}}{n_\text{cat}}}

docs/source/models/causes/neonatal/preterm_birth.rst

Lines changed: 87 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -154,11 +154,17 @@ Note that these probabilities are not used directly in the model and are include
154154
Modeling Strategy
155155
+++++++++++++++++
156156

157-
The Preterm Birth submodel requires only the birth-weight- and gestation-age-stratified cause specific mortality risks for preterm birth complications with and without respiratory distress syndrome during the early and late neonatal periods.
157+
The Preterm Birth submodel only needs to produce the birth-weight- and gestation-age-stratified cause specific mortality risks for preterm birth complications with and without respiratory distress syndrome during the early and late neonatal periods.
158+
(These risks are also implicitly stratified by age group, sex, and location.)
158159

159160
Since this is a PAF-of-one cause, the calculation must take into account the "structural zeros" representing no mortality risk for simulants with a gestational age of 37 or more weeks.
160161

161-
The way these CSMRisks are used is the same for all subcauses, and therefore is included in the :ref:`Overall Neonatal Disorders Model <2021_cause_neonatal_disorders_mncnh>` page. This page describes the birth-weight- and gestational-age-specific cause specific mortality risks that are used for this cause on that page, :math:`\text{CSMRisk}^{\text{preterm with RDS}}_{\text{BW},\text{GA}}` and :math:`\text{CSMRisk}^{\text{preterm without RDS}}_{\text{BW},\text{GA}}`. In both cases, the formula is:
162+
The way these CSMRisks are used is the same for all subcauses, and therefore is included in the :ref:`Overall Neonatal Disorders Model <2021_cause_neonatal_disorders_mncnh>` page. This page describes how to calculate the birth-weight- and gestational-age-specific cause specific mortality risks that are used for the preterm subcauses on that page, namely :math:`\text{CSMRisk}^{\text{preterm with RDS}}_{\text{BW},\text{GA}}` and :math:`\text{CSMRisk}^{\text{preterm without RDS}}_{\text{BW},\text{GA}}`.
163+
As in the equations on the overall neonatal disorders model page, all quantities here
164+
are age-group-, sex-, and location-specific; these subscripts are omitted for brevity.
165+
For both preterm subcauses, the formula is:
166+
167+
.. _preterm_csmrisk_equation:
162168

163169
.. math::
164170
\begin{align*}
@@ -171,7 +177,8 @@ The way these CSMRisks are used is the same for all subcauses, and therefore is
171177
\end{align*}
172178
173179
where :math:`k` is the subcause of interest (preterm birth with or without RDS),
174-
:math:`\text{CSMRisk}` is the cause-specific mortality riskk for preterm birth complications,
180+
:math:`\text{CSMRisk}` is the cause-specific mortality risk for preterm birth complications,
181+
:math:`p_{\text{preterm}}` is the prevalence of preterm (gestational age < 37 weeks) at the *beginning* of the age group,
175182
:math:`f_k` is the fraction of preterm deaths due to subsubcause :math:`k` (with or without RDS), :math:`\text{RR}_{\text{BW},\text{GA}}` is the relative risk of all-cause mortality for a birth weight of :math:`\text{BW}` and gestational age of :math:`\text{GA}`, and :math:`Z` is a normalizing constant selected so that :math:`E[\text{RR}_{\text{BW,GA}} | \text{GA}<37] \cdot Z = 1`. Solving for :math:`Z` gives :math:`Z = 1 / E[\text{RR}_{\text{BW,GA}} | \text{GA}<37]`.
176183

177184
.. note::
@@ -184,7 +191,7 @@ where :math:`k` is the subcause of interest (preterm birth with or without RDS),
184191

185192
We will use a **population size of 195_112** for this calculation. This number was selected in order to satisfy the following criteria:
186193

187-
- The population size per LWBSG exposure category is required to be a perfect square to be compatible with our strategy of initializing individual exposures on a grid within each LBWSG exposure category
194+
- The population size per LBWSG exposure category is required to be a perfect square to be compatible with our strategy of initializing individual exposures on a grid within each LBWSG exposure category
188195

189196
- The total population size of the PAF calculation pipeline must be divisible by the product of the number of LBWSG exposure categories (58), the number of sexes (2), and the number of age groups (2) used in the PAF calculation
190197

@@ -202,7 +209,73 @@ where :math:`k` is the subcause of interest (preterm birth with or without RDS),
202209
Also, it is possible that the choice of :math:`\text{RR}_{\text{BW},\text{GA}}` might not work for every subcause. Since we're moving all the preterm mortality into the preterm categories, there is less room there for mortality from other causes, so depending on the risks involved, we may need to shift mortality from some other causes into the non-preterm categories in order to avoid making things negative.
203210
It is even possible that there is no way to make this work consistently, meaning that any choice of weight function would lead to negative mortality risks. We expect that this will not be an issue, but we haven't actually tried it with the real data yet.
204211

205-
Each individual simulant :math:`i` has their own :math:`\text{CSMR}_i^k` that might be different from :math:`\text{CSMRisk}^k_{\text{BW}_i,\text{GA}_i}` (meaning the average birth-weight- and gestational-age-specific CSMRisk for simulants with the birth weight and gestational age matching simulant :math:`i`. We recommend implementing this as a pipeline eventually because it will be modified by interventions (or access to interventions) relevant to this subcause. (Until we implement those, we will have :math:`\text{CSMRisk}_{i}^k = \text{CSMRisk}^k_{\text{BW}_i,\text{GA}_i}`, though.)
212+
:math:`\text{CSMRisk}` and :math:`p_{\text{preterm}}` are calculated differently for the ENN and LNN age groups.
213+
For clarity of notation, in what follows we will again make explicit the age group
214+
subscripts that have been implicit on every quantity to this point.
215+
(Sex and location remain implicit.)
216+
We define the ENN CSMRisk as:
217+
218+
.. math::
219+
220+
\text{CSMRisk}_\text{ENN} = \frac{\text{enn_death_count}}{\text{live_birth_count}},
221+
222+
where the :math:`\text{enn_death_count}` and :math:`\text{live_birth_count}` are
223+
quantities pulled from GBD, as detailed in the table below.
224+
225+
The LNN CSMRisk is:
226+
227+
.. math::
228+
229+
\text{CSMRisk}_\text{LNN} = \frac{\text{lnn_death_count}}{\text{live_birth_count} - \text{enn_all_cause_death_count}},
230+
231+
where, again, all quantities are pulled from GBD as detailed in the table below.
232+
233+
:math:`p_{\text{preterm}}`, as mentioned above, represents the prevalence/exposure
234+
of preterm (gestational age < 37 weeks) at the *beginning* of the age group.
235+
For ENN, the beginning of the age group is birth, so the prevalence of preterm
236+
at birth is a sum of the birth prevalence for all LBWSG categories with gestational
237+
age less than 37 weeks:
238+
239+
.. math::
240+
241+
p_{\text{preterm},\text{ENN}} = \sum_{\{\text{cat}: \text{GA}<37\}} \text{lbwsg_birth_prevalence}_\text{cat},
242+
243+
where :math:`\text{lbwsg_birth_prevalence}` can be pulled from GBD with minor transformations,
244+
as detailed in the table below.
245+
246+
For LNN, the situation is more complicated, because we need to account
247+
for differential mortality in the ENN period.
248+
Therefore, the easiest way to calculate :math:`p_{\text{preterm},\text{LNN}}` is to get the end-of-ENN preterm
249+
prevalence from the same LBWSG PAF calculation pipeline used for :math:`Z`
250+
above.
251+
As detailed at :ref:`details_of_the_lbwsg_paf_calculation` on the neonatal all-cause
252+
mortality page, there are two iterative steps using microsimulation, with the late neonatal calculations
253+
using the result of the early neonatal calculations.
254+
Similarly to the LNN PAF, *after* the early neonatal calculations are complete, the prevalence of
255+
preterm at the end of the ENN age group should be calculated.
256+
This value should be used as :math:`p_{\text{preterm},\text{LNN}}` for the purposes
257+
of the CSMRisk equation.
258+
259+
Determining the prevalence of preterm is a bit more complex than it sounds, because in the PAF calculation pipeline,
260+
the same number of simulants are assigned to each LBWSG category, rather than assigning each simulant
261+
to a random category with probability equal to that category's prevalence at birth.
262+
Due to this initialization strategy, all quantities calculated in the pipeline must use *weights*
263+
to account for the fact that the simulants in the categories with higher birth prevalence represent more people.
264+
Therefore, :math:`p_{\text{preterm},\text{LNN}}` is calculated as follows:
265+
266+
.. math::
267+
268+
p_{\text{preterm},\text{LNN}} = \frac{
269+
\sum_{\{\text{cat}: \text{GA}<37\}} \text{lbwsg_birth_prevalence}_\text{cat} \times \frac{n_\text{cat} - n^\text{deaths}_\text{cat}}{n_\text{cat}}
270+
}{
271+
\sum_{\text{cat}} \text{lbwsg_birth_prevalence}_\text{cat} \times \frac{n_\text{cat} - n^\text{deaths}_\text{cat}}{n_\text{cat}}
272+
},
273+
274+
where :math:`n_\text{cat}` is the number of simulants initialized into each LBWSG category at birth
275+
and :math:`n^\text{deaths}_\text{cat}` is the number of deaths in each category when ENN mortality was applied.
276+
Note that :math:`n_\text{cat}` will not vary by LBWSG exposure category under the current approach of assigning the same number of simulants to each LBWSG category.
277+
278+
Each individual simulant :math:`i` has their own :math:`\text{CSMR}_i^k` that might be different from :math:`\text{CSMRisk}^k_{\text{BW}_i,\text{GA}_i}` (meaning the average birth-weight- and gestational-age-specific CSMRisk for simulants with the birth weight and gestational age matching simulant :math:`i`). We recommend implementing this as a Vivarium pipeline eventually because it will be modified by interventions (or access to interventions) relevant to this subcause. (Until we implement those, we will have :math:`\text{CSMRisk}_{i}^k = \text{CSMRisk}^k_{\text{BW}_i,\text{GA}_i}`, though.)
206279

207280
The following table shows the data needed for these
208281
calculations.
@@ -212,8 +285,9 @@ Data Tables
212285

213286
.. note::
214287

215-
All quantities pulled from GBD in the following table are for a
216-
specific year, sex, age group, and location.
288+
All quantities pulled from GBD in the following table are pulled
289+
for all modeled years, sexes, age groups, and locations,
290+
except when the age group is explicitly specified.
217291

218292
.. list-table:: Data values and sources
219293
:header-rows: 1
@@ -224,36 +298,24 @@ Data Tables
224298
- Note
225299
* - enn_all_cause_death_count
226300
- Count of deaths due to all causes in the early neonatal age group
227-
- GBD: source='codcorrect', metric_id=1, cause_id=294
301+
- GBD: source='codcorrect', metric_id=1, cause_id=294, age_group_id=2
228302
-
229303
* - enn_death_count
230304
- Count of deaths due to cause neonatal preterm birth complications in the early neonatal age group
231-
- GBD: source='codcorrect', metric_id=1, cause_id=381
305+
- GBD: source='codcorrect', metric_id=1, cause_id=381, age_group_id=2
232306
-
233307
* - lnn_death_count
234308
- Count of deaths due to cause neonatal preterm birth complications in the late neonatal age group
235-
- GBD: source='codcorrect', metric_id=1, cause_id=381
309+
- GBD: source='codcorrect', metric_id=1, cause_id=381, age_group_id=3
236310
-
237311
* - live_birth_count
238312
- Count of live births
239313
- GBD: covariate_id = 1106
240314
-
241-
* - csmrisk_enn
242-
- neonatal preterm birth complications mortality risk in the early neonatal age group
243-
- enn_death_count / live_birth_count
244-
-
245-
* - csmrisk_lnn
246-
- neonatal preterm birth complications mortality risk in the late neonatal age group
247-
- lnn_death_count / (live_birth_count - enn_all_cause_death_count)
248-
-
249-
* - :math:`\text{CSMRisk}`
250-
- neonatal preterm birth complications mortality risk
251-
- either csmrisk_enn or csmrisk_lnn depending on the simulant's age group
315+
* - lbwsg_birth_prevalence
316+
- Birth prevalence of low birthweight and short gestation risk factor
317+
- GBD with post-processing: rei_id = 339, then remove the extraneous category and rescale prevalence :ref:`as described here <rescaling_lbwsg_exposure_data_pulled_from_gbd_2019>`.
252318
-
253-
* - :math:`p_\text{preterm}`
254-
- Prevalence of gestational age <37 weeks at birth
255-
- Derived from :ref:`GBD LBWSG exposure <risk_exposure_lbwsg>`
256-
- Equal to the sum of exposures for all categories with gestational age at birth <37 weeks. A list of such categories can be generated in a manner similar to `this notebook <https://github.com/ihmeuw/vivarium_research_nutrition_optimization/blob/data_prep/data_prep/LBW%20categories.ipynb>`_
257319
* - :math:`f_\text{preterm w RDS}`
258320
- fraction of preterm deaths with RDS
259321
- 85%

docs/source/models/concept_models/vivarium_mncnh_portfolio/ai_ultrasound_module/module_document.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,6 @@ AI Ultrasound Module
3737

3838
.. contents::
3939
:local:
40-
:depth: 2
4140

4241
1.0 Overview
4342
++++++++++++

docs/source/models/concept_models/vivarium_mncnh_portfolio/anemia_component/module_document.rst

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -201,7 +201,28 @@ Note that simulants who died during labor should not experience any YLDs due to
201201
4.0 Verification and Validation Criteria
202202
+++++++++++++++++++++++++++++++++++++++++
203203

204-
- Baseline simulated anemia YLDs should match corresponding pregnancy-specific GBD values. TODO: define specifically what these are (do they save pregnancy-specific impairment prevalence in GBD 2023 or do we need to calculate our own targets again?)
204+
- Baseline simulated anemia YLDs should match corresponding pregnancy-specific GBD values. Run the following command to load the data from GBD 2023:
205+
206+
.. code-block:: python
207+
208+
get_outputs(
209+
location_id=[165,179,214],
210+
topic='rei',
211+
rei_id=[206,206,207], # We also have rei_id=192 for all anemia and rei_id=432 for moderate and severe combined
212+
population_group_id=16,
213+
sex_id=2,
214+
year_id=2023,
215+
release_id=16, # release_id=33 also works
216+
compare_version_id=8306,
217+
measure_id=[3,5],
218+
age_group_id=[7, 8, 9, 10, 11, 12, 13, 14, 15, 24, 169]
219+
)
220+
221+
.. note::
222+
223+
Make sure you have the latest version of ``db_queries`` to be able to use the ``population_group_id`` argument. To get pregnancy-specific results, the population group and the age groups need to be specified, because the default is all ages.
224+
As of the time of writing (July 2025), we can only use ``population_group_id=16`` with ``get_outputs()``. There were a few EPIC/COMO runs with pregnancy this GBD round, which are noted in the `tracking HUB page <https://hub.ihme.washington.edu/spaces/GBDdirectory/pages/229280352/GBD+2023+EPIC+COMO+tracking>`_.
225+
205226

206227
5.0 References
207228
+++++++++++++++

0 commit comments

Comments
 (0)