Albrja/mic 6867/update-population-characteristics-ordering#646
Albrja/mic 6867/update-population-characteristics-ordering#646
Conversation
| - Fertility fires in on_time_step_prepare, creating newborns at age=0 | ||
| - Aging fires in on_time_step, incrementing all simulant ages (including newborns) | ||
|
|
||
| Therefore, after one time step, newborns should have age > 0 (specifically |
There was a problem hiding this comment.
This is to test that this order is happening. We could set up these components to not include newborns in the time step population but this is just to show the correct order is happening right. The tests would obviously be updated if we decided to include some of these nuances.
| sim.setup() | ||
| mortality_rates = sim.get_population("mortality_rate") | ||
|
|
||
| # Step 1: Under the new ordering, mortality (priority 3) fires before |
There was a problem hiding this comment.
This was the only test that failed due to issues caused by new ordering. State transitions use to happen before mortality and now they happen after is the TLDR of what is happening here.
| "vivarium_dependencies[pandas,numpy_lt_2,scipy,tables,loguru,pyarrow]", | ||
| "vivarium_build_utils>=3.0.2,<4.0.0", | ||
| "vivarium>=4.1.0,<5.0.0", | ||
| "vivarium @ git+https://github.com/ihmeuw/vivarium@albrja/mic-6867/observer-priority", |
There was a problem hiding this comment.
TODO: Revert before merging
| Priority set such that aging occurs before observations (priority 2) | ||
| and before mortality (priority 3). | ||
| """ | ||
| return 1 |
There was a problem hiding this comment.
This should be 0, right? And accordingly, each of person-time observation and mortality should go down by 1 as well?
| return 1 | |
| return 0 |
| before mortality removes simulants. | ||
| """ | ||
| return 0 | ||
| return 3 |
There was a problem hiding this comment.
| return 3 | |
| return 2 |
| pop_filter=f"is_alive == True", | ||
| when="time_step__prepare", | ||
| when="time_step", | ||
| priority=2, |
There was a problem hiding this comment.
| priority=2, | |
| priority=1, |
| pop_filter="is_alive == True", | ||
| when="time_step__prepare", | ||
| when="time_step", | ||
| priority=2, |
There was a problem hiding this comment.
| priority=2, | |
| priority=1, |
| pop_filter=pop_filter, | ||
| when="time_step__prepare", | ||
| when="time_step", | ||
| priority=2, |
There was a problem hiding this comment.
| priority=2, | |
| priority=1, |
| pop_filter: str = "", | ||
| include_untracked: bool = False, | ||
| when: str = "collect_metrics", | ||
| priority: int = 5, |
There was a problem hiding this comment.
Use the default constant from vivarium?
| ).all() | ||
|
|
||
|
|
||
| def test_person_time_includes_dead_simulants(base_config, base_plugins): |
There was a problem hiding this comment.
It shouldn't include simulants who died on previous time-steps, and that's what this test name suggests.
| def test_person_time_includes_dead_simulants(base_config, base_plugins): | |
| def test_person_time_includes_dying_simulants(base_config, base_plugins): |
There was a problem hiding this comment.
I have updated the name and doc string to hopefully clear this up.
Albrja/mic 6867/update-population-characteristics-ordering
Update order of fertility, age, observation, and mortality of population
Changes and notes
-update priority of various population and observer components to change order of when population characteristics are updated
-new order is: fertility on time step prepare, on time step: aging, person time observations, mortality, all other components (default priority)
Code reviewer
PR Review: Update when population characteristics happen in lifecycle methods
Summary
This PR introduces priority-based ordering of time-step lifecycle events in vivarium_public_health, ensuring that aging, observations, mortality, and disease transitions execute in a defined sequence. The key execution order is: aging (priority 1) → observations (priority 2) → mortality (priority 3) → disease transitions (priority 5, vivarium default) → cleanup (priority 9). Previously, transitions could fire before mortality.
Additionally, BasePopulation is decomposed into sub-components — AgeOutSimulants (handles aging-out logic), Mortality (already existed, now registered as a sub-component), and Disability (new, registers the all-cause disability weight pipeline). The on_time_step for BasePopulation now filters for living simulants before aging. Tests are updated to account for the new ordering, particularly test_mortality_cause_of_death which uses a 2-step approach.
Design
Priority constants are scattered magic numbers with no central registry — base_population.py:59-70, mortality.py:138-145. Priorities (1, 2, 3, 5, 9) are spread across files as bare integers. Each docstring cross-references others by number, creating a fragile web. Consider a shared constants module or enum:
class LifecyclePriority: AGING = 1 OBSERVATIONS = 2 MORTALITY = 3 DISEASE_TRANSITIONS = 5 # vivarium framework default CLEANUP = 9
AgeOutSimulants cleanup priority is implicit — base_population.py:452-530. AgeOutSimulants inherits the default cleanup priority while BasePopulation explicitly sets time_step_cleanup_priority = 9. The ordering between these is critical (age-out must mark is_aged_out before cleanup evaluates exit_time) but undocumented. Consider adding an explicit time_step_cleanup_priority property to AgeOutSimulants.
Mortality hardcoded as sub-component of BasePopulation limits composability — base_population.py:78. Any simulation using BasePopulation always gets Mortality and Disability with no opt-out. This is convenient for the common case, but prevents using BasePopulation without mortality (tests work around this with all_cause_mortality_rate: 0). Worth noting as a deliberate trade-off.
Disability component is very thin — base_population.py:798-832. The entire class registers a single pipeline in setup(). This could be 3 lines inside BasePopulation.setup. The separate class adds ceremony for potential future extensibility that isn't described. Acceptable but worth noting.
Maintainability
Disease transition priority (5) is an undocumented coupling — mortality.py:138-145. Mortality.time_step_priority = 3 relies on disease transitions happening at the vivarium framework default priority (5), which is never stated in this repo. If the upstream default changes, the ordering breaks silently. Add an explicit note in the docstring.
time_step_cleanup_priority = 9 gap is unexplained — base_population.py:68-70. The gap from 5 to 9 (leaving 6-8 unused) has no rationale. A brief docstring note would clarify intent: "Set to 9 to ensure all other processing completes before exit times are finalized."
test_mortality_creates_attributes uses fragile hardcoded string lists — test_mortality.py:43-57. The test hardcodes "is_aged_out", "simulant_step_size", "all_causes.disability_weight" as known non-mortality attributes. If any sub-component adds a column, this test breaks even though mortality is unchanged. Consider asserting mortality's columns are a subset rather than computing via subtraction.
Comment typo: "AgedOutSimulants" should be "AgeOutSimulants" — test_mortality.py:52. The comment reads AgedOutSimulants but the class is AgeOutSimulants.
DRY
Near-identical update_exit_times in Mortality and AgeOutSimulants — mortality.py:294-300, base_population.py:483-493. Both methods follow the same 5-line algorithm: filter for matching simulants → intersect with NaT exit times → set clock() + step_size(). The only differences are the query string and include_untracked=True. Consider extracting a shared helper to reduce maintenance burden.
Repeated sim-setup boilerplate in test_mortality.py — The InteractiveContext + config update pattern is repeated in setup_sim_with_pop_and_mortality, test_mortality_cause_of_death, and test_unmodeled_causes. A factory fixture could consolidate these.
Tests
test_mortality_cause_of_death does not validate step-1 behavior — test_mortality.py:144-150. The test takes 2 steps but only asserts on step-2 deaths. Step 1 is where the ordering change is most visible: healthy simulants should only die of "other_causes" (no excess mortality), while sick simulants can die of "sick". A regression where transitions fire before mortality would not be caught. Add step-1 assertions.
Dead-code branch weakens the assertion — test_mortality.py:175-185. The loop for mortality_rate in rates silently skips when mortality_rate == 0 and the else: mortality_rate = mortality_rate on line 160 is a no-op. If a bug zeroes out all rates, the test passes without asserting anything. Remove the dead else branch and assert at least one non-zero rate was checked per cause.
No test for simultaneous death and age-out — A simulant near untracking_age could both die (mortality at priority 3) and age out (cleanup at priority 9) in the same step. Both update_exit_times modifiers would fire. No test verifies the behavior in this edge case.
test_aged_out_simulant_untracking never verifies actual untracking — test_base_population.py:131-186. The test always uses include_untracked=True, so it never validates that register_tracked_query("is_aged_out == False") actually excludes aged-out simulants from the default population view.
No integration test for the full priority chain — There is no single test validating that observations (priority 2) capture person-time before mortality (priority 3) removes simulants. This is the core motivation for the reordering.
Documentation
CHANGELOG entry is too vague — CHANGELOG.rst:3. The entry "Update when population characteristics happen in lifecycle methods" doesn't describe the behavioral change. Users upgrading from 5.1.x won't know that mortality now fires before disease transitions. Expand to list the priority ordering, the new sub-components, and the behavioral implications.
No centralized lifecycle priority documentation — The priority values are documented individually in property docstrings but there's no single table or section enumerating the full ordering. Add a priority-ordering table to the module docstring of base_population.py or concept docs.
BasePopulation class docstring doesn't mention sub-components — base_population.py:33-44. Users adding Mortality() explicitly alongside BasePopulation() would get a duplicate component error. The docstring should list the auto-registered sub-components and note that users should not add them separately.
Test comment references "priority 5" without sourcing — test_mortality.py:144. The comment says "disease transitions (priority 5)" but this is the vivarium framework default, not set in this repo. Clarify: "disease transitions (vivarium default priority 5)".
Functionality
exit_time semantics are correct by coincidence — mortality.py:294-300. Both update_exit_times methods set exit time to self.clock() + self.step_size(). This is evaluated lazily as an attribute modifier during BasePopulation.on_time_step_cleanup (priority 9). The test at test_mortality.py:98 asserts exit_time == sim._clock._clock_time, which works because _clock_time has already advanced by that point. However, the semantic intent (death happens mid-step but exit time is end-of-step) is not documented. If the pipeline evaluation phase ever changes, this could silently produce incorrect exit times.
test_unmodeled_causes uses * 365 instead of DAYS_PER_YEAR — test_mortality.py:224. The assertion other_causes_mortality_rate.unique()[0] * 365 == 0.5 uses a magic multiplier that doesn't match utilities.DAYS_PER_YEAR (which is 365.25). This works because the mock data setup produces values that happen to cancel out, but it's misleading.
No-op assignment in test_mortality_cause_of_death — test_mortality.py:180: else: mortality_rate = mortality_rate is dead code, likely a leftover from refactoring.
Minor Nits
test_mortality.py:52: Typo AgedOutSimulants → AgeOutSimulants.
test_mortality.py:180: Remove no-op else: mortality_rate = mortality_rate.
AgeOutSimulants and Disability are not exported from init.py. If they are internal, consider prefixing with _; if public, add to exports.
Overall
The priority-based lifecycle ordering is a well-motivated architectural change that makes the execution order explicit and deterministic. The decomposition into sub-components is reasonable. The main areas for improvement are:
Testing gaps: Step-1 assertion in cause-of-death test, simultaneous death+age-out edge case, and full priority chain integration test are missing.
Documentation: The CHANGELOG and docstrings need to clearly communicate the new ordering and its implications for downstream models. A centralized priority table would prevent future confusion.
DRY: The duplicated update_exit_times pattern should be extracted before a third instance appears.
Code hygiene: Remove the dead-code branch in the test and fix the typo.
Testing