Add Cluster-based E2E tests for VCT by patricktnast · Pull Request #288 · ihmeuw/vivarium_cluster_tools

patricktnast · 2026-02-18T22:49:47Z

Add CLuster-based E2E tests for VCT

Description

Category:
testing
JIRA issue: https://jira.ihme.washington.edu/browse/MIC-6827

Changes and notes

I added some basic tests of psimulate run, restart, and expand. This will help us sort out the jobmon refactor.

I added a load test one too, but it is failing, i think more or less expectedly.

Testing

…ster

albrja · 2026-02-19T01:25:51Z

tests/psimulate/test_e2e.py

+def _read_metadata(output_dir: Path) -> pd.DataFrame:
+    """Read the finished simulation metadata CSV from an output directory."""
+    metadata_path = output_dir / "finished_sim_metadata.csv"
+    assert metadata_path.exists(), f"Metadata file not found at {metadata_path}"


Nit: This seems unnecessary since we will runtime error on the next line if this fails.

albrja · 2026-02-19T01:29:57Z

tests/conftest.py

+                item.add_marker(skip_e2e)
+
+
+def is_on_slurm() -> bool:


Isn't this a pytest plugin in VTU now?

it actually is not, but I think it is fair to say that it should be.

albrja · 2026-02-19T01:30:35Z

tests/conftest.py

+    return shutil.which("sbatch") is not None
+
+
+def is_slow_test_day(slow_test_day: str = SLOW_TEST_DAY) -> bool:


This should definitely be a plugin.

rmudambi · 2026-02-19T16:11:08Z

tests/psimulate/data/e2e_model_spec.yaml

@@ -0,0 +1,22 @@
+# Minimal vivarium model specification for E2E testing.
+# Runs 2 time steps with 10 simulants -- completes in seconds.
+# No components, no data artifact, no results observers.


Won't we need observers to do some sorts of E2E tests - e.g. to test that we write results to disk properly?

rmudambi · 2026-02-19T16:12:34Z

tests/psimulate/test_e2e.py

+    for _ in range(10):
+        if not results_dir.exists():
+            break  # the dir has been removed
+        # Take a quick nap to ensure processes are finished with the directory


Sometimes copilot comments amuse me

funny enough, this is actually a direct copy/paste from Easylink
https://github.com/ihmeuw/easylink/blame/main/tests/conftest.py
Not sure where Steve got it from, though!

tests/psimulate/test_e2e.py

rmudambi · 2026-02-19T16:19:17Z

tests/psimulate/test_e2e.py

+        self, shared_tmp_path: Path, slurm_project: str
+    ) -> None:
+        """Expand a completed run by adding draws and seeds, verify new jobs complete."""
+        _, output_dir = _run_basic_simulation(shared_tmp_path, slurm_project)


Nit: is it possible to leverage the fact that you have already done a psimulate run in other tests so you don't have to duplicate that work? This is not that important, as these jobs are fast.

rmudambi · 2026-02-19T16:20:48Z

tests/psimulate/test_e2e.py

+        "-P",
+        slurm_project,
+        "-r",
+        "00:10:00",


You said above that these sims take a few seconds to run. Can we reduce the run-time request? What about memory?

Uh, I shouldn't have let copilot keep in the "-- completes in seconds." but they are indeed pretty fast. I can bump it down to a couple minutes to be on the safe side once I add in another component and observer.

the memory is actually typed in click as int

rmudambi · 2026-02-19T16:25:29Z

tests/conftest.py

+    )


 def pytest_collection_modifyitems(config: Config, items: list[Function]) -> None:


This logic seems to be duplicated across many repos. Is it possible to move it into VTU?

apparently so!

Co-authored-by: Rajan Mudambi <11376379+rmudambi@users.noreply.github.com>

…m_cluster_tools into pnast/testing/e2e-cluster

rmudambi · 2026-02-19T22:20:13Z

tests/psimulate/test_e2e.py

+        assert deaths_dir.exists()
+
+        deaths_df = pd.read_parquet(deaths_dir)
+        assert not deaths_df.empty


Check if we have results from each of our parallel runs.

patricktnast added 13 commits February 11, 2026 16:10

[COPILOT] Scaffold E2E tests

ca2ff47

Merge remote-tracking branch 'origin/main' into pnast/testing/e2e-clu…

6c30936

…ster

remove a test

49b273b

replace flag with is_on_slurm check

f3a4c1b

add appropriate skips

2886a7e

adjust shared dir creation

9b7a886

remove a test i didn't like

f5befe0

reduce docstring

a9314fa

add psimulate expand test

ae0d6aa

add load test test

1975d7e

format

fd55a6e

consolidate test

f9d27c5

change max time to 10m

438ae7c

patricktnast changed the title ~~Pnast/testing/e2e cluster~~ Add Cluster-based E2E tests for VCT Feb 19, 2026

patricktnast marked this pull request as ready for review February 19, 2026 00:28

patricktnast requested review from albrja, hussain-jafari, rmudambi and stevebachmeier as code owners February 19, 2026 00:28

mypy

758c1d6

albrja approved these changes Feb 19, 2026

View reviewed changes

rmudambi reviewed Feb 19, 2026

View reviewed changes

patricktnast and others added 8 commits February 19, 2026 09:33

Update tests/psimulate/test_e2e.py

94be2d7

Co-authored-by: Rajan Mudambi <11376379+rmudambi@users.noreply.github.com>

add mortality to e2e sim

701d5db

remove a comment

f67e3e5

use VTU plugin

0fe8766

create temp dependency

1390925

reduce runtime requirements

7d5e4ef

Merge branch 'pnast/testing/e2e-cluster' of github.com:ihmeuw/vivariu…

388c1b5

…m_cluster_tools into pnast/testing/e2e-cluster

adjust expand test and remove duplicate sims

540de85

DRY common slurm args

b4cdd8c

rmudambi approved these changes Feb 19, 2026

View reviewed changes

		return shutil.which("sbatch") is not None


		def is_slow_test_day(slow_test_day: str = SLOW_TEST_DAY) -> bool:

		)


		def pytest_collection_modifyitems(config: Config, items: list[Function]) -> None:

Conversation

patricktnast commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!