Skip to content

Add Cluster-based E2E tests for VCT#288

Open
patricktnast wants to merge 23 commits intomainfrom
pnast/testing/e2e-cluster
Open

Add Cluster-based E2E tests for VCT#288
patricktnast wants to merge 23 commits intomainfrom
pnast/testing/e2e-cluster

Conversation

@patricktnast
Copy link
Contributor

@patricktnast patricktnast commented Feb 18, 2026

Add CLuster-based E2E tests for VCT

Description

Changes and notes

I added some basic tests of psimulate run, restart, and expand. This will help us sort out the jobmon refactor.

I added a load test one too, but it is failing, i think more or less expectedly.

Testing

@patricktnast patricktnast changed the title Pnast/testing/e2e cluster Add Cluster-based E2E tests for VCT Feb 19, 2026
@patricktnast patricktnast marked this pull request as ready for review February 19, 2026 00:28
def _read_metadata(output_dir: Path) -> pd.DataFrame:
"""Read the finished simulation metadata CSV from an output directory."""
metadata_path = output_dir / "finished_sim_metadata.csv"
assert metadata_path.exists(), f"Metadata file not found at {metadata_path}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This seems unnecessary since we will runtime error on the next line if this fails.

item.add_marker(skip_e2e)


def is_on_slurm() -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a pytest plugin in VTU now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it actually is not, but I think it is fair to say that it should be.

return shutil.which("sbatch") is not None


def is_slow_test_day(slow_test_day: str = SLOW_TEST_DAY) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should definitely be a plugin.

@@ -0,0 +1,22 @@
# Minimal vivarium model specification for E2E testing.
# Runs 2 time steps with 10 simulants -- completes in seconds.
# No components, no data artifact, no results observers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't we need observers to do some sorts of E2E tests - e.g. to test that we write results to disk properly?

for _ in range(10):
if not results_dir.exists():
break # the dir has been removed
# Take a quick nap to ensure processes are finished with the directory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes copilot comments amuse me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

funny enough, this is actually a direct copy/paste from Easylink
https://github.com/ihmeuw/easylink/blame/main/tests/conftest.py
Not sure where Steve got it from, though!

self, shared_tmp_path: Path, slurm_project: str
) -> None:
"""Expand a completed run by adding draws and seeds, verify new jobs complete."""
_, output_dir = _run_basic_simulation(shared_tmp_path, slurm_project)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: is it possible to leverage the fact that you have already done a psimulate run in other tests so you don't have to duplicate that work? This is not that important, as these jobs are fast.

"-P",
slurm_project,
"-r",
"00:10:00",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You said above that these sims take a few seconds to run. Can we reduce the run-time request? What about memory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, I shouldn't have let copilot keep in the "-- completes in seconds." but they are indeed pretty fast. I can bump it down to a couple minutes to be on the safe side once I add in another component and observer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the memory is actually typed in click as int

)


def pytest_collection_modifyitems(config: Config, items: list[Function]) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic seems to be duplicated across many repos. Is it possible to move it into VTU?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apparently so!

assert deaths_dir.exists()

deaths_df = pd.read_parquet(deaths_dir)
assert not deaths_df.empty
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check if we have results from each of our parallel runs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments