Skip to content

Commit 5c3d1a6

Browse files
nina-xuNina Xubinaryaaron
authored
chore: sync from NMP MR !5254 - e2e tests (#55)
# Summary Sync CI tests from NMP > Files changed in MR but NOT synced (outside src/ and tests/, may need manual review): Makefile README.md images/ci_pipeline.png ## Pre-Review Checklist <!-- These checks should be completed before a PR is reviewed, --> <!-- but you can submit a draft early to indicate that the issue is being worked on. --> Ensure that the following pass: - [x] `make format && make lint` or via prek validation. - [ ] `make test` passes locally - [ ] `make e2e` passes locally - [ ] `make test-ci-container` passes locally (recommended) ## Pre-Merge Checklist <!-- These checks need to be completed before a PR is merged, --> <!-- but as PRs often change significantly during review, --> <!-- it's OK for them to be incomplete when review is first requested. --> - [x] New or updated tests for any fix or new behavior - [ ] Updated documentation for new features and behaviors, including docstrings for API docs. ## Other Notes <!-- Please add the issue number that should be closed when this PR is merged. --> - Closes #<issue> --------- Signed-off-by: aagonzales <aagonzales@nvidia.com> Co-authored-by: Nina Xu <ninaxu@cw-dfw-cs-001-vscode-01.cm.cluster> Co-authored-by: aagonzales <aagonzales@nvidia.com>
1 parent e871be4 commit 5c3d1a6

29 files changed

Lines changed: 46645 additions & 994 deletions

.agent/skills/diagnose-failures/SKILL.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@ Map GitHub Actions job names to local commands:
5959
| `Lint` | `make lint` (runs ruff + ty + copyright) |
6060
| `Typecheck` | `bash tools/lint/run-ty-check.sh` |
6161
| `Unit Tests` | `make test-ci` or `make test` |
62+
| `Config-Dataset e2e` | `make test-nss-<config>-<dataset>-ci` |
6263

6364
Fetch CI logs:
6465
```bash

.agent/skills/python-testing-patterns/SKILL.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,6 @@ compatibility: "Python 3.11+, pytest with xdist and asyncio_mode=auto."
1414
# Make targets (preferred)
1515
make test # Unit tests, excluding slow
1616
make test-slow # All tests including slow, excluding e2e
17-
make test-sdk-related # Config/SDK/CLI/API tests only
1817
make test-ci-container # CI tests in a Linux container
1918

2019
# Direct pytest (via uv)

.claude/commands/gpu-test.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,4 +9,5 @@ Run GPU-dependent tests. Requires CUDA.
99
* All e2e tests: `make test-e2e`
1010
* Default e2e only: `make test-e2e-default`
1111
* DP e2e only: `make test-e2e-dp`
12+
* Config-dataset combo: `make test-nss-tinyllama_unsloth-clinc_oos-ci` (12 combos total, see `tests/TESTING.md`)
1213
* Note: e2e tests run with `-n 0` (single process)

.claude/commands/test-sdk-related.md

Lines changed: 0 additions & 9 deletions
This file was deleted.

.copyrightignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ README.md
1010
SECURITY.md
1111
THIRD_PARTY.md
1212
design.md
13+
tests/stub_datasets/LICENSES.md
1314

1415
# AI / IDE / CI configuration directories
1516
.agent/

AGENTS.md

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -106,13 +106,21 @@ make test
106106
# All tests including slow (excludes e2e)
107107
make test-slow
108108

109-
# SDK-related tests only
110-
make test-sdk-related
109+
# GPU integration (requires CUDA)
110+
make test-gpu-integration
111+
112+
# End-to-end tests (requires CUDA)
113+
make test-e2e
114+
115+
# Specific config-dataset e2e combo (12 total)
116+
make test-nss-tinyllama_unsloth-clinc_oos-ci
111117

112118
# CI tests in a Linux container
113119
make test-ci-container
114120
```
115121

122+
See [tests/TESTING.md](tests/TESTING.md) for the full test matrix and details.
123+
116124
Test runner: `uv run --frozen pytest -n auto --dist loadscope -vv`
117125

118126
Markers (defined in `pytest.ini`):

CONTRIBUTING.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -265,15 +265,15 @@ make test
265265
# Run all tests including slow tests (excludes e2e)
266266
make test-slow
267267

268-
# Run SDK-related tests (config, sdk, cli, api)
269-
make test-sdk-related
270-
271268
# Run GPU integration tests (requires CUDA)
272269
make test-gpu-integration
273270

274271
# Run end-to-end tests (requires CUDA)
275272
make test-e2e
276273

274+
# Run a specific config-dataset e2e combo (12 total, see tests/TESTING.md)
275+
make test-nss-tinyllama_unsloth-clinc_oos-ci
276+
277277
# Run CI tests locally in a Linux container (Docker/Podman)
278278
make test-ci-container
279279

Makefile

Lines changed: 26 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ PYTORCH_DEPS ?= cpu
2828
PYTEST_ADDOPTS := -n auto --dist loadscope -vv
2929
PYTEST_CI_OPTS := --cov --cov-report json:coverage.json
3030
PYTEST_CMD := uv run --frozen pytest $(PYTEST_ADDOPTS)
31+
PYTEST_NO_XDIST_CMD := $(PYTEST_CMD) -n 0
3132

3233
# Display platform info
3334
$(info local system architecture: $(PLATFORM)/$(ARCH))
@@ -153,14 +154,6 @@ test-slow: ## Run all tests including slow tests (excludes e2e)
153154
pushd $(NSS_ROOT_PATH) && \
154155
$(PYTEST_CMD) $(NSS_ROOT_PATH)/tests -m "not e2e" --run-slow
155156

156-
.PHONY: test-sdk-related
157-
test-sdk-related: ## Run SDK-related tests (config, sdk, cli, api)
158-
$(PYTEST_CMD) \
159-
$(NSS_ROOT_PATH)/tests/config \
160-
$(NSS_ROOT_PATH)/tests/sdk \
161-
$(NSS_ROOT_PATH)/tests/cli \
162-
$(NSS_ROOT_PATH)/tests/api
163-
164157
.PHONY: test-ci
165158
test-ci: ## Run CI unit tests excluding slow and GPU tests
166159
pushd $(NSS_ROOT_PATH) && \
@@ -368,3 +361,28 @@ synchronize-from-nmp: synchronize-py-files-from-nmp synchronize-metafiles-from-n
368361
echo "NMP_REPO_PATH '$(NMP_REPO_PATH)' is not a valid directory."; \
369362
exit 1; \
370363
fi
364+
365+
366+
# ============================================================
367+
# Config-Dataset Combination Tests (12 total)
368+
# ============================================================
369+
# Generated targets: test-nss-{CONFIG}-{DATASET}-ci
370+
# CONFIGS : tinyllama_unsloth tinyllama_dp smollm3_unsloth smollm3_dp mistral_nodp mistral_dp
371+
# DATASETS: clinc_oos dow_jones_index
372+
# Example usage:
373+
# make test-nss-tinyllama_unsloth-clinc_oos-ci
374+
# make test-nss-tinyllama_dp-dow_jones_index-ci
375+
376+
NSS_CONFIGS := tinyllama_unsloth tinyllama_dp smollm3_unsloth smollm3_dp mistral_nodp mistral_dp
377+
NSS_DATASETS := clinc_oos dow_jones_index
378+
379+
define nss_combo_test
380+
test-nss-$(1)-$(2)-ci: ## Run pytest test for $(shell echo $(1) | tr '_' '-') config with $(shell echo $(2) | tr '_' '-') dataset
381+
$(MAKE) bootstrap-nss cu128
382+
$(PYTEST_NO_XDIST_CMD) -vv $(PYTEST_CI_OPTS) $(NSS_ROOT_PATH)/tests/e2e/test_dataset_config.py -k "test_$(2)_dataset[$(subst _,-,$(1))]"
383+
endef
384+
385+
$(foreach config,$(NSS_CONFIGS),\
386+
$(foreach dataset,$(NSS_DATASETS),\
387+
$(eval $(call nss_combo_test,$(config),$(dataset)))))
388+

README.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -380,19 +380,21 @@ make test
380380
# Run all tests including slow tests (excludes e2e)
381381
make test-slow
382382
383-
# Run SDK-related tests (config, sdk, cli, api)
384-
make test-sdk-related
385-
386383
# Run GPU integration tests (requires CUDA)
387384
make test-gpu-integration
388385
389386
# Run end-to-end tests (requires CUDA)
390387
make test-e2e
391388
389+
# Run a specific config-dataset e2e combo (12 total, see tests/TESTING.md)
390+
make test-nss-tinyllama_unsloth-clinc_oos-ci
391+
392392
# Run specific test files directly
393393
uv run pytest tests/cli/test_run.py
394394
```
395395
396+
See [tests/TESTING.md](tests/TESTING.md) for the full test matrix and usage.
397+
396398
### Container-Based Testing
397399
398400
You can run the CI test suite locally in a Linux container using Docker or Podman:

src/nemo_safe_synthesizer/sdk/library_builder.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -363,6 +363,10 @@ def generate(self) -> SafeSynthesizer:
363363
if self._total_start is None:
364364
self._total_start = time.monotonic()
365365

366+
# Clean up trainer model if it exists (only present when train->generate in same session)
367+
if hasattr(self, "trainer") and self.trainer is not None:
368+
self.trainer.delete_trainable_model()
369+
366370
# Select backend based on time_series configuration
367371
if self._nss_config.time_series and self._nss_config.time_series.is_timeseries:
368372
self.generator = TimeseriesBackend(

0 commit comments

Comments
 (0)