Skip to content

Commit b4ae90d

Browse files
committed
chore: rebase and updates
Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>
1 parent 1152761 commit b4ae90d

6 files changed

Lines changed: 34 additions & 6 deletions

File tree

.github/workflows/ci-checks.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -169,7 +169,7 @@ jobs:
169169
run: |
170170
make bootstrap-nss cpu
171171
make test-smoke
172-
172+
173173
174174
# ---------------------------------------------------------------------------
175175
# Single required status check for branch protection.

STYLE_GUIDE.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -727,10 +727,10 @@ Testing conventions are substantial enough to warrant their own section. For the
727727
- Fixture scope: function-scoped by default. Session scope only when empirically justified by test runtime -- not based on assumptions about cost.
728728
- Assertions: bare `assert` is the primary style; `pytest.raises()` with `match=` for exceptions; `pytest.approx()` for floating-point comparisons
729729
- Docstrings: optional for simple tests, recommended for complex/e2e tests explaining purpose
730-
- Markers: auto-assigned by path via `pytest_collection_modifyitems` (`/e2e/` -> `e2e`, `/gpu_integration/` -> `gpu_integration`, default -> `unit`). Explicit markers: `@pytest.mark.slow`, `@pytest.mark.timeout()`.
730+
- Markers: auto-assigned by path via `pytest_collection_modifyitems` (`/e2e/` -> `e2e`, `/smoke/` -> `smoke`, default -> `unit`). Explicit markers: `@pytest.mark.slow`, `@pytest.mark.requires_gpu`, `@pytest.mark.timeout()`.
731731
- `conftest.py`: shared fixtures per directory; root conftest has `load_test_dataset()` and `load_test_dataframe()` helpers
732732
- Use `tmp_path` fixture for file operations, never write to the repo tree
733-
- Mark CUDA-dependent tests with `@pytest.mark.e2e` or `@pytest.mark.gpu_integration`
733+
- Mark CUDA-dependent tests with `@pytest.mark.e2e`, `@pytest.mark.smoke`, or `@pytest.mark.requires_gpu`
734734
- Mock only external boundaries, not internal implementation details
735735
- Test isolation: no shared mutable state or execution-order dependencies between tests. If something must be run first before executing a test, include it in the test or a fixture.
736736
- Use `@pytest.mark.parametrize` for testing multiple input combinations rather than copy-pasting similar tests

docs/user-guide/troubleshooting.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -438,4 +438,3 @@ Checklist:
438438
439439
Library bugs. If you encounter this error through documented interfaces,
440440
please [file an issue on GitHub](https://github.com/NVIDIA-NeMo/Safe-Synthesizer/issues).
441-

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ dependencies = [
3737
"colorama>=0.4.6",
3838
"tqdm>=4.67.1",
3939
"setuptools>=80.0.0",
40-
40+
4141
]
4242

4343
[dependency-groups]

pytest.ini

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@ markers =
1818
smoke: Smoke tests - quick tests exercising training/generation hot paths with tiny models
1919
e2e: End-to-end tests - test the entire pipeline from data to generation to evaluation
2020
requires_gpu: Test needs CUDA hardware (orthogonal modifier, stacks on smoke/e2e)
21+
smollm2: SmolLM2 Hub download tests (used by Makefile for process isolation)
22+
unsloth: Unsloth backend tests (process-isolated from DP tests)
2123
noautouse: Marker to skip autouse fixtures for specific tests
2224

2325
# Note: Unit tests (testing single classes/functions with no infrastructure dependencies)

tests/smoke/README.md

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,34 @@ end-to-end without throwing. Use the smallest model that exercises the path
1919
(the local `tiny_llama` stub for most things, SmolLM2-135M when you need
2020
a real tokenizer/model).
2121

22+
## GPU Test Process Isolation
23+
24+
GPU smoke tests run in three separate single-process (`-n 0`) pytest invocations to avoid CUDA and import-time conflicts:
25+
26+
1. Local tiny-model tests (everything except SmolLM2 and Unsloth)
27+
2. SmolLM2 Hub download test (downloads ~270MB from HuggingFace)
28+
3. Unsloth backend test (process-isolated from DP tests)
29+
30+
Why: Unsloth monkey-patches transformers at import time, poisoning Opacus/DP if they share a process. CUDA device-side asserts also cascade across xdist workers. The Makefile `test-smoke-gpu` target handles the split automatically via `-k` filters.
31+
32+
Tests use pytestmark decorators:
33+
34+
```python
35+
pytestmark = [
36+
pytest.mark.requires_gpu,
37+
pytest.mark.skipif(not torch.cuda.is_available(), reason="CUDA not available"),
38+
pytest.mark.skipif(sys.platform == "darwin", reason="Not applicable on macOS"),
39+
]
40+
```
41+
42+
For SmolLM2 and Unsloth tests, add the marker to a test function:
43+
44+
```python
45+
@pytest.mark.usefixtures("_register_smollm2") # for SmolLM2 tests
46+
def test_full_pipeline_smollm2(...):
47+
...
48+
```
49+
2250
## Things that will bite you
2351

2452
- LoRA rank must be 8 (not 4). vLLM silently rejects rank 4. Use `lora_r=8`.
@@ -27,7 +55,6 @@ a real tokenizer/model).
2755
- Stub tokenizer vocab is 32000. If you change the tiny model config, keep `vocab_size=32000` or you'll get shape mismatches.
2856
- Always set `use_unsloth=False` unless you're specifically testing Unsloth. The `auto` default can pull it in and it monkey-patches transformers globally.
2957
- CPU tests need `optim="adamw_torch"`. The production default (`paged_adamw_32bit`) requires bitsandbytes CUDA kernels.
30-
- Unsloth tests run in a separate process. Unsloth patches transformers at import time, which breaks Opacus/DP if they share a process. The Makefile handles this automatically.
3158

3259
## What's in `conftest.py`?
3360

0 commit comments

Comments
 (0)