Commit 0081473
Speed up slow unit/gpu/example tests (#1616)
### What does this PR do?
Type of change: test infrastructure / test speedups + CI stabilization
Make the test suite faster, `tests/unit` hermetic, and the CI lanes
stable, without losing coverage. Most changes are mechanical test/infra
edits; the buckets below cover the diff broadly.
**Unit tests — hermetic (no HF Hub):** toy local datasets/configs + the
local tiny tokenizer (with a checked-in chat template) replace Hub
assets; `tests/unit/conftest.py` enforces offline mode. Genuinely-HF
tests moved to `tests/gpu*` (e.g. the new
`tests/gpu/torch/utils/test_dataset_utils.py`). `CONTRIBUTING.md`
documents the hermetic-unit-test expectation.
**Unit-test speedups (no coverage loss):** speculative (disable CPU
torch.compile), calibrator (fewer histogram bins), ONNX conv/dynamo
(smaller shapes + representative subset), Ruler/sparse-attention (local
tokenizer), data-parallel autoquant (world size 4→2). Shared
`tiny_tokenizer` fixture. The distributed test helper now uses a private
`spawn` context instead of mutating the global start method (avoids
cross-test contamination).
**Rarely-used autonas/fastnas tests:** heavy parametrize cases marked
`@pytest.mark.manual`, one representative kept per test (fastnas
preferred); lighter sibling tests still cover core behavior. The legacy
FSDP1 NAS distributed test is also dropped: FastNAS/AutoNAS aren't used
with either FSDP1 or FSDP2, and FSDP1 is superseded by the newer FSDP2
API — so we keep a single FSDP2 case as a sanity check and drop FSDP1,
leaving the suite leaner.
**gpu_megatron:** deduplicate distributed worker pools by world_size
within a module (saves a redundant pool spin-up in multi-pool files;
module-scoped, no cross-module reuse).
**Example tests:** reduce per-test work via args that default to current
behavior (tests pass the fast values) — torch_onnx TRT optimization
level, diffusers calibration/inference steps, eagle `sample_size`,
megatron_bridge iters/calib, llm_sparsity data slice, export
safetensors-structure `calib_size`. Also enable the recently added
`gpt-oss` example tests in CI.
**Per-test timeouts:** `pytest-timeout` with a default per-directory
timeout (60s unit / 300s gpu+example) enforced in `tests/conftest.py`
(`timeout_func_only` in `pyproject.toml`), so a new test cannot silently
exceed the budget — an unmapped test dir crashes collection. A few
inherently slow tests carry explicit higher per-test overrides
(CUDA-compile, autotune, dflash).
**CUDA kernel pre-compilation:** a dedicated `tests/gpu/_extensions`
test JIT-builds the conv3d implicit-GEMM kernel up front (collected
before the functional tests in the same process) so the one-time build
cost no longer lands on — and time out — the first functional test that
uses it. Mirrored into the `llm_ptq`/`vlm_ptq` example lanes.
**Test relocation & optional-dependency guards:** vLLM sparsity plugin
test moved to `tests/gpu_vllm` (drops the in-test `importorskip`);
diffusers-dependent unit test guarded with `importorskip("diffusers")`
for partial-install lanes; `gpt_oss` example test dir renamed to
`gpt-oss` to match the CI matrix.
**Diffusers test models:** shared model-path constants in
`tests/_test_utils/examples/models.py` consolidated/renamed and point at
tiny `hf-internal-testing` test pipes (SDXL/SD3/FLUX) so
cachify/quantize/export tests run on toy weights; `local_id`s
normalized.
**Shared dataset utils:** `examples/llm_sparsity/.../hf_pts.py` now uses
`get_dataset_dataloader` (drops the bespoke cnn_dailymail-only
`get_calib_dataloader`; supports any registered/HF/JSONL dataset,
includes attention_mask); `data_prep.py` gains `--max_samples`.
**CI workflows:** container image bumps (pytorch 26.04→26.05, TRT-LLM
rc16→rc17) and tightened lane timeouts (unit 30→15 min, gpu lanes
trimmed, onnx example lane 45 min).
**Imports at top of file:** in-function imports across the test suite
are moved to module top per the coding guideline, conservatively —
optional deps stay guarded (in-function or behind a module-level
`importorskip`) in `tests/unit` since the partial-install lane runs
without them, and build/hardware-availability imports (apex, triton,
megatron/transformer_engine, tensorrt_llm) plus `_test_utils` lazy
guards are left in place.
**Kernel warning filters:** the repeated `filterwarnings` blanket-ignore
in six `tests/gpu/torch/kernels/**` modules is consolidated into a
scoped hook in `tests/gpu/torch/kernels/conftest.py` (kernel tests only
— the rest of the suite keeps surfacing warnings).
**Eagle example speedups:** `torch.compile` (eagle recipe default) added
~2 min to every eagle training test; it's now disabled in the eagle
example tests except one smoke (`test_llama_eagle3[1-False]`), and the
downstream resume / AR-validate / export tests point at the compile-free
checkpoint. Measured: `test_ar_validate` 139s→17s, offline training
142s→22s, streaming 140s→23s — the compile path is still smoke-tested
once.
**Example lanes install editable (`-e`):** so example scripts launched
as subprocesses resolve `modelopt` to the same source path as the test
process and reuse the pre-compiled CUDA-extension cache instead of
recompiling (~2 min/test); verified in the TRT-LLM container.
**Tiny test tokenizer:** `get_tiny_tokenizer` defaults to left padding
(what decoder-LM calibration expects) and ships a terse
generation-tagged chat template — replacing a verbose ChatML one that
inflated tokenized length on the 128-vocab tokenizer and broke the
offline-PTQ example tests' `max-seq-len` filter.
**Restored Hub-download coverage:** the live (ungated) HF dataset
round-trips exercising `get_dataset_samples`' download branch now live
in `tests/gpu/torch/utils/test_dataset_utils.py` (they had been dropped
from the hermetic unit file without a counterpart).
Individual file changes not explicitly called out above fall under this
general test/CI cleanup.
### Testing
Unit + the touched gpu_megatron files validated locally; example/GPU
lanes validated in CI.
### Before your PR is "*Ready for review*"
- Is this change backward compatible?: ✅ (tests + example CLI args
default to prior behavior)
- If you copied code from any other sources or added a new PIP
dependency: N/A
- Did you write any new necessary tests?: N/A (optimizes/relocates
existing tests)
- Did you update Changelog?: N/A
- Did you get Claude approval on this PR?: ❌ (pending)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
## Release Notes
* **Chores**
* Updated container image versions for PyTorch (26.04→26.05),
TensorRT-LLM (1.3.0rc16→1.3.0rc17), and ONNX/TensorRT (26.04→26.05).
* **Tests**
* Enhanced test isolation: unit tests now run hermetically without
HuggingFace Hub access.
* Optimized test runtime via smaller model/dataset parameters and
parallel test caching.
* Added CUDA extension availability tests and extended dataset utility
coverage.
* **Documentation**
* Updated testing guidelines in `CONTRIBUTING.md` to emphasize offline
test design.
* **Chores**
* Added pytest timeout configuration and improved CI/CD workflow
efficiency with editable installs.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>1 parent ca7eb64 commit 0081473
88 files changed
Lines changed: 907 additions & 723 deletions
File tree
- .github/workflows
- examples
- diffusers/quantization
- llm_eval
- llm_ptq/scripts
- llm_sparsity/weight_sparsity
- tests
- _test_utils
- examples
- onnx
- torch
- distributed
- quantization
- tokenizer
- examples
- diffusers
- gpt-oss
- llm_eval
- llm_ptq/_extensions
- llm_sparsity/weight_sparsity
- megatron_bridge
- speculative_decoding
- vlm_ptq/_extensions
- gpu_megatron
- torch
- export
- quantization
- plugins
- gpu_vllm/torch/sparsity/attention_sparsity
- gpu
- _extensions
- onnx/quantization
- autotune
- torch
- export
- kernels
- common/attention
- quantization/conv
- sparsity/attention
- nas
- quantization
- plugins
- sparsity/attention_sparsity
- utils
- regression/torch/speculative
- unit
- onnx
- autocast
- quantization
- recipe
- torch
- deploy/utils
- export
- nas
- quantization
- plugins
- sparsity/attention_sparsity
- speculative/plugins
- utils
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
50 | | - | |
51 | | - | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
52 | 56 | | |
53 | 57 | | |
54 | 58 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
38 | | - | |
| 38 | + | |
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
45 | | - | |
| 45 | + | |
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
| |||
59 | 59 | | |
60 | 60 | | |
61 | 61 | | |
62 | | - | |
| 62 | + | |
63 | 63 | | |
64 | 64 | | |
65 | 65 | | |
| |||
73 | 73 | | |
74 | 74 | | |
75 | 75 | | |
76 | | - | |
| 76 | + | |
77 | 77 | | |
78 | 78 | | |
79 | 79 | | |
| |||
102 | 102 | | |
103 | 103 | | |
104 | 104 | | |
105 | | - | |
| 105 | + | |
106 | 106 | | |
| 107 | + | |
107 | 108 | | |
108 | 109 | | |
109 | 110 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
42 | | - | |
43 | | - | |
| 42 | + | |
| 43 | + | |
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
49 | | - | |
| 49 | + | |
50 | 50 | | |
51 | | - | |
| 51 | + | |
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
58 | 58 | | |
59 | 59 | | |
60 | 60 | | |
61 | | - | |
| 61 | + | |
62 | 62 | | |
63 | 63 | | |
64 | 64 | | |
| |||
78 | 78 | | |
79 | 79 | | |
80 | 80 | | |
81 | | - | |
| 81 | + | |
82 | 82 | | |
83 | 83 | | |
84 | 84 | | |
| |||
90 | 90 | | |
91 | 91 | | |
92 | 92 | | |
93 | | - | |
| 93 | + | |
94 | 94 | | |
95 | 95 | | |
96 | 96 | | |
| |||
115 | 115 | | |
116 | 116 | | |
117 | 117 | | |
118 | | - | |
| 118 | + | |
119 | 119 | | |
120 | 120 | | |
121 | 121 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
163 | 163 | | |
164 | 164 | | |
165 | 165 | | |
166 | | - | |
167 | | - | |
168 | | - | |
169 | | - | |
170 | | - | |
171 | | - | |
172 | | - | |
173 | | - | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
174 | 180 | | |
175 | 181 | | |
176 | 182 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
68 | | - | |
| 68 | + | |
69 | 69 | | |
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
73 | 73 | | |
74 | 74 | | |
75 | | - | |
| 75 | + | |
76 | 76 | | |
77 | 77 | | |
78 | 78 | | |
| |||
186 | 186 | | |
187 | 187 | | |
188 | 188 | | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
189 | 195 | | |
190 | 196 | | |
191 | 197 | | |
| |||
235 | 241 | | |
236 | 242 | | |
237 | 243 | | |
238 | | - | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
239 | 247 | | |
240 | 248 | | |
241 | 249 | | |
| |||
322 | 330 | | |
323 | 331 | | |
324 | 332 | | |
325 | | - | |
| 333 | + | |
326 | 334 | | |
327 | 335 | | |
328 | 336 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| 25 | + | |
25 | 26 | | |
26 | 27 | | |
27 | 28 | | |
| |||
42 | 43 | | |
43 | 44 | | |
44 | 45 | | |
45 | | - | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
328 | 328 | | |
329 | 329 | | |
330 | 330 | | |
331 | | - | |
| 331 | + | |
332 | 332 | | |
333 | 333 | | |
334 | 334 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
41 | | - | |
| 41 | + | |
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
| 63 | + | |
63 | 64 | | |
64 | 65 | | |
65 | 66 | | |
| |||
159 | 160 | | |
160 | 161 | | |
161 | 162 | | |
| 163 | + | |
162 | 164 | | |
163 | 165 | | |
164 | 166 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
42 | 49 | | |
43 | 50 | | |
44 | 51 | | |
| |||
48 | 55 | | |
49 | 56 | | |
50 | 57 | | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
51 | 66 | | |
52 | 67 | | |
53 | 68 | | |
| |||
0 commit comments