Commit 54fb87e
authored
[OMNIML-4869] author_yaml (#1574)
Draft PR opened by **pensieve-intern** for
[OMNIML-4869](https://jirasw.nvidia.com/browse/OMNIML-4869).
Stage `author_yaml` of Epic `OMNIML-4868`. The agent ran from the SPEC
on the ticket description; review every change before marking ready.
_Always-draft is enforced — the bot never auto-merges._
---
**Agent's self-narration** (stripped from PR diff; surfaced here for
context):
`VERIFICATION_COMMENT.txt`:
```
OMNIML-4869 status: model not staged on cw_dfw.
Spec checks:
- HF Hub model exists: Qwen/Qwen3.5-4B (public, not gated; model_type=qwen3_5).
- Cluster stage check failed: /lustre/fsw/portfolios/coreai/projects/coreai_dlalgo_modelopt/hf-local/Qwen/Qwen3.5-4B not found.
Requested action on Epic: please stage the model via `/manage-assets cw_dfw get Qwen/Qwen3.5-4B`.
Notes:
- YAML already exists at tools/launcher/examples/Qwen/Qwen3.5-4B/specdec_bench_mtp_vllm.yaml with tp_size=1, gpus_per_node=1, and container vllm/vllm-openai:qwen3_5-cu130.
- Once staged, I can run the required dry-run: `uv run launch.py --yaml tools/launcher/examples/Qwen/Qwen3.5-4B/specdec_bench_mtp_vllm.yaml --dryrun --yes -v`.
```
_Pollution-strip removed `VERIFICATION_COMMENT.txt` from this commit
(sidecar narration and/or incidental lockfile regeneration are never
part of the agent's intended deliverable)._
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Chores**
* Added a benchmark configuration to evaluate speculative decoding (MTP)
performance for the Qwen3.5-4B model, enabling separate speed and
high-throughput (32k) runs with adjustable decoding and runtime
parameters.
* Configured execution settings for single-node GPU benchmarking and
standardized container/runtime invocation for reproducible performance
tests.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Signed-off-by: Chenhan Yu <chenhany@nvidia.com>1 parent 7ae4ee7 commit 54fb87e
1 file changed
Lines changed: 67 additions & 0 deletions
Lines changed: 67 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
0 commit comments