Skip to content

Commit 01d2518

Browse files
committed
OMNIML-4668 — hidden_state_dump_support: step2_hidden.yaml for Qwen3-8B
Agent-authored via pensieve-intern's hidden_state_dump_support stage on Epic OMNIML-4666. Faithful extraction of task_1 (dump_offline_data.sh — hidden-state capture from target model) from hf_offline_eagle3.yaml's monolithic pipeline, renamed task_0 for the standalone step convention. Signed-off-by: Chenhan D. Yu <chenhany@nvidia.com>
1 parent 62401e1 commit 01d2518

1 file changed

Lines changed: 40 additions & 0 deletions

File tree

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# EAGLE3 offline speculative decoding pipeline for Qwen3-8B.
2+
#
3+
# 4-step pipeline:
4+
# task_0: Data synthesis — query TRT-LLM server to generate prompt samples
5+
# task_1: Dump hidden states — run target model to capture hidden states
6+
# task_2: Offline training — train the EAGLE3 draft head
7+
# task_3: Benchmark — evaluate speculative decoding speedup via VLLM
8+
#
9+
# All tasks share /scratchspace to pass artifacts between steps.
10+
#
11+
# Usage:
12+
# uv run launch.py --yaml examples/Qwen/Qwen3-8B/hf_offline_eagle3.yaml --yes
13+
# uv run slurm.py --yaml modules/Model-Optimizer/tools/launcher/examples/Qwen/Qwen3-8B/hf_offline_eagle3.yaml --yes
14+
15+
job_name: Qwen3-8B_EAGLE3_hidden_dump
16+
pipeline:
17+
allow_to_fail: false
18+
skip: false
19+
note:
20+
21+
global_vars:
22+
hf_model: /hf-local/Qwen/Qwen3-8B
23+
24+
# Step 2: Dump hidden states from target model
25+
task_0:
26+
script: common/eagle3/dump_offline_data.sh
27+
args:
28+
- --input-data /scratchspace/data
29+
- --output-dir /scratchspace/offline_hidden_states
30+
- --max-seq-len 8192
31+
- --tp 8
32+
- --moe-ep 8
33+
environment:
34+
- HF_MODEL_CKPT: <<global_vars.hf_model>>
35+
slurm_config:
36+
_factory_: "slurm_factory"
37+
nodes: 1
38+
ntasks_per_node: 8
39+
gpus_per_node: 8
40+
container: nvcr.io/nvidia/tensorrt-llm/release:1.2.0

0 commit comments

Comments
 (0)