Skip to content

Commit 0487c85

Browse files
committed
chore: add parallel + svdq config yaml
1 parent 0d5b267 commit 0487c85

1 file changed

Lines changed: 13 additions & 0 deletions

File tree

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
parallelism_config:
2+
ulysses_size: auto
3+
attention_backend: native
4+
quantize_config:
5+
quant_type: "svdq_nvfp4_r128_dq"
6+
svdq_kwargs:
7+
quantize_device: "cuda"
8+
runtime_kernel: "v2"
9+
fused_mlp: true
10+
exclude_layers:
11+
- "embedder"
12+
- "embed"
13+
verbose: false

0 commit comments

Comments
 (0)