Skip to content

Commit 93a12f1

Browse files
committed
remove CMP from all 2of4 examples
Signed-off-by: Kyle Sayers <[email protected]>
1 parent acca004 commit 93a12f1

File tree

4 files changed

+2
-49
lines changed

4 files changed

+2
-49
lines changed

examples/quantization_2of4_sparse_w4a16/2of4_w4a16_group-128_recipe.yaml

-14
Original file line numberDiff line numberDiff line change
@@ -6,20 +6,6 @@ sparsity_stage:
66
mask_structure: "2:4"
77
targets: ["Linear"]
88
ignore: ["re:.*lm_head"]
9-
finetuning_stage:
10-
run_type: train
11-
finetuning_modifiers:
12-
ConstantPruningModifier:
13-
targets: [
14-
're:.*q_proj.weight',
15-
're:.*k_proj.weight',
16-
're:.*v_proj.weight',
17-
're:.*o_proj.weight',
18-
're:.*gate_proj.weight',
19-
're:.*up_proj.weight',
20-
're:.*down_proj.weight',
21-
]
22-
start: 0
239
quantization_stage:
2410
run_type: oneshot
2511
quantization_modifiers:

examples/quantization_2of4_sparse_w4a16/2of4_w4a16_recipe.yaml

-14
Original file line numberDiff line numberDiff line change
@@ -6,20 +6,6 @@ sparsity_stage:
66
mask_structure: "2:4"
77
targets: ["Linear"]
88
ignore: ["re:.*lm_head"]
9-
finetuning_stage:
10-
run_type: train
11-
finetuning_modifiers:
12-
ConstantPruningModifier:
13-
targets: [
14-
're:.*q_proj.weight',
15-
're:.*k_proj.weight',
16-
're:.*v_proj.weight',
17-
're:.*o_proj.weight',
18-
're:.*gate_proj.weight',
19-
're:.*up_proj.weight',
20-
're:.*down_proj.weight',
21-
]
22-
start: 0
239
quantization_stage:
2410
run_type: oneshot
2511
quantization_modifiers:

examples/sparse_2of4_quantization_fp8/README.md

+2-10
Original file line numberDiff line numberDiff line change
@@ -63,21 +63,13 @@ recipe = [
6363
]
6464

6565
if fp8_enabled:
66-
recipe.extend([
66+
recipe.append(
6767
QuantizationModifier(
6868
targets=["Linear"],
6969
ignore=["lm_head"],
7070
scheme="FP8_DYNAMIC",
7171
),
72-
ConstantPruningModifier(
73-
targets=[
74-
r"re:.*q_proj.weight", r"re:.*k_proj.weight", r"re:.*v_proj.weight",
75-
r"re:.*o_proj.weight", r"re:.*gate_proj.weight", r"re:.*up_proj.weight",
76-
r"re:.*down_proj.weight",
77-
],
78-
start=0,
79-
),
80-
])
72+
)
8173
```
8274

8375
2. **Apply Compression**

tests/e2e/vLLM/recipes/Sparse_2of4/recipe_sparse_2of4_fp8_dynamic.yaml

-11
Original file line numberDiff line numberDiff line change
@@ -9,17 +9,6 @@ sparsity_stage:
99
quantization_stage:
1010
run_type: oneshot
1111
quantization_modifiers:
12-
ConstantPruningModifier:
13-
targets: [
14-
're:.*q_proj.weight',
15-
're:.*k_proj.weight',
16-
're:.*v_proj.weight',
17-
're:.*o_proj.weight',
18-
're:.*gate_proj.weight',
19-
're:.*up_proj.weight',
20-
're:.*down_proj.weight',
21-
]
22-
start: 0
2312
QuantizationModifier:
2413
targets: ["Linear"]
2514
ignore: ["lm_head"]

0 commit comments

Comments
 (0)