You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: scripts/inc_woq_g2_bkc.md
+33-5Lines changed: 33 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -33,13 +33,42 @@ This script 1) converts official model weights from `torch.float8_e4m3fn` format
33
33
> [!NOTE]
34
34
> For INC WoQ requantization, make sure to:
35
35
> 1) Specify the path to the measurement files in the quantization configuration JSON file.
36
-
>
36
+
>
37
37
> 2) Set the `QUANT_CONFIG` environment variable to point to this configuration file.
38
-
>
38
+
>
39
39
>For more details, refer to the `INC WOQ ReQuant` section in the `single_16k_len_inc.sh` script.
40
40
41
+
### Configure the Measurement Statistics Results
42
+
43
+
The environment variable `INC_MEASUREMENT_DUMP_PATH_PREFIX` specifies the root directory where measurement statistics were saved.
44
+
The final path is constructed by joining this root directory with the `dump_stats_path` defined in the quantization JSON file specified by the `QUANT_CONFIG` environment variable.
45
+
46
+
#### Example
47
+
48
+
If we download the measurements to `/path/to/vllm-fork/scripts/nc_workspace_measure_kvache`, we got below files:
49
+
50
+
```bash
51
+
user:vllm-fork$ pwd
52
+
/path/to/vllm-fork
53
+
user:vllm-fork$ ls -l ./scripts/nc_workspace_measure_kvache
54
+
-rw-r--r-- 1 user Software-SG 1949230 May 15 08:05 inc_measure_output_hooks_maxabs_0_8.json
55
+
-rw-r--r-- 1 user Software-SG 254451 May 15 08:05 inc_measure_output_hooks_maxabs_0_8_mod_list.json
56
+
-rw-r--r-- 1 user Software-SG 1044888 May 15 08:05 inc_measure_output_hooks_maxabs_0_8.npz
57
+
...
58
+
```
59
+
60
+
Then, we export `INC_MEASUREMENT_DUMP_PATH_PREFIX=/path/to/vllm-fork`, and INC will parse the full as below:
0 commit comments