Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions examples/model_free_ptq/glm_4.6_fp8_block.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
from llmcompressor import model_free_ptq

Comment on lines +1 to +2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To support using os.path.basename for parsing the model path, please import the os module. It's good practice to group standard library imports first, followed by third-party imports.

Suggested change
from llmcompressor import model_free_ptq
import os
from llmcompressor import model_free_ptq

MODEL_ID = "zai-org/GLM-4.6"
SAVE_DIR = MODEL_ID.rstrip("/").split("/")[-1] + "-FP8-BLOCK"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better readability and robustness when parsing the model path, it's recommended to use os.path.basename to extract the model name. This avoids manual string splitting and is generally safer for handling paths.

Suggested change
SAVE_DIR = MODEL_ID.rstrip("/").split("/")[-1] + "-FP8-BLOCK"
SAVE_DIR = os.path.basename(MODEL_ID.rstrip("/")) + "-FP8-BLOCK"


# Apply FP8-Block to the model
# Once quantized, the model is saved
# using compressed-tensors to the SAVE_DIR.
model_free_ptq(
model_stub=MODEL_ID,
save_directory=SAVE_DIR,
scheme="FP8_BLOCK",
ignore=[
"re:.*gate$",
"lm_head",
"model.embed_tokens",
],
max_workers=15,
device="cuda:0",
)
Loading