Open
Description
This is where the quantize.py saves the quantized model:
Line 97 in 9fd1ead
It looks like this:
def save_checkpoint(self, cfg: DictConfig):
ckpt_dict = self._model.state_dict()
file_name = cfg.checkpointer.checkpoint_files[0].split(".")[0]
output_dir = Path(cfg.checkpointer.output_dir)
output_dir.mkdir(exist_ok=True)
checkpoint_file = Path.joinpath(
output_dir, f"{file_name}-{self._quantization_mode}".rstrip("-qat")
).with_suffix(".pt")
torch.save(ckpt_dict, checkpoint_file)
This means that even if the input are multiple .safetensors files, it will save a single .pt file, making it not too attractive.