-
Notifications
You must be signed in to change notification settings - Fork 284
Open
Description
System Info
Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.
transformersversion: 4.43.4- Platform: Linux-6.1.0-37-amd64-x86_64-with-glibc2.35
- Python version: 3.10.14
- Huggingface_hub version: 0.27.1
- Safetensors version: 0.6.2
- Accelerate version: not installed
- Accelerate config: not found
- PyTorch version (GPU?): 2.8.0+cu128 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?: no
- Using GPU in script?: yes
- GPU type: NVIDIA RTX 4000 Ada Generation
Information
- The official example scripts
- My own modified scripts
Reproduction
I found the problem to be reported already here: huggingface/accelerate#3101 (comment)
However their solution safe_serialization=False seems not to work anymore and also does only partially solve the problem.
Basically the following error occurs when I try to save a custom model containing a torch.nn.GRU with model.save_pretrained('test'):
Traceback (most recent call last):
File "/path/main.py", line 156, in <module>
model.save_pretrained('./test')
File "/opt/miniconda3/envs/customlipsync/lib/python3.10/site-packages/huggingface_hub/hub_mixin.py", line 408, in save_pretrained
self._save_pretrained(save_directory)
File "/opt/miniconda3/envs/customlipsync/lib/python3.10/site-packages/huggingface_hub/hub_mixin.py", line 755, in _save_pretrained
save_model_as_safetensor(model_to_save, str(save_directory / constants.SAFETENSORS_SINGLE_FILE))
File "/opt/miniconda3/envs/customlipsync/lib/python3.10/site-packages/safetensors/torch.py", line 169, in save_model
to_removes = _remove_duplicate_names(state_dict)
File "/opt/miniconda3/envs/customlipsync/lib/python3.10/site-packages/safetensors/torch.py", line 113, in _remove_duplicate_names
raise RuntimeError(
RuntimeError: Error while trying to find names to remove to save state dict, but found no suitable name to keep for saving amongst: {'gru.weight_ih_l0'}. None is covering the entire storage.Refusing to save/load the model since you could be storing much more memory than needed. Please refer to https://huggingface.co/docs/safetensors/torch_shared_tensors for more information. Or open an issue.
Adding a check if there is only one shared tensor, I found here: https://huggingface.co/spaces/safetensors/convert/blob/main/convert.py#L54 to the torch implementation here (I guess)
| if not complete_names: |
Expected behavior
The model is saved without an error.
Silverster98
Metadata
Metadata
Assignees
Labels
No labels