-
Notifications
You must be signed in to change notification settings - Fork 284
Description
System Info
safetensors version : 0.6.0
torch version : 2.6.0
torcn_npu version : 2.6.0
ascend cann version : 8.2.RC1
npu version : 24.1.rc3
Hello, I'm using the document parsing component Docling, which runs on a Huawei Ascend server with the version listed above. Docling reports an error when loading the SafeTensors model, and I can't figure it out. The same code and model file work fine on a CUDA GPU, but not on an Ascend NPU. I'd appreciate any help in resolving this issue. The error message is as follows:
Traceback (most recent call last):
File "/usr/local/python3.11.0/bin/docling", line 8, in
sys.exit(app())
^^^^^
File "/usr/local/python3.11.0/lib/python3.11/site-packages/typer/main.py", line 341, in call
raise e
File "/usr/local/python3.11.0/lib/python3.11/site-packages/typer/main.py", line 324, in call
return get_command(self)(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.0/lib/python3.11/site-packages/click/core.py", line 1442, in call
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.0/lib/python3.11/site-packages/typer/core.py", line 694, in main
return _main(
^^^^^^
File "/usr/local/python3.11.0/lib/python3.11/site-packages/typer/core.py", line 195, in _main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.0/lib/python3.11/site-packages/click/core.py", line 1226, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.0/lib/python3.11/site-packages/click/core.py", line 794, in invoke
return callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.0/lib/python3.11/site-packages/typer/main.py", line 699, in wrapper
return callback(**use_params)
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.0/lib/python3.11/site-packages/docling/cli/main.py", line 690, in convert
export_documents(
File "/usr/local/python3.11.0/lib/python3.11/site-packages/docling/cli/main.py", line 192, in export_documents
for conv_res in conv_results:
File "/usr/local/python3.11.0/lib/python3.11/site-packages/docling/document_converter.py", line 258, in convert_all
for conv_res in conv_res_iter:
File "/usr/local/python3.11.0/lib/python3.11/site-packages/docling/document_converter.py", line 293, in _convert
for item in map(
File "/usr/local/python3.11.0/lib/python3.11/site-packages/docling/document_converter.py", line 339, in _process_document
conv_res = self._execute_pipeline(in_doc, raises_on_error=raises_on_error)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.0/lib/python3.11/site-packages/docling/document_converter.py", line 360, in _execute_pipeline
pipeline = self._get_pipeline(in_doc.format)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.0/lib/python3.11/site-packages/docling/document_converter.py", line 322, in _get_pipeline
self.initialized_pipelines[cache_key] = pipeline_class(
^^^^^^^^^^^^^^^
File "/usr/local/python3.11.0/lib/python3.11/site-packages/docling/pipeline/standard_pdf_pipeline.py", line 85, in init
TableStructureModel(
File "/usr/local/python3.11.0/lib/python3.11/site-packages/docling/models/table_structure_model.py", line 84, in init
self.tf_predictor = TFPredictor(
^^^^^^^^^^^^
File "/usr/local/python3.11.0/lib/python3.11/site-packages/docling_ibm_models/tableformer/data_management/tf_predictor.py", line 131, in init
self._model = self._load_model()
^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.0/lib/python3.11/site-packages/docling_ibm_models/tableformer/data_management/tf_predictor.py", line 208, in _load_model
missing, unexpected = load_model(model, model_fn, device=self._device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.0/lib/python3.11/site-packages/safetensors/torch.py", line 224, in load_model
to_removes = _remove_duplicate_names(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.0/lib/python3.11/site-packages/safetensors/torch.py", line 115, in _remove_duplicate_names
raise RuntimeError(
RuntimeError: Error while trying to find names to remove to save state dict, but found no suitable name to keep for saving amongst: {'_encoder._resnet.0.weight'}. None is covering the entire storage.Refusing to save/load the model since you could be storing much more memory than needed. Please refer to https://huggingface.co/docs/safetensors/torch_shared_tensors for more information. Or open an issue.
Information
- The official example scripts
- My own modified scripts
Reproduction
def _remove_duplicate_names(
state_dict: Dict[str, torch.Tensor],
*,
preferred_names: Optional[List[str]] = None,
discard_names: Optional[List[str]] = None,
) -> Dict[str, List[str]]:
if preferred_names is None:
preferred_names = []
preferred_names = set(preferred_names)
if discard_names is None:
discard_names = []
discard_names = set(discard_names)
shareds = _find_shared_tensors(state_dict)
to_remove = defaultdict(list)
for shared in shareds:
complete_names = set(
[name for name in shared if _is_complete(state_dict[name])]
)
if not complete_names:
raise RuntimeError(
"Error while trying to find names to remove to save state dict, but found no suitable name to keep"
f" for saving amongst: {shared}. None is covering the entire storage.Refusing to save/load the model"
" since you could be storing much more memory than needed. Please refer to"
" https://huggingface.co/docs/safetensors/torch_shared_tensors for more information. Or open an"
" issue."
)
Expected behavior
help !