Skip to content

Difference between embed/convert? #53

@vrdn-23

Description

@vrdn-23

Hey @justinchuby,

First off, thanks for the amazing project! I've been looking for something like this for a while.

I had a couple of questions about the fundamental difference between the convert and embed operations for this library. I have a model trained in-house (which I can't share unfortunately), that seems to be successful when running a conversion but fails when running a embed.

I am also trying to understand what method would be helpful if I wanted to assess the memory usage of the model based on the converted safetensors file (if you know of a way to do this directly through onnx I would be very much interested to hear this). For example, would I able to use the hf-mem tool to accurately calculate VRAM usage of the onnx model after conversion?

~/dev/inferentia main ?30 > uv run onnx-safetensors convert model.onnx model.safetensors                                                                                                    10s py 3.12  14:30:02
Loading ONNX model from model.onnx...
Converting model to safetensors format...
Saving model.safetensors (onnx::MatMul_12379): 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 380/380 [00:00<00:00, 3608.57it/s]
Model saved to model.safetensors
~/dev/inferentia main ?31 > uv run onnx-safetensors embed model.onnx model.safetensors                                                                                                          py 3.12  14:39:53
Loading ONNX model from model.onnx...
Embedding model into safetensors file...
Traceback (most recent call last):
  File "/Users/vidamoda/dev/inferentia/.venv/bin/onnx-safetensors", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/vidamoda/dev/inferentia/.venv/lib/python3.12/site-packages/onnx_safetensors/_cli.py", line 96, in main
    embed_command(args)
  File "/Users/vidamoda/dev/inferentia/.venv/lib/python3.12/site-packages/onnx_safetensors/_cli.py", line 53, in embed_command
    onnx_safetensors.save_safetensors_model(model, output_path)
  File "/Users/vidamoda/dev/inferentia/.venv/lib/python3.12/site-packages/onnx_safetensors/_safetensors_io.py", line 640, in save_safetensors_model
    safetensors.serialize_file(tensor_dict, safetensors_model_path, metadata=metadata)
safetensors_rust.SafetensorError: Error while serializing: header too large


Thanks again for the project!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions