Difference between embed/convert?

Hey @justinchuby,

First off, thanks for the amazing project! I've been looking for something like this for a while. 

I had a couple of questions about the fundamental difference between the `convert` and `embed` operations for this library. I have a model trained in-house (which I can't share unfortunately), that seems to be successful when running a conversion but fails when running a `embed`. 

I am also trying to understand what method would be helpful if I wanted to assess the memory usage of the model based on the converted safetensors file (if you know of a way to do this directly through onnx I would be very much interested to hear this). For example, would I able to use the [hf-mem](https://github.com/alvarobartt/hf-mem) tool to accurately calculate VRAM usage of the onnx model after conversion?

```
~/dev/inferentia main ?30 > uv run onnx-safetensors convert model.onnx model.safetensors                                                                                                    10s py 3.12  14:30:02
Loading ONNX model from model.onnx...
Converting model to safetensors format...
Saving model.safetensors (onnx::MatMul_12379): 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 380/380 [00:00<00:00, 3608.57it/s]
Model saved to model.safetensors
```

```
~/dev/inferentia main ?31 > uv run onnx-safetensors embed model.onnx model.safetensors                                                                                                          py 3.12  14:39:53
Loading ONNX model from model.onnx...
Embedding model into safetensors file...
Traceback (most recent call last):
  File "/Users/vidamoda/dev/inferentia/.venv/bin/onnx-safetensors", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/vidamoda/dev/inferentia/.venv/lib/python3.12/site-packages/onnx_safetensors/_cli.py", line 96, in main
    embed_command(args)
  File "/Users/vidamoda/dev/inferentia/.venv/lib/python3.12/site-packages/onnx_safetensors/_cli.py", line 53, in embed_command
    onnx_safetensors.save_safetensors_model(model, output_path)
  File "/Users/vidamoda/dev/inferentia/.venv/lib/python3.12/site-packages/onnx_safetensors/_safetensors_io.py", line 640, in save_safetensors_model
    safetensors.serialize_file(tensor_dict, safetensors_model_path, metadata=metadata)
safetensors_rust.SafetensorError: Error while serializing: header too large


```

Thanks again for the project!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference between embed/convert? #53

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Difference between embed/convert? #53

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions