Skip to content

docs: add external data file handling section to PyTorch export tuto…#28764

Open
CandyButcher27 wants to merge 1 commit into
microsoft:gh-pagesfrom
CandyButcher27:docs/external-data-file-handling
Open

docs: add external data file handling section to PyTorch export tuto…#28764
CandyButcher27 wants to merge 1 commit into
microsoft:gh-pagesfrom
CandyButcher27:docs/external-data-file-handling

Conversation

@CandyButcher27
Copy link
Copy Markdown

Fixes #28763

Description

Added a new section "Handling External Data Files (Large Models)" to the PyTorch export tutorial.

Documents a common production gotcha: exporting large PyTorch models with opset 17+ automatically splits the output into model.onnx + model.onnx.data. InferenceSession fails if the .data file is missing.

Also documents how to merge the split files into a single self-contained .onnx file for distribution using the onnx library.

Motivation and Context

This gap trips up anyone exporting modern PyTorch models for CPU deployment. Neither the Python inference tutorial nor the PyTorch export tutorial mentioned this behavior. Discovered while deploying an EfficientNet-B2 model exported with PyTorch 2.12 + opset 17.

@CandyButcher27
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

```python
import onnx

model = onnx.load("model.onnx", load_external_data=True)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot what do you recommend for files > 2Gb? (not supported by protobuf)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants