Skip to content

[Training] Any way to profile the training of the model? #22614

Open
@martinkorelic

Description

@martinkorelic

Describe the issue

Hi, I was wondering if there are any ways we can profile the training session of the training with details about different operations and memory allocation of the ONNX training model?
Something like this is done for inference, but I am not sure the same steps will work if we initiate a training session.

Is there any other way to get details about the training computation of the ONNX model? Are there any plans to add such a feature?

To reproduce

Create a training session and enable SessionOptions.EnableProfiling().
Observe that nothing gets saved after releasing the session.

Urgency

No response

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

PyTorch Version

Execution Provider

Default CPU

Execution Provider Library Version

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleissues that have not been addressed in a while; categorized by a bottrainingissues related to ONNX Runtime training; typically submitted using template

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions