-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Add documentation warning: Don’t use torch.profiler.profile context manager around Trainer methods #20864
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add documentation warning: Don’t use torch.profiler.profile context manager around Trainer methods #20864
Conversation
for more information, see https://pre-commit.ci
|
||
**References:** | ||
- https://github.com/pytorch/pytorch/issues/88472 | ||
- https://github.com/Lightning-AI/lightning/issues/16958 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These issues capture user reports and maintainer confirmations of the error caused by using torch.profiler.profile around Trainer methods, just supporting the reason for this doc warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please elaborate on how this Bump pytest from 7.2.0 to 7.2.2 in /requirements #16958
captures the profiler's issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out the confusion. I’ve removed the unrelated Lightning PR and forum links from the references, now only the directly relevant PyTorch issue is cited to support this documentation warning.
What does this PR do?
This PR adds a prominent documentation warning to the Trainer class docstring (and optionally Profiler docs) to alert users not to wrap Trainer.fit(), Trainer.validate(), or similar methods inside a manual torch.profiler.profile context manager.
Instead, users are advised to use the profiler argument in the Trainer constructor, which is the robust and officially supported method for profiling with Lightning.
This change aims to prevent a common source of cryptic internal errors and crashes reported by users who attempt to manually profile the Trainer, and improves developer experience by making the correct usage obvious in the docs.
Fixes #20779
(You can fill in the relevant issue if one exists, such as #16958)
This PR does not introduce any breaking changes.
Before submitting
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:
Reviewer checklist
Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Thanks for considering this! This documentation improvement should help prevent many user headaches and support requests.
📚 Documentation preview 📚: https://pytorch-lightning--20864.org.readthedocs.build/en/20864/