feat: add filter_keys to log only specified device stats#21707
Conversation
There was a problem hiding this comment.
Pull request overview
Adds an optional filter_keys argument to DeviceStatsMonitor to allow users to restrict which device stats keys get logged, with a warning for unknown keys to help catch typos.
Changes:
- Extend
DeviceStatsMonitor.__init__withfilter_keys: Optional[set[str]]and apply filtering before logging. - Emit a
rank_zero_warnwhenfilter_keyscontains keys not present in collected stats. - Add unit tests covering key filtering behavior and warning emission for unknown keys.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/lightning/pytorch/callbacks/device_stats_monitor.py |
Adds filter_keys support, warning on unknown keys, and updates docstring/examples. |
tests/tests_pytorch/callbacks/test_device_stats_monitor.py |
Adds tests validating filtering behavior and warning on unrecognized keys. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #21707 +/- ##
=========================================
- Coverage 87% 79% -8%
=========================================
Files 270 267 -3
Lines 23973 23926 -47
=========================================
- Hits 20748 18814 -1934
- Misses 3225 5112 +1887 |
What does this PR do?
Adds a
filter_keys: Optional[Set[str]] = Noneargument toDeviceStatsMonitor.When provided, only the specified keys from
get_device_stats()are logged.If a key in
filter_keysis not found in the collected stats, arank_zero_warnis emitted to catch typos early.fixes #11796
Motivation
The original issue requested fine-grained control over which device metrics get logged.
The
cpu_statsflag added in #11795 resolved the CPU+GPU simultaneous logging gap.This PR addresses the remaining open ask: filtering which specific metric keys are tracked,
without introducing per-accelerator flags or a new class hierarchy.
Changes
DeviceStatsMonitor.__init__accepts a newfilter_keysargumentuniformly across all accelerators
rank_zero_warnrather than raising, to avoid breakingruns over a typo
Usage
Tests
test_device_stats_monitor_filter_keys: parametrized, verifies correct keys arepresent/absent on CPU
test_device_stats_monitor_filter_keys_unrecognized_warns: verifiesUserWarningis emitted for unknown keys
Before submitting
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:
Reviewer checklist
📚 Documentation preview 📚: https://pytorch-lightning--21707.org.readthedocs.build/en/21707/