๐ ๐๐๐๐ ๐จ๐๐-๐๐-๐ถ๐๐ ๐ป๐๐๐ ๐๐๐ ๐ท๐๐๐๐๐๐ ๐ด๐๐ ๐๐ ๐จ๐๐๐๐๐๐๐ ๐
- Docs: https://docs.torchmeter.top (Backup link ๐)
- Intro: Provides comprehensive measurement of Pytorch model's
Parameters
,FLOPs/MACs
,Memory-Cost
,Inference-Time
andThroughput
with highly customizable result display โจ
โ ๐๐๐๐-๐ฐ๐๐๐๐๐๐๐๐ ๐ท๐๐๐๐
- Acts as drop-in decorator without any changes of the underlying model
- Seamlessly integrates with
Pytorch
modules while preserving full compatibility (attributes and methods)
โก ๐ญ๐๐๐-๐บ๐๐๐๐ ๐ด๐๐ ๐๐ ๐จ๐๐๐๐๐๐๐๐
Holistic performance analytics across 5 dimensions:
-
Parameter Analysis
- Total/trainable parameter quantification
- Layer-wise parameter distribution analysis
- Gradient state tracking (requires_grad flags)
-
Computational Profiling
- FLOPs/MACs precision calculation
- Operation-wise calculation distribution analysis
- Dynamic input/output detection (number, type, shape, ...)
-
Memory Diagnostics
- Input/output tensor memory awareness
- Hierarchical memory consumption analysis
-
Inference latency & 5. Throughput benchmarking
- Auto warm-up phase execution (eliminates cold-start bias)
- Device-specific high-precision timing
- Inference latency & Throughput Benchmarking
โข ๐น๐๐๐ ๐๐๐๐๐๐๐๐๐๐๐๐๐
-
Programmable tabular report
- Dynamic table structure adjustment
- Style customization and real-time rendering
- Real-time data analysis in programmable way
-
Rich-text hierarchical operation tree
- Style customization and real-time rendering
- Smart module folding based on structural equivalence detection for intuitive model structure insights
โฃ ๐ญ๐๐๐-๐ฎ๐๐๐๐๐๐ ๐ช๐๐๐๐๐๐๐๐๐๐๐๐
- Real-time hot-reload rendering: Dynamic adjustment of rendering configuration for operation trees, report tables and their nested components
- Progressive update: Namespace assignment + dictionary batch update
โค ๐ช๐๐๐๐๐-๐ซ๐๐๐๐๐ ๐น๐๐๐๐๐๐ ๐ด๐๐๐๐๐๐๐๐๐
- Centralized control: Singleton-managed global configuration for dynamic behavior adjustment
- Portable presets: Export/import YAML profiles for runtime behaviors, eliminating repetitive setup
โฅ ๐ท๐๐๐๐๐๐๐๐๐๐ ๐๐๐ ๐ท๐๐๐๐๐๐๐๐๐๐๐
- Decoupled pipeline: Separation of data collection and visualization
- Automatic device synchronization: Maintains production-ready status by keeping model and data co-located
- Dual-mode reporting with export flexibility:
- Measurement units mode vs. raw data mode
- Multi-format export (
CSV
/Excel
) for analysis integration
Note
๐ช๐๐๐๐๐๐๐๐๐๐๐๐
- OS:
windows
/linux
/macOS
Python
: >= 3.8Pytorch
: >= 1.7.0
โ ๐ป๐๐๐๐๐๐ ๐ท๐๐๐๐๐ ๐ท๐๐๐๐๐๐ ๐ด๐๐๐๐๐๐
the most convenient way, suitable for installing the released latest stable version
# pip series
pip/pipx/pipenv install torchmeter
# Or via conda
conda install torchmeter
# Or via uv
uv add torchmeter
# Or via poetry
poetry add torchmeter
# Other managers' usage please refer to their own documentation
โก ๐ป๐๐๐๐๐๐ ๐ฉ๐๐๐๐๐ ๐ซ๐๐๐๐๐๐๐๐๐๐๐
Suitable for installing released historical versions
-
Download
.whl
from PyPI or Github Releases. -
Install locally:
# Replace x.x.x with actual version pip install torchmeter-x.x.x.whl
โข ๐ป๐๐๐๐๐๐ ๐บ๐๐๐๐๐ ๐ช๐๐ ๐
Suitable for who want to try out the upcoming features (may has unknown bugs).
git clone https://github.com/TorchMeter/torchmeter.git
cd torchmeter
# If you want to install the released stable version, use this:
# Don't forget to eplace x.x.x with actual version
git checkout vx.x.x # Stable
# If you want to try the latest development version(alpha/beta), use this:
git checkout master # Development version
pip install .
Refer to tutorials for all scenarios
โโ ๐ซ๐๐๐๐๐๐๐ ๐๐๐๐ ๐๐๐ ๐๐ ๐๐ ๐๐๐๐๐๐๐๐๐๐
Implementation of ExampleNet
import torch.nn as nn class ExampleNet(nn.Module): def __init__(self): super(ExampleNet, self).__init__() self.backbone = nn.Sequential( self._nested_repeat_block(2), self._nested_repeat_block(2) ) self.gap = nn.AdaptiveAvgPool2d(1) self.classifier = nn.Linear(3, 2) def _inner_net(self): return nn.Sequential( nn.Conv2d(10, 10, 1), nn.BatchNorm2d(10), nn.ReLU(), ) def _nested_repeat_block(self, repeat:int=1): inners = [self._inner_net() for _ in range(repeat)] return nn.Sequential( nn.Conv2d(3, 10, 3, stride=1, padding=1), nn.BatchNorm2d(10), nn.ReLU(), *inners, nn.Conv2d(10, 3, 1), nn.BatchNorm2d(3), nn.ReLU() ) def forward(self, x): x = self.backbone(x) x = self.gap(x) x = x.squeeze(dim=(2,3)) return self.classifier(x)
import torch.nn as nn
from torchmeter import Meter
from torch.cuda import is_available as is_cuda
# 1๏ธโฃ Prepare your pytorch model, here is a simple examples
underlying_model = ExampleNet() # see above for implementation of `ExampleNet`
# Set an extra attribute to the model to show
# how torchmeter acts as a zero-intrusion proxy later
underlying_model.example_attr = "ABC"
# 2๏ธโฃ Wrap your model with torchmeter
model = Meter(underlying_model)
# 3๏ธโฃ Validate the zero-intrusion proxy
# Get the model's attribute
print(model.example_attr)
# Get the model's method
# `_inner_net` is a method defined in the ExampleNet
print(hasattr(model, "_inner_net"))
# Move the model to other device (now on cpu)
print(model)
if is_cuda():
model.to("cuda")
print(model) # now on cuda
โก ๐ฎ๐๐ ๐๐๐๐๐๐๐๐ ๐๐๐๐ ๐๐๐ ๐๐๐ ๐๐ ๐๐๐๐๐๐๐๐๐
from rich import print
print(model.structure)
โข ๐ธ๐๐๐๐๐๐๐ ๐๐๐ ๐๐ ๐๐๐๐๐๐๐๐๐๐๐ ๐๐๐๐ ๐๐๐๐๐๐๐ ๐ ๐๐๐๐๐๐๐๐๐
# Parameter Analysis
# Suppose that the `backbone` part of ExampleNet is frozen
_ = model.backbone.requires_grad_(False)
print(model.param)
tb, data = model.profile('param', no_tree=True)
# Before measuring calculation you should first execute a feed-forward
# you do **not** need to concern about the device mismatch,
# just feed the model with the input.
import torch
input = torch.randn(1, 3, 32, 32)
output = model(input)
# Computational Profiling
print(model.cal) # `cal` for calculation
tb, data = model.profile('cal', no_tree=True)
# Memory Diagnostics
print(model.mem) # `mem` for memory
tb, data = model.profile('mem', no_tree=True)
# Performance Benchmarking
print(model.ittp) # `ittp` for inference time & throughput
tb, data = model.profile('ittp', no_tree=True)
# Overall Analytics
print(model.overview())
โฃ ๐ฌ๐๐๐๐๐ ๐๐๐๐๐๐๐ ๐๐๐ ๐๐๐๐๐๐๐ ๐๐๐๐๐๐๐๐
# export to csv
model.profile('param', show=False, save_to="params.csv")
# export to excel
model.profile('cal', show=False, save_to="../calculation.xlsx")
โค ๐จ๐ ๐๐๐๐๐๐ ๐๐๐๐๐
- Attributes/methods access of underlying model
- Automatic device synchronization
- Smart module folding
- Performance gallery
- Customized visulization
- Best practice of programmable tabular report
- Instant export and postponed export
- Centralized configuration management
- Submodule exploration
Thank you for wanting to make TorchMeter
even better!
There are several ways to make a contribution:
Before jumping in, let's ensure smooth collaboration by reviewing our ๐ contribution guidelines first.
Thanks again !
Note
@Ahzyuan
: I'd like to say sorry in advance. Due to my master's studies and job search, I may be too busy in the coming year to address contributions promptly. I'll do my best to handle them as soon as possible. Thanks a lot for your understanding and patience!
Refer to official code-of-conduct file for more details.
-
TorchMeter
is an open-source project built by developers worldwide. We're committed to fostering a friendly, safe, and inclusive environment for all participants. -
This code applies to all community spaces including but not limited to GitHub repositories, community forums, etc.
TorchMeter
is released under the AGPL-3.0 License, see the LICENSE file for the full text. Please carefully review the terms in the LICENSE file before using or distributing TorchMeter
. Ensure compliance with the licensing conditions, especially when integrating this project into larger systems or proprietary software.