TT Swiss - A swiss army knife for model bringup 🇨🇭

This repo is a collection of all of the useful tools for enabling models to work on TT hardware. This includes:

Memory profiler ttmem - useful for look at memory usage of the model. Signs that you need this - errors like Out of Memory: Not enough space to allocate <nbytes> B DRAM buffer across <nbanks> banks
Model analyzer ttchop - analyzes PyTorch models to identify which modules/ops work on TT hardware. Generates interactive HTML report showing pass/fail status for each module.
Claude skills and commands - We recommend you copy paste these in your ~/.claude or <tt-xla-path>/.claude for easier debugging of models.

One click setup

This installs both ttmem and ttchop CLI tools and python packages

pip install git+https://github.com/vkovinicTT/tt-swiss.git

Prerequisites

Before using tt-swiss, you need to configure TT-XLA for memory logging and op by op testing:

1. Build `tt-xla` with debug flags and Python bindings enabled

source venv/activate

cmake -G Ninja -B build -DCMAKE_BUILD_TYPE=Debug -DTT_RUNTIME_DEBUG=ON -DTTMLIR_ENABLE_BINDINGS_PYTHON=ON

cmake --build build

2. Export runtime logger flag (for op and memory info)

export TTMLIR_RUNTIME_LOGGER_LEVEL=DEBUG
export TT_RUNTIME_MEMORY_LOG_LEVEL=operation

3. Initialize TTRT artifacts

ttrt query --save-artifacts # --disable-eth-dispatch # add this for blackhole qb

Installation

cd /path/to/tt-xla
source venv/activate
pip install git+https://github.com/vkovinicTT/tt-swiss.git

Editable Install (for development)

If you want to modify tt-swiss and have changes reflected immediately:

cd /path/to/tt-swiss
pip install -e .

Note: Editable installs require setuptools>=64. If you get an error about a missing build_editable hook, upgrade setuptools first:
pip install --upgrade setuptools pip

Note: Always activate the tt-xla environment first (source venv/activate). This sets up the required paths for the model analyzer to find tt-xla's op-by-op test infrastructure.

Memory profiler

Browser usage

No need to install anything. Just do

Check if you have debug build of tt-xla: grep CMAKE_BUILD_TYPE build/CMakeCache.txt. If not, rebuild in debug
Turn on these enviroment variables when running your python script: TTMLIR_RUNTIME_LOGGER_LEVEL=DEBUG TT_RUNTIME_MEMORY_LOG_LEVEL=operation
Upload your logs to http://yyz2-forge-dash.local.tenstorrent.com:9000/

Usage

Try It with Example Logs

Example log files are included in example_logs/ so you can try ttmem without running a model first.

Interactive CLI (Recommended for Remote Development)

ttmem

The interactive CLI guides you through the process with prompts:

Asks if you have a log file ready (shows prerequisites if not)
Prompts for the log file path with autocomplete
Parses the log and generates the HTML report
Optionally starts an HTTP server for remote viewing

When working on a remote machine via VS Code Remote SSH, the HTTP server option allows you to view the report in your local browser. VS Code automatically forwards the port, so http://localhost:8000/report.html will work from your local machine.

Command Line Interface

# Default: run + parse + visualize (recommended)
tt-memory-profiler path/to/your_model.py

# Only capture logs (for later processing)
tt-memory-profiler --log path/to/your_model.py

# Parse existing log file
tt-memory-profiler --analyze logs/your_model_20260122_143957/your_model_profile.log

# Generate visualization from existing run
tt-memory-profiler --visualize logs/your_model_20260122_143957/

# Specify custom output directory
tt-memory-profiler --output-dir /path/to/output path/to/your_model.py

Output Structure

Output is stored in ./logs/ relative to your current working directory (or --output-dir if specified):

./logs/<script_name>_YYYYMMDD_HHMMSS/
├── <script_name>_memory.json      # Memory stats per operation
├── <script_name>_operations.json  # Operation metadata per operation
├── <script_name>_profile.log      # Raw logs
└── <script_name>_report.html      # Interactive visualization

View Visualization

Option 1: Using ttmem (recommended for remote development)

Run ttmem, select "Yes" when asked to serve via HTTP
Open http://localhost:PORT/report.html in your browser
VS Code Remote SSH automatically forwards the port

Option 2: Using VS Code Live Server

Right-click on the HTML file and choose "Open with Live Server"
Requires the Live Server extension in VS Code

Docker Users

If you are running inside a Docker container, the HTTP server binds to 0.0.0.0 but the port is not exposed by default. You need to forward port 8000 when starting your container:

docker run -p 8000:8000 <your-image>

Or if the container is already running, you can use docker exec with a new container that shares the network, or restart with the port published. If ttmem picks a different port (e.g. 8001 if 8000 is busy), forward that port instead.

Once the port is forwarded, open http://localhost:8000/report.html in your host browser to view the report.

Features

Interactive HTML visualization with memory graphs, fragmentation analysis, peak operations
Synchronized JSON outputs (nth element = same operation)
Filtered data (excludes deallocate operations)
Timestamped runs (never overwrites previous data)

Model Analysis Tool

Analyze PyTorch models to identify which modules/ops work on TT hardware.

Prerequisites

Quick Start

ttchop \
    --model-path path/to/model.py::load_model \
    --inputs-path path/to/model.py::get_inputs

What It Does

Extract Modules: Identifies all unique modules in the model
Run Op-by-Op Analysis: Tests each module hierarchically on TT hardware
Generate Report: Creates interactive HTML visualization showing pass/fail status

Usage

The tool requires two Python functions:

load_model() - Returns the PyTorch model
get_inputs() - Returns sample input tensors

# Basic usage
ttchop --model-path model.py::load_model --inputs-path model.py::get_inputs

# Specify output directory
ttchop --model-path model.py::load_model --inputs-path model.py::get_inputs --dir ./output

Output

<ModelClass>/
├── unique_modules.json      # Module analysis results with status
├── analysis_report.html     # Interactive tree visualization
└── module_irs/              # IR files for each module

Example

# model.py
import torch
import torch.nn as nn

def load_model():
    # Just return the model on CPU - the tool handles device placement
    return nn.Sequential(
        nn.Conv2d(3, 64, 3),
        nn.ReLU(),
        nn.Linear(64, 10)
    )

def get_inputs():
    # Just return CPU tensors - the tool handles device placement
    return torch.randn(1, 3, 224, 224)

Note: Your functions should return CPU models/tensors. The tool automatically handles moving them to the TT device.

ttchop --model-path model.py::load_model --inputs-path model.py::get_inputs

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
claude		claude
example_logs		example_logs
media		media
memory_profiler		memory_profiler
tests		tests
ttchop		ttchop
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
TTMEM_AGENTS.md		TTMEM_AGENTS.md
benchmark_memory.py		benchmark_memory.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TT Swiss - A swiss army knife for model bringup 🇨🇭

One click setup

Prerequisites

1. Build `tt-xla` with debug flags and Python bindings enabled

2. Export runtime logger flag (for op and memory info)

3. Initialize TTRT artifacts

Installation

Editable Install (for development)

Memory profiler

Browser usage

Usage

Try It with Example Logs

Interactive CLI (Recommended for Remote Development)

Command Line Interface

Output Structure

View Visualization

Docker Users

Features

Model Analysis Tool

Prerequisites

Quick Start

What It Does

Usage

Output

Example

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TT Swiss - A swiss army knife for model bringup 🇨🇭

One click setup

Prerequisites

1. Build tt-xla with debug flags and Python bindings enabled

2. Export runtime logger flag (for op and memory info)

3. Initialize TTRT artifacts

Installation

Editable Install (for development)

Memory profiler

Browser usage

Usage

Try It with Example Logs

Interactive CLI (Recommended for Remote Development)

Command Line Interface

Output Structure

View Visualization

Docker Users

Features

Model Analysis Tool

Prerequisites

Quick Start

What It Does

Usage

Output

Example

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Build `tt-xla` with debug flags and Python bindings enabled

Packages