TritonParse Release Notes v0.4.0 (115 commits)
- Date range: 2025-12-26 β 2026-01-21
- Scope: Major feature release - New
bisectCLI subcommand for automated Triton/LLVM regression bisection, SASS source mapping support, BlockPingpong IR analysis, advanced filter syntax, and significant infrastructure improvements.
Highlights
- π New
bisectCLI Subcommand: Complete regression bisection system for Triton and LLVM. Automatically find culprit commits withgit bisectintegration, LLVM bump detection, commit pair testing, and Rich TUI real-time progress display. Supports resumable workflows and multiple operation modes. - π SASS Source Mapping: Full SASS (NVIDIA assembly) source mapping support with fuzzy matching. Enables bidirectional mapping between SASS and other IR types (TTIR, TTGIR, PTX) in the website UI.
- π¬ BlockPingpong Detection: New IR analysis capability to detect and categorize block pingpong scheduling patterns in TTGIR, with color-coded visualization in the website UI.
- π¦ Standalone Reproducer: New
--embed-contextflag embeds JSON context directly into generated Python scripts, creating fully self-contained single-file reproducers for easy sharing and bug reports. - ποΈ Advanced Filter Syntax: Enhanced
--args-listfiltering with support for nested properties (C_ptr.dtype), array indexing (C_ptr.shape[0]), and list matching (C_ptr.shape=[3024, 10752]). - ποΈ Infrastructure Modernization: Parse module refactored into dedicated subdirectory, unified logging system, centralized SVG icons, test directory restructuring, and ESLint integration for website.
Changes by area
π New bisect CLI Subcommand
A complete regression bisection system spanning ~6000+ lines of code across 55+ PRs, organized in 7 architectural layers.
-
Operation modes (PR-43 ~ PR-52):
tritonparseoss bisect --good <commit> --bad <commit>- Triton-only bisect--llvm-only- Direct LLVM commit bisection--pair-test- Test (Triton, LLVM) commit pairs from CSV--commits-csv- Full 4-phase workflow (Triton bisect β LLVM bump detection β pair test β LLVM bisect)--resume/--status- Resume interrupted bisect or check status
-
Core bisector architecture (PR-15 ~ PR-21):
BaseBisector- Abstract base class with template method patternTritonBisector- Triton commit bisection with automatic build and testLLVMBisector- LLVM commit bisection with Triton rebuild- Commit validation and correct bisect range detection
-
Commit detection and pair testing (PR-22 ~ PR-27):
CommitDetector- Automatically detects LLVM version bump commitsLLVMBumpInfo- Captures old/new LLVM hash informationPairTester- CSV-driven (Triton, LLVM) commit pair testing- LLVM range filtering for efficient pair selection
-
State management (PR-28 ~ PR-31):
BisectPhaseenum:TRITON_BISECT,TYPE_CHECK,PAIR_TEST,LLVM_BISECT,COMPLETED,FAILEDBisectStatedataclass with JSON serializationStateManagerfor persistent state with auto-resume support- Automatic state file discovery (
find_latest_state())
-
Rich TUI interface (PR-32 ~ PR-42):
BisectUI- Split-screen layout with progress and output panels- Real-time progress updates with phase, commit, and step information
- Graceful fallback to plain text when Rich unavailable
print_final_summary()- Beautiful summary with GitHub links
-
Shell scripts (PR-06 ~ PR-13):
bisect_triton.sh- Triton build and test script for git bisectbisect_llvm.sh- LLVM + Triton build with COMPAT_MODE supporttest_commit_pairs.sh- Sequential pair testing with CSV supportscripts/__init__.py- Script path utilities
-
Execution infrastructure (PR-01 ~ PR-05, PR-14):
ShellExecutor- Blocking and streaming command executionCommandResultdataclass with duration trackingBisectLogger- Dual logging (file + TUI callback)run_git_bisect_sequence()- Complete git bisect workflowuvpackage manager support viaconfig.py(PR-54)- Clean build environment before each bisect step (PR-55)
-
Unit tests (Test-PR-01 ~ Test-PR-03):
- Tests for
state.py,commit_detector.py,pair_tester.py - Tests for
executor.pyandlogger.py(Layer 0)
- Tests for
π SASS Source Mapping Support
-
Fuzzy matching for SASS (commit 762844e):
- New
extract_sass_mappings()function inir_parser.py ignore_columnparameter for fuzzy matching (SASS lacks column info)- Automatic fuzzy matching when source or target IR is "sass"
- SASS comment line mapping (
//## File "/path", line N) - Skip
.nv_debug_ptx_txtdebug file references
- New
-
Website UI integration (#249):
- SASS code panel support in IR Code View
- Bidirectional highlighting between SASS and other IRs
- Updated default trace with SASS code (commit 1b2d6a9)
π¬ BlockPingpong Detection
- IR analysis enhancement (commits 50deca4, fe3092f, 0426510, 2dc0eac):
- New BlockPingpong pattern detection in
ir_analysis.py(~257 lines) - Automatic categorization of ping-pong scheduling patterns
- Pattern matching descriptions for each category
- Color-coded visualization in website UI
- Dedicated Pingpong section in IR Analysis interface
- New BlockPingpong pattern detection in
π¦ Reproducer Enhancements
-
Standalone reproducer (#252):
- New
--embed-contextCLI flag (default: False) - Embeds JSON context directly into Python script
- Creates fully self-contained single-file reproducer
- Ideal for sharing, bug reports, and archiving
- New
-
Compile params support (#295):
- Pass compile parameters to kernel invocation
- Fixes issue #277
-
Improved identification (#293, #294):
line_indexadded to reproducer filename- Metadata comments in generated scripts
-
Bug fixes:
ποΈ Advanced Filter Syntax
- Nested property filtering (commit 3ee5df5):
- Dot notation:
C_ptr.dtype=torch.bfloat16 - Array indexing:
C_ptr.shape[0]=3024 - List matching:
C_ptr.shape=[3024, 10752] - Unified nested dict unwrapping across all value sources
- Filter kernel launches by tensor metadata (shape, dtype, stride)
- Dot notation:
π Website UI Improvements
-
Code panel enhancements:
-
Infrastructure:
ποΈ Infrastructure & Code Quality
-
Module reorganization:
-
Test infrastructure:
- Test directory restructuring:
tests/cpu/andtests/gpu/ - Extract GPU TensorBlob, complex kernels, reproducer E2E tests
- Extract GPU structured logging + context manager tests
- Extract CPU tests to dedicated directory
- CI workflow updated for new test structure
- Test directory restructuring:
-
Code formatting:
-
Bug fixes:
- Kernel selector overflow fix (commit b5c72b8)
- Substring matching bug in call graph dependency filtering (commit 0ec75af)
- PAR compatibility in function_extractor (commit 48551a2)
ast.unparse()for proper indentation in reproducer extraction (commit 1d8a33d)--kernel-importhelp message fix (commit 18cf9d8)source_repo_dirsupport for mapping production file paths (commit a952d99)- BisectLogger unique logger names per instance (#251)
π Documentation
- Simplified CHANGELOG.md with links to GitHub releases (#226)
- Website version bumped to 0.3.2 with dependency updates (#238)
Compatibility notes
- New Feature: The
bisectsubcommand is an additive feature that doesn't affect existing workflows. - SASS Support: To use SASS source mapping, traces must include SASS IR (enable via
enable_sass_dump=TrueorTRITONPARSE_DUMP_SASS=1). - Filter Syntax: The new advanced filter syntax is backward compatible; existing filter expressions continue to work.
- Test Directory: Tests have been reorganized into
tests/cpu/andtests/gpu/subdirectories.
Upgrade guidance
-
Use bisect for regression hunting:
# Basic Triton bisect tritonparseoss bisect --triton-dir /path/to/triton \ --test-script test.py --good v2.0.0 --bad HEAD # Full workflow with LLVM bump detection tritonparseoss bisect --triton-dir /path/to/triton \ --test-script test.py --good v2.0.0 --bad HEAD \ --commits-csv pairs.csv # Resume interrupted bisect tritonparseoss bisect --resume # Check status tritonparseoss bisect --status
-
Generate standalone reproducers:
tritonparseoss reproduce trace.ndjson --kernel matmul --embed-context
-
Use advanced filtering:
tritonparseoss info trace.ndjson --args-list "C_ptr.shape[0]=3024,C_ptr.dtype=torch.bfloat16" -
SASS source mapping: Enable SASS dump in your trace, then load in website UI for full bidirectional mapping support.
-
BlockPingpong analysis: Load traces with TTGIR in the website UI; pingpong patterns are automatically detected and displayed in the IR Analysis section.