Skip to content

TritonParse v0.4.0 Release πŸŽ‰

Latest

Choose a tag to compare

@FindHao FindHao released this 22 Jan 18:07
· 18 commits to main since this release

TritonParse Release Notes v0.4.0 (115 commits)

  • Date range: 2025-12-26 β€” 2026-01-21
  • Scope: Major feature release - New bisect CLI subcommand for automated Triton/LLVM regression bisection, SASS source mapping support, BlockPingpong IR analysis, advanced filter syntax, and significant infrastructure improvements.

Highlights

  • πŸ” New bisect CLI Subcommand: Complete regression bisection system for Triton and LLVM. Automatically find culprit commits with git bisect integration, LLVM bump detection, commit pair testing, and Rich TUI real-time progress display. Supports resumable workflows and multiple operation modes.
  • πŸ“Š SASS Source Mapping: Full SASS (NVIDIA assembly) source mapping support with fuzzy matching. Enables bidirectional mapping between SASS and other IR types (TTIR, TTGIR, PTX) in the website UI.
  • πŸ”¬ BlockPingpong Detection: New IR analysis capability to detect and categorize block pingpong scheduling patterns in TTGIR, with color-coded visualization in the website UI.
  • πŸ“¦ Standalone Reproducer: New --embed-context flag embeds JSON context directly into generated Python scripts, creating fully self-contained single-file reproducers for easy sharing and bug reports.
  • πŸŽ›οΈ Advanced Filter Syntax: Enhanced --args-list filtering with support for nested properties (C_ptr.dtype), array indexing (C_ptr.shape[0]), and list matching (C_ptr.shape=[3024, 10752]).
  • πŸ—οΈ Infrastructure Modernization: Parse module refactored into dedicated subdirectory, unified logging system, centralized SVG icons, test directory restructuring, and ESLint integration for website.

Changes by area

πŸ” New bisect CLI Subcommand

A complete regression bisection system spanning ~6000+ lines of code across 55+ PRs, organized in 7 architectural layers.

  • Operation modes (PR-43 ~ PR-52):

    • tritonparseoss bisect --good <commit> --bad <commit> - Triton-only bisect
    • --llvm-only - Direct LLVM commit bisection
    • --pair-test - Test (Triton, LLVM) commit pairs from CSV
    • --commits-csv - Full 4-phase workflow (Triton bisect β†’ LLVM bump detection β†’ pair test β†’ LLVM bisect)
    • --resume / --status - Resume interrupted bisect or check status
  • Core bisector architecture (PR-15 ~ PR-21):

    • BaseBisector - Abstract base class with template method pattern
    • TritonBisector - Triton commit bisection with automatic build and test
    • LLVMBisector - LLVM commit bisection with Triton rebuild
    • Commit validation and correct bisect range detection
  • Commit detection and pair testing (PR-22 ~ PR-27):

    • CommitDetector - Automatically detects LLVM version bump commits
    • LLVMBumpInfo - Captures old/new LLVM hash information
    • PairTester - CSV-driven (Triton, LLVM) commit pair testing
    • LLVM range filtering for efficient pair selection
  • State management (PR-28 ~ PR-31):

    • BisectPhase enum: TRITON_BISECT, TYPE_CHECK, PAIR_TEST, LLVM_BISECT, COMPLETED, FAILED
    • BisectState dataclass with JSON serialization
    • StateManager for persistent state with auto-resume support
    • Automatic state file discovery (find_latest_state())
  • Rich TUI interface (PR-32 ~ PR-42):

    • BisectUI - Split-screen layout with progress and output panels
    • Real-time progress updates with phase, commit, and step information
    • Graceful fallback to plain text when Rich unavailable
    • print_final_summary() - Beautiful summary with GitHub links
  • Shell scripts (PR-06 ~ PR-13):

    • bisect_triton.sh - Triton build and test script for git bisect
    • bisect_llvm.sh - LLVM + Triton build with COMPAT_MODE support
    • test_commit_pairs.sh - Sequential pair testing with CSV support
    • scripts/__init__.py - Script path utilities
  • Execution infrastructure (PR-01 ~ PR-05, PR-14):

    • ShellExecutor - Blocking and streaming command execution
    • CommandResult dataclass with duration tracking
    • BisectLogger - Dual logging (file + TUI callback)
    • run_git_bisect_sequence() - Complete git bisect workflow
    • uv package manager support via config.py (PR-54)
    • Clean build environment before each bisect step (PR-55)
  • Unit tests (Test-PR-01 ~ Test-PR-03):

    • Tests for state.py, commit_detector.py, pair_tester.py
    • Tests for executor.py and logger.py (Layer 0)

πŸ“Š SASS Source Mapping Support

  • Fuzzy matching for SASS (commit 762844e):

    • New extract_sass_mappings() function in ir_parser.py
    • ignore_column parameter for fuzzy matching (SASS lacks column info)
    • Automatic fuzzy matching when source or target IR is "sass"
    • SASS comment line mapping (//## File "/path", line N)
    • Skip .nv_debug_ptx_txt debug file references
  • Website UI integration (#249):

    • SASS code panel support in IR Code View
    • Bidirectional highlighting between SASS and other IRs
    • Updated default trace with SASS code (commit 1b2d6a9)

πŸ”¬ BlockPingpong Detection

  • IR analysis enhancement (commits 50deca4, fe3092f, 0426510, 2dc0eac):
    • New BlockPingpong pattern detection in ir_analysis.py (~257 lines)
    • Automatic categorization of ping-pong scheduling patterns
    • Pattern matching descriptions for each category
    • Color-coded visualization in website UI
    • Dedicated Pingpong section in IR Analysis interface

πŸ“¦ Reproducer Enhancements

  • Standalone reproducer (#252):

    • New --embed-context CLI flag (default: False)
    • Embeds JSON context directly into Python script
    • Creates fully self-contained single-file reproducer
    • Ideal for sharing, bug reports, and archiving
  • Compile params support (#295):

    • Pass compile parameters to kernel invocation
    • Fixes issue #277
  • Improved identification (#293, #294):

    • line_index added to reproducer filename
    • Metadata comments in generated scripts
  • Bug fixes:

    • Fix reproducer for inductor-generated Triton kernels (commit 430510c)
    • Fix isort reordering issue (#254)

πŸŽ›οΈ Advanced Filter Syntax

  • Nested property filtering (commit 3ee5df5):
    • Dot notation: C_ptr.dtype=torch.bfloat16
    • Array indexing: C_ptr.shape[0]=3024
    • List matching: C_ptr.shape=[3024, 10752]
    • Unified nested dict unwrapping across all value sources
    • Filter kernel launches by tensor metadata (shape, dtype, stride)

🌐 Website UI Improvements

  • Code panel enhancements:

    • Vertical resize capability for IR Code View panels (#253)
    • Horizontal scroll tip banner (#250)
    • Long kernel name overflow fix (#246)
    • Index prefix added to kernel selector (#279)
  • Code quality improvements (#228 ~ #235):

    • ESLint added to CI workflow (#235)
    • 8 PRs fixing React hooks, TypeScript, and lint errors
    • Fix Python source line highlight clearing (#236)
    • Display Python source line numbers from original file offset (#239)
  • Infrastructure:

    • Dependabot configuration for npm dependencies (#264)
    • Runtime accessibility test in CI (#265)
    • SVG icons centralized using @heroicons/react (#241)
    • Remote URL button fix (commit 7df4b87)
    • Compile-time flag for internal wiki link (commit 03c71e5)

πŸ—οΈ Infrastructure & Code Quality

  • Module reorganization:

    • Parse module refactored into tritonparse/parse/ subdirectory (#240)
    • Unified logger modules to tp_logger.py (#242)
    • Hierarchical sub-loggers under "tritonparse" namespace
  • Test infrastructure:

    • Test directory restructuring: tests/cpu/ and tests/gpu/
    • Extract GPU TensorBlob, complex kernels, reproducer E2E tests
    • Extract GPU structured logging + context manager tests
    • Extract CPU tests to dedicated directory
    • CI workflow updated for new test structure
  • Code formatting:

    • Align OSS formatting with internal pyfmt config (#256)
    • Black 25.11.0 style applied (commit b2d12f9)
  • Bug fixes:

    • Kernel selector overflow fix (commit b5c72b8)
    • Substring matching bug in call graph dependency filtering (commit 0ec75af)
    • PAR compatibility in function_extractor (commit 48551a2)
    • ast.unparse() for proper indentation in reproducer extraction (commit 1d8a33d)
    • --kernel-import help message fix (commit 18cf9d8)
    • source_repo_dir support for mapping production file paths (commit a952d99)
    • BisectLogger unique logger names per instance (#251)

πŸ“š Documentation

  • Simplified CHANGELOG.md with links to GitHub releases (#226)
  • Website version bumped to 0.3.2 with dependency updates (#238)

Compatibility notes

  • New Feature: The bisect subcommand is an additive feature that doesn't affect existing workflows.
  • SASS Support: To use SASS source mapping, traces must include SASS IR (enable via enable_sass_dump=True or TRITONPARSE_DUMP_SASS=1).
  • Filter Syntax: The new advanced filter syntax is backward compatible; existing filter expressions continue to work.
  • Test Directory: Tests have been reorganized into tests/cpu/ and tests/gpu/ subdirectories.

Upgrade guidance

  1. Use bisect for regression hunting:

    # Basic Triton bisect
    tritonparseoss bisect --triton-dir /path/to/triton \
        --test-script test.py --good v2.0.0 --bad HEAD
    
    # Full workflow with LLVM bump detection
    tritonparseoss bisect --triton-dir /path/to/triton \
        --test-script test.py --good v2.0.0 --bad HEAD \
        --commits-csv pairs.csv
    
    # Resume interrupted bisect
    tritonparseoss bisect --resume
    
    # Check status
    tritonparseoss bisect --status
  2. Generate standalone reproducers:

    tritonparseoss reproduce trace.ndjson --kernel matmul --embed-context
  3. Use advanced filtering:

    tritonparseoss info trace.ndjson --args-list "C_ptr.shape[0]=3024,C_ptr.dtype=torch.bfloat16"
  4. SASS source mapping: Enable SASS dump in your trace, then load in website UI for full bidirectional mapping support.

  5. BlockPingpong analysis: Load traces with TTGIR in the website UI; pingpong patterns are automatically detected and displayed in the IR Analysis section.