Skip to content

XProf v2.22.2

Latest

Choose a tag to compare

@Matt-Hurd Matt-Hurd released this 13 Apr 20:17
· 17 commits to master since this release

In this release, we've moved from Python+Platform specific builds to Platform-only specific builds. Now, rather than only supporting Python 3.10-3.13, we support all Python versions greater than 3.10 and all of their variants (3.14t).

Release Notes

Highlights

  • Trace Viewer V2 Enhancements: Significant UI/UX improvements, including a new mouse mode toolbar, area selection, static search, and persistent UI states.
  • Performance Optimizations: Faster trace processing by avoiding unnecessary memory copies, using proto arenas, and optimizing compression levels.
  • Build & Packaging: Improved multi-Python version build support and resolved various OSS build and macOS compatibility issues.

Detailed Changes

Trace Viewer V2 & Frontend

  • New Features:
    • Implemented Static Search in Trace Viewer V2 with a Material 3 pill-shaped search box.
    • Added a Mouse Mode Toolbar, keyboard shortcuts, and zoom interactions.
    • Added support for Area Selection and drag-and-drop for JSON files.
    • Added a feature to click to copy track names.
    • Displayed an area chart for top-level process utilization in the timeline preview.
  • UI/UX Improvements:
    • Updated fonts and styling to Material 3 specifications.
    • Added vertical scrolling and automatic track expansion when navigating to an event.
    • Made the detail drawer resizable and preserved its size across sessions.
    • Preserved group expanded states across trace data updates.
    • Improved counter event hover visual effects.
  • Bug Fixes & Stability:
    • Fixed canvas resize flickering.
    • Fixed mouse wheel shortcuts and multi-host selection.
    • Prevented layout shifts by reserving scrollbar width.
    • Stabilized Trace Viewer URL generation.
  • Internal Refactoring:
    • Introduced a scheduler to manage redraws.
    • Refactored timeline navigation and drawing logic.
    • Renamed Emscripten bindings for better TypeScript readability.

Core Profiler & Backend

  • Performance Optimizations:
    • Sped up trace processing by avoiding unnecessary memory copies of Perfetto output.
    • Used Proto Arenas to speed up LoadFromFile and WriteToCord.
    • Changed default gzip compression level from 6 to 1 for faster processing.
    • Reduced memory allocation during the creation of the prefix trie search index.
    • Optimized TraceEvent serialization for LevelDB storage.
  • Features & Protocol Updates:
    • Added TraceDataResponse proto to significantly reduce payload size over the wire.
    • Enabled async full DMA feature in 3P Trace Viewer.
    • Added counter lines for megascale collective count and in-flight bytes.
  • Smart Suggestions:
    • Optimized EventTimeFractionAnalyzer to process all events at once.
    • Added caching and prevented redundant fetches for smart suggestions.
  • Bug Fixes:
    • Fixed SparseCore busy and idle time calculations.
    • Synchronized viewport initialized ranges and locked concurrent fetches.

Graph Viewer

  • Fixed HLO download bug.
  • Added fallback to find HloProto by node name.
  • Added an HLO metadata fixer.

Build, Infrastructure & Packaging

  • Created a custom CustomBdistWheel class to enable multi-Python version builds.
  • Fixed Kokoro build failures by forcing platlib in setup.py.
  • Limited macOS linkopts to exclude Linux-exclusive flags.
  • Included Trace Viewer V2 assets in the pip package and fixed OSS build issues.
  • Decoupled heavy google-cloud-cpp dependencies.

Documentation & Tests

  • Added documentation for Custom Call Profiling and CPU Perf Counters support.
  • Updated the demo trace to a modern MaxText workload.
  • Added automated load performance benchmarks and various unit tests to improve coverage.

Individual Commits

  • Fix Kokoro build failure by forcing platlib in setup.py. by Profiler Team in #2513
  • Synchronize mouse mode changes from keyboard shortcuts to UI. by Profiler Team in #2502
  • Stabilize Trace Viewer URL generation and remove redundant caches. by Profiler Team in #2492
  • [XProf: trace viewer] Code cleanup: Refactor timeline_test.cc by grouping and sorting tests by @lilysjtu2011 in #2505
  • [XProf: trace viewer] Extract GetNextGroupStartLevel helper function. by @lilysjtu2011 in #2510
  • [XProf: trace viewer] Go to next search result on Enter by @lilysjtu2011 in #2506
  • [XProf: trace viewer] Expand related tracks when revealing an event by @lilysjtu2011 in #2498
  • [XProf: trace viewer] Add vertical scrolling when navigating to an event by @lilysjtu2011 in #2497
  • [XProf: trace viewer] Implement static search in Trace Viewer V2 by @lilysjtu2011 in #2496
  • Create custom CustomBdistWheel class to enable multi-Python version builds by @Matt-Hurd in #2504
  • [XProf: trace viewer] Fix canvas resize flickering by @lilysjtu2011 in #2500
  • [XProf: trace viewer] Refactor timeline navigation to reveal by @lilysjtu2011 in #2494
  • [XProf: trace viewer] Use stable hash function for event color generation by @lilysjtu2011 in #2499
  • [XProf: trace viewer] Keep visualization visible when unfolded for process tracks by @lilysjtu2011 in #2487
  • Add mouse mode toolbar and fix cursor sync in Trace Viewer v2. by Profiler Team in #2484
  • Add mouse mode shortcuts and zoom interaction to Trace Viewer v2. by Profiler Team in #2474
  • Limit macos linkopts to exclude linux-exclusive ones. by @Matt-Hurd in #2486
  • Replace broken sed command with a patch file by @Matt-Hurd in #2475
  • [XProf: trace viewer] Add automated load performance benchmark by @lilysjtu2011 in #2473
  • Enable async full DMA feature in 3P Trace Viewer. by Profiler Team in #2459
  • [XProf: trace viewer] Add JSON drag-and-drop support by @lilysjtu2011 in #2461
  • Add HLO metadata fixer by @subhamsoni-google in #2468
  • Implement Details Selector in Trace Viewer v2 UI by Profiler Team in #2460
  • Implement area selection and migrate click selection to MouseReleased in Trace Viewer. by Profiler Team in #2464
  • [XProf: trace viewer] Introduce a scheduler to manage redraws in Trace Viewer v2. by @lilysjtu2011 in #2067
  • Fix SparseCore busy and idle time calculations. by @bmass02 in #2465
  • Add TraceDataResponse proto definition. This would be used for transferring trace data over the wire significantly reducing the payload size from the current JSON baseline. by Profiler Team in #2436
  • Support uint32_t, int64_t, and uint64_t types when converting absl::any to emscripten::val for WASM bindings. Also add styling for .single-event-table and .selection-info to make single-event tables more compact. by Profiler Team in #2454
  • This test verifies that performance counters are successfully collected and can be processed by the xprof plugin when using jax.profiler.trace. The test runs a simple matrix multiplication and addition, then checks for the presence of 'perf_counters' data in the generated xplane file. by Profiler Team in #2453
  • Optimize TraceEvent serialization for LevelDB storage. by Profiler Team in #2382
  • Fix HLO Download Bug in Graph Viewer. by @smit-hinsu in #2455
  • Synchronize viewport initialized ranges and lock concurrent fetches. by Profiler Team in #2451
  • Enable option to not use cached smart suggestion results in 3P by @RuiyangZhu in #2440
  • SS Optimization: Make EventTimeFractionAnalyzer to process all events at once by @RuiyangZhu in #2406
  • Fix Build Error in 3P Xprof Build Script by @RuiyangZhu in #2439
  • [XProf: trace viewer] Merge Timeline Level Y Positions into precalculated relative offsets by @lilysjtu2011 in #2435
  • [XProf: trace viewer] Cleanup ImGui Table Flags and Extract Pixel Constants by @lilysjtu2011 in #2434
  • [XProf: trace viewer] Make trace viewer v2 search box a GM3 pill shape by @lilysjtu2011 in #2432
  • [XProf: trace viewer] Refactor Timeline logic by @lilysjtu2011 in #2433
  • [XProf: trace viewer] Display an area chart for the top-level process utilization in the timeline preview. by @lilysjtu2011 in #2431
  • [XProf: trace viewer] Improve counter event hover visual effect by @lilysjtu2011 in #2430
  • Resolve rest of formatting and lint warnings in ts files. by @zzzaries in #2425
  • [XProf: trace viewer] Add click to copy track name feature by @lilysjtu2011 in #2429
  • [XProf: trace viewer] Format Process Track Name in Two Lines by @lilysjtu2011 in #2428
  • [XProf: trace viewer] Update Trace Viewer fonts to Material 3 specifications by @lilysjtu2011 in #2427
  • [XProf: trace viewer] UI styling and track alignment improvements. by @lilysjtu2011 in #2426
  • Add docs for upcoming cpu perf counters support by @Matt-Hurd in #2419
  • Add fallback to find HloProto by node name in Graph Viewer. by @subhamsoni-google in #2391
  • [XProf: trace viewer] Separate drawing logic for ruler and vertical lines by @lilysjtu2011 in #2421
  • Create EventFractionAnalyzerResults Proto by @RuiyangZhu in #2387
  • [XProf: trace viewer] Add secondary container color and fix process track color collision by @lilysjtu2011 in #2420
  • [XProf: trace viewer] Add aggregated event preview for collapsed groups. by @lilysjtu2011 in #2395
  • [XProf: trace viewer] Rename Emscripten bindings for TypeScript readability. by @lilysjtu2011 in #2388
  • Add an option to not use cached smart suggestion results in 1P by @RuiyangZhu in #2394
  • Project import generated by Copybara by Profiler Team in #2418
  • Don't show legacy DCN view in Trace Viewer by default. Users should use the new by @bbadawi1 in #2404
  • [XProf: trace viewer] Fix mouse wheel shortcuts and add input handler tests. by @lilysjtu2011 in #2414
  • Change gzip compression level from 6 (default) to 1. by @bbadawi1 in #2412
  • Apply code formatting to TypeScript files. by @zzzaries in #2415
  • Speed up trace processing by avoiding unnecessary memory copy of perfetto output. by @bbadawi1 in #2410
  • Switch EventForest to use new XPlaneVisitor factory. by @bmass02 in #2401
  • Use SmartSuggestionProcessor in ConvertMultiXSpacesToSmartSuggestion in xplane_to_tools_data by @xiongbolu-mineral in #2372
  • Use proto arenas to speed up LoadFromFile and WriteToCord. by @bbadawi1 in #2409
  • Add documentation for Custom Call Profiling in XProf. by @cliveverghese in #2408
  • Update demo trace to modern MaxText workload. by @Matt-Hurd in #2407
  • [XProf: trace viewer] Add tests to improve test coverage and fix regressions. by @lilysjtu2011 in #2402
  • Fix multi-host selection in trace viewer by @muditgokhale2 in #2403
  • Use channel_id instead of rendezvous for connecting megascale events. by @bbadawi1 in #2392
  • Refractor AddHloProto to get module_id from module by @cliveverghese in #2304
  • Switch EventForest to use new XPlaneVisitor factory. by @muditgokhale2 in #2384
  • Add counter lines for megascale collective count and in-flight bytes. by @bbadawi1 in #2381
  • Use public targets from google_cloud_cpp by @ethanluoyc in #2376
  • Add default constructors to hlo_proto_map by @cliveverghese in #2349
  • [XProf: trace viewer] Making one-line threads not collapsible. by @lilysjtu2011 in #2370
  • [XProf: trace viewer] Preserve group expanded states across trace data updates. by @lilysjtu2011 in #2367
  • Only render the timeline graph if the allocation timeline is not empty. by @muditgokhale2 in #2378
  • Reduce memory allocation while creation of prefix trie trace events search index. by Profiler Team in #2379
  • Switch EventForest to use new XPlaneVisitor factory. by @bmass02 in #2371
  • Register BarrierCoresRule for 3P XProf by @xiongbolu-mineral in #2353
  • Update xprof WORKSPACE.BAZEL with patch commands for emsdk by Profiler Team in #2374
  • Cache smart suggestion result for 3P XProf by @RuiyangZhu in #2366
  • Fix missing overview page contents after smart suggestion request by @xiongbolu-mineral in #2359
  • Remove unused function ConvertMultiXSpacesToTfDataBottleneckAnalysis from xplane_to_tools_data by @xiongbolu-mineral in #2365
  • Include trace viewer v2 assets in the pip package, and fix OSS build issues by Profiler Team in #2351
  • [XProf: trace viewer] Make the trace viewer drawer size persistent. by @lilysjtu2011 in #2343
  • [XProf: trace viewer] Reserve scrollbar width to avoid layout shift. by @lilysjtu2011 in #2356
  • Refactor: Convert recursive event tree traversal to iterative by Profiler Team in #2360
  • Refactor: Abstract smart suggestion check in DataServiceV2 by @xiongbolu-mineral in #2354
  • Don't populate disaggregated serving latency for irrelevant sessions by Profiler Team in #2358
  • Prevent redundant smart suggestion fetches for the same session. by @xiongbolu-mineral in #2352
  • Decouple third_party/xprof/convert:repository from heavy google-cloud-cpp dependencies. by Profiler Team in #2344
  • Register 3P specific rules in smart suggestion processor and disable it from profile processor framework by @xiongbolu-mineral in #2348
  • When a module already has a run_id, use it instead of creating a new one. by @bbadawi1 in #2350
  • [XProf: trace viewer] Add angular-split to make the detail drawer resizable. by @lilysjtu2011 in #2341
  • Update XProf documentation and README. by Profiler Team in #2346
  • Fix documentation typo in OpMetrics proto by @charlesalaras in #2347

Full Changelog: xprof-v2.22.0...xprof-v2.22.2