Enhancement - Fix hipblaslt-gemm result parsing for hipBLASLt v1500+ output format by polarG · Pull Request #791 · microsoft/superbenchmark

polarG · 2026-03-20T17:52:42Z

Description
The hipblaslt-gemm benchmark result parser fails with MICROBENCHMARK_RESULT_PARSING_FAILURE (return code 33) when running against hipBLASLt v1500+. The benchmark kernels execute successfully and produce valid TFLOPS data, but SuperBench cannot parse the results into structured metrics.

Root cause: The parser hardcodes len(fields) != 23 to validate the output CSV, but hipBLASLt v1500 outputs 34 columns — it added a_type, b_type, c_type, d_type, scaleA, scaleB, scaleC, scaleD, amaxD, bias_type, aux_type, and hipblaslt-GB/s. This causes two bugs:

The field count check rejects every result line as invalid.
Even if it passed, fields[-2] would return hipblaslt-GB/s instead of hipblaslt-Gflops.

Fix
Replace the hardcoded field-count check and positional index with header-based column lookup:

Parse the CSV header line to dynamically find the column index of hipblaslt-Gflops.
Validate that the data line has the same number of columns as the header (instead of a magic number).
Use the discovered column index to extract the Gflops value.

This approach is:

Backward compatible — works with the old 23-column format (hipBLASLt v600).
Forward compatible — will handle any future column additions as long as hipblaslt-Gflops remains in the header.

Copilot

Pull request overview

Updates hipblaslt-gemm result parsing to support hipBLASLt v1500+ CSV output by using header-based column lookup instead of hardcoded field counts/positions.

Changes:

Parse the CSV header to dynamically locate the hipblaslt-Gflops column and validate row width against the header.
Add a unit test covering the new v1500+ (34-column) output format.
Clarify parsing intent via inline comments.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
tests/benchmarks/micro_benchmarks/test_hipblaslt_function.py	Adds a new test case for v1500+ output and annotates the existing old-format test.
superbench/benchmarks/micro_benchmarks/hipblaslt_function.py	Switches parsing to header-based `hipblaslt-Gflops` extraction and header/data column-count validation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-20T17:59:29Z

tests/benchmarks/micro_benchmarks/test_hipblaslt_function.py

+        self.assertEqual(2, len(benchmark.result))
+        self.assertEqual(678.209, benchmark.result['fp16_1_4096_4096_4096_flops'][0])


self.assertEqual(2, len(benchmark.result)) is a brittle assertion because it couples the test to the total number of emitted metrics rather than the behavior under test (correct Gflops extraction). Prefer asserting that the expected key exists (and optionally that no parsing error occurred) without pinning the total metric count. Also, comparing floats with assertEqual can be flaky due to floating-point representation; use assertAlmostEqual(..., places=...) for the numeric check.

Suggested change

self.assertEqual(2, len(benchmark.result))

self.assertEqual(678.209, benchmark.result['fp16_1_4096_4096_4096_flops'][0])

self.assertIn('fp16_1_4096_4096_4096_flops', benchmark.result)

self.assertAlmostEqual(678.209, benchmark.result['fp16_1_4096_4096_4096_flops'][0], places=3)

Copilot · 2026-03-20T17:59:30Z

superbench/benchmarks/micro_benchmarks/hipblaslt_function.py

+            header_fields[0] = header_fields[0].split(']')[-1].lstrip(':')
+            gflops_col = None
+            for col_idx, col_name in enumerate(header_fields):
+                if 'hipblaslt-Gflops' in col_name:


The column lookup uses substring matching (in), which can accidentally match unexpected column names (e.g., a future column like hipblaslt-Gflops-peak). This should be an exact match after normalizing whitespace (e.g., col_name.strip() == 'hipblaslt-Gflops') to ensure the parser reliably targets the intended metric.

Suggested change

if 'hipblaslt-Gflops' in col_name:

if col_name.strip() == 'hipblaslt-Gflops':

Copilot · 2026-03-20T17:59:30Z

superbench/benchmarks/micro_benchmarks/hipblaslt_function.py

+            # This is needed because hipBLASLt output format varies across versions:
+            #   - v600  (old): 23 columns, Gflops at index -2
+            #   - v1500 (new): 34 columns, added a_type/b_type/c_type/scaleA-D/amaxD/
+            #                  bias_type/aux_type/GB_s columns, Gflops at index -3
+            # Using header-based lookup ensures compatibility with both formats
+            # and any future column additions.


Since the implementation no longer relies on positional indices, the comment stating exact positions (e.g., 'Gflops at index -2/-3') can become misleading as formats evolve. Consider removing the index claims and focusing the comment on the key guarantee (header-driven lookup + header/data width validation). Also, the comment mentions GB_s but the sample header uses hipblaslt-GB/s.

Suggested change

# This is needed because hipBLASLt output format varies across versions:

# - v600 (old): 23 columns, Gflops at index -2

# - v1500 (new): 34 columns, added a_type/b_type/c_type/scaleA-D/amaxD/

# bias_type/aux_type/GB_s columns, Gflops at index -3

# Using header-based lookup ensures compatibility with both formats

# and any future column additions.

# This is needed because the hipBLASLt output format varies across versions:

# - v600 (old): 23 columns.

# - v1500 (new): 34 columns, adding a_type/b_type/c_type/scaleA-D/amaxD/

# bias_type/aux_type/hipblaslt-GB/s columns.

# Using header-based lookup (plus header/data width validation) ensures

# compatibility across existing formats and resilience to future changes.

codecov · 2026-03-20T18:02:10Z

Codecov Report

❌ Patch coverage is 81.81818% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.71%. Comparing base (6b8e810) to head (622592c).

Files with missing lines	Patch %	Lines
.../benchmarks/micro_benchmarks/hipblaslt_function.py	81.81%	2 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #791   +/-   ##
=======================================
  Coverage   85.70%   85.71%           
=======================================
  Files         102      102           
  Lines        7703     7712    +9     
=======================================
+ Hits         6602     6610    +8     
- Misses       1101     1102    +1

Flag	Coverage Δ
cpu-python3.10-unit-test	`70.98% <81.81%> (+0.02%)`	⬆️
cpu-python3.12-unit-test	`70.98% <81.81%> (+0.02%)`	⬆️
cpu-python3.7-unit-test	`70.45% <81.81%> (+0.02%)`	⬆️
cuda-unit-test	`83.60% <81.81%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Fix hpblaslt cannot parse the result on V1500.

622592c

polarG requested a review from a team as a code owner March 20, 2026 17:52

Copilot AI review requested due to automatic review settings March 20, 2026 17:52

Copilot AI reviewed Mar 20, 2026

View reviewed changes

Copilot started reviewing on behalf of polarG March 20, 2026 18:17 View session

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancement - Fix hipblaslt-gemm result parsing for hipBLASLt v1500+ output format#791

Enhancement - Fix hipblaslt-gemm result parsing for hipBLASLt v1500+ output format#791
polarG wants to merge 1 commit intomainfrom
dev/hongtaozhang/fix-hipblaslt-parse-in-new-version

polarG commented Mar 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 20, 2026

Uh oh!

Copilot AI Mar 20, 2026

Uh oh!

Copilot AI Mar 20, 2026

Uh oh!

codecov bot commented Mar 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		self.assertEqual(2, len(benchmark.result))
		self.assertEqual(678.209, benchmark.result['fp16_1_4096_4096_4096_flops'][0])

	if 'hipblaslt-Gflops' in col_name:
	if col_name.strip() == 'hipblaslt-Gflops':

Conversation

polarG commented Mar 20, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Mar 20, 2026 •

edited

Loading