Enhancement - Fix hipblaslt-gemm result parsing for hipBLASLt v1500+ output format#791
Enhancement - Fix hipblaslt-gemm result parsing for hipBLASLt v1500+ output format#791
Conversation
There was a problem hiding this comment.
Pull request overview
Updates hipblaslt-gemm result parsing to support hipBLASLt v1500+ CSV output by using header-based column lookup instead of hardcoded field counts/positions.
Changes:
- Parse the CSV header to dynamically locate the
hipblaslt-Gflopscolumn and validate row width against the header. - Add a unit test covering the new v1500+ (34-column) output format.
- Clarify parsing intent via inline comments.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| tests/benchmarks/micro_benchmarks/test_hipblaslt_function.py | Adds a new test case for v1500+ output and annotates the existing old-format test. |
| superbench/benchmarks/micro_benchmarks/hipblaslt_function.py | Switches parsing to header-based hipblaslt-Gflops extraction and header/data column-count validation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| self.assertEqual(2, len(benchmark.result)) | ||
| self.assertEqual(678.209, benchmark.result['fp16_1_4096_4096_4096_flops'][0]) |
There was a problem hiding this comment.
self.assertEqual(2, len(benchmark.result)) is a brittle assertion because it couples the test to the total number of emitted metrics rather than the behavior under test (correct Gflops extraction). Prefer asserting that the expected key exists (and optionally that no parsing error occurred) without pinning the total metric count. Also, comparing floats with assertEqual can be flaky due to floating-point representation; use assertAlmostEqual(..., places=...) for the numeric check.
| self.assertEqual(2, len(benchmark.result)) | |
| self.assertEqual(678.209, benchmark.result['fp16_1_4096_4096_4096_flops'][0]) | |
| self.assertIn('fp16_1_4096_4096_4096_flops', benchmark.result) | |
| self.assertAlmostEqual(678.209, benchmark.result['fp16_1_4096_4096_4096_flops'][0], places=3) |
| header_fields[0] = header_fields[0].split(']')[-1].lstrip(':') | ||
| gflops_col = None | ||
| for col_idx, col_name in enumerate(header_fields): | ||
| if 'hipblaslt-Gflops' in col_name: |
There was a problem hiding this comment.
The column lookup uses substring matching (in), which can accidentally match unexpected column names (e.g., a future column like hipblaslt-Gflops-peak). This should be an exact match after normalizing whitespace (e.g., col_name.strip() == 'hipblaslt-Gflops') to ensure the parser reliably targets the intended metric.
| if 'hipblaslt-Gflops' in col_name: | |
| if col_name.strip() == 'hipblaslt-Gflops': |
| # This is needed because hipBLASLt output format varies across versions: | ||
| # - v600 (old): 23 columns, Gflops at index -2 | ||
| # - v1500 (new): 34 columns, added a_type/b_type/c_type/scaleA-D/amaxD/ | ||
| # bias_type/aux_type/GB_s columns, Gflops at index -3 | ||
| # Using header-based lookup ensures compatibility with both formats | ||
| # and any future column additions. |
There was a problem hiding this comment.
Since the implementation no longer relies on positional indices, the comment stating exact positions (e.g., 'Gflops at index -2/-3') can become misleading as formats evolve. Consider removing the index claims and focusing the comment on the key guarantee (header-driven lookup + header/data width validation). Also, the comment mentions GB_s but the sample header uses hipblaslt-GB/s.
| # This is needed because hipBLASLt output format varies across versions: | |
| # - v600 (old): 23 columns, Gflops at index -2 | |
| # - v1500 (new): 34 columns, added a_type/b_type/c_type/scaleA-D/amaxD/ | |
| # bias_type/aux_type/GB_s columns, Gflops at index -3 | |
| # Using header-based lookup ensures compatibility with both formats | |
| # and any future column additions. | |
| # This is needed because the hipBLASLt output format varies across versions: | |
| # - v600 (old): 23 columns. | |
| # - v1500 (new): 34 columns, adding a_type/b_type/c_type/scaleA-D/amaxD/ | |
| # bias_type/aux_type/hipblaslt-GB/s columns. | |
| # Using header-based lookup (plus header/data width validation) ensures | |
| # compatibility across existing formats and resilience to future changes. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #791 +/- ##
=======================================
Coverage 85.70% 85.71%
=======================================
Files 102 102
Lines 7703 7712 +9
=======================================
+ Hits 6602 6610 +8
- Misses 1101 1102 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Description
The hipblaslt-gemm benchmark result parser fails with MICROBENCHMARK_RESULT_PARSING_FAILURE (return code 33) when running against hipBLASLt v1500+. The benchmark kernels execute successfully and produce valid TFLOPS data, but SuperBench cannot parse the results into structured metrics.
Root cause: The parser hardcodes len(fields) != 23 to validate the output CSV, but hipBLASLt v1500 outputs 34 columns — it added a_type, b_type, c_type, d_type, scaleA, scaleB, scaleC, scaleD, amaxD, bias_type, aux_type, and hipblaslt-GB/s. This causes two bugs:
Fix
Replace the hardcoded field-count check and positional index with header-based column lookup:
This approach is: