bf16xint16 fails on AMD ROCm with tensor comparison assertion error

**Describe the bug**
The test compares a Helion bf16×int16 GEMM kernel against a pytorch reference implementation, but ROCm produces slightly different numerical results that exceed the comparison tolerance.

**To Reproduce**
Pull PR https://github.com/pytorch/helion/pull/794 and remove the `@skipIfRocm` decorator from test_bf16xint16 in test/test_examples.py. Run the test on ROCm environment.
Failed CI job: https://github.com/pytorch/helion/actions/runs/18213179448/job/51857480348

**Expected behavior**
The test should pass on ROCm since both implementations perform the same mathematical operation
  - helion kernel: converts int16 -> bf16 inside Triton kernel then performs GEMM
  - pytorch reference: converts int16 -> bf16 with PyTorch then performs torch.matmul


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bf16xint16 fails on AMD ROCm with tensor comparison assertion error #797

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bf16xint16 fails on AMD ROCm with tensor comparison assertion error #797

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions