Skip to content

new api changes for unit tests for supported configurations#683

Open
rajprince-intel wants to merge 7 commits intointel:mainfrom
rajprince-intel:rajprinc/unit-test-new-api-changes
Open

new api changes for unit tests for supported configurations#683
rajprince-intel wants to merge 7 commits intointel:mainfrom
rajprince-intel:rajprinc/unit-test-new-api-changes

Conversation

@rajprince-intel
Copy link

@rajprince-intel rajprince-intel commented Jan 6, 2026

Description

Migrate GEMM tests from legacy Intel Xe-specific API to new APIs.

Changes:

  • Grouped GEMM: gemm_universal_mainloopintelxexmx16group.cpp
  • Epilogue Fusion GEMM: gemm_universal_lincomb_per_rowbias_eltact.cpp

Type

  • Bug
  • Feature
  • Performance
  • Refactor

Testing

  • Tests pass
  • Xe12
  • Xe20

Performance

Metric Before After
Performance No change (API migration only) No change

References

Related to generic API modernization effort.

Checklist

  • Copyright headers preserved (Intel 2025)
  • Co-pilot Review (self-reviewed migration)
  • Deprecated APIs not used (migrated away from legacy APIs)

Quick Reference

Component Grouped GEMM Epilogue Fusion GEMM
File gemm_universal_mainloopintelxexmx16group.cpp gemm_universal_lincomb_per_rowbias_eltact.cpp
GmemTiledCopy XE_2D_U16x32x32_*void XE_2D_U16x32x32_*void
MMA Atom XE_8x16x16_*XE_DPAS_TT<8, float, bfloat16_t> XE_8x16x16_*XE_DPAS_TT<8, float, bfloat16_t>
GEMM Dispatch MainloopIntelXeXMX16GroupMainloopXeL1StagedGroup MainloopIntelXeXMX16MainloopXeL1Staged
Epilogue Dispatch IntelXeXMX16Group (unchanged) [based on example09] IntelXeXMX16IntelXeGeneric
Epilogue Tile N/A [based on example09] Add void as 3rd param
Epilogue Atoms Explicit (unchanged) [based on example09] Explicit → void

Code Examples

Grouped GEMM Changes

- using GmemTiledCopyA = XE_2D_U16x32x32_LD_N;
+ using GmemTiledCopyA = void;

- MMA_Atom<XE_8x16x16_F32BF16BF16F32_TT>
+ MMA_Atom<XE_DPAS_TT<8, float, cute::bfloat16_t>>

- using GEMMDispatchPolicy = cutlass::gemm::MainloopIntelXeXMX16Group<PipelineStages>;
+ using GEMMDispatchPolicy = cutlass::gemm::MainloopXeL1StagedGroup<PipelineStages>;

Epilogue Fusion GEMM Changes

- using GEMMDispatchPolicy = cutlass::gemm::MainloopIntelXeXMX16<PipelineStages>;
+ using GEMMDispatchPolicy = cutlass::gemm::MainloopXeL1Staged<PipelineStages>;

- using EpilogueDispatchPolicy = cutlass::epilogue::IntelXeXMX16;
+ using EpilogueDispatchPolicy = cutlass::epilogue::IntelXeGeneric;

  using CollectiveEpilogue = cutlass::epilogue::collective::CollectiveEpilogue<
      EpilogueDispatchPolicy, TileShape,
+     void,                    // NEW: Epilogue tile (automatic)
      ElementAccumulator, ..., FusionCallBacks,
-     XE_2D_U32x8x16_LD_N, void, void, XE_2D_U32x8x16_ST_N, void, void>;
+     void, void>;             // Automatic load/store atoms

aschabana
aschabana previously approved these changes Jan 6, 2026
@rajprince-intel rajprince-intel marked this pull request as ready for review January 16, 2026 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

Comments