[SYCL] Use wide alignment for sycl::marray to improve performance#22275
[SYCL] Use wide alignment for sycl::marray to improve performance#22275uditagarwal97 wants to merge 3 commits into
sycl::marray to improve performance#22275Conversation
sycl::marray to improve performance
There was a problem hiding this comment.
Pull request overview
This PR introduces an opt-in (preview-breaking-changes) ABI-impacting alignment change for sycl::marray storage to improve vectorized access performance, and adds test coverage to validate the resulting IR/ABI differences plus an informational benchmark to compare sycl::vec vs sycl::marray.
Changes:
- Add a constexpr alignment computation for
sycl::marrayand (under preview-breaking-changes) applyalignas(...)to its storage. - Add an LLVM IR check test covering marray ops with and without preview-breaking-changes to validate alignment/ABI effects.
- Add a manual (UNSUPPORTED) e2e microbenchmark that prints timing comparisons between
sycl::vecandsycl::marray.
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| sycl/include/sycl/marray.hpp | Introduces wide-alignment logic (preview-guarded) via a new alignment computation helper and alignas storage. |
| sycl/test/check_device_code/marray/marray_ops.cpp | New IR-check test validating marray ABI/alignment differences between default and preview builds. |
| sycl/test-e2e/Performance/vec_vs_marray.cpp | New manual performance comparison tool (UNSUPPORTED) printing vec vs marray timings across types/sizes. |
|
|
||
| template <typename T> struct is_device_copyable; | ||
|
|
||
| #ifdef __INTEL_PREVIEW_BREAKING_CHANGES |
There was a problem hiding this comment.
this breaks my heart. Do we have customers actually using the preview breaking changes flag?
I guess there is nothing to be done about it, and come 2027.0 we'll once again sweep up all this ABI breaking stuff in, but I wish we had a better approach. We work hard to get these improvements into the product, but then hide them where no one will find them.
There was a problem hiding this comment.
True, it's sad that we can't make this change default because it will break the ABI :(
| // The alignment guarantee is limited to 64 bytes because some host compilers | ||
| // (e.g. on Microsoft Windows) limit the maximum alignment of function | ||
| // parameters to this value. | ||
| constexpr std::size_t marrayAlignment(std::size_t NumElements, |
There was a problem hiding this comment.
probably not worth the effort but is there anyway vec and marray can use the SAME affordance for calculating alignment instead of duplicating the code in two different places?
There was a problem hiding this comment.
Not sure if we should do that because (1) sycl::vec will be deprecated in future in favor of sycl::marray (2) Alignment calculation of sycl::vec and sycl::marray is different because, unlike sycl::vec, sycl::marray can accept any arbitrary size
cperkinsintel
left a comment
There was a problem hiding this comment.
had some non-blocking commentary.
|
@intel/llvm-gatekeepers please consider merging |
|
Changing the PR to draft for the time being because of some ongoing internal discussions about sycl::marray alignment. |
Currently,
sycl::marrayusessizeof(DataType)alignment whilesycl::vecusessizeof(DataType)*NumElementsalignment, that's why for 16- and 8-bit data types, IGC is able to vectorizesycl::veccode but notsycl::marray. This caused performance regression when migrating fromsycl::vectosycl::marray.This PR fixes
sycl::marrayalignment.