[WIP] Replace shift_group_left with reduce_over_group in Reduce Sum#3987
Open
CuiYifeng wants to merge 1 commit into
Open
[WIP] Replace shift_group_left with reduce_over_group in Reduce Sum#3987CuiYifeng wants to merge 1 commit into
CuiYifeng wants to merge 1 commit into
Conversation
8bd3f60 to
b3d75f5
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request refactors and enhances the reduction utilities in the SYCL backend to support native SYCL reduction operations when available, improving performance and maintainability. The most important changes are grouped below:
Support for Native SYCL Reduction Operations
get_native_sycl_opandnative_sycl_op_ttype traits to detect and propagate native SYCL reduction operators from functors, enabling the use of optimized native reductions when possible.SumFunctorto define anative_sycl_op(sycl::plus), allowing it to leverage SYCL’s built-in reduction.Refactoring of Reduction Logic
tree_reducefunction, which uses the native SYCL operator for floating-point types if available, and otherwise falls back to a manual reduction loop. This is now used in place of the previous manual reduction code ingroup_reduceandgroup_x_reduce.Integration and Propagation of Native Operators
func_wrapper_tand reduction kernel logic to propagate the detected native SYCL operator through the reduction pipeline, ensuring that the most efficient reduction method is chosen at compile time.These changes collectively make the reduction code more generic, extensible, and able to utilize hardware-accelerated reductions where possible, while falling back to custom logic otherwise.