Skip to content

Updated FIR to add __shared__ optimization of Base_HIP variant.#543

Open
mxxw wants to merge 1 commit into
llnl:developfrom
mxxw:wagner60/elcap_774_opt
Open

Updated FIR to add __shared__ optimization of Base_HIP variant.#543
mxxw wants to merge 1 commit into
llnl:developfrom
mxxw:wagner60/elcap_774_opt

Conversation

@mxxw
Copy link
Copy Markdown

@mxxw mxxw commented Sep 10, 2025

Modified RAJAPerf/src/apps/{FIR-Hip.cpp,FIR.hpp}
to use shared/LDS memory in Base_HIP variant
to reduce pressure on vL1D/L2 cache, which
resulted in a > 1.5x speedup under ROCm-6.4.0
for --size 100000000 /* 1E8 */ .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant