Skip to content

Commit 2f25b8a

Browse files
authored
Refactor AArch64 Interpolation Filter 16x16 implementation (#431)
* Move InterpolationFilter{ARM.h => _neon.cpp} Since this header is only used in one place and would not share any code with an eventual SVE implementation, simply move it to a .cpp file similar to MCTF.cpp. * Refactor simdFilter16xX_N8_neon The use of the vsrcv temporary array rather than simple local variables meant that LLVM emitted an unnecessary number of load/store instructions in the inner loops. Refactoring this to make the dependency between loop iterations more explicit allows for much nicer generated code. Running a video encoding job on a Neoverse V2 machine using the --preset=fast setting shows a ~1.8% improvement in reported FPS.
1 parent 7acfaba commit 2f25b8a

File tree

2 files changed

+287
-300
lines changed

2 files changed

+287
-300
lines changed

source/Lib/CommonLib/arm/InterpolationFilterARM.h

Lines changed: 0 additions & 299 deletions
This file was deleted.

0 commit comments

Comments
 (0)