Commit 510dd14
authored
S390x simd implementation (#25757)
### Description
This change adds SIMD-optimized implementation of functions for s390x.
This implementation is based on similar functions for ppc64le.
#### Build System Integration (onnxruntime_mlas.cmake):
* Adds a new S390X flag to the CMake build system to detect the target
architecture.
* Includes new source files specific to s390x (SgemmKernel.cpp,
DgemmKernel.cpp, Quantize.cpp, qgemm_kernel_zvector.cpp, etc.).
* Sets the necessary compiler flags (-mvx, -mzvector, -march=z15) to
enable z/Vector extensions.
#### Platform Abstraction (mlasi.h, platform.cpp):
* Defines MLAS_TARGET_S390X and MLAS_ZVECTOR_INTRINSICS for conditional
compilation.
* Integrates the new s390x kernels into the MLAS_PLATFORM dispatch
table.
* platform.cpp now checks for z/Vector support at runtime using
getauxval(AT_HWCAP) and HWCAP_S390_VXE, allowing it to fall back to
scalar implementations if the hardware support is not present.
#### New Kernel Implementations:
* qgemm_kernel_zvector.cpp: Implements quantized integer matrix
multiplication. This is the core of the performance improvement for
quantized models.
* SgemmKernelZVECTOR.cpp / DgemmKernelZVECTOR.h: Implements single and
double-precision floating-point GEMM.
* QuantizeZVECTOR.cpp / Quantize.cpp: Implements quantization and
requantization kernels.
* FgemmKernelZVECTOR.h: A generic header providing templates and macros
for both single and double-precision GEMM, similar to the ppc64le
implementation.
### Motivation and Context
This change improves performance of onnxruntime on s390x.1 parent e0569fd commit 510dd14
File tree
28 files changed
+4397
-35
lines changed- cmake
- external
- patches/eigen
- onnxruntime
- core/mlas
- inc
- lib
- s390x
- test/mlas/unittest
28 files changed
+4397
-35
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| 7 | + | |
| 8 | + | |
7 | 9 | | |
8 | 10 | | |
9 | 11 | | |
10 | 12 | | |
| 13 | + | |
11 | 14 | | |
12 | 15 | | |
13 | 16 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
384 | 384 | | |
385 | 385 | | |
386 | 386 | | |
| 387 | + | |
| 388 | + | |
387 | 389 | | |
388 | 390 | | |
389 | 391 | | |
| |||
792 | 794 | | |
793 | 795 | | |
794 | 796 | | |
| 797 | + | |
| 798 | + | |
| 799 | + | |
| 800 | + | |
| 801 | + | |
| 802 | + | |
| 803 | + | |
| 804 | + | |
| 805 | + | |
| 806 | + | |
| 807 | + | |
| 808 | + | |
| 809 | + | |
| 810 | + | |
| 811 | + | |
| 812 | + | |
| 813 | + | |
| 814 | + | |
795 | 815 | | |
796 | 816 | | |
797 | 817 | | |
| |||
0 commit comments