You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Added ZA lazy save to kai_matmul_clamp_f32_qsi8d32p1x4_qsi4c32p4vlx4_1x4vl_sme_dot
Fix QAI8/QSI8CXP matmul test failures by constraining generated qsi32 bias values to preserve int32 accumulator headroom.
Fix a clamping issue in matmul_clamp_qai8_qai8p_qai8p_test.cpp
Fix traditional matmul and imatmul packed offset helpers to use packing panel boundaries.
New Advanced SIMD micro-kernels
Matrix Multiplication MxN and 1xN Micro-Kernels of QAI8DXP LHS and QSU2CXP RHS with F32 output, optimized for FEAT_DotProd, along with RHS packing kernel.
Documentation
Contribution policy updates as part of third party contribution enablement
Added coding standard and conventions
New Transposed-B RHS packing micro-kernel versions of kai_rhs_pack_kxn_x32p16x1b_x32_x32_neon and kai_rhs_pack_kxn_x16p32x1b_x16_x16_neon:
kai_rhs_pack_nxk_x16p32x1bx16_x16_x16_neon
kai_rhs_pack_nxk_x32p16x1bx32_x32_x32_neon
New SME2 FP32 GEMV micro-kernel with 4vsx1 RHS format
New SME2 static Int8 GEMM/GEMV kernels and the RHS packing kernel.