Skip to content

CPP implementations of Conv kernels#3

Open
Rohanjames1997 wants to merge 24 commits intomainfrom
conv_cpp
Open

CPP implementations of Conv kernels#3
Rohanjames1997 wants to merge 24 commits intomainfrom
conv_cpp

Conversation

@Rohanjames1997
Copy link
Owner

@Rohanjames1997 Rohanjames1997 commented Jun 23, 2025

Need to implement 4 Conv kernels and 3 Pooling kernels in CPP + Intrinsics, and then modify them to use BF16 MMLA kernels.

  • MlasConvPointwiseFloatKernelNeon. Test: ./build/Linux/Debug/onnxruntime_mlas_test --gtest_filter=Conv2dNchwc*/KH1/KW1/*
  • MlasConvDepthwiseFloatKernelNeon. Test: ./build/Linux/Debug/onnxruntime_mlas_test --gtest_filter=Conv2dNchwc*/Cpg1/Fpg1/*
  • MlasConvNchwcFloatKernelNeon
  • MlasConvNchwFloatKernelNeon
  • MlasPoolMaximumFloatKernelNeon
  • MlasPoolAverageExcludePadFloatKernelNeon
  • MlasPoolAverageIncludePadFloatKernelNeon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant