Now, paddle ERNIE fp32 inference on CPU performance is ass below:
single thread: 251.464 m
20 threads:29.8818 ms
Our goal is to prove that with INT8 real kernel, ERNIE can get the performance gain.
@Sand3r- @wojtuss Please update your benchmark progress here.
@wzzju @luotao1 Please track the status here.