Handle chunk loop in assembly while aligning both arm64 and amd64 implementations
#400
| Job | Run time |
|---|---|
| 1m 34s | |
| 59s | |
| 2m 33s |
arm64 and amd64 implementations
#400
| Job | Run time |
|---|---|
| 1m 34s | |
| 59s | |
| 2m 33s |