Skip to content

Conversation

@pjbgf
Copy link
Owner

@pjbgf pjbgf commented Jul 27, 2025

The changes are a stepping stone for future SIMD optimisations. New comments have been added throughout the assembly code to make it easier to review and maintain the code in the future, with special attention to the stack layout.

The block implementation for both architecture was handling the chunk loop in Go, as well as calling the checkCollision. Both have now been moved to assembly.

pjbgf added 6 commits July 27, 2025 13:52
The native implementation for the DV mask calculation were missing
the noescape directives. For further optimisation, the wrapping funcs
are now marked with nosplit.

Signed-off-by: Paulo Gomes <[email protected]>
The recent changes seem to have caused a bug when -race is enabled,
additional tests are under way to understand where the problem lies.
Once the SIMD implementation is in place, this needs to be reverted.

Signed-off-by: Paulo Gomes <[email protected]>
@pjbgf
Copy link
Owner Author

pjbgf commented Aug 19, 2025

Superseded by #198 due to the additional complexity this was introducing.

@pjbgf pjbgf closed this Aug 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant