Skip to content

Mathias's Patch: Perf tuning for gcc + aarch64 #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 7, 2024

Conversation

erin2722
Copy link
Collaborator

Applying google#176 to our fork so that we can upgrade with perf benefits now.

@erin2722 erin2722 force-pushed the mathias-perf-tuning branch from 911f364 to 90b7d1b Compare February 5, 2024 15:18
std::memmove(dst + kShortMemCopy,
static_cast<const uint8_t*>(src) + kShortMemCopy,
64 - kShortMemCopy);
FixedSizeMemMove<kShortMemCopy>(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q] We've already copied kShortMemCopy bytes on line 1126. This code used to copy the remaining bytes if size > kShortMemCopy by copying 64 - kShortMemCopy bytes. The new code, however, always copies kShortMemCopy, assuming 2 * kShortMemCopy == 64, right? If so, do we need to have a static assertion verifying this?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is defined as 32 on line 1100, 30 lines up. Adding a static assert that 32*2 == 64 seems silly. Also the original AVX impl was relying on this as well.

Copy link

@samanca samanca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just have one question for @RedBeard0531, otherwise LGTM.

@erin2722 erin2722 merged commit 80c9300 into v1.1.10 Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants