Skip to content

Reduce allocator usage in Gossip implementation #4640

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

alexpyattaev
Copy link

@alexpyattaev alexpyattaev commented Jan 26, 2025

Problem

Gossip is torturing the memory allocator too much. Significant amount of these allocations occur in flate2 compression.

Summary of Changes

  • Deflate called inflate for some reason, and discards result. This is pure waste. Killed the unneeded call
  • Switched to bitvec crate to reduce reallocations that were created by lack of API to read the underlying storage buffer

Fixes #
Gossip allocating too much.

@alexpyattaev alexpyattaev changed the title kill useless call to inflate Reduce allocator usage in Gossip implementation Jan 26, 2025
compressed,
};
let _ = rv.inflate()?;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently this was a hack to validate that compression worked out. A crude but effective one I guess.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it was a sanity check.
I don't know how often it does fail though.
Have you checked if it ever fails or not?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we keep this line as a debug check:

debug_assert_matches!(rv.inflate(), Ok(_));

@alexpyattaev alexpyattaev marked this pull request as ready for review January 26, 2025 22:00
@alexpyattaev alexpyattaev removed the request for review from gregcusack January 26, 2025 22:56
@alexpyattaev alexpyattaev marked this pull request as draft January 26, 2025 22:57
@behzadnouri
Copy link

Chatted on slack.
We need to keep the old bv crate because any minor implementation differences between the two crates can introduce compatibility issues where a node running the new code cannot send (or receive) epoch-slots to (or from) a node running older version of the code.

@alexpyattaev alexpyattaev marked this pull request as ready for review January 27, 2025 14:32
@gregcusack gregcusack self-requested a review January 27, 2025 15:44
@alexpyattaev
Copy link
Author

revert to bv complete. We are even doing the exact same amount of allocations as we'd get with bitvec.

Copy link

@gregcusack gregcusack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good just a few questions but then should be gtg

Comment on lines +145 to 152
min_slot - self.first_slot
};
for i in start..self.num {
if i >= self.slots.len() as usize {
for i in start..self.num as u64 {
if i >= self.slots.len() {
break;
}
if self.slots.get(i as u64) {
if self.slots.get(i) {
rv.push(self.first_slot + i as Slot);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these just stylistic changes?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its to keep clippy happy.

Comment on lines +168 to +173
let offset = *s - self.first_slot;
if offset >= self.slots.len() {
return i;
}
self.slots.set(*s - self.first_slot, true);
self.num = std::cmp::max(self.num, 1 + (*s - self.first_slot) as usize);
self.slots.set(offset, true);
self.num = std::cmp::max(self.num, 1 + offset as usize);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is purely to make the code more readable.

compressed,
};
let _ = rv.inflate()?;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we worried about forcing sanity check on the receiver of this EpochSlot? Looks like maybe no since you added the check here:

 if status != flate2::Status::StreamEnd {
        return Err(Error::DecompressError);
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the original idea was to validate that compression worked (no idea why else it would be here). But since flate2 can tell us that "it worked", I see no point running decompression to achieve the same result. The check you refer is exactly that, StreamEnd according to docs indicates that all input was consumed and all output fits into buffer provided without reallocation.

Copy link

@behzadnouri behzadnouri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks fine.
But just to make sure, have you verified that this change is backward compatible?
Like the new (old) code is able to ingest epoch slots generated by the old (new) code?

Separately, with the new code does deflate/inflate return Err more often or fewer times than before?

compressed,
};
let _ = rv.inflate()?;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it was a sanity check.
I don't know how often it does fail though.
Have you checked if it ever fails or not?

@alexpyattaev
Copy link
Author

Did some tests on MNB. The comperssion branch never errored, including the old validation path that did decompression to check (it does not get called much though, so maybe it is possible to trigger errors there if we try harder), the decompression path errored in 43104/6059035 packets, each time with BufError (i.e. not enough space for decompression https://prisma.github.io/prisma-engines/doc/flate2/enum.Status.html). Previously, before we did any checking on the decompress path, all of these packets would be passed to the validator and do something in there. I guess these could be formed by incompatible implementations?

@behzadnouri
Copy link

behzadnouri commented Jan 28, 2025

the decompression path errored in 43104/6059035 packets, each time with BufError

hmm, how often does this happen with the old code?

Separately, can you please check if number of epoch-slots received from the cluster but fail to sigverify increase or not?
Like discussed in the chat before, signatures are defined on the serialized "bytes", so we need to make sure that starting with any serialized input bytes:

serialize(deserialize(bytes)) == bytes

otherwise the signatures do not verify. In other words,

bytes -> deserialize -> serialize -> bytes

should round-trip. So if number of epoch-slots which don't sigverify increase, it means that above is not valid.

@alexpyattaev
Copy link
Author

hmm, how often does this happen with the old code?

The decompression path is exactly the same as old code, except for the check that did not exist. So in the old code it would fail exactly the same, just it would be ignored.

@behzadnouri
Copy link

the decompression path errored in 43104/6059035 packets, each time with BufError (i.e. not enough space for decompression https://prisma.github.io/prisma-engines/doc/flate2/enum.Status.html).

Pretty suspicious that it is giving BufError.
Does it help if we increase the head room below?
https://github.com/anza-xyz/agave/blob/9ca57b16e/gossip/src/epoch_slots.rs#L101-L102

@alexpyattaev
Copy link
Author

Increasing headroom to 128 bytes results in no notable reduction in rate of errors to decode packet. I guess only some nodes produce invalid packets that fail to decompress without errors, and buffer size has nothing to do with it.
5963/649907 packets failed to decompress over a test run of 10 minutes.

@alexpyattaev
Copy link
Author

alexpyattaev commented Jan 29, 2025

Attaching list of validators that send invalid packets
invalids.txt
Update: got a total of 920 validators sending "malformed" packets.

@behzadnouri
Copy link

Update: got a total of 920 validators sending "malformed" packets.

There is a newer version of flate that the one we are using: #3660
but the dependebot patch isn't merged yet.
It might be worth to give it a shot and see if we see more or fewer number of malformed packets.

@alexpyattaev
Copy link
Author

13436 /897356 or 1.4% of packets are reported as invalid by inflate process with the new version, which is somewhat more than with the old version (0.8%). But the more suspicious issue is that new version is failing the roundtrip tests in CI, but passes the same tests on a devbox, even with multiple reruns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants