Skip to content

Conversation

@fernandodeluret
Copy link
Contributor

  • Deserialize snapshot metadata file
  • Use AccountFile lenght contained in metadata for deserialization

Comment on lines +108 to +116
let (slot_str, id_str) = file_name
.split_once('.')
.unwrap_or_else(|| panic!("Invalid file name: {}", file_name));
let slot = slot_str
.parse::<u64>()
.unwrap_or_else(|_| panic!("Invalid slot: {}", slot_str));
let id = id_str
.parse::<u64>()
.unwrap_or_else(|_| panic!("Invalid id: {}", id_str));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can share details on this? Wonderful addition of write version. Please share any resources too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is not much resources about any of the changes here. I basically had to read through agave code, in theory we should only have 1 write version of the account per slot in snapshots (so also means 1 file per slot). But it's there just for cover that edge case. Also because it's a good way to double check that metadata deserialization went correctly (because we look on metadata not only by slot but also by write version)

Comment on lines +122 to +128
let mut size = None;
for account in accounts_metadata {
if account.id as u64 == id {
size = Some(account.accounts_current_len);
break;
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you tell me a bit about this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the basic idea of this(and all this change in general) is that we can teorically have files for which the accounts data does not match the file length (because of padding). So the safe way to decode accounts is to read the "accounts_current_len" from the metadata file, which is how agave is loading snapshots. In the tests I ran after the deserialization in practice they do match most of the time, but I left it there because it's the way that agave uses and also we cover the case there is difference between account data and file size

);

let solana_snapshot = SolanaSnapshot::unpack_compressed(config.path.clone())?;
let solana_snapshot = SolanaSnapshot::unpack_compressed(config.path.clone(), config.slot)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the slot is in the name of the file why pass it in? And also assuming folks don't change the filename not a big one but it is something that could break this. I don't think it should rely on the name of the tar but can rely on the naming and folder structure of the contents of the tar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true, actually for now this is draft, but I havethe intention of doing the slot optional, and if not passed use the metadata slot (the file name can also be checked). But this is a good double check (actually triple check) about the slot, because we can set as the slot in the config the data received as result from the endpoint we get the snapshot (and over which we do some calculations). So when dealing with several snapshots in the same process this is just an reduntant check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still want to test further all this though (seems is working good so far, and could be a good resource for people working with snapshots, because there is not much docs out there about all the length check, at least that I have found)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants