floresta-chain: new optimized chainstore #251

Davidson-Souza · 2024-10-08T00:28:34Z

The current chainstore is based on kv, but it has a few problems:

When we flush, we get a huge heap spike
We are getting a 2 or 3 times overhead on headers
It gets kinda slow to retrieve headers during IBD if we flush early

This commit introduces a bare-bones, ad-hock store that consists in two parts:

A open addressing, file backed and memory-mapped hash map to keep the relation block_hash -> block_height
A flat file that contains block headers serialized, in ascending order

To recover a header, given the block height, we simply use pointer arithmetic inside the flat file. If we need to get from the block hash, use the map first, then find it inside the flat file. This has the advantage of not needing explicit flushes (the os will flush it in fixed intervals), flushes are async (the os will do it), we get caching for free (mmap-ed pages will stay in memory if we need) and our cache can react to system constraints, because the kernel will always know how much memory we sill have

JoseSK999 · 2024-10-08T15:54:34Z

My understanding is that kv buckets also don't need explicit flushes. The data saved in a bucket is flushed to disk by the OS periodically, and it's kept in the inner sled pagecache (so we should have fast access), which is configured with a capacity of 100MB in KvChainStore::new.

Davidson-Souza · 2024-10-08T16:23:37Z

That was my understanding too, but for some reason, even before using the cache (second part of #169) it didn't really flush on its own, and if we had an unclean shutdown, we would lose our progress (or a good part of it). It would also become increasingly more CPU and IO heavy as we made progress, I suspect it's due to the block locator getting bigger as our chain grows.

But the biggest problem for me, that I couldn't find an alternative, was the heap spike. It would always crash on my phone, before or after #169. With this PR, it runs fine!

JoseSK999 · 2024-10-08T16:39:33Z

It would also become increasingly more CPU and IO heavy as we made progress, I suspect it's due to the block locator getting bigger as our chain grows.

Isn't that the expected behavior of a node in IBD, as it moves from old empty blocks to more recent ones?

Also was the OS not flushing on its own on desktop or on mobile?

Davidson-Souza · 2024-10-10T00:40:33Z

Isn't that the expected behavior of a node in IBD, as it moves from old empty blocks to more recent ones?

Not this much, at least not for headers. They are small and have constant-size

Also was the OS not flushing on its own on desktop or on mobile?

Both. At least on my setup.

jaoleal

Nice changes, heres mine superficial review... I still didnt finished.

Youre right, this needs a lot of testing and review.
It looks a nice job!

crates/floresta-chain/src/pruned_utreexo/flat_chain_store.rs

JoseSK999

Some initial review, looks good so far! I still have to check flat_chain_store.rs

crates/floresta-chain/Cargo.toml

crates/floresta-chain/src/pruned_utreexo/chain_state.rs

JoseSK999

Mainly nits, but I think I have found an issue with the update_block_index method impl, as explained below.

crates/floresta-chain/src/pruned_utreexo/flat_chain_store.rs

lucad70 · 2025-03-21T15:11:08Z

I have started a review aiming to measure the heap usage on both master and on this PR. However the tool I am using stops recording as soon as I give a kill sig to the florestad (I am using mac's Instruments for reference access this link).

This is a snapshot showing both builds (this PR and Master).

I wanted to give a more detailed info on shutdown, as I figure out how to do it, there goes a few reviewing points:

On both builds there is a spike at the start up (I'll also look into it to figure out why)
This PR also improves heap usage overall during startup and regular process. Check the total bytes column to see the decrease from 5.59GiB to 5.02GiB on heap allocations, roughly 10% improvement overall on startup and regular usage.

Davidson-Souza · 2025-03-23T17:28:54Z

I have started a review aiming to measure the heap usage on both master and on this PR. However the tool I am using stops recording as soon as I give a kill sig to the florestad (I am using mac's Instruments for reference access this link).

This is a snapshot showing both builds (this PR and Master).

I wanted to give a more detailed info on shutdown, as I figure out how to do it, there goes a few reviewing points:

On both builds there is a spike at the start up (I'll also look into it to figure out why)

This PR also improves heap usage overall during startup and regular process. Check the total bytes column to see the decrease from 5.59GiB to 5.02GiB on heap allocations, roughly 10% improvement overall on startup and regular usage.

Hey @lucad70 . Are you building floresta with the --experimental-db feature?

lucad70 · 2025-03-24T11:27:11Z

Hey @lucad70 . Are you building floresta with the --experimental-db feature?

I am not. Should I be building it using the --experimental-db? Maybe I missed it while reading the documentation.

lucad70 · 2025-03-24T12:04:11Z

Hey @lucad70 . Are you building floresta with the --experimental-db feature?

I am not. Should I be building it using the --experimental-db? Maybe I missed it while reading the documentation.

Okay, I have re-read here and I am now testing on my machine. I am also fixing a few mistakes I made while profiling.

I used the debug version, I should use release instead
I did not used the experimental-db
I did only one run for each attempt
I did not make sure my computer had no other program running

I am now addressing these issues and will return with more helpful data asap

crates/floresta-chain/src/pruned_utreexo/flat_chain_store.rs

JoseSK999

A few additional suggestions. I still have to review more deeply some things

florestad/src/florestad.rs

crates/floresta-chain/src/pruned_utreexo/flat_chain_store.rs

JoseSK999

A couple of other suggestions, I'm closer to a final review

florestad/src/florestad.rs

crates/floresta-chain/src/pruned_utreexo/flat_chain_store.rs

Davidson-Souza · 2025-03-28T17:02:02Z

Updated with this diff

diff --git a/crates/floresta-chain/src/pruned_utreexo/flat_chain_store.rs b/crates/floresta-chain/src/pruned_utreexo/flat_chain_store.rs
index be41a88..d7ec29c 100644
--- a/crates/floresta-chain/src/pruned_utreexo/flat_chain_store.rs
+++ b/crates/floresta-chain/src/pruned_utreexo/flat_chain_store.rs
@@ -22,7 +22,7 @@
 //!
 //! We want to keep the load factor for the hash map as low as possible, while also avoiding
 //! re-hashing. So we pick a fairly big initial space to it, say 10 million buckets. Each bucket is
-//! 4 bytes long, so we have 40 MiB of map. Each [DiskBlockHeader] is 116 bytes long (80 bytes for
+//! 4 bytes long, so we have 40 MiB of map. Each [DiskBlockHeaderAndHash] is 116 bytes long (80 bytes for
 //! header + 4 for height + 32 for hash), so the maximum size for it, assuming 2.5 million headers
 //! (good for the next ~30 years), is 330 MiB. The smallest device I know that can run floresta
 //! has ~250 MiB of RAM, so we could  fit almost everything in memory. Given that newer blocks are
@@ -99,7 +99,6 @@ const UTREEXO_ACC_SIZE: usize = 32 * 64 + 8;
 
 #[derive(Clone)]
 /// Configuration for our flat chain store. See each field for more information
-
 pub struct FlatChainStoreConfig {
     /// The index map size, in buckets
     ///
@@ -137,7 +136,7 @@ pub struct FlatChainStoreConfig {
     /// The size of the fork headers file map, in headers
     ///
     /// This store keeps headers that are not in our main chain, but may be needed sometime. The
-    /// default value is having space for 1000 blocks.
+    /// default value is having space for 10,000 blocks.
     ///
     /// When computing the size in bytes, we will round the number to the nearest power of 2, minus
     /// 1. This lets us do some optimizations like use & instead of %, and use << instead of *.
@@ -346,7 +345,7 @@ pub struct FlatChainStore {
 /// and easy to read. It holds the memory map for the index, and the size of the index
 /// in buckets. It also keeps track of how many buckets are occupied, so we can re-hash
 /// the map when it gets full.
-pub struct BlockIndex {
+struct BlockIndex {
     /// The actual memory map for the index
     index_map: MmapMut,
 
@@ -367,7 +366,7 @@ impl BlockIndex {
     /// This function should only be called by [FlatChainStore::new], and it should never be called
     /// directly. It creates a new block index, given a memory map for the index, the size of the
     /// index in buckets, and how many buckets are occupied.
-    pub(super) fn new(index_map: MmapMut, index_size: usize, occupied: usize) -> Self {
+    fn new(index_map: MmapMut, index_size: usize, occupied: usize) -> Self {
         Self {
             index_map,
             index_size,
@@ -380,7 +379,7 @@ impl BlockIndex {
     /// If we have enough changes that we don't want to lose, we should flush the index map to
     /// disk. This will make sure that the index map is persisted, and we can recover it in case
     /// of a crash.
-    pub(super) fn flush(&self) -> Result<(), FlatChainstoreError> {
+    fn flush(&self) -> Result<(), FlatChainstoreError> {
         self.index_map.flush()?;
 
         Ok(())
@@ -732,7 +731,7 @@ impl FlatChainStore {
             return Err(FlatChainstoreError::NotFound);
         }
 
-        Ok(metadata.acc[0..size].iter().copied().collect())
+        Ok(metadata.acc[0..size].to_vec())
     }
 
     unsafe fn do_save_height(&self, best_block: BestChain) -> Result<(), FlatChainstoreError> {
@@ -868,7 +867,6 @@ impl FlatChainStore {
     }
 
     #[inline(always)]
-    #[must_use]
     #[doc(hidden)]
     fn get_cache_mut(&self) -> Result<MutexGuard<CacheType>, PoisonError<MutexGuard<CacheType>>> {
         self.cache.lock()
diff --git a/florestad/src/florestad.rs b/florestad/src/florestad.rs
index 0d5748c..6a1edfa 100644
--- a/florestad/src/florestad.rs
+++ b/florestad/src/florestad.rs
@@ -706,7 +706,7 @@ impl Florestad {
     #[cfg(feature = "experimental-db")]
     fn load_chain_store(data_dir: String) -> ChainStore {
         let config = FlatChainStoreConfig::new(data_dir + "/chaindata");
-        ChainStore::new(config.clone()).expect("failure while creating chainstate")
+        ChainStore::new(config).expect("failure while creating chainstate")
     }
 
     #[cfg(not(feature = "experimental-db"))]
@@ -715,7 +715,7 @@ impl Florestad {
         network: Network,
         assume_valid: Option<bitcoin::BlockHash>,
     ) -> ChainState<ChainStore<'static>> {
-        let db = ChainStore::new(data_dir.to_string()).expect("Could not read db");
+        let db = ChainStore::new(data_dir.clone()).expect("Could not read db");
         let assume_valid =
             assume_valid.map_or(AssumeValidArg::Hardcoded, AssumeValidArg::UserInput);
 
@@ -723,7 +723,7 @@ impl Florestad {
             Ok(chainstate) => chainstate,
             Err(err) => match err {
                 BlockchainError::ChainNotInitialized => {
-                    let db = ChainStore::new(data_dir.to_string()).expect("Could not read db");
+                    let db = ChainStore::new(data_dir).expect("Could not read db");
 
                     ChainState::<ChainStore>::new(db, network.into(), assume_valid)
                 }
@@ -746,7 +746,7 @@ impl Florestad {
             Ok(chainstate) => chainstate,
             Err(err) => match err {
                 BlockchainError::ChainNotInitialized => {
-                    let db = Self::load_chain_store(data_dir.clone());
+                    let db = Self::load_chain_store(data_dir);
 
                     ChainState::<ChainStore>::new(db, network.into(), assume_valid)
                 }

JoseSK999

Mainly changes for readability and clarity, and some suggested comments that I believe are important. My next review should be the last one.

crates/floresta-chain/src/pruned_utreexo/flat_chain_store.rs

JoseSK999

We need to update, persist and correctly load the BlockIndex occupied count.

crates/floresta-chain/src/pruned_utreexo/flat_chain_store.rs

Davidson-Souza · 2025-04-17T15:16:19Z

@JoseSK999 I think the problem here is the API for ChainStore. It should mark methods like save_header as taking a mutable reference of self. Otherwise we end up in a situation where users can always call them in a thread unsafe way. If we want to expose those types, and allow users to implement their own chainstate, we must make sure this is not possible. For this PR tho, I think we can just leave as is for now.

For the map size, I'm using the fields inside metadata as you suggested. I'm also initializing the Metadata struct when we create a new chainstore

Davidson-Souza · 2025-05-02T22:24:40Z

Rebased

JoseSK999 · 2025-05-12T18:13:01Z

@Davidson-Souza I see what you mean. Even if we don't unsafe impl Sync, FlatChainStore is already Sync since the MmapMut type is Sync. So currently we can have multiple threads with references to the store and do concurrent writes and reads (scary). Changing the signature for the mutating methods seems the proper fix (for another PR).

JoseSK999

Final comments!

crates/floresta-chain/src/pruned_utreexo/flat_chain_store.rs

Davidson-Souza · 2025-05-13T20:43:31Z

Rebased and updated with @JoseSK999's comments.

JoseSK999

Some remaining things and we are done

crates/floresta-chain/src/pruned_utreexo/flat_chain_store.rs

Davidson-Souza · 2025-05-15T15:29:19Z

@JoseSK999 done!

crates/floresta-chain/src/pruned_utreexo/flat_chain_store.rs

The current chainstore is based on `kv`, but it has a few problems: - When we flush, we get a huge heap spike - We are getting a 2 or 3 times overhead on headers - It gets kinda slow to retrieve headers during IBD if we flush early This commit introduces a bare-bones, ad-hock store that consists in two parts: - A open addressing, file backed and memory-mapped hash map to keep the relation block_hash -> block_height - A flat file that contains block headers serialized, in ascending order - A LRU cache to avoid going througth the map every time To recover a header, given the block height, we simply use pointer arithmetic inside the flat file. If we need to get from the block hash, use the map first, then find it inside the flat file. This has the advantage of not needing explicit flushes (the os will flush it in fixed intervals), flushes are async (the os will do it), we get caching for free (mmap-ed pages will stay in memory if we need) and our cache can react to system constraints, because the kernel will always know how much memory we sill have

JoseSK999

ACK 40baf4b 🥳

Davidson-Souza · 2025-05-15T23:14:33Z

Thanks @JoseSK999, @lucad70 and @jaoleal for the valuable review 🫡

Davidson-Souza mentioned this pull request Oct 8, 2024

No data is persisted during IBD #250

Closed

Davidson-Souza mentioned this pull request Dec 23, 2024

Add prometheus support #314

Merged

Davidson-Souza force-pushed the new-chainstore branch 7 times, most recently from 4f43d68 to 9961e8c Compare January 2, 2025 22:05

Davidson-Souza changed the title ~~[WIP] floresta-chain: new optimized chainstore~~ floresta-chain: new optimized chainstore Jan 3, 2025

Davidson-Souza marked this pull request as ready for review January 3, 2025 00:26

jaoleal reviewed Jan 3, 2025

View reviewed changes

Davidson-Souza force-pushed the new-chainstore branch 2 times, most recently from 9f33428 to 66f70e6 Compare February 11, 2025 16:19

JoseSK999 suggested changes Feb 13, 2025

View reviewed changes

crates/floresta-chain/Cargo.toml Outdated Show resolved Hide resolved

crates/floresta-chain/src/pruned_utreexo/chain_state.rs Outdated Show resolved Hide resolved

JoseSK999 suggested changes Feb 20, 2025

View reviewed changes

Davidson-Souza force-pushed the new-chainstore branch 3 times, most recently from 5175e30 to 0597417 Compare March 5, 2025 17:54

Davidson-Souza force-pushed the new-chainstore branch from 0597417 to 960778c Compare March 18, 2025 14:38

JoseSK999 suggested changes Mar 24, 2025

View reviewed changes

crates/floresta-chain/src/pruned_utreexo/flat_chain_store.rs Outdated Show resolved Hide resolved

JoseSK999 suggested changes Mar 24, 2025

View reviewed changes

JoseSK999 suggested changes Mar 27, 2025

View reviewed changes

Davidson-Souza force-pushed the new-chainstore branch from ee56aeb to 3b8a3cd Compare March 28, 2025 17:02

JoseSK999 suggested changes Mar 30, 2025

View reviewed changes

Davidson-Souza force-pushed the new-chainstore branch from 3b8a3cd to 36764c9 Compare April 1, 2025 18:27

JoseSK999 suggested changes Apr 4, 2025

View reviewed changes

Davidson-Souza force-pushed the new-chainstore branch from 36764c9 to 3341644 Compare April 17, 2025 14:48

Davidson-Souza force-pushed the new-chainstore branch from 3341644 to 136ea08 Compare April 17, 2025 15:51

Davidson-Souza force-pushed the new-chainstore branch from 136ea08 to c8d0751 Compare May 2, 2025 22:24

Davidson-Souza force-pushed the new-chainstore branch from c8d0751 to 1c770b4 Compare May 10, 2025 18:08

Davidson-Souza requested a review from JoseSK999 May 12, 2025 00:37

Davidson-Souza mentioned this pull request May 12, 2025

flat-chain-store: Detect file corruption and reindex if needed. #479

Merged

15 tasks

JoseSK999 suggested changes May 13, 2025

View reviewed changes

Davidson-Souza force-pushed the new-chainstore branch 2 times, most recently from 48a620d to ff692d2 Compare May 13, 2025 20:42

JoseSK999 suggested changes May 14, 2025

View reviewed changes

Davidson-Souza force-pushed the new-chainstore branch 2 times, most recently from 2ed1330 to 7022b15 Compare May 15, 2025 15:28

Davidson-Souza force-pushed the new-chainstore branch from 7022b15 to 8344e2e Compare May 15, 2025 15:45

JoseSK999 reviewed May 15, 2025

View reviewed changes

crates/floresta-chain/src/pruned_utreexo/flat_chain_store.rs Outdated Show resolved Hide resolved

Davidson-Souza force-pushed the new-chainstore branch from 8344e2e to 40baf4b Compare May 15, 2025 20:29

JoseSK999 approved these changes May 15, 2025

View reviewed changes

Davidson-Souza merged commit 46b5830 into vinteumorg:master May 15, 2025
10 checks passed

floresta-chain: new optimized chainstore #251

floresta-chain: new optimized chainstore #251

Uh oh!

Conversation

Davidson-Souza commented Oct 8, 2024

Uh oh!

JoseSK999 commented Oct 8, 2024

Uh oh!

Davidson-Souza commented Oct 8, 2024

Uh oh!

JoseSK999 commented Oct 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Davidson-Souza commented Oct 10, 2024

Uh oh!

jaoleal left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JoseSK999 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

JoseSK999 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lucad70 commented Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Davidson-Souza commented Mar 23, 2025

Uh oh!

lucad70 commented Mar 24, 2025

Uh oh!

lucad70 commented Mar 24, 2025

Uh oh!

Uh oh!

JoseSK999 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JoseSK999 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Davidson-Souza commented Mar 28, 2025

Uh oh!

JoseSK999 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JoseSK999 commented Oct 8, 2024 •

edited

Loading

lucad70 commented Mar 21, 2025 •

edited

Loading

Davidson-Souza commented Apr 17, 2025 •

edited

Loading