-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Implement persistent reputation database for collator protocol (#7751) #10917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: alindima/collator-protocol-revamp-collation-manager
Are you sure you want to change the base?
Implement persistent reputation database for collator protocol (#7751) #10917
Conversation
Signed-off-by: Alexandru Cihodaru <[email protected]>
Implements disk persistence for the experimental collator protocol's
reputation system to preserve peer scores across validator restarts.
Key changes:
- Add PersistentDb wrapper that persists in-memory Db to disk
- Implement periodic persistence (10 min prod, 30 sec test mode)
- Add immediate persistence for slashes and para pruning
- Create serialization layer for ScoreEntry and reputation data
- Add ReputationConfig for database column configuration
Signed-off-by: Alexandru Cihodaru <[email protected]>
Signed-off-by: Alexandru Cihodaru <[email protected]>
Tests added:
- basic_persistence.rs: Validates that reputation data persists across
validator restarts and that the startup lookback mechanism correctly
processes missed blocks during downtime. Tests both large gaps (20+
blocks) and small gaps (<20 blocks).
- pruning.rs: Verifies that reputation data is correctly pruned when
parachains are deregistered. Tests para cleanup via sudo, session
boundary detection, immediate persistence of pruned state, and
correct loading of only remaining paras after restart.
Signed-off-by: Alexandru Cihodaru <[email protected]>
|
All GitHub workflows were cancelled due to failure one of the required jobs. |
alindima
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, good job!
| // along with Polkadot. If not, see <http://www.gnu.org/licenses/>. | ||
|
|
||
| //! Serialization types for disk persistence of collator reputation data. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all the db-related file would be better off in a separate module within the peer manager (including the in-memory db). And the in-memory db should be renamed from just plain db.
But let's do the renames in a follow-up PR, to not make review more difficult right now
|
|
||
| let last_finalized = instance.inner.processed_finalized_block_number().await; | ||
|
|
||
| // Use info level for clear observability in tests and production |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
outdated comment
| // Load metadata | ||
| if let Some(meta) = self.load_metadata()? { | ||
| // We need to directly access the inner fields to restore state | ||
| // This requires making Db fields pub(super) or adding setter methods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this comment
|
|
||
| /// Persist all in-memory data to disk. | ||
| /// | ||
| /// This should be called periodically by the main loop (currently, every 10 minutes). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optimisation: it should only be done for the paras that have been "dirtied" (had any change).
| } | ||
|
|
||
| /// Persist a single para's data to disk (called immediately after slash). | ||
| fn persist_para(&self, para_id: &ParaId) -> Result<(), PersistenceError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: for slash we only need to persist one peer's reputation. The way it was previously stored (in the old PR), this was possible
| ); | ||
|
|
||
| if block_data.experimental_send_approved_peer { | ||
| // Create a valid PeerId in multihash format: [hash_code, digest_size, ...digest_bytes] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just double checking: this was added for test purposes, but the code can handle these invalid peerids gracefully right?
| // Copyright (C) Parity Technologies (UK) Ltd. | ||
| // SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| //! Test: Basic Persistence on Graceful Shutdown with Startup Lookback Verification |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the name of the test is not expressive enough. What reputation is this about?
Also, this tests persistence on restart more than anything
| @@ -0,0 +1,266 @@ | |||
| // Copyright (C) Parity Technologies (UK) Ltd. | |||
| // SPDX-License-Identifier: Apache-2.0 | |||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you want to have the tests running in CI you need to add them to the github actions config
| //! 5. **Deregister parachain 2001 using sudo** | ||
| //! 6. Wait for session boundary (triggers pruning check) | ||
| //! 7. Verify pruning logs show para 2001 was pruned | ||
| //! 8. Wait for periodic persistence (pruned state written to disk) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the pruned state is written immediately, is it not?
| ); | ||
|
|
||
| log::info!("Verifying para 2000 continues normal operation (para 2001 is deregistered)"); | ||
| assert_para_throughput(&validator0_client_after, 5, [(ParaId::from(PARA_ID_1), 4..6)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also assert that para 2001 has throughput 0
Implements persistent storage for the experimental collator protocol's reputation database.
Changes:
PersistentDbwrapper that persists the in-memory reputation DB to diskImplementation:
PersistentDbwraps the existingDband adds persistence on top:Tests:
basic_persistence.rs: Validates persistence across restarts and startup lookbackpruning.rs: Validates automatic cleanup on parachain deregistration