-
Notifications
You must be signed in to change notification settings - Fork 174
Description
TLDR: Support data injection from genesis files using a streaming framework, rather than relying on lazy IO.
For the purposes of testing and benchmarking we provide functionality to be able to start in an arbitrary era with some initial data populated in the ledger state through the genesis files.
This injection of data for benchmarking is really big in size. Current solution of using lazy lists (namely ListMap) during decoding of those large Genesis files proved to be very fragile and has led to some serious space leaks, causing issues for performance and tracing team. It has been agreed upon to adjust this injection mechanism in such a way that instead of decoding big json file lazily, genesis files will provide a filename for the large injection data, instead of embedding this data directly into the genesis files.
Here are currently all of the occurrences of injected data that can be large:
- Initial funds (a.k.a. initial UTxO):
, sgInitialFunds :: LM.ListMap Addr Coin - Initial stake pools:
cardano-ledger/eras/shelley/impl/src/Cardano/Ledger/Shelley/Genesis.hs
Lines 123 to 124 in c22a509
{ sgsPools :: LM.ListMap (KeyHash StakePool) StakePoolParams -- ^ Pools to register - Initial stake credentials with their stake pool delegations:
cardano-ledger/eras/shelley/impl/src/Cardano/Ledger/Shelley/Genesis.hs
Lines 129 to 130 in c22a509
, sgsStake :: LM.ListMap (KeyHash Staking) (KeyHash StakePool) -- ^ Stake-holding key hash credentials and the pools to delegate that stake - Initial delegations to DReps:
, cgInitialDReps :: ListMap (Credential DRepRole) DRepState
Each of these need to transition to:
- a consistent interface that is specified in
"extraConfig"json field, like it has recently been done for Alonzo genesis in Add ability to inject any cost models viaAlonzoGenesis#5379 - streaming data from a flat json file with hash computation (hash algorithm implementation need to be chosen that supports streaming data, I know sha256 definitely supports it). Preferably
streaminglibrary should be used, since that is what already being used in some other project in cardano-node, however, if there is insufficient support for streaming aeson data, thenconduit-aesoncan be used instead - support embedded data instead of a streaming from a file, since people also use this interface for testing with small payloads
- support the old fields, until the new one has been fully integrated and adopted. If both old fields and
extraConfigare provided this should be an error
First part of this ticket is to actually design the interface by using injection of UTxO and the rest of the fields will follow in a subsequent PR.
I imagine Haskell types that look something like this:
data InjectionData k v
= InjectFromFile !FilePath !Hash
| EmbeddedData (ListMap k v)
data ShelleyExtraConfig = ShelleyExtraConfig
{ secInitialFunds :: InjectionData Addr Coin
, secStakePools :: InjectionData (KeyHash StakePool) StakePoolParams
, secStakeCredentials :: InjectionData (KeyHash Staking) (KeyHash StakePool)
}This is how initially we can approach this by changing the transition interface to accept an action that allows reading a file:
https://github.com/IntersectMBO/cardano-ledger/blob/e04cde449e9f0dcf38d4bc822cc028ff8fedac4a/eras/shelley/impl/src/Cardano/Ledger/Shelley/Transition.hs#L119C3-L126
injectIntoTestState ::
MonadFail m =>
(forall a. FilePath -> (Handle -> m a) -> m a) ->
-- ^ File reading action
TransitionConfig era ->
NewEpochState era ->
m (NewEpochState era)We can later polish the interface a bit more in order to support direct streaming into LedgerHD, but for now this should suffice.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status