Skip to content

Conversation

@tbagrel1
Copy link
Contributor

Description

Please include a meaningful description of the PR and link the relevant issues
this PR might resolve.

Also note that:

  • New code should be properly tested (even if it does not add new features).
  • The fix for a regression should include a test that reproduces said regression.

@tbagrel1 tbagrel1 added the peras label Nov 20, 2025
@agustinmista agustinmista changed the base branch from main to peras-staging November 20, 2025 18:22
@agustinmista
Copy link
Contributor

There's an implementation detail that we should discuss with the researchers: whether or not we need to store votes targeting a different block than the one a node is currently championing.

  • If we store every vote we receive (from within the voting committee), then an attacked with many low-stake nodes that make it into the committee could cause a honest node to store an unreasonable amount of bogus votes for different blocks.
  • On the other hand, if we only keep track of votes being cast upon the block we are currently championing, then we would be unable to ever observe a quorum by ourselves, relying instead on certificate diffusion to reconcile the global state.

If an honest node is voting for the wrong block because of adversarial actions, is it realistic for it to expect receiving votes from the honest majority? Or, conversely, to receive a certificate when a quorum is reached elsewhere.

Copy link
Contributor

@agustinmista agustinmista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good so far! I have some thoughts around what we discussed the last time. Maybe they could lead to some simplifications.

Also, would you mind marking the new TODOs somehow? Like TODO(peras) so that we can easily grep for them.

PerasVoteStakeDistr ->
Either (PerasValidationErr blk) (ValidatedPerasVote blk)

const_PERAS_QUORUM_THRESHOLD :: PerasVoteStake
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if it's not a tunable parameter, I think it would be good to move this to Ouroboros.Consensus.Peras.Params

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I'm not sure that the part of the code dealing with the PerasVoteDB (probably the ChainDB, as PerasCertDB is managed as a part of the ChainDB already) will have access to those params?

If we follow the same structure as for the PerasCertDB, we will have a PerasCertDBArgs struct as a parameter of openPerasVoteDB that could be used to host this param.

Comment on lines 227 to 232
data PerasVoteAggregateStatus blk
= PerasVoteAggregateQuorumNotReached {pvasVoteAggregate :: !(PerasVoteAggregate blk)}
| PerasVoteAggregateQuorumReachedAlready
{pvasVoteAggregate :: !(PerasVoteAggregate blk), pvasCert :: ValidatedPerasCert blk}
deriving stock (Generic, Eq, Ord, Show)
deriving anyclass NoThunks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this way better!

What I am still a bit concerned about, though, is that complete and incomplete voting aggregates are still represented by the same type, and that could make it easier to introduce bugs in the future.

Maybe we can define a GADT indexed by the voting state to make these two cases different at the type level, while still allowing for a single type to be carried around in most places:

type data PerasVotingStatus = NoQuorum | Quorum

data PerasVotingTally blk status where
  NoQuorum ::
    !(PerasVoteAggregate) -> 
    PerasVotingTally NoQuorum
  Quorum :: 
    !(PerasVoteAggregate blk) ->
    !(ValidatedPerasCert blk) ->
    PerasVotingTally Quorum

This way, we could enforce the voting code to only accept the expected state transitions:

  • NoQuorum -> NoQuorum
  • NoQuorum -> Quorum
  • Quorum -> Quorum
    (note that Quorum -> NoQuorum shouldn't be possible)

As well as allowing future client code using these aggregates to specify when it expects voting to have reached a quorum, as opposed to handling it dynamically via a Maybe and/or some error "impossible case". E.g., instead of pvasMaybeCert, we could have:

perasVotingCert :: PerasVotingTally blk Quorum -> ValidatedPerasCert blk
perasVotingCert (Quorum _ cert) = cert

Which can only be used to extract the cert from the aggregate bundle after voting has reached a quorum.


Going one step further, we could also encode the number of votes that we have seen after reaching a quorum in the type-level Quorum constructor:

type data PerasVotingStatus = NoQuorum | Quorum Nat

This would allow us to enrich the state transitions to:

  • NoQuorum -> NoQuorum
  • NoQuorum -> Quorum Z
  • Quorum n -> Quorum (S n)

And we could use Quorum 0 as the witness for the state transition that produces a certificate. This way, I think we could able to get rid of the PerasVoteResult type* or, alternatively, turn it into a thin wrapper over PerasVotingTally to hide the existential:

data PerasVotingTally blk status where
  NoQuorum ::
    !(PerasVoteAggregate) -> 
    PerasVotingTally NoQuorum
  Quorum ::
    !(SNat n) ->
    !(PerasVoteAggregate blk) ->
    !(ValidatedPerasCert blk) ->
    PerasVotingTally (Quorum n)

-- constructor not exported
newtype PerasVoteResult blk = PerasVoteResult (Some (PerasVoteTally blk))

-- instead we could expose only the things the client need, e.g.:
-- to extract the certificate if we have _just_ reached a quorum, and never again
perasVoteResultEmittedCert :: PerasVoteResult blk -> Maybe (ValidatedPerasCert blk)
perasVoteResultEmittedCert (PerasVoteResult (Some tally)) =
  case tally of
    Quorum Z _ cert -> Just cert
    _ -> Nothing

[*] depending on whether the future client code might also be interested in reacting to the event of adding a certificate that was already in the VoteDB (I think it wouldn't, as that's sounds as something purely relevant to tracing, which is already being handled by the VoteDB implementation)

, pvaTotalStake = initialStake
}
vote =
if not (getPerasVoteRound vote == roundNo && getPerasVoteVotedBlock vote == point)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would apply DeMorgan's law to this condition:

if getPerasVoteRound vote /= roundNo || getPerasVoteVotedBlock vote /= point


-- | Add a vote to an existing aggregate if it isn't already present, and update
-- the stake accordingly.
-- PRECONDITION: the vote's target must match the aggregate's target.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is only being used by updatePerasVoteAggregateStatus. Does the latter also inherit this precondition? If not, I would then embed this function as a local binding to updatePerasVoteAggregateStatus and get rid of the error. Otherwise, updatePerasVoteAggregateStatus should also mention the precondition.

Comment on lines 328 to 339
instance Serialise (HeaderHash blk) => Serialise (PerasVote blk) where
encode PerasVote{pvVoteRound, pvVotedBlock, pvVoteVoterId} =
encodeListLen 3
<> encode pvVoteRound
<> encode pvVotedBlock
<> KeyHash.toCBOR pvVoteVoterId
decode = do
decodeListLenOf 3
pvVoteRound <- decode
pvVotedBlock <- decode
pvVoteVoterId <- KeyHash.fromCBOR
pure $ PerasVote{pvVoteRound, pvVotedBlock, pvVoteVoterId}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put a TODO so we don't forget about adding roundtrip tests for these.

Either (UpdateRoundVoteStateError blk) (PerasRoundVoteState blk)
updatePerasRoundVoteState vote cfg roundState =
if getPerasVoteRound vote /= getPerasVoteRound roundState
then error "updatePerasRoundVoteTallys: vote round does not match aggregate round"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this error become unnecessary if you turn this function into a local definition inside updatePerasRoundVoteStates?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure. The fact that the roundNo in a RoundState/target in a TargetState is consistent with the key it is given in the map is an invariant that is not enforced statistically, so I think it makes sense to still have an error case for it (i.e. it works as an assert if we mess up somewhere in the implementation).

Moving such a function to a local definition would make it harder to test, and I think that would be a bigger issue maybe?

Copy link
Contributor

@agustinmista agustinmista Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. I just wanted to understand if this error call came from splitting a function with internal invariants into two smaller ones, or if it was something external.

Not sure if we will ever want to test this function in isolation. More likely through the election API via quickcheck-dynamic, no?

Copy link
Contributor

@nbacquey nbacquey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to have a closer look at the implementation, but I have some remarks already


data PerasVoteStakeDistr
getPerasVoteStakeOf :: PerasVoteStakeDistr -> VoterId -> PerasVoteStake
getPerasVoteStakeOf = undefined
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put an error here, stating why it hasn't been implemented yet, and maybe a comment with possibilities for future implementations

-- | TODO: this may become a Ledger protocol parameter
-- see https://github.com/tweag/cardano-peras/issues/119
quorumThreshold :: PerasVoteStake
quorumThreshold = PerasVoteStake 0.75
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be too much trouble to rebase the PR on top of #1784 and make it part of PerasParams?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be done at some point :)

, Eq (IdOf (PerasCert blk))
, Show (IdOf (PerasCert blk))
, NoThunks (IdOf (PerasCert blk))
, Serialise (IdOf (PerasCert blk))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, do you need all those instances when you already know HasId (PerasCert blk) and type IdOf (ValidatedPerasCert blk) = IdOf (PerasCert blk)?
I'm not saying you should change anything, but it seems redundant to me. Maybe you could say something like IdOf (PerasCert blk) ~ IdOf (ValidatedPerasCert blk) in the constraints?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum HLS was complaining when I only had HasId (PerasCert blk) => and suggested that I introduced the apparently-redundant constraints.

I'll try with (HasId (PerasCert blk), IdOf (PerasCert blk) ~ IdOf (ValidatedPerasCert blk)) =>! This may be the part that I was missing.

-- see https://github.com/tweag/cardano-peras/issues/73
instance StandardHash blk => BlockSupportsPeras blk where
newtype PerasCfg blk = PerasCfg
data PerasCfg blk = PerasCfg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same remark as above: would it be possible to add the quorum to PerasParams from #1784 instead?

, Eq (IdOf (PerasVote blk))
, Show (IdOf (PerasVote blk))
, NoThunks (IdOf (PerasVote blk))
, Serialise (IdOf (PerasVote blk))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same remark as above: could it be simplified with something like IdOf (PerasVote blk) ~ IdOf (ValidatedPerasVote blk)?

-- the vote was actually added, or if it was already present.
, getVoteSnapshot :: STM m (PerasVoteSnapshot blk)
-- ^ Interface to read the known votes, mostly for diffusion
, getForgedCertForRound :: PerasRoundNo -> STM m (Maybe (ValidatedPerasCert blk))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have STM m here and not in other functions? Should we get STM everywhere, or leave the choice of using STM to the individual implementations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually reused most of the signatures from the PerasCertDB interface. Which were designed by Alex, and I remember that already at that time, I wasn't fully satisfied by the alternating use of either STM m or m. But Alex seems to prefer it that way, and seemed to have more context than me about the project, so I didn't push my concerns any further.

Anyway, let me try to remember the rationale:

  • addCert is in m because it can throw exceptions, and because it might need on-disk storage
  • getCertSnapshot is in STM m because it only reads the in-memory index. getCertsAfter from PerasCertSnapshot should be in m because it may require loading from disk, but Alex argued to keep it pure at the moment since we only keep stuff in memory (it's the part I disagreed with)
  • garbageCollect is in m because again it might alter on-disk state.

Tbh, I'm not sure we need the concept of "snapshot" as intermediary interfaces if we don't have actual on-disk storage. The role of the snapshot is IMO to split between what we can get immediately (i.e. in-memory index), for which we just need STM, and the heavier stuff (actually loading objects from either memory or disk) that might require m. But even this justification doesn't work anymore (for both certs and votes) as the methods in the snapshot are not using m at all...

I don't think we ever intend to store votes in memory, so we could probably remove PerasCertSnapshot and simplify this part. And in a second time, it might be worth revisiting the interface of PerasCertDB too.

What do you think @nbacquey @agustinmista

Copy link
Contributor

@agustinmista agustinmista Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we ever intend to store votes in memory, so we could probably remove PerasCertSnapshot and simplify this part. And in a second time, it might be worth revisiting the interface of PerasCertDB too.

I would've thought that a first implementation would be completely in memory. Testing would also be much easier if we don't have to mock up disk IO. Perhaps you meant:

"I don't think we ever intend to store votes on disk"?

But to your second point, yes, I would like both the CertDB and the VoteDB to share a common API as much as possible. If that means dropping support for snapshots (unless those are strictly necessary even for a testnet), then I'm all for it.

deriving stock Generic
deriving newtype (Enum, Eq, Ord, Num, Bounded, NoThunks, Serialise)

newtype PerasVoteStake = PerasVoteStake {unPerasVoteStake :: Rational}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's just me wanting to be fancy, but is there a type for rationals between 0 and 1?

PerasTargetVoteState blk 'Candidate ->
PerasTargetVoteState blk 'Loser
candidateToLoser cfg (PerasTargetVoteCandidate tally) =
if voteTallyAboveQuorum cfg tally
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we check tally > (1-quorum) to trigger an error instead?

Either
(PerasForgeErr blk)
(Either (PerasTargetVoteState blk 'Candidate) (PerasTargetVoteState blk 'Winner))
updateCandidateVoteState cfg vote oldState =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the function would be a bit more readable if we used the Monad instance of the outer Either with do?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also introduce an ad-hoc type with three variants and flatten this out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is an issue if in a single place in the code we have a nested Either. I think we already have many ad-hoc types in this module, and this new one wouldn't be very meaningful IMO. But I agree with @nbacquey suggestion, we could use do for error-handling and Right/Left explicit use for possible returned variants

Copy link
Contributor

@agustinmista agustinmista Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me.

If you're feeling fancy, you could also pull PerasTargetVoteState out of the inner Either and return and NS over the two possible return variants of the same indexed type:

Either 
  (PerasForgeErr blk)
  (NS (PerasTargetVoteState blk) '[Candidate, Winner])

This makes it more evident that you're returning a new vote state constrained to only 2 of the 3 existing variants.

updateLoserVoteState cfg vote oldState =
let newVoteTally = updateTargetVoteTally vote (ptvsVoteTally oldState)
aboveQuorum = voteTallyAboveQuorum cfg newVoteTally
in if aboveQuorum
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other parts of the code, we raise an error if the loser gets above the tally (or 1-tally). Why don't we do that here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other parts of the code we raise an error as it is not supposed to happen (i.e. the error is here to enforce an internal invariant of the implementation, something that shouldn't ever fail except if we made a mistake in the impl).

Here however, we are in a place where a loser could theoretically go above quorum even if our implementation is valid and bug-free. That's why we return an Either where the Left case indicates an error that is not our responsibility. E.g., if the vote validation logic is faulty, a loser could go above quorum even with a perfectly valid implementation on our end (as voteDB implementers).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g., if the vote validation logic is faulty

If we assume that the vote validation logic could be faulty. Wouldn't we need to turn errors into Eithers in other places as well? Or is this the only place where that problem could manifest? Looks a bit like a slippery slope of responsibility delegation to me 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only place where we depends on correctness of "external" input (here, data provided by other modules), that's why we have proper error handling using Either. In other places where I used error, we are not depending on external factors, only on proper implementation on our side.

Typically, if we check if a {candidate,loser} gets above the quorum every time we update it, then it shouldn't be possible to have a candidate with more than quorum stake in our list of candidates. The first situation is dealt with using Either, the latter is dealt with using error (as it cannot happened except if we have a faulty implem in this current module).

@tbagrel1 tbagrel1 changed the base branch from peras-staging to peras/main-pr/votedb-api December 12, 2025 08:24
@tbagrel1 tbagrel1 changed the title Add Vote and VoteDB for Peras Add VoteDB and Vote aggregation structure for Peras Dec 12, 2025
@agustinmista agustinmista force-pushed the peras/main-pr/votedb-api branch 4 times, most recently from fd401fc to 0006522 Compare December 16, 2025 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants