Skip to content

Historic sync duties queries fail for recently-activated validators #4717

Open
@michaelsproul

Description

@michaelsproul

Description

There's a bug in the calculation of historic sync committee duties, where by this error will be returned:

curl -X POST -H "Content-Type: application/json" --data '["155654"]' "http://localhost:5052/eth/v1/validator/duties/sync/688802"
{"code":500,"message":"UNHANDLED_ERROR: SyncDutiesError(UnknownValidator(155654))","stacktraces":[]}

(This example is from Gnosis chain, reported by one of our users)

It occurs because there's an optimisation in the endpoint that tries to avoid expensive state loads:

// Load the state at the start of the *previous* sync committee period.
// This is sufficient for historical duties, and efficient in the case where the head
// is lagging the current epoch and we need duties for the next period (because we only
// have to transition the head to start of the current period).
//
// We also need to ensure that the load slot is after the Altair fork.
let load_slot = max(
chain.spec.epochs_per_sync_committee_period * sync_committee_period.saturating_sub(1),
altair_fork_epoch,
)
.start_slot(T::EthSpec::slots_per_epoch());

Although the validator 155654 exists on-chain at epoch 688802, the state loaded is from epoch (688802 // 512 - 1) * 512 = 688128 where it does not exist. Using the state from the start of the current period also isn't sufficient because that would yield the state at epoch 688640, where the validator also doesn't exist.

In some sense this optimisation is still sound, because the validator cannot actually be part of sync committee at this point, because sync committees are determined 512 epochs in advance. One way to fix it would be to just avoid erroring out here:

let pubkey = self.get_validator(validator_index as usize)?.pubkey;

I need to think more about the best way to fix it.

Version

Lighthouse v4.4.1

Additional Info

There may also be another bug lurking in the duties-from-head codepath, as the user reports the error occurring "live" while trying to request duties for the current epoch. I haven't spotted the bug there (yet), but will link it here if I find one.

Metadata

Metadata

Assignees

No one assigned

    Labels

    HTTP-APIbugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions