-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Description
If an error is thrown at any point while submitting graph updates to the chain, the graphUpdate job will be marked as failed, and attempt to re-try next time around in the queue (if max retries not exceeded).
On the re-try attempt, if some transactions were submitted to the chain (but some threw an error), then we detect that and don't try to re-update the graph (because we'd potentially be stepping on ourselves and get a StalePage error); instead we just queue a child job to await the completion of the submitted transactions
While this avoids burning capacity with likely StalePage errors, it could result in an incomplete user graph, as the transactions that failed to be submitted will not be recreated.
☝🏻 note that this would be a very rare scenario, as a user would need to be following > 7k in order for us to submit more than a single capacity batch transaction for their graph.
(though, once we get a provider that supports private friendship, that number will be much lower, since we can fit far fewer PRIds in a graph page). Still, it's unlikely that a user will require > 10 graph pages.
Discussion
One possible solution is to track this in the job state. If we detect pending transactions for a user graph, we should still await them, but record a special state in the job that indicates we need to re-verify/re-update the user graph. In this state, instead of awaiting the child job, we'll simply add a delay and re-try once all pending transactions are resolved. If some transactions were missed, when we fetch the graph again we'll detect a delta and submit another update; if there's no delta, the job will successfully complete as a no-op.