Obedient Wool Canary
Medium
maybeResendFromStore
in relayer.go
in vigilante module
, currently resends the same checkpoint tx again, no matter what the error returned from the json-rpc function GetRawTransactionFunc
is, which in some situations could obstruct the correct flow, as we could end up submitting the same transaction twice.
Bitcoin rpc errors: https://github.com/btcsuite/btcd/blob/bb52d7d78d9cf335e0611b9ae06ad8c77e75de0b/btcjson/jsonrpcerr.go#L177
When submitting a checkpoint tx to bitcoin, we use maybeResendFromStore
which has a fail-safe mechanism which only proceeds with sending the transaction if there is an error when calling GetRawTransactionFunc
, this is to prevent sending already send transactions, if for example the submitter service is restarted.
However there is a flaw, currently we do that no matter what the returned error is, this poses a problem as there is possibility that error could be returned due to problems with the rpc( rate limits exceeded, network error & etc ), or other json-rpc error codes that are not transaction not found
.
func (rl *Relayer) SendCheckpointToBTC(ckpt *ckpttypes.RawCheckpointWithMetaResponse) error {
.........
if rl.shouldSendCompleteCkpt(ckptEpoch) || rl.shouldSendTx2(ckptEpoch) {
@>> hasBeenProcessed, err := maybeResendFromStore(
ckptEpoch,
rl.store.LatestCheckpoint,
rl.GetRawTransaction,
rl.sendTxToBTC,
)
if err != nil {
return err
}
if hasBeenProcessed {
return nil
}
}
....
}
@>> // maybeResendFromStore - checks if we need to resubmit txns from a store
@>> // in case "submitter" service was restarted, we want to ensure that we don't send txns again for a checkpoint
@>> // that has already been processed.
@>> // Returns true if the first transactions are in the mempool (no resubmission needed),
@>> // and false if any transaction was re-sent from the store.
func maybeResendFromStore(
epoch uint64,
getLatestStoreCheckpoint GetLatestCheckpointFunc,
getRawTransaction GetRawTransactionFunc,
sendTransaction SendTransactionFunc,
) (bool, error) {
storedCkpt, exists, err := getLatestStoreCheckpoint()
if err != nil {
return false, err
} else if !exists {
return false, nil
}
if storedCkpt.Epoch != epoch {
return false, nil
}
maybeResendFunc := func(tx *wire.MsgTx) error {
txID := tx.TxHash()
@>> _, err = getRawTransaction(&txID) // todo(lazar): check for specific not found err
@>> if err != nil {
@>> _, err := sendTransaction(tx)
if err != nil {
@>> return err //we could end up here
}
// we know about this tx, but we needed to resend it from already constructed tx from db
return nil
}
// tx exists in mempool and is known to us
return nil
}
@>> if err := maybeResendFunc(storedCkpt.Tx1); err != nil {
return false, err
}
@>> if err := maybeResendFunc(storedCkpt.Tx2); err != nil {
return false, err
}
return true, nil
}
A few bad scenarios could happen here:
- If in reality
storedCkpt.Tx2
needs to be resend, but we failed onmaybeResendFunc(storedCkpt.Tx1)
because we falsely send the same tx1 again, we would not be able to re-submitstoredCkpt.Tx2
. - The
MaybeResubmitSecondCheckpointTx
flow will not be executed because we callcontinue
.
func (s *Submitter) processCheckpoints() {
defer s.wg.Done()
quit := s.quitChan()
for {
select {
case ckpt := <-s.poller.GetSealedCheckpointChan():
s.logger.Infof("A sealed raw checkpoint for epoch %v is found", ckpt.Ckpt.EpochNum)
if err := s.relayer.SendCheckpointToBTC(ckpt); err != nil {
s.logger.Errorf("Failed to submit the raw checkpoint for %v: %v", ckpt.Ckpt.EpochNum, err)
s.metrics.FailedCheckpointsCounter.Inc()
@>> continue
}
if err := s.relayer.MaybeResubmitSecondCheckpointTx(ckpt); err != nil {
s.logger.Errorf("Failed to resubmit the raw checkpoint for %v: %v", ckpt.Ckpt.EpochNum, err)
s.metrics.FailedCheckpointsCounter.Inc()
}
s.metrics.SecondsSinceLastCheckpointGauge.Set(0)
case <-quit:
// We have been asked to stop
return
}
}
}
- getRawTransaction returns an error, due to network connection ( for example )
- sendTransaction(tx) is called, however it fails with error as its already submitted to the bitcoin chain
- Error is propagated from
relayer.go maybeResendFromStore
tosubmitter.go
- As result even though checkpoint 2 needed to be re-submitted its not. ( Either with the
maybeResendFunc(storedCkpt.Tx2)
flow, or with the bump fee flow in MaybeResubmitSecondCheckpointTx )
If sendTx is called for the 1st checkpoint tx, when the tx is already on bitcoin chain, the operation will fail and will obstruct the flow, possibly leading to not correctly submitting the 2nd tx if it needed to be resubmitted.
The correct behaviours would be to handle separately RPC/curl request error or any json-rpc error code different from "transaction not found".