Skip to content

Improve Stability in the Payment Process #598

Open
0 of 7 issues completed
Open
0 of 7 issues completed
@utkarshg6

Description

@utkarshg6

I and Bert identified a couple of problems which are likely responsible for frequent delinquency bans in the network.

The current mechanism doesn't seem reliable enough because of rather a chaotic rate of successes in pending payments being evaluated incorrectly compared to inspections done through the common online blockchain explorers. We shouldn't probably strive to pinpoint every little eventuality of this process, convinced for that, besides other things, by the differences in the local manifestation at particular Node runners participating in QA tests. It seems like it has quite a variable nature. What we have been able to conclude, though, is that our old design can be greatly improved by implementing these following ideas:

  • Thorough the whole research it became more and more obvious that our code lacks the ability of resubmitting transactions which for some reason have never completed. It's a crucial feature which is probably a gold standard of the web3 ecosystem. If we implement a reasonable time limit for transactions to complete, it makes sense to retry with a slightly higher gas price than of that one at those previously given. That may imply the need of remembering more information about pending transactions than we do with at the moment. Particularly, we should newly also keep their nonce and gas price.
    It's unclear regarding the importance of having the gas price because with every retry we will probably query the blockchain fro the recommended average gas price, which should be more meaningful than what we remember about the previously submitted transactions. Of course, if we found out that the recommended price goes lower than the one used previously, something would be quite wrong and we'd have to care of it. Therefore, maybe let's check it with an if statement if this always holds just to be sure the code is prepared for the reality well.
    Besides we do try to get the most relevant gas price by the RPC call, it's recommended that we add on some extra and ensure the transaction will be attractive for the miners.

  • From what we found out from a research, our choice of using the method from the Web3 JSON-RPC standard eth_getTransactionReceipt has been suboptimal or even wrong, with the big difference to eth_getTransactionByHash which can also answer if the so-believed pending transaction is still kept waiting in the transaction pool or otherwise it has actually disappeared and so it becomes a fool's errand to go on checking on its completion. It's commonly stated that transactions can be dropped by the transaction pool under certain circumstances. We think we witnessed some cases of this kind in the collected logs that a transaction not being findable by the third-party search tools was still regarded as pending which resulted into a high counts of attempts of querying its receipt.
    Therefore, the replacement for the method with more general results is vital. We need to be able to sense when we should definitely submit a certain transaction anew in this situation. (This will require some considerations about how to arrange it so that it can use some piece of the same code from when a transaction has truly been pending but for too long - which might be an hour - and we will also order its repetition.)

  • There is a serious design issue which we have been completely unaware of until now. It involves the use of eth_getTransactionCount every time when we are about to compose and sign raw transactions and pack them in a batch.
    As long as everything goes right and this set of payments are found complete before the next sending of payments no issues may occur. However, if we still have potent pending transactions (which we do believe will complete) and we proceed to the next step (basically we move from the PendingPayableScanner towards the PayableScaner), we suddenly commit a blunder by querying the transaction count, reasoning that the returned value, plus one, should be used for the nonce of the first composed transaction. This is wrong because we've got already one transaction in the pending state, but above all, it is meant for a different creditor, with a payment relevant only to that particular creditor, and also other disconnected information.
    The consequences are severe as this attempt will be understood by the blockchain Nodes as if we just asked them to replace the pending transactions. It doesn't matter they point to a completely different address. Therefore, the former transactions will be deemed (it takes no chances that they will ever complete after this point). While we just wiped out some transactions which did intend to let complete, it happened all at the side of the blockchain, and besides we could potentially prolong the payment gap for these particular creditors, we, unfortunately, don't even take any reflection on it and we still keep the records in the pending_payable table. If we removed them (but we should cope with the problem differently, neither this way), we could at least save some damage we will suffer.
    One of the effect is that these already overridden transactions will never succeed and will probably stay in this table until we consider them to be there for too long (this time limit isn't in place in the old design). Secondly, if we detect these entries for certain wallets, it blocks further payments for those, implying the will for sure see no payments over this period of time (In the old design, it basically means never, because even the hard limit, making maybe about two days, would only mark these as failures. There has been no mechanism in the code that could fix this kind of failed transaction and so it's left for the user.)

  • We realized during the debugging work that it was made tough for us to reconstruct the history of payments and their final status (not mentioning whether they were being registered in the transaction pool or whether it was just our disillusion). Later on and from this, we proposed that it would be great to keep a history of the attempted payments, both completed and pending, possibly them all together within a single table. It might be required, though, to limit it maximally to e.g. 5000 entries.

There is, however, also a field of possible improvements on the receiving side. We could eliminate some delays between booking a received payment and lifting a delinquency-ban for this subject if there has been one. There is a card referring to this issue #GH-759: Making delinquency banning fairer. We recommend to play this card along these others to maximize the effect.

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    • Status

      🏗 Development In Progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions