Skip to content

Conversation

@davidrichards-da
Copy link
Contributor

No description provided.

Signed-off-by: davidrichards-da <89472028+davidrichards-da@users.noreply.github.com>
@davidrichards-da davidrichards-da requested review from a team as code owners January 6, 2026 16:10
Signed-off-by: davidrichards-da <89472028+davidrichards-da@users.noreply.github.com>
Copy link
Contributor

@PHOL-DA PHOL-DA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im hesitant about adding "HA and DR" to wallet integration section, this seems more like something that should be in the operate sections (and we have most of the cases covered in the exchange integration already, but under different naming)

Recommended Architecture
~~~~~~~~~~~~~~~~~~~~~~~~

* **Redundant Validators**: Run 2 validators behind a single gateway.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean a wallet gateway or a load balancer ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

load balancer. I'll change


* **Redundant Validators**: Run 2 validators behind a single gateway.
* **Confirming Rights**: Host parties on both validators with confirming rights.
* **Threshold Configuration**: Implement a confirming threshold of 1/2. This ensures that if one validator goes offline, the remaining node can still authorize transactions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add a code snippet showcasing how to do this using the Wallet SDK ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't. From what I can see we don't set a confirming threshold in our multi-hosting parties example either.

* **Node Scaling**: Host more validators with a low confirming threshold.

----------------------
Disaster Recovery (DR)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it not make more sense to simply link to:
https://docs.dev.sync.global/validator_operator/validator_disaster_recovery.html

you seem to be linking to it in all three subsections anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but that doesn't put it into perspective as to what can be recovered

@@ -0,0 +1,73 @@
=======================================
High Availability and Disaster Recovery
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could make a strong argument that this does not fit well within the Integrate sections and should rather be in the Operate sections of the docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could, but as discussed in the meeting, won't

Signed-off-by: davidrichards-da <89472028+davidrichards-da@users.noreply.github.com>
Signed-off-by: davidrichards-da <89472028+davidrichards-da@users.noreply.github.com>
Recommended Architecture
~~~~~~~~~~~~~~~~~~~~~~~~

* **Redundant Validators**: Run 2 validators behind a load balancer.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't work like this thanks to node-local offsets. Failover needs some client side handling.

~~~~~~~~~~~~~~~~~~~~~~~~

* **Redundant Validators**: Run 2 validators behind a load balancer.
* **Confirming Rights**: Host parties on both validators with confirming rights.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably refers to the hosted parties, but there's also the matter of the provider party for CC preapprovals. Unfortunately the renewal automation only works if the preapproval provider is a node admin party. And if the node admin party is down, the preapproval doesn't work anymore so incoming transfers time out.

So the node admin party of one node has to be replicated to another node in confirming mode, which is a thus far undocumented procedure.


Disaster Recovery is the process of recovering from a scenario where a validator is lost completely and its immediate failovers are unavailable.

Backup Best Practices
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants