Open
Description
Some operations within Nexus are implemented as "manage the state which may exist within my local rack". This includes:
- Awaiting handoff from RSS
- Ensuring a rack-wide CRDB instance exists
- Ensuring sufficient redundancy for services exists with a rack
However, longer-term, we would ideally migrate many of these operations to be "fleet-wide" instead of "rack-wide". This way, one nexus could control multiple racks simultaneously, ensure that CRDB nodes are distributed within an AZ, and ensure that service redundancy suffices for multi-rack failure scenarios.
For additional context, see: https://github.com/oxidecomputer/omicron/pull/1234/files/28d87f51ab88cce3d8ff2560a8996904e8c78f81#diff-5a93a4691987ea1b28d848375a2728abcb26cec85d477d051243cb1863198392