Skip to content

[nexus] Managing local rack -> managing local fleet #1276

@smklein

Description

@smklein

Some operations within Nexus are implemented as "manage the state which may exist within my local rack". This includes:

  • Awaiting handoff from RSS
  • Ensuring a rack-wide CRDB instance exists
  • Ensuring sufficient redundancy for services exists with a rack

However, longer-term, we would ideally migrate many of these operations to be "fleet-wide" instead of "rack-wide". This way, one nexus could control multiple racks simultaneously, ensure that CRDB nodes are distributed within an AZ, and ensure that service redundancy suffices for multi-rack failure scenarios.

For additional context, see: https://github.com/oxidecomputer/omicron/pull/1234/files/28d87f51ab88cce3d8ff2560a8996904e8c78f81#diff-5a93a4691987ea1b28d848375a2728abcb26cec85d477d051243cb1863198392

Metadata

Metadata

Assignees

No one assigned

    Labels

    customerFor any bug reports or feature requests tied to customer requests

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions