|
| 1 | +# Why a Message Bus - Stateless Async |
| 2 | + |
| 3 | +At gigawatt scale, polling breaks down quickly. Every application cannot ask every other application for possible updates every second and still leave room for the work that actually matters. |
| 4 | + |
| 5 | +The bus model pushes signals on change and on a cadence, then lets every |
| 6 | +interested consumer receive the same signal. A producer does not need to know |
| 7 | +who is listening, and a consumer does not need to know who else cares. |
| 8 | + |
| 9 | +## Design Principles |
| 10 | + |
| 11 | +- **Push, Don’t Poll**: React quickly at very large scale without building a polling |
| 12 | + mesh. |
| 13 | +- **Publish Once, Fan Out Many**: One state publication reaches all interested |
| 14 | + consumers without per-consumer producer load. |
| 15 | +- **Decouple Producers From Consumers**: Producers publish state. Consumers |
| 16 | + independently decide how to react. |
| 17 | +- **Converge on Current State**: Consistency comes from publishing current |
| 18 | + state, not preserving a perfect message stream. Failure is corrected by later |
| 19 | + publications, so the system self-heals. |
| 20 | + |
| 21 | +## Polling at 1GW |
| 22 | + |
| 23 | +Leak detection shows why polling breaks down. A liquid rack leak needs fast |
| 24 | +reaction, and more than one consumer may care. Host management, workload |
| 25 | +migration, facility response, alerting, and analysis may all need it. Waiting |
| 26 | +for the next minute or five-minute poll is too slow, and having every consumer |
| 27 | +poll the BMS every second is wasteful. |
| 28 | + |
| 29 | +The BMS publishes the leak state when it changes. The bus fans that |
| 30 | +publication out to the consumers that need it. |
| 31 | + |
| 32 | +```mermaid |
| 33 | +flowchart LR |
| 34 | + BMS["BMS leak state"] -->|publish on detection| Bus["Bus"] |
| 35 | + Bus --> NICo["Host management"] |
| 36 | + Bus --> Workload["Workload migration"] |
| 37 | + Bus --> Facilities["Facility response"] |
| 38 | + Bus --> Alerting["Alerting"] |
| 39 | + Bus --> Analysis["Analysis"] |
| 40 | +``` |
| 41 | + |
| 42 | +| Polling mesh | Stateless async bus | |
| 43 | +| :----------- | :------------------ | |
| 44 | +| Every consumer polls each producer | Producer publishes each message once | |
| 45 | +| Fast reaction needs tight polling | Fast reaction comes from push delivery | |
| 46 | +| Adding consumers adds producer load | Adding consumers adds bus load | |
| 47 | + |
| 48 | +Push delivery replaces every application polling |
| 49 | +every other application and improves reaction times. |
| 50 | + |
| 51 | +## Stateless by Default |
| 52 | + |
| 53 | +DSX Exchange carries live, current state events. It is not a database for every |
| 54 | +application's state. The source application owns its state, publishes current |
| 55 | +state when it changes, and periodically republishes current state at a cadence |
| 56 | +it can sustain. |
| 57 | + |
| 58 | +Consumers converge on current state. Messages should carry _current values_, not |
| 59 | +deltas. A temperature message should say the current temperature is 24, not that |
| 60 | +the temperature changed by 2. Consumers apply those current values idempotently, |
| 61 | +so repeated messages are safe. |
| 62 | + |
| 63 | +The normal flow has three parts: |
| 64 | + |
| 65 | +1. Publish when state changes. |
| 66 | +2. Periodically republish current state even when it did not change. |
| 67 | +3. Consumers process messages idempotently. |
| 68 | + |
| 69 | +```mermaid |
| 70 | +sequenceDiagram |
| 71 | + participant Source as Source application |
| 72 | + participant Bus |
| 73 | + participant Consumer |
| 74 | +
|
| 75 | + Source->>Bus: Publish current value on change |
| 76 | + Bus->>Consumer: Deliver current value |
| 77 | + Consumer->>Consumer: Apply current value idempotently |
| 78 | + loop Periodic current state republish |
| 79 | + Source->>Source: Republish interval |
| 80 | + Source->>Bus: Republish current value |
| 81 | + Bus->>Consumer: Deliver current value |
| 82 | + Consumer->>Consumer: Apply current value idempotently |
| 83 | + end |
| 84 | +``` |
| 85 | + |
| 86 | +Failure can mean a missed update, producer bug, broker problem, network issue, |
| 87 | +consumer bug, bad local state, data corruption, power loss across too many |
| 88 | +high-availability domains, or any other problem that leaves a consumer with the |
| 89 | +wrong state. The next update or periodic current state republish gives the |
| 90 | +consumer the current value again. |
| 91 | + |
| 92 | +This design gives the system eventual consistency and self-reconciliation. |
| 93 | +Change publications provide fast reaction, while slower scheduled republishes |
| 94 | +provide reconciliation. If local state drifts, the next changed state message or |
| 95 | +scheduled publish brings it back. |
| 96 | + |
| 97 | +Keeping the bus stateless is both a correctness choice and a performance choice. |
| 98 | +_Correctness_ comes from convergence on the source's next current-state |
| 99 | +publication. This also repairs a missed message, stale consumer cache, or incorrect local value. The source publishes the current value again and |
| 100 | +consumers apply it idempotently. _Performance_ comes from keeping high-rate state |
| 101 | +on the live message path without turning each publication into replicated |
| 102 | +persistent state. |
| 103 | + |
| 104 | +### The Startup Problem |
| 105 | + |
| 106 | +Bootstrapping consumers is where the stateless event flow needs help. A new |
| 107 | +consumer starts with no local view. For fast-changing values, the normal stream |
| 108 | +is enough. Consumers subscribe and wait for the next current state publication. |
| 109 | + |
| 110 | +Slow-changing context is different. For example, BMS metadata barely changes, so |
| 111 | +a new consumer could wait too long to learn the context needed to interpret live |
| 112 | +values. Without a bootstrap path, the consumer may be connected and receiving |
| 113 | +live values but unable to use them correctly. |
| 114 | + |
| 115 | +In MQTT, use retained messages for this startup case. DSX Exchange persists the |
| 116 | +retained set so a new consumer can build its first view immediately. The |
| 117 | +retained set should stay small and slow-changing. Retained messages should be |
| 118 | +used as an optimization, not for correctness. The source application still owns |
| 119 | +the data and is responsible for republishing for reconciliation. Assume retained |
| 120 | +data can eventually be lost. At gigawatt scale, unlikely events happen. |
| 121 | + |
| 122 | +This is a compromise. Retained messages _are_ broker state. Even in memory, |
| 123 | +that state has to be stored and replicated, so it has lower maximum throughput |
| 124 | +than the stateless live message path. Keep high-rate live values on the |
| 125 | +stateless path and recover missed, stale, or incorrect live values through the |
| 126 | +next current state publication. |
| 127 | + |
| 128 | +## Decoupled Intent - Remodeling a Sync Request as Async |
| 129 | + |
| 130 | +Traditional synchronous requests can often be modeled as async to gain scaling, |
| 131 | +decoupling, and self-healing benefits. |
| 132 | + |
| 133 | +A requester publishes intent when it wants another application to change |
| 134 | +something. The state owner uses its own rules to decide what to do with that intent. |
| 135 | + |
| 136 | +The state owner keeps publishing current state or status on its normal stream. |
| 137 | +That stream does not depend on an intent message. It is the ongoing source of |
| 138 | +truth for every consumer. |
| 139 | + |
| 140 | +If the owner accepts the intent, the next state or status it publishes shows the |
| 141 | +accepted value. If the owner ignores it, clamps it, or falls back, the stream |
| 142 | +shows that result instead. The requester confirms the outcome by reading the |
| 143 | +same stream as every other consumer. |
| 144 | + |
| 145 | +The state owner does not need a response topic, callback address, or connection |
| 146 | +back to the requester. |
| 147 | + |
| 148 | +```mermaid |
| 149 | +sequenceDiagram |
| 150 | + participant Requester as Requester application |
| 151 | + participant Bus |
| 152 | + participant StateOwner as State owner |
| 153 | +
|
| 154 | + loop Normal state or status stream |
| 155 | + StateOwner->>Bus: Publish current state or status |
| 156 | + Bus->>Requester: Deliver current state or status |
| 157 | + end |
| 158 | +
|
| 159 | + Requester->>Bus: Publish intent |
| 160 | + Bus->>StateOwner: Deliver intent |
| 161 | + StateOwner->>StateOwner: Apply local rules |
| 162 | + StateOwner->>Bus: Publish current state or status |
| 163 | + Bus->>Requester: Deliver current state or status |
| 164 | +``` |
| 165 | + |
| 166 | +## BMS Setpoint Example |
| 167 | + |
| 168 | +Even straightforward synchronous requests such as "set the target temperature |
| 169 | +for a CDU" can be modeled as async. With this model, multiple integrating systems |
| 170 | +can see and direct BMS state without increasing the load on the BMS. |
| 171 | + |
| 172 | +A CDU liquid temperature control loop has three values in the bus model: |
| 173 | + |
| 174 | +- The current temperature is what the BMS measures and publishes. |
| 175 | + |
| 176 | +```text |
| 177 | +BMS/v1/PUB/Value/CDU/LiquidTemperature/{currentTemperatureTagPath} |
| 178 | +``` |
| 179 | + |
| 180 | +- The target setpoint is what the BMS is trying to hold, and publishes as BMS |
| 181 | + state. |
| 182 | + |
| 183 | +```text |
| 184 | +BMS/v1/PUB/Value/CDU/LiquidTemperature/{targetSetpointTagPath} |
| 185 | +``` |
| 186 | + |
| 187 | +- The requested target setpoint is what the integration wants the BMS to use. |
| 188 | + The integration publishes it on the topic the BMS listens to. |
| 189 | + |
| 190 | +```text |
| 191 | +BMS/v1/{integration}/Value/CDU/LiquidTemperatureSpRequest/{requestTagPath} |
| 192 | +``` |
| 193 | + |
| 194 | +The requested target setpoint is intent. The BMS may apply it, ignore it, clamp |
| 195 | +it to a configured range, or fall back to a local default. The integration does |
| 196 | +not get a callback from the BMS. |
| 197 | + |
| 198 | +Confirmation comes from the BMS published target setpoint. If the BMS accepts |
| 199 | +the request, the target setpoint changes to the accepted value. If the BMS |
| 200 | +clamps or falls back, the target setpoint shows the value the BMS actually |
| 201 | +chose. The current temperature remains the measured value. |
| 202 | + |
| 203 | +## When not to use a Bus |
| 204 | + |
| 205 | +Async is the right approach for live state, fan-out, and decoupled integration, but it is not the right approach for every workflow. |
| 206 | + |
| 207 | +Use a direct API when one caller needs an immediate response from one known |
| 208 | +owner before it can continue. Provisioning a machine, rebooting a known host, |
| 209 | +creating a VPC, or changing a setting that requires acknowledgement of that |
| 210 | +exact request are better as direct requests. |
| 211 | + |
| 212 | +DSX Exchange can still carry the resulting resource state after the direct API |
| 213 | +call creates or changes the resource. That gives interested consumers a native |
| 214 | +async state stream without making the bus part of the synchronous request path. |
| 215 | + |
| 216 | +## Practical Checklist for AsyncAPI Design |
| 217 | + |
| 218 | +- Publish the current value when it changes. |
| 219 | +- Republish the current value periodically at a cadence the source can sustain. |
| 220 | +- Make repeated current-value messages safe to process idempotently. |
| 221 | +- Prefer current values over deltas. |
| 222 | +- Include a timestamp for when the value was observed or created. |
| 223 | +- Include enough identifying information in the topic, metadata, or subject for |
| 224 | + consumers to know which value is being updated. |
| 225 | +- Include a correlation field if a later status message must tie back to an intent |
| 226 | + or request. |
| 227 | +- Retain infrequently-changing metadata needed at startup. |
| 228 | +- Do not retain frequently-changing live values. |
| 229 | + |
| 230 | +## Related Docs |
| 231 | + |
| 232 | +- [Architecture](architecture.md) |
| 233 | +- [BMS Integration](bms-integration.md) |
| 234 | +- [BMS Event Bus Schema](schema-bms.mdx) |
| 235 | +- [NICo Host State Schema](schema-nico.mdx) |
| 236 | +- [Power Management Schema](schema-power-management.mdx) |
0 commit comments