-
Couldn't load subscription status.
- Fork 1
Description
Because of some recent events, where quotas were exceeded to such a degree that it caused severe service degradation for other users, a "soft" version of quota enforcement should be made.
It should be made by adding the ability for Fleet Manager to slow down data ingestion for a specific account by having it "hold" batch requests containing failed messages, saga data, and audited messages for 10-20 seconds after having forwarded the data. Since these types are the batch requests that contribute to the vast majority of the consumed resources, I expect this to be very effective at limiting the ingestion rate.
The "slow-down-mode" should be activated once Fleet Manager detects that the projected usage for a quota is greater than 150%, and it should be lifted again once Fleet Manager detects that the projected quota is less than 120%. Since the quota projection is not calculated until 24 hours of each quota period has elapsed, it follows from this that "slow-down-mode" cannot be activated within the first 24 hours of each quota period.
- UI that is capable of displaying a bar at the top of the screen with an appropriate warning
- Background poller in UI capable of querying the AccoutnController for the current status ("normal"/"slow-mode")
- New controller method that gets the current mode for the current account
- Persistence (thinking part of AccountRepository?) that remembers the current mode
- Background service in backend that monitors projected quota usage and manages the current mode
- Persistence in command db that is capable of storing account IDs currently in slow-mode
- Background checker in API that queries the command db for account IDs currently in slow-mode
- API modification in batch handlers that checks: IF we are in slow-mode for the current account AND the batch contained failed messages, audited messages, or saga snapshot bodies THEN wait 15 s before returning OK to the client
That ought to do it 🙂