Open
Description
Justification
See this comment - #1086 (comment)
Milestones
V0
- Create a new "ApiAvailability" measurement
- This measurement should periodically probe the apiserver's /healthz endpoint and record whether api was available or not
- The measurement should output the following stats
- availability percentage (e.g. % of OK responses over all responses)
- Longest consecutively unavailability period
- ...
V1
- Make the measurement fail if apiserver is consecutively not available in the last XX min (e.g. 30min).
- Make the error critical to make sure the test execution is stopped in such case
V2
- Come up with exact SLI/SLO definition, make it available in perf-dash for further analysis
- Make it a WIP Scalability SLO
- Evaluate and promote to official SLO
/good-first-issue