You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: complete SLO guardrails documentation across all references (#307)
* docs: add slo:<name> revert reason across all documentation
Add slo:<name> as a valid revert reason to every location that
enumerates revert reasons (metrics.md, troubleshooting.md, cli.md,
api.md, first-30-days.md, canary-rollout.md, quickstart.md, index.md,
safety.md). The SLO guardrails feature was fully implemented in code
and tested (PR #306) but the documentation predated the feature and
did not list the new reason type.
Also adds an SLO guardrails subsection to docs/architecture/safety.md
explaining the fail-open design, evaluation window behavior, and
mitigation guidance.
Minor: add Deployment metadata labels to the SLO guardrails E2E test
for consistency with other E2E tests.
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
* docs: add SLO guardrails to SPEC.md auto-revert triggers and fix stale code comments
- SPEC.md: add SLO guardrail breach as 5th auto-revert trigger in
section 7.2, add sloGuardrails and safetyObservationPeriod to the
updateStrategy spec example
- internal/safety/monitor.go: update SafetyVerdict.Reason comment to
include slo:<name>, update CheckPod docstring to list all 6 checks
in actual execution order (including throttle and SLO guardrails)
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
* docs: complete safety check enumeration in remaining doc locations
Update 4 locations that still listed incomplete safety check types:
- resize-api.md Mermaid diagram label
- SPEC.md directory tree comment
- SPEC.md Phase 3 checklist
- why-attune.md comparison table
All now list all 5 check types: OOMKill, throttle, restart, NotReady,
SLO guardrails.
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
* docs: add 4 missing updateStrategy fields to inheritable fields table
Add safetyObservationPeriod, sloGuardrails, canary, and initialSizing
to the Inheritable UpdateStrategy Fields table in configuration.md.
These fields are part of the UpdateStrategy struct and were correctly
listed in the namespace defaults summary row but missing from the
dedicated table.
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
---------
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
- **slo:<name>**: an SLO guardrail query breached its threshold after resize. Review the guardrail's PromQL query and threshold in `updateStrategy.sloGuardrails`.
0 commit comments