Status: Work-In-Progress
ToDo: add metrics, event exporters
Every pipeline core emits Kubernetes-style conditions alongside its phase:
- Accepted:
Trueonce admission/config validation succeeds,False/Unknownotherwise. - Ready:
Truewhen the core is healthy enough to process telemetry ( policy-driven),False/Unknownotherwise.
Each condition carries:
status:True,False, orUnknown.reason: a strongly-typedConditionReasonenum (e.g.ConfigValid,Pending,QuorumNotMet,ForceDeleting,Unknown(String)).message: human-readable context.last_update: timestamp of the most recent transition.
Pipeline-level conditions are synthesized from the per-core conditions ( respecting quorum policies) so API consumers no longer have to infer health from legacy "phase" strings.
- Pending: Exists but not admitted; awaiting a decision.
- Starting: Admitted; provisioning-initialization in progress.
- Running: Serving traffic normally.
- Updating: Applying a new spec-version under control.
- RollingBack: Reverting after update failure.
- Draining: Quiescing; no new work; finishing in-flight.
- Stopped: Cleanly stopped; can be restarted with re-admission.
- Rejected(AdmissionError|ConfigRejected): Input was invalid or disallowed; fix inputs.
- Failed(RuntimeError|DrainError|RollbackFailed|DeleteError): Unrecoverable runtime/teardown failure.
- Deleting(Graceful|Forced): Teardown in progress (forced may drop in-flight work).
- Deleted: All resources removed; terminal.
First-class Kubernetes probes (/livez, /readyz):
/livez: fails only when a non-benignAccepted=False/Unknowncondition is observed (e.g.AdmissionError,ConfigRejected,RuntimeError). Pipelines with no observed runtimes are ignored./readyz: fails when any tracked pipeline reportsReady=False/Unknownand the configured ready quorum is not met (reason surfaced via the aggregate condition).
- ProbePolicy:
- live_if: phases considered alive (default: all except
Deleted). - ready_if: phases considered ready (default:
Runningand optionallyUpdating).
- live_if: phases considered alive (default: all except
- Quorum: All | AtLeast(n) | Percent(p) (of non-Deleted cores).
- AggregationPolicy:
- core_probe: ProbePolicy
- live_quorum (default AtLeast(1))
- ready_quorum (default All; popular alternative: Percent(80))