Summary
Conduit does not handle SIGTERM. pkg/conduit/entrypoint.go:70 registers only os.Interrupt (SIGINT):
signal.Notify(signalChan, os.Interrupt)
docker stop, kubectl delete pod, and systemctl stop all send SIGTERM, which hits Go's default disposition — the process dies immediately with zero drain. This violates documented data-integrity invariant 7 ("Shutdown is graceful by default. SIGTERM drains in-flight records and checkpoints before exit").
Related gaps (same fix)
- A second SIGINT calls a bare
os.Exit(exitCodeInterrupt) (entrypoint.go:77-78) that bypasses the graceful-stop path entirely.
registerCleanupV2 force-stop escalation (pkg/conduit/runtime.go:499-519) force-stops after a fixed fraction of exitTimeout regardless of checkpoint completion; ForceStop cancels the connector context without verifying no un-acked record was already forwarded downstream (invariant-1 adjacency).
Severity
With at-least-once (inv. 3) and crash-safe positions (inv. 2) holding, an un-drained SIGTERM behaves like a crash — recoverable on restart, not silent data loss. But it violates a documented invariant, produces duplicate storms and unclean checkpoints on every Kubernetes pod recycle, and undermines the container/12-factor deployment story. Pre-existing (not a v0.15.0 regression).
Fix (v0.15.1, Tier 1 — data path)
- Design doc first (
docs/design-documents/): signal set, drain sequence, grace deadline, force-stop-respects-checkpoint semantics.
- Register SIGTERM; make the drain path the default; remove the second-signal
os.Exit bypass; fix V2 force-stop to escalate only after checkpoint or a hard deadline, never mid-ack.
- Regression + chaos test (seeds
tests/chaos, which doesn't exist yet): SIGTERM/SIGKILL at random points under load → assert no double-delivery beyond at-least-once and no lost/torn checkpoint.
Tier
Tier 1 (data path) — requires human sign-off + failure-mode analysis per CLAUDE.md.
Scoped in the Phase 1 execution plan (#2518), §0.1.
Summary
Conduit does not handle SIGTERM.
pkg/conduit/entrypoint.go:70registers onlyos.Interrupt(SIGINT):docker stop,kubectl delete pod, andsystemctl stopall send SIGTERM, which hits Go's default disposition — the process dies immediately with zero drain. This violates documented data-integrity invariant 7 ("Shutdown is graceful by default. SIGTERM drains in-flight records and checkpoints before exit").Related gaps (same fix)
os.Exit(exitCodeInterrupt)(entrypoint.go:77-78) that bypasses the graceful-stop path entirely.registerCleanupV2force-stop escalation (pkg/conduit/runtime.go:499-519) force-stops after a fixed fraction ofexitTimeoutregardless of checkpoint completion;ForceStopcancels the connector context without verifying no un-acked record was already forwarded downstream (invariant-1 adjacency).Severity
With at-least-once (inv. 3) and crash-safe positions (inv. 2) holding, an un-drained SIGTERM behaves like a crash — recoverable on restart, not silent data loss. But it violates a documented invariant, produces duplicate storms and unclean checkpoints on every Kubernetes pod recycle, and undermines the container/12-factor deployment story. Pre-existing (not a v0.15.0 regression).
Fix (v0.15.1, Tier 1 — data path)
docs/design-documents/): signal set, drain sequence, grace deadline, force-stop-respects-checkpoint semantics.os.Exitbypass; fix V2 force-stop to escalate only after checkpoint or a hard deadline, never mid-ack.tests/chaos, which doesn't exist yet): SIGTERM/SIGKILL at random points under load → assert no double-delivery beyond at-least-once and no lost/torn checkpoint.Tier
Tier 1 (data path) — requires human sign-off + failure-mode analysis per CLAUDE.md.
Scoped in the Phase 1 execution plan (#2518), §0.1.