Problem
awf-cli-proxy fails the entire agent run when the external DIFC proxy is slow to accept connections during startup. The current liveness probe uses only 2 attempts with 1s between them — too tight for transient host-level DIFC proxy startup contention. On 2026-06-10 this knocked out two scheduled workflows (Auto-Triage #27261698585, Sub-Issue Closer #27261373050) with 0 agent turns executed before the firewall aborted.
Context
Source issue: github/gh-aw#38309
The probe resolves localhost to IPv6 [::1] but the tunnel listener may bind IPv4-only at 127.0.0.1, adding a second failure mode.
Root Cause
awf-cli-proxy entrypoint fail-fast logic: 2 liveness probe attempts at 1s intervals → immediate fatal abort on connection refused. No exponential backoff. Two concurrent scheduled jobs hit the same DIFC proxy at 07:46 UTC, exhausted the probe window before the proxy finished binding.
Proposed Solution
- Replace 2-attempt/1s fail-fast with exponential backoff (~5 attempts, ~15–30s total) in
containers/api-proxy/ cli-proxy startup logic.
- Pin the
localhost:18443 tunnel listener and probe to the same address family (prefer 127.0.0.1 explicitly) to eliminate IPv4/IPv6 mismatch.
Success criteria: scheduled runs survive transient DIFC proxy startup slowness; no awf-cli-proxy could not connect fatal in the next 7-day window.
Generated by Firewall Issue Dispatcher · 157.4 AIC · ⊞ 27.8K · ◷
Problem
awf-cli-proxyfails the entire agent run when the external DIFC proxy is slow to accept connections during startup. The current liveness probe uses only 2 attempts with 1s between them — too tight for transient host-level DIFC proxy startup contention. On 2026-06-10 this knocked out two scheduled workflows (Auto-Triage #27261698585, Sub-Issue Closer #27261373050) with 0 agent turns executed before the firewall aborted.Context
Source issue: github/gh-aw#38309
The probe resolves
localhostto IPv6[::1]but the tunnel listener may bind IPv4-only at127.0.0.1, adding a second failure mode.Root Cause
awf-cli-proxyentrypoint fail-fast logic: 2 liveness probe attempts at 1s intervals → immediate fatal abort onconnection refused. No exponential backoff. Two concurrent scheduled jobs hit the same DIFC proxy at 07:46 UTC, exhausted the probe window before the proxy finished binding.Proposed Solution
containers/api-proxy/cli-proxy startup logic.localhost:18443tunnel listener and probe to the same address family (prefer127.0.0.1explicitly) to eliminate IPv4/IPv6 mismatch.Success criteria: scheduled runs survive transient DIFC proxy startup slowness; no
awf-cli-proxy could not connectfatal in the next 7-day window.