Skip to content

Commit d5e9c02

Browse files
zahclaude
andauthored
fix(engine): count cascaded asBlocked after a broker-failed elevated edge (#19)
When ``brokerSpawn`` returns ``ok = false`` for a ``requiresElevation`` edge, the engine calls ``blockClosure`` to mark every transitively dependent edge as ``asBlocked``. The surrounding ``inc completed`` only counted THIS one failed root — the cascaded blocked descendants stayed invisible to the loop's ``completed < total`` termination check, so the next iteration found no pending / running / ready work and raised the spec-mandated ``build graph made no progress; pending actions: `` diagnostic with an EMPTY pending list (the surviving actions were ``asBlocked``, not ``asPending``). The error path's diagnostic gets projected by ``apply_build_actions`` onto every input action, so on the L3 production-profile apply this surfaces as: - build-action actionsRunnerZip: build engine raised BuildEngineError: \ build graph made no progress; pending actions: - build-action actionsRunnerExtracted: build engine raised BuildEngineError: \ build graph made no progress; pending actions: - build-action runnerRegistered: build engine raised BuildEngineError: \ build graph made no progress; pending actions: with the empty trailing list and IDENTICAL message across all 3 edges — even though only ``actionsRunnerZip`` actually failed (``actionsRunnerExtracted`` and ``runnerRegistered`` depend on it and were correctly cascaded to ``asBlocked`` by ``blockClosure``). Every OTHER ``blockClosure`` site in the engine already uses ``completed = terminalCount()`` for this exact reason; the broker- returns-ok=false branch was the lone offender. Align it. Add a regression test: a 3-action chain with the root broker call returning ``ok = false`` must terminate with the root ``asFailed`` and the two deps ``asBlocked``, NOT raise ``BuildEngineError``. --no-verify: pre-existing prek migration-mode failure (no .pre-commit-config.yaml in repo). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent de6608d commit d5e9c02

2 files changed

Lines changed: 84 additions & 0 deletions

File tree

libs/repro_build_engine/src/repro_build_engine.nim

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4076,6 +4076,22 @@ proc runBuild*(g: BuildGraph; config: BuildEngineConfig): BuildRunResult =
40764076
"exit=" & $brokerOutcome.exitCode)
40774077
blockClosure(id, id)
40784078
emitProgress(bpkActionCompleted, id)
4079+
# ``blockClosure`` marks every transitively-dependent
4080+
# action as ``asBlocked`` without touching the local
4081+
# ``completed`` counter. ``inc completed`` here would
4082+
# only count THIS action (the broker-failed one) — the
4083+
# cascaded blocked descendants would stay invisible to
4084+
# the loop's "completed < total" termination check, so
4085+
# the next iteration would find no pending / running /
4086+
# ready work and raise the spec-mandated
4087+
# ``build graph made no progress; pending actions: ``
4088+
# diagnostic with an empty pending list. Every OTHER
4089+
# blockClosure site in this file uses ``terminalCount()``
4090+
# for exactly this reason; this branch was the lone
4091+
# offender.
4092+
completed = terminalCount()
4093+
launchedAny = true
4094+
continue
40794095
inc completed
40804096
launchedAny = true
40814097
continue

libs/repro_build_engine/tests/test_elevated_inline_exec_hook.nim

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -246,6 +246,74 @@ suite "Windows-System-Resources Phase E — engine broker-dispatch hook":
246246
# the file's absence here is the negative pin.
247247
check not fileExists(outputPath)
248248

249+
test "broker returns ok=false on root edge — cascaded asBlocked deps are counted (regression)":
250+
## Regression for a build-engine scheduler bug: when ``brokerSpawn``
251+
## returns ``ok = false`` for a root elevated edge, the engine
252+
## calls ``blockClosure`` to mark every transitively-dependent
253+
## edge as ``asBlocked``. The surrounding ``inc completed`` ONLY
254+
## counted the failed root, NOT the cascaded blocked deps — so
255+
## the next loop iteration saw ``completed < total`` with no
256+
## pending / running / ready work and raised
257+
## ``build graph made no progress; pending actions: `` (with an
258+
## empty pending list because all surviving actions were
259+
## ``asBlocked``, not ``asPending``).
260+
##
261+
## Every OTHER ``blockClosure`` site in the engine uses
262+
## ``completed = terminalCount()`` for exactly this reason. The
263+
## fix aligns the broker-returns-ok=false branch with the same
264+
## pattern.
265+
##
266+
## Pin: a 3-action chain (root + 2 dependents) where the root
267+
## broker call returns ``ok = false`` must terminate with the
268+
## root ``asFailed`` and the two deps ``asBlocked`` — NOT raise
269+
## ``BuildEngineError``.
270+
resetTmp()
271+
let cacheRoot = TmpDir / "cache-cascade"
272+
let rootOut = absolutePath(TmpDir / "outputs-cascade" / "root.out")
273+
let midOut = absolutePath(TmpDir / "outputs-cascade" / "mid.out")
274+
let leafOut = absolutePath(TmpDir / "outputs-cascade" / "leaf.out")
275+
createDir(cacheRoot)
276+
createDir(splitPath(rootOut).head)
277+
278+
let rec = newRecorder(returnOk = false, returnExitCode = 1,
279+
stderrText = "simulated broker failure\n",
280+
diagnosticText = "simulated broker failure")
281+
var root = elevatedAction("t-root", rootOut, @["/bin/true"],
282+
fingerprintToken = "cascade-root")
283+
var mid = elevatedAction("t-mid", midOut, @["/bin/true"],
284+
fingerprintToken = "cascade-mid")
285+
mid.deps = @["t-root"]
286+
var leaf = elevatedAction("t-leaf", leafOut, @["/bin/true"],
287+
fingerprintToken = "cascade-leaf")
288+
leaf.deps = @["t-mid"]
289+
290+
let g = graph(@[root, mid, leaf], newSeq[BuildPool]())
291+
var raised = false
292+
var res: BuildRunResult
293+
try:
294+
res = runBuild(g, elevatedConfig(cacheRoot, rec))
295+
except BuildEngineError:
296+
raised = true
297+
check not raised # the engine MUST NOT raise the "no progress" error.
298+
check res.results.len == 3
299+
var statusById = newSeq[tuple[id: string; status: ActionStatus]]()
300+
for r in res.results:
301+
statusById.add((id: r.id, status: r.status))
302+
var rootStatus, midStatus, leafStatus = asPending
303+
for s in statusById:
304+
case s.id
305+
of "t-root": rootStatus = s.status
306+
of "t-mid": midStatus = s.status
307+
of "t-leaf": leafStatus = s.status
308+
else: discard
309+
check rootStatus == asFailed
310+
check midStatus == asBlocked
311+
check leafStatus == asBlocked
312+
# The broker was invoked exactly once — for the root. The
313+
# cascaded blocked deps must NOT reach the broker.
314+
check rec.invocations.len == 1
315+
check rec.invocations[0].actionId == "t-root"
316+
249317
test "non-elevated edges remain byte-identical (broker hook ignored)":
250318
## A graph with NO ``requiresElevation = true`` edges must run
251319
## end-to-end without consulting the broker hook even when one

0 commit comments

Comments
 (0)