Skip to content

chore(smoke-nuke): parallel stack deletes + 6h timeout#355

Merged
chrisns merged 1 commit into
mainfrom
chore/smoke-nuke-parallel
May 21, 2026
Merged

chore(smoke-nuke): parallel stack deletes + 6h timeout#355
chrisns merged 1 commit into
mainfrom
chore/smoke-nuke-parallel

Conversation

@chrisns
Copy link
Copy Markdown
Member

@chrisns chrisns commented May 21, 2026

Previous nuke serialized deletes; 9+ orphans × 30-90 min each blew past the 180m cap. Rewrite to fire every delete-stack in parallel and poll all in one loop. Bumps timeout to 360m as headroom for worst case.

Previous nuke runs serialized force_delete_stack per orphan, with each
worst-case taking 60+60+30 min between attempts. 9+ orphans × 30-90 min
each blows past the 180-min workflow cap; nuke #4 was cancelled mid-
delete on the second stuck stack with the umbrella + 9 stacks already
gone but two PaperlessNgx/BopsPlanning orphans not yet down.

Rewrite PHASE 1 to fire delete-stack on every matching stack
simultaneously, then poll all of them in one loop. CFN's per-stack
delete is what's slow, not the API; running 16 in parallel is bounded
by the slowest single stack rather than the sum.

Each iteration also handles DELETE_FAILED inline (retain-retry, then
force-retain-everything) so we don't need a separate sequential pass.

Bump workflow timeout-minutes 180 → 360 so even worst-case parallel
deletes (where every stack hits its 90-min internal cap simultaneously)
can complete.
@chrisns chrisns requested a review from a team as a code owner May 21, 2026 08:50
@chrisns chrisns added this pull request to the merge queue May 21, 2026
Merged via the queue into main with commit e858dfb May 21, 2026
6 of 7 checks passed
@chrisns chrisns deleted the chore/smoke-nuke-parallel branch May 21, 2026 08:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant