Skip to content

fix(skill): line-budget batching and split-on-output-limit retry#202

Closed
AsimRaza10 wants to merge 1 commit into
Lum1104:mainfrom
AsimRaza10:fix/output-limit-batch-sizing
Closed

fix(skill): line-budget batching and split-on-output-limit retry#202
AsimRaza10 wants to merge 1 commit into
Lum1104:mainfrom
AsimRaza10:fix/output-limit-batch-sizing

Conversation

@AsimRaza10
Copy link
Copy Markdown
Contributor

Summary

Fixes #159"Frequently seeing 'Batch X failed again (output limit). Retrying with minimal output mode.' on Opus + Bedrock for a 100-file repo. Can we split into smaller batches?"

Two tweaks to understand-anything-plugin/skills/understand/SKILL.md. Both are prompt-only — no code changes, no schema changes.

1. Cap batches by line count, not just file count

The current batching guidance says "20-30 files each, aim for ~25." That's fine for typical mixes, but on repos where files run 300-500 lines (web frameworks, generated config, fat test files) a 25-file batch can carry 10K+ source lines. Output tokens scale with input size, so on output-constrained models — Opus on Bedrock is the canonical case from the issue — those batches blow past max_tokens before the response finishes.

Added: a ~2,500 total-source-line cap that closes a batch early when hit, regardless of file count. The 20-30 file target becomes a ceiling rather than a quota. $FILE_LIST already carries sizeLines from Phase 1, so no new data needed at batching time.

2. Split-on-output-limit retry (instead of identical retry)

The Error Handling section currently retries any subagent failure once with the same prompt. For output-limit failures specifically, that's a wasted retry — the prompt is what caused the overflow, so a verbatim retry will overflow the same way.

Added: when the failure signal is output-token overflow (e.g. output limit exceeded, max_tokens reached, truncated JSON), split the failing batch in half by file list and dispatch each half as a fresh batch with a new batchIndex (avoids batch-<index>.json collisions). Recurse if a half still hits the cap. Only fall back to skip-and-continue after the batch has been shrunk to a single file that still fails.

This means a transient output overflow gets healed by smaller batches rather than burning the retry budget and dropping that batch's contribution to the graph.

Why this is prompt-only

Both behaviours are already orchestrated by the main session reading SKILL.md — there's no separate scheduler to teach. The batching loop in Phase 2 needs one extra line of bookkeeping (running line sum); the retry logic in Error Handling needs the new branch.

What this doesn't change

  • Default file-count target (20-30) — still the headline number for users without large files.
  • Concurrency (5 batches) — unchanged.
  • Merge / dedupe / normalization path — unaffected (the merge script already handles arbitrary batchIndex values, including split-batch ones).
  • No new commands, flags, or config knobs. A future PR could expose --max-batch-lines if maintainer wants user override.

Files changed

File Change
understand-anything-plugin/skills/understand/SKILL.md +3 lines: line-budget note in Phase 2 and split-retry rule in Error Handling.

Test plan

  • Diff is prompt text only — no scripts or types touched
  • Cap value (2,500 lines) chosen to fit a typical Opus-on-Bedrock output budget with headroom for the JSON wrapper; maintainer may tune
  • Maintainer to verify on a 100-file repo with mixed file sizes that the line cap doesn't fragment small batches unnecessarily
  • Maintainer to verify split-retry behaviour on a deliberately oversized batch

Closes #159

Two changes to /understand orchestration so users on output-constrained
models (Opus on Bedrock in particular) stop hitting "Batch X failed
again (output limit). Retrying with minimal output mode." repeatedly:

1. **Cap batches by line count, not just file count.** The existing
   20-30 file target is fine for typical batches but blows up when those
   25 files are 400 lines each — output tokens scale with input size,
   not file count. Add a ~2,500 source-line cap that closes batches
   early when hit.

2. **Split-on-output-limit retry.** The Error Handling section currently
   retries any subagent failure once with the same prompt. For
   output-limit failures that just wastes a retry (the prompt is what
   caused the overflow). Instead, split the batch in half by file list
   and dispatch each half as a fresh batch with new batchIndex values.
   Recursively split if a half-batch still hits the cap; only skip after
   shrinking to a single file that still fails.

Closes Lum1104#159
@AsimRaza10
Copy link
Copy Markdown
Contributor Author

@Lum1104 friendly ping when you get a chance — this one's prompt-only (no code/schema changes) and addresses #159, which a few folks have hit on Opus + Bedrock. Happy to tune the 2,500-line cap, split the two changes into separate PRs, or rework anything if either part is a sticking point. Thanks!

@AsimRaza10
Copy link
Copy Markdown
Contributor Author

Closing this — PR #204 (merged just now) fixes #159 via a more comprehensive route: a programmatic compute-batches.mjs for semantic batching, plus an agent-side output-splitting protocol (batch-<i>-part-<k>.json) that lets file-analyzer self-chunk when a response would overflow. That covers the same user-visible symptom my two prompt-only tweaks were chasing, just at a different layer.

My input-side line cap and orchestrator-level split-retry aren't present in #204, but they're not the right shape to graft onto the new architecture — the line cap would belong inside compute-batches.mjs rather than the prompt, and the split-retry is largely redundant with agent-side output splitting. Better as a fresh, narrowly-scoped PR if real-world traces on Opus/Bedrock still show overflow after #204 lands in users' hands.

Thanks @Lum1104 for the thorough fix on #159. Happy to revisit if you want a line-aware secondary cap added to compute-batches.mjs once there's evidence the file-count caps aren't enough on output-constrained models.

@AsimRaza10 AsimRaza10 closed this May 24, 2026
@Lum1104
Copy link
Copy Markdown
Owner

Lum1104 commented May 24, 2026 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Frequently seeing output limit exceeded

2 participants