Description
Add a --max-tokens <N> (or --token-budget <N>) CLI/config flag that fails the run with a clear error when the packed output would exceed N tokens, so CI pipelines and model-aware workflows can guarantee the output fits a target model's context window before spending time generating a multi-megabyte file.
Problem
Today, when repomix packages a large codebase, the output can balloon past an LLM's context window. The only built-in mitigation is --split-output, which silently splits into multiple files — it does not fail-fast. A pipeline that ships a 1.2M-token packed file to a 200k-context model will hit a runtime error far away from the cause, after minutes of packing and writing.
This is a recurring pain point in CI and agent harnesses where the packed file is the only input an LLM sees, and the token budget is a hard contract.
Proposed solution
Introduce a top-level output.tokenBudget / --max-tokens <N> option that:
- Counts tokens incrementally during packing (reuse the existing
tokenCountTree / tokenCount.encoding machinery).
- Aborts the run as soon as the running total exceeds N, with a clear
RepomixError like: ✖ Packed output exceeds token budget: <actual> > <limit> tokens. Use --compress, refine --include/--ignore, or --split-output.
- Surface a corresponding
output.tokenBudget field in repomix.config.json so it can be pinned per-project.
- Exit with a non-zero code (new
NangoCliExitCode-style constant, e.g. TokenBudgetExceeded) so CI can detect it.
Distinct from --split-output (which chunks after packing) and from --compress (which reduces). --max-tokens is a guard, not a transform.
Use case
- A
repomix step in a CI pipeline for a 200k-token-context target model: repomix --max-tokens 180000 fails fast instead of producing a 600k-token file that breaks the downstream LLM call.
- A Claude Code / Codex / Cursor workflow that calls
repomix before each turn: pin the budget in repomix.config.json so over-sized projects are rejected with an actionable error.
- A pre-PR bot that runs
repomix on a diff branch and gates the PR on the packed file fitting the team's review-model context window.
Why this is small
The token-counting path already exists (output.tokenCountTree, output.tokenCount.encoding, TOKEN_ENCODINGS). The change is: read a new optional numeric field, accumulate the running total, throw on overflow, write a config schema entry, plumb one CLI flag. No new dependencies. Backward compatible (default = unlimited, preserves current behavior).
Description
Add a
--max-tokens <N>(or--token-budget <N>) CLI/config flag that fails the run with a clear error when the packed output would exceed N tokens, so CI pipelines and model-aware workflows can guarantee the output fits a target model's context window before spending time generating a multi-megabyte file.Problem
Today, when
repomixpackages a large codebase, the output can balloon past an LLM's context window. The only built-in mitigation is--split-output, which silently splits into multiple files — it does not fail-fast. A pipeline that ships a 1.2M-token packed file to a 200k-context model will hit a runtime error far away from the cause, after minutes of packing and writing.This is a recurring pain point in CI and agent harnesses where the packed file is the only input an LLM sees, and the token budget is a hard contract.
Proposed solution
Introduce a top-level
output.tokenBudget/--max-tokens <N>option that:tokenCountTree/tokenCount.encodingmachinery).RepomixErrorlike:✖ Packed output exceeds token budget: <actual> > <limit> tokens. Use --compress, refine --include/--ignore, or --split-output.output.tokenBudgetfield inrepomix.config.jsonso it can be pinned per-project.NangoCliExitCode-style constant, e.g.TokenBudgetExceeded) so CI can detect it.Distinct from
--split-output(which chunks after packing) and from--compress(which reduces).--max-tokensis a guard, not a transform.Use case
repomixstep in a CI pipeline for a 200k-token-context target model:repomix --max-tokens 180000fails fast instead of producing a 600k-token file that breaks the downstream LLM call.repomixbefore each turn: pin the budget inrepomix.config.jsonso over-sized projects are rejected with an actionable error.repomixon a diff branch and gates the PR on the packed file fitting the team's review-model context window.Why this is small
The token-counting path already exists (
output.tokenCountTree,output.tokenCount.encoding,TOKEN_ENCODINGS). The change is: read a new optional numeric field, accumulate the running total, throw on overflow, write a config schema entry, plumb one CLI flag. No new dependencies. Backward compatible (default = unlimited, preserves current behavior).