Skip to content

[Feature] Add --max-tokens guard to fail-fast when packed output exceeds a token budget #1616

@fuleinist

Description

@fuleinist

Description

Add a --max-tokens <N> (or --token-budget <N>) CLI/config flag that fails the run with a clear error when the packed output would exceed N tokens, so CI pipelines and model-aware workflows can guarantee the output fits a target model's context window before spending time generating a multi-megabyte file.

Problem

Today, when repomix packages a large codebase, the output can balloon past an LLM's context window. The only built-in mitigation is --split-output, which silently splits into multiple files — it does not fail-fast. A pipeline that ships a 1.2M-token packed file to a 200k-context model will hit a runtime error far away from the cause, after minutes of packing and writing.

This is a recurring pain point in CI and agent harnesses where the packed file is the only input an LLM sees, and the token budget is a hard contract.

Proposed solution

Introduce a top-level output.tokenBudget / --max-tokens <N> option that:

  1. Counts tokens incrementally during packing (reuse the existing tokenCountTree / tokenCount.encoding machinery).
  2. Aborts the run as soon as the running total exceeds N, with a clear RepomixError like: ✖ Packed output exceeds token budget: <actual> > <limit> tokens. Use --compress, refine --include/--ignore, or --split-output.
  3. Surface a corresponding output.tokenBudget field in repomix.config.json so it can be pinned per-project.
  4. Exit with a non-zero code (new NangoCliExitCode-style constant, e.g. TokenBudgetExceeded) so CI can detect it.

Distinct from --split-output (which chunks after packing) and from --compress (which reduces). --max-tokens is a guard, not a transform.

Use case

  • A repomix step in a CI pipeline for a 200k-token-context target model: repomix --max-tokens 180000 fails fast instead of producing a 600k-token file that breaks the downstream LLM call.
  • A Claude Code / Codex / Cursor workflow that calls repomix before each turn: pin the budget in repomix.config.json so over-sized projects are rejected with an actionable error.
  • A pre-PR bot that runs repomix on a diff branch and gates the PR on the packed file fitting the team's review-model context window.

Why this is small

The token-counting path already exists (output.tokenCountTree, output.tokenCount.encoding, TOKEN_ENCODINGS). The change is: read a new optional numeric field, accumulate the running total, throw on overflow, write a config schema entry, plumb one CLI flag. No new dependencies. Backward compatible (default = unlimited, preserves current behavior).

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions