[Feature] Add --max-tokens guard to fail-fast when packed output exceeds a token budget

## Description

Add a `--max-tokens <N>` (or `--token-budget <N>`) CLI/config flag that fails the run with a clear error when the packed output would exceed N tokens, so CI pipelines and model-aware workflows can guarantee the output fits a target model's context window before spending time generating a multi-megabyte file.

**Problem**

Today, when `repomix` packages a large codebase, the output can balloon past an LLM's context window. The only built-in mitigation is `--split-output`, which silently splits into multiple files — it does not fail-fast. A pipeline that ships a 1.2M-token packed file to a 200k-context model will hit a runtime error far away from the cause, after minutes of packing and writing.

This is a recurring pain point in CI and agent harnesses where the packed file is the only input an LLM sees, and the token budget is a hard contract.

**Proposed solution**

Introduce a top-level `output.tokenBudget` / `--max-tokens <N>` option that:

1. Counts tokens incrementally during packing (reuse the existing `tokenCountTree` / `tokenCount.encoding` machinery).
2. Aborts the run as soon as the running total exceeds N, with a clear `RepomixError` like: `✖ Packed output exceeds token budget: <actual> > <limit> tokens. Use --compress, refine --include/--ignore, or --split-output.`
3. Surface a corresponding `output.tokenBudget` field in `repomix.config.json` so it can be pinned per-project.
4. Exit with a non-zero code (new `NangoCliExitCode`-style constant, e.g. `TokenBudgetExceeded`) so CI can detect it.

Distinct from `--split-output` (which chunks after packing) and from `--compress` (which reduces). `--max-tokens` is a guard, not a transform.

**Use case**

- A `repomix` step in a CI pipeline for a 200k-token-context target model: `repomix --max-tokens 180000` fails fast instead of producing a 600k-token file that breaks the downstream LLM call.
- A Claude Code / Codex / Cursor workflow that calls `repomix` before each turn: pin the budget in `repomix.config.json` so over-sized projects are rejected with an actionable error.
- A pre-PR bot that runs `repomix` on a diff branch and gates the PR on the packed file fitting the team's review-model context window.

**Why this is small**

The token-counting path already exists (`output.tokenCountTree`, `output.tokenCount.encoding`, `TOKEN_ENCODINGS`). The change is: read a new optional numeric field, accumulate the running total, throw on overflow, write a config schema entry, plumb one CLI flag. No new dependencies. Backward compatible (default = unlimited, preserves current behavior).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Add --max-tokens guard to fail-fast when packed output exceeds a token budget #1616

Description

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

[Feature] Add --max-tokens guard to fail-fast when packed output exceeds a token budget #1616

Description

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions