tokensieve

Every token counts.

Cloud CLIs were built for humans to skim. Your AI agent has to read every character.

aws ec2 describe-instances returns 14,766 tokens. Your agent needed 8,594. You paid for 14,766.

Nulls. Empty arrays. Base64 blobs. Epoch timestamps. Duplicate IDs repeated across every object. None of it matters to your agent. All of it costs tokens.

tokensieve strips it out before your agent reads it.

[TokenSieve] Original: 14766 tok | Compressed: 8594 tok | Saved: 6172 (41.8%)

No changes to your agent. No config files. Five commands to install.

Results

17 real AWS API calls. No cherry-picking.

Command	Original	Compressed	Saved
`eks describe-cluster`	1,785 tok	599 tok	66.4%
`ec2 describe-security-groups` (5)	3,108 tok	1,410 tok	54.6%
`ec2 describe-subnets` (8)	2,639 tok	1,265 tok	52.1%
`ec2 describe-vpcs`	2,714 tok	1,375 tok	49.3%
`ec2 describe-instances` (6)	14,766 tok	8,594 tok	41.8%
`logs describe-log-groups` (10)	1,053 tok	738 tok	29.9%
`lambda list-functions` (5)	2,125 tok	1,592 tok	25.1%

40,483 tokens in → 21,487 out → 46.9% savings

The EKS number (66%) is mostly one thing. Every EKS cluster response embeds a PEM certificate as a JSON string. ~800 tokens of base64. tokensieve detects it by content — no field-name hints, no per-tool config — and replaces it with <base64 1476 chars>. Four tokens.

Full per-stage breakdowns: docs/stress-tests.md

Install

Requires Rust (stable).

git clone https://github.com/YOUR_USERNAME/tokensieve
cd tokensieve
cargo build --release

Register the tools you want intercepted:

mkdir -p ~/.tokensieve/bin

ln -sf $(pwd)/target/release/tokensieve ~/.tokensieve/bin/aws
ln -sf $(pwd)/target/release/tokensieve ~/.tokensieve/bin/kubectl
ln -sf $(pwd)/target/release/tokensieve ~/.tokensieve/bin/databricks

export PATH="$HOME/.tokensieve/bin:$PATH"   # add to ~/.zshrc or ~/.bashrc

Verify:

which aws               # → ~/.tokensieve/bin/aws
aws ec2 describe-vpcs   # compressed output + receipt on stderr

Usage

Proxy mode (default)

Your agent calls aws .... The symlink intercepts it. tokensieve finds the real binary further down $PATH, runs it, compresses the output, returns it.

agent → aws (symlink) → tokensieve → real aws → compressed output → agent

Non-JSON output passes through untouched. Exit codes are preserved. The agent cannot tell it's there.

Fetch mode

Run multiple commands concurrently, compress the merged result once:

printf "databricks grants get catalog prod\ndatabricks grants get catalog staging\n" \
  | tokensieve fetch

Manual pipeline

cat response.json | tokensieve

How it works

Six stages on every response:

Stage	What it does
Scrub	Strip ANSI escape codes
Gate	Non-JSON passes through at zero cost
Sieve	Remove nulls, empty values, base64 blobs
Dedupe	Drop epoch timestamps; first-seen-wins scalar deduplication
Route	Schema-YAML for dense arrays; PVFN for everything else
Emit	Compressed payload to stdout, receipt to stderr

Schema-YAML emits keys once, values as compact rows. No pipes, no separator lines.

PVFN (Path-Value Flattened Notation) flattens nested JSON to dot-notation paths, abbreviates long repeated key names, and inlines Schema-YAML blocks for dense sub-arrays.

The router picks based on fill ratio. No configuration.

Full design doc: docs/ARCHITECTURE.md

Contributing

Issues, PRs, and compression reports welcome.

Where the headroom is:

New CLI coverage — Tested against GCP, Azure, Terraform, gh, docker? Open an issue with a sanitized sample and measured savings.
Compression improvements — Two changes would push EC2 from ~40% to ~65%+: auto-unwrapping nested wrappers (Reservations → Instances), and recursive compression of embedded JSON strings. Details in docs/stress-tests.md.
Bug reports — Output garbled, or savings negative on a real payload? File an issue with a sanitized sample.

git clone https://github.com/YOUR_USERNAME/tokensieve
cd tokensieve
cargo test

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docs		docs
src		src
tests		tests
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tokensieve

Results

Install

Usage

Proxy mode (default)

Fetch mode

Manual pipeline

How it works

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

tokensieve

Results

Install

Usage

Proxy mode (default)

Fetch mode

Manual pipeline

How it works

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages