rtk-ai/rtk gave me the clean CLI proxy. contextzip folded in the session and stacktrace compactors I kept reaching for. Tirith gave me a real shell-syntax gate. I was meant to just use them. Instead I keep bolting more crap on: supply-chain gate, discover command, web extractor, session manager. I genuinely cannot stop.
The fluffy ragdoll up top is my recurring mascot, same one on the blog, same one anywhere I need a logo. Hat tip to Matt Dinniman (Dungeon Crawler Carl) for the recent-reading inspiration behind the "Dammit exec()!" line.
Thanks rtk, contextzip and Tirith for the bones. Sorry upstream for the bolt-ons. Not sorry for the cat.
Warning
Active development. Might work, might not. Use at your own risk.
This is a fast-moving downstream fork by one person. Before depending on it: build it yourself, test it against your own workflow, read the diff on top of upstream rtk, and run the code through your favourite LLM for a second opinion (why not). Don't trust me. Verify. Bug reports welcome; expectations of stability shouldn't be.
ContextCrawler is a CLI proxy for AI coding agents (Claude Code, Cursor, Copilot, Gemini, …) that does two things:
- Compresses noisy command output before it eats your LLM context window.
- Gates risky shell commands and supply-chain installs before any auto-approval reaches the agent.
One binary, one name: contextcrawler. Since 0.4.0 the same crate is
also a small Rust library (see Use as a library).
Built from rtk-ai/rtk (the core CLI proxy and 60+ command filters, tracked by rebase), jee599/contextzip (the session compactor, stacktrace compressor, and HTML extractor, carried over with per-file SPDX headers), and Tirith (a shell-syntax security gate, invoked subprocess-only). The supply-chain gate is built in-tree. See Architecture for the lineage diagram.
Make AI coding agents both cheaper and safer without changing how you work: compress noisy output before it eats your context window, and run agent-proposed commands past two optional, opt-in gates (shell-syntax inspection, pre-install supply-chain checks) before auto-approving them.
Everything is one binary; the full command and filter reference lives in
docs/guide/commands.md.
- Token-saving filters: 60+ command filters (git, cargo, npm, kubectl, docker, …) plus per-language stacktrace trimming and HTML chrome stripping. Typically 60-90% fewer tokens per command.
- Security gate (optional): routes auto-allow rewrites through Tirith; block-level findings downgrade the verdict to Ask. Fail-open by default.
- Supply-chain control (opt-in): pre-install age-of-release + OSV CVE
lookup for
npm/pnpm/yarnandpip/uv/poetry/pipxinstalls. - Analytics:
contextcrawler gainfor token-savings stats andcontextcrawler discoverfor filtering opportunities you missed. - Library API: apply the filters to text you already have, from Rust, without spawning the CLI.
For the gates (env knobs, false positives, the gate-safe network-fetch
pattern, tirith trust), see
docs/security/working-with-the-gate.md.
Since 0.4.0 the crate publishes a small curated Rust API so a downstream tool can apply ContextCrawler's filters to text it already has, no CLI subprocess.
Warning
The public API is experimental and NOT yet semver-guaranteed. It may change between 0.x releases. There are no pre-built crates, so depend on it from source and pin an exact tag.
[dependencies]
contextcrawler = { git = "https://github.com/thehoff/contextcrawler", tag = "v0.4.0" }The curated entry points (re-exported from the crate root) are
filter_output, auto_filter_output, available_filters,
summarize_command_output (with CommandOutputSummaryOptions), and
no_bloat. The filtering helpers are panic-safe and exit-blind (text only,
never the command's exit code). Full surface, signatures, and examples:
docs/guide/library.md and the rustdoc
(cargo doc --open).
Requires a Rust toolchain (rustup, stable, 1.80+). There are no
pre-built binaries (single-maintainer fork); you build from source.
Installation is always through Cargo, never by copying a binary around.
# From a clone (recommended, read the diff first):
git clone https://github.com/thehoff/contextcrawler.git
cd contextcrawler && git checkout v0.4.0
cargo install --path .
# Or straight from git:
cargo install --git https://github.com/thehoff/contextcrawler --tag v0.4.0 --lockedThen wire up the agent hook. Run contextcrawler init -g for Claude Code,
or the per-agent flag for the others. Full walkthrough (the rtk/contextzip
migration step, every agent, the optional Tirith and supply-chain gates):
Installation and
Supported agents.
Config lives under ~/.config/ctxcrl/, savings history at
~/.local/share/ctxcrl/history.db. Environment variables use the CTXCRL_*
prefix; legacy RTK_* names are still honoured via a shim. See
Configuration for the full list.
- Documentation hub: start here
- Installation · Quick start · Configuration · Supported agents
- Command & filter reference: every command, what it strips, the savings
- Use as a library: the experimental Rust API
- Architecture: lib+bin split, the hook gate, diagrams
- Working with the security gate: false positives,
tirith trust, the network-fetch pattern - Tracking & analytics: the savings data model and schema
The downstream parts of this repository are MIT.
- Upstream rtk content remains under its original license terms (see
the root
LICENSE). Note that upstream rtk's repo is internally inconsistent (LICENSEsays Apache-2.0;Cargo.tomlsays MIT). We preserve those upstream files as-is. - Source files we add or carry over carry per-file SPDX-License-Identifier headers citing their origin (jee599/contextzip MIT for ported modules; ContextCrawler contributors MIT for new additions).
- Tirith is AGPL-3.0 and is only invoked via subprocess; no statically linked AGPL code in this distribution.
- rtk-ai/rtk: upstream base. Active, 47K stars, current release v0.39.0. ContextCrawler tracks their tagged releases.
- jee599/contextzip: source of the session compactor, stacktrace compressor, and HTML extractor. Each carried-over file has a per-file SPDX header citing this upstream.
- sheeki03/tirith: invoked via subprocess for the optional defense-in-depth gate.
v0.4.0 is the library pivot: the binary is now a thin shim over the
contextcrawler library crate, which also exposes the experimental
filter/summary API above. See CHANGELOG.md.
