Skip to content
#

prompt-compression

Here are 76 public repositories matching this topic...

14-stage Fusion Pipeline for LLM token compression — reversible compression, AST-aware code analysis, intelligent content routing. Zero LLM inference cost. MIT licensed.

  • Updated Apr 1, 2026
  • Python

Local proxy that compresses your LLM API requests so you pay less, with no change to the answers. Trims wasted tokens from prompts, history, tool output, and code before they're sent: -31% input / -74% output, measured live. Any provider, no extra model calls. Also an MCP server and embeddable library (Rust, Python, Ruby, Kotlin, Swift, JS/TS).

  • Updated Jun 22, 2026
  • Rust

A curated list of strategies, tools, papers, and resources for reducing LLM token costs and improving efficiency in production.

  • Updated Jun 21, 2026

Improve this page

Add a description, image, and links to the prompt-compression topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the prompt-compression topic, visit your repo's landing page and select "manage topics."

Learn more