prompt-compression

Here are 76 public repositories matching this topic...

open-compress / claw-compactor

14-stage Fusion Pipeline for LLM token compression — reversible compression, AST-aware code analysis, intelligent content routing. Zero LLM inference cost. MIT licensed.

Updated Apr 1, 2026
Python

jia-gao / leanctx

Star

Drop-in prompt compression for production LLM apps. Cut your token bill 40-60% without changing your code. Python SDK, LLMLingua-2, MIT.

python gemini openai cost-optimization rag llm langchain anthropic llm-inference prompt-compression langgraph llmlingua

Updated Jun 14, 2026
Python

gglucass / headroom-desktop

Star

Unlock 2x more Claude Code and Codex usage

react macos rust typescript ai proxy openai developer-tools codex tauri menu-bar-app llm anthropic prompt-compression claude-code token-optimization

Updated Jun 22, 2026
Rust

Local proxy that compresses your LLM API requests so you pay less, with no change to the answers. Trims wasted tokens from prompts, history, tool output, and code before they're sent: -31% input / -74% output, measured live. Any provider, no extra model calls. Also an MCP server and embeddable library (Rust, Python, Ruby, Kotlin, Swift, JS/TS).

rust ai proxy mcp prompt openai developer-tools mitm-proxy llm prompt-engineering cost-reduction llmops anthropic prompt-compression claude-code token-optimization agentic-coding

Updated Jun 22, 2026
Rust

atjsh / llmlingua-2-js

Star

JavaScript/TypeScript implementation of LLMLingua-2 (Experimental)

nodejs javascript typescript web tensorflow transformers webgpu hf tensorflowjs prompt-engineering transformer-js prompt-compression llmlingua

Updated Sep 14, 2025
TypeScript

chappyasel / meta-kb

Star

A self-improving knowledge base about LLM agent infrastructure

markdown machine-learning ai artificial-intelligence multi-agent knowledge-graph knowledge-base self-learning ai-agents rag autonomous-research llm anthropic prompt-compression agent-skills agent-memory claude-code context-engineering openclaw

Updated Apr 9, 2026
TypeScript

centminmod / or-cli

Sponsor

Star

Python command-line tool for interacting with AI models through the OpenRouter API/Cloudflare AI Gateway, or local self-hosted Ollama. Optionally support Microsoft LLMLingua prompt token compression

openai linkup opik rag openai-api txtai llms llm-inference openrouter ollama cloudflare-ai ollama-api prompt-compression structured-outputs openai-api-client openrouter-api cloudflare-ai-gateway ai-rag llmlingua

Updated Dec 28, 2025

sriinnu / clipforge-PAKT

Sponsor

Star

Lossless-first prompt compression for JSON, YAML, CSV, and Markdown. Library, CLI, MCP server, desktop app, and browser extension.

markdown cli yaml json csv mcp developer-tools lossless-compression llm pakt prompt-compression token-compression coding-agent

Updated Jun 16, 2026
TypeScript

KathanModh259 / latent-gate

Star

VL-JEPA inspired pipeline — compress images/text locally via Ollama, send compact payloads to any LLM API. Cut token costs by ~80%.

python ai computer-vision python3 gemini openai embedding claude multimodal vision-language cost-reduction local-llm ollama llm-pipeline prompt-compression token-optimization selective-decode vl-jepa api-cost

Updated Jun 22, 2026
Python

NodeNestor / claude-rolling-context

Star

Rolling context compression for Claude Code — never hit the context wall. Auto-compresses old messages while keeping recent context verbatim. Zero config, zero latency. Works as a Claude Code plugin.

claude ai-agent anthropic context-window context-management prompt-compression context-compression llm-context ai-coding claude-code claude-code-plugin claude-code-extension rolling-context

Updated Jun 2, 2026
Python

bladysh / exprompt

Star

Reverse T9 for LLMs. Free, open-source prompt compressor for your AI prompts and agents.

cli golang openai developer-tools agents codex text-compression claude llm prompt-engineering llms chatgpt anth prompt-compression

Updated May 17, 2026
Go

pleasedodisturb / awesome-llm-token-optimization

Star

A curated list of strategies, tools, papers, and resources for reducing LLM token costs and improving efficiency in production.

Updated Jun 21, 2026

napmany / cutia

Star

CUTIA: compress prompts while preserving quality

dspy prompt-engineering prompt-compression

Updated Feb 2, 2026
Python

g-akshay / ClaudeShrink

Sponsor

Star

A Claude Code skill that shrinks massive prompts and files using LLMLingua to save tokens.

skills developer-tools claude ai-tools context-window prompt-compression llmlingua claude-code token-optimization claude-skills

Updated Apr 25, 2026
Python

congvmit / awesome-llm-token-reduction

Star

A curated list of techniques, tools, and research for reducing LLM token usage. Optimize context for Claude Code, Copilot, Cursor, and Aider.

awesome awesome-list github-copilot openai-codex llm token-reduction prompt-compression context-optimization claude-code ai-coding-assistant

Updated Jun 13, 2026

kaistAI / GenPI

Star

This repository is the official implementation of Generative Context Distillation.

agent distillation prompt-injection prompt-compression prompt-internalization context-distillation

Updated May 10, 2025
Python

gladehq / claude-shorthand

Star

LLMLingua-2 prompt compression hook for Claude Code — cut token usage by ~55%

macos linux cli developer-tools token claude prompt-tuning llm prompt-engineering prompt-compression llmlingua token-optimization claudecode claudecode-hooks claudecode-plugin

Updated Mar 16, 2026
Python

Jiangnan0522 / ComprExIT

Star

ComprExIT: context compression via explicit information transmission over frozen LLM hidden states (official implementation)

nlp transformers pytorch llama optimal-transport large-language-models llm long-context prompt-compression context-compression

Updated Jun 19, 2026
Python

therohanparmar / t3-toon

Star

TOON for TYPO3 — a compact, human-readable, and token-efficient data format for AI prompts & LLM contexts. Perfect for ChatGPT, Gemini, Claude, Mistral, and OpenAI integrations (JSON ⇄ TOON).

Updated Jun 15, 2026
PHP

ingridtoulotte / context-compressor

Star

Lossless-first semantic compression for LLM context windows. Shrink context 60-80% and prove nothing important was lost. Deterministic, extractive, stdlib-only, validated.

python deterministic ai-agents cost-optimization llm prompt-engineering llmops context-window llm-compression prompt-compression token-optimization context-engineering

Updated Jun 14, 2026
Python

Improve this page

Add a description, image, and links to the prompt-compression topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the prompt-compression topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prompt-compression

Here are 76 public repositories matching this topic...

open-compress / claw-compactor

jia-gao / leanctx

gglucass / headroom-desktop

fkiene / llmtrim

atjsh / llmlingua-2-js

chappyasel / meta-kb

centminmod / or-cli

sriinnu / clipforge-PAKT

KathanModh259 / latent-gate

NodeNestor / claude-rolling-context

bladysh / exprompt

pleasedodisturb / awesome-llm-token-optimization

napmany / cutia

g-akshay / ClaudeShrink

congvmit / awesome-llm-token-reduction

kaistAI / GenPI

gladehq / claude-shorthand

Jiangnan0522 / ComprExIT

therohanparmar / t3-toon

ingridtoulotte / context-compressor

Improve this page

Add this topic to your repo