Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
254 commits
Select commit Hold shift + click to select a range
7ca629b
Fix asyncio event loop error in Python 3.10+ tests
chopratejas Jan 31, 2026
b97c5cc
Add global httpx.ReadTimeout handler for all tests
chopratejas Jan 31, 2026
13973c0
Add MkDocs GitHub Pages documentation site
chopratejas Jan 31, 2026
b8e500f
Add documentation badge to README
chopratejas Jan 31, 2026
abed6c8
Add dashboard UI and improve proxy observability
chopratejas Jan 31, 2026
db5e4de
Update README with dashboard info and installation recommendations
chopratejas Jan 31, 2026
c815708
feat: Add AWS Strands Agents SDK integration
chopratejas Jan 31, 2026
51c7f44
Merge pull request #16 from chopratejas/feature/strands-integration
chopratejas Jan 31, 2026
2c47826
Add DiffCompressor and fix hnswlib SIGILL crash on CI
chopratejas Jan 31, 2026
4622516
Add HTMLExtractor for web content extraction with OSS benchmarks
chopratejas Jan 31, 2026
a0b695f
Fix mypy errors in html_extraction.py
chopratejas Jan 31, 2026
a0a22e6
Add OpenRouter backend support with provider registry pattern
chopratejas Jan 31, 2026
7423294
feat(compression): add exclude_tools to bypass compression for specif…
Jan 31, 2026
f4b111b
fix(tests): skip HTML extractor tests when trafilatura not installed
Jan 31, 2026
908fe84
fix(memory): use subprocess probe to avoid hnswlib SIGILL crash on CI
Jan 31, 2026
20fd711
fix(tests): skip memory tests when hnswlib not available
Feb 1, 2026
317e192
fix(tests): add missing skip decorator and tests for exclude_tools
Feb 1, 2026
9a11b59
fix(config): remove web tools from default exclusion list
Feb 1, 2026
095affd
Merge pull request #18 from smartwatermelon/feature/exclude-tools-com…
chopratejas Feb 1, 2026
8c57500
Add question-aware compression and enable ContentRouter by default
chopratejas Feb 1, 2026
1a22ce5
Merge remote-tracking branch 'origin/main'
chopratejas Feb 1, 2026
96cd737
fix: Handle Anthropic format tool_use/tool_result as atomic units
prakersh Feb 1, 2026
b68389c
Merge pull request #19 from prakersh/dev/prakersh/fix-anthropic-tool-…
chopratejas Feb 1, 2026
d5fb9a5
fix: Extend Anthropic format tool protection to IntelligentContextMan…
chopratejas Feb 1, 2026
3b68a11
Merge pull request #20 from chopratejas/fix/intelligent-context-anthr…
chopratejas Feb 1, 2026
27fb871
Add multi-provider memory system with auto-detection
chopratejas Feb 1, 2026
4ce2c92
Add memory observability system (Phase 1)
chopratejas Feb 2, 2026
4487b03
Merge pull request #23 from chopratejas/feature/memory-observability
chopratejas Feb 2, 2026
5f0ceb2
Add SQLiteGraphStore for bounded, persistent graph storage
chopratejas Feb 2, 2026
c22ffc4
Merge pull request #24 from chopratejas/feature/memory-observability
chopratejas Feb 2, 2026
9d5f7b0
Add bounded memory support to HNSWVectorIndex with LRU eviction
chopratejas Feb 2, 2026
a22d765
Add SQLiteVectorIndex using sqlite-vec for bounded vector search
chopratejas Feb 2, 2026
b9f7b5f
Add SQLite vector backend as default and fix HNSW test skipping
chopratejas Feb 2, 2026
40a6bc6
Fix test to expect VectorBackend.AUTO as default
chopratejas Feb 2, 2026
fc5858a
Add centralized MLModelRegistry to share ML model instances
chopratejas Feb 2, 2026
c473c0c
Fix test_router_is_available_with_models after MLModelRegistry refactor
chopratejas Feb 2, 2026
faed38a
Fix mypy error: add type annotation for SentenceTransformer return
chopratejas Feb 2, 2026
252ddb5
Fix: Include dashboard HTML templates in wheel package
chopratejas Feb 2, 2026
940ca13
Bump version to 0.3.1
chopratejas Feb 2, 2026
8388b67
Add centralized ML model configuration
chopratejas Feb 2, 2026
7026098
fix: Increase MAX_REQUEST_BODY_SIZE from 10MB to 100MB to support ima…
prakersh Feb 2, 2026
971315f
test: Update LLMLingua default model test to match new default
prakersh Feb 2, 2026
b1e94c9
Merge pull request #25 from prakersh/fix/increase-request-body-size-l…
chopratejas Feb 2, 2026
e7b11e8
fix: Handle BaseModelOutputWithPooling from SigLIP get_*_features()
prakersh Feb 2, 2026
80a354d
Merge pull request #26 from prakersh/dev/prakersh/fix
chopratejas Feb 2, 2026
b1ecdaf
Add MCP CLI for Claude Code subscription users
chopratejas Feb 2, 2026
e8e13a6
Fix MCP tests to work when MCP SDK is not installed
chopratejas Feb 2, 2026
5913e1d
Use AWS API to dynamically fetch Bedrock inference profiles
chopratejas Feb 2, 2026
82447a1
Fix inflated token savings stats from cache hits
chopratejas Feb 2, 2026
6e146ba
Fix token savings metrics mismatch between pipeline and server
chopratejas Feb 3, 2026
1a895e4
Add LiteLLM backend routing for OpenAI endpoint and Magika content de…
chopratejas Feb 4, 2026
599f1e2
Fix security vulnerabilities in memory and CCR systems
chopratejas Feb 4, 2026
126a4a0
Adding any-llm as a backed provider
angpt Feb 6, 2026
63b555f
Update anyllm.py
angpt Feb 6, 2026
5cb44c5
Add adaptive compression sizing and fix cost tracking bugs
chopratejas Feb 11, 2026
d2d5e58
Bump version to 0.3.2
chopratejas Feb 11, 2026
c3e4e80
Update anyllm.py
angpt Feb 11, 2026
1b25ada
Fix streaming crashes under concurrent sessions
chopratejas Feb 11, 2026
73c77e0
Add tests for streaming resilience and concurrent session safety
chopratejas Feb 11, 2026
09717f0
Add pluggable adapter hooks for CCR, Storage, and TOIN backends
chopratejas Feb 16, 2026
e71388d
Fixing the DockerFile to have hnswlib working
chopratejas Feb 17, 2026
318b205
Add compression summaries, multi-provider headers, Dockerfile fix
chopratejas Feb 19, 2026
695505e
Add Query Echo: re-inject user question after compressed tool outputs
chopratejas Feb 19, 2026
eebc3e6
Bump version to 0.3.6
chopratejas Feb 19, 2026
e2b4bb0
Update anyllm.py
angpt Feb 19, 2026
af1c5e8
Add Compression Hooks — extension points for SaaS and advanced custom…
chopratejas Feb 19, 2026
31f1adb
Add one-function compress() API, ASGI middleware, LiteLLM callback
chopratejas Feb 19, 2026
cbfddc3
Rewrite README + add Integration Guide
chopratejas Feb 19, 2026
b92c49e
README: proxy as the hero quickstart, compress() for Python, integrat…
chopratejas Feb 19, 2026
f46c5f0
README: add back proof points — needle-in-haystack demo, benchmarks, …
chopratejas Feb 19, 2026
9784c96
README: replace plain text architecture with Mermaid diagrams
chopratejas Feb 19, 2026
0aad82c
Fix: guard starlette imports in test_compress_api.py for CI without p…
chopratejas Feb 19, 2026
2a38fef
Fix CI: guard starlette imports, asyncio.run(), deprecate datetime.ut…
chopratejas Feb 19, 2026
5d7ecf0
Bump version to 0.3.7
chopratejas Feb 19, 2026
2379454
Update CHANGELOG.md
angpt Feb 23, 2026
5cacb7f
Merge pull request #33 from angpt/angpt/any-llm-integration
chopratejas Feb 23, 2026
b31a754
Add OSS evaluation suite, universal JSON crush, latency benchmarks
chopratejas Feb 24, 2026
6bbc85f
Fixing the ruff linting
chopratejas Feb 24, 2026
984e878
Add Memory Bridge: bidirectional sync between markdown files and Head…
chopratejas Feb 24, 2026
ac2eae1
Add Read Lifecycle: event-driven stale/superseded Read detection
chopratejas Feb 28, 2026
d9e3932
Fix LangChain tool_call argument handling for varied message formats
chopratejas Feb 28, 2026
ec18659
Add Cloud mode to ASGI middleware and LiteLLM callback
chopratejas Feb 28, 2026
2b6f754
Add headroom learn: offline failure learning for coding agents
chopratejas Feb 28, 2026
74baa49
Add multi-agent support, quality gates, and integration tests for hea…
chopratejas Mar 1, 2026
445ef3d
fix: remove hnswlib from core deps and fix package name in docs
longtngo Mar 2, 2026
f5c4415
revert: restore hnswlib to core dependencies
longtngo Mar 2, 2026
b5ffcb2
Merge pull request #37 from longtngo/fix/docker-install-hnswlib-core-dep
chopratejas Mar 2, 2026
20a4a06
fix(docs): fix Docker example and document hnswlib C++ build requirement
longtngo Mar 2, 2026
b16dcdf
Merge pull request #38 from longtngo/fix/docker-example-build-deps
chopratejas Mar 2, 2026
02185c7
fix(ci): add sqlite-vec to dev deps to fix memory tests on CI
longtngo Mar 2, 2026
f509a0f
Merge pull request #39 from longtngo/fix/dev-deps-sqlite-vec
chopratejas Mar 2, 2026
dd95d3e
feat(code): add semantic symbol importance to CodeAwareCompressor
chopratejas Mar 3, 2026
3933c10
feat(router): adaptive compression with Read lifecycle and context-pr…
chopratejas Mar 6, 2026
e0f0378
docs: update README demo GIF to headroom_recording.gif
chopratejas Mar 6, 2026
0c8dcb9
Revert "docs: update README demo GIF to headroom_recording.gif"
chopratejas Mar 6, 2026
b7e6dec
fix(mcp): use claude mcp add for install and fix startup import
Mar 6, 2026
ea7e0a7
fix(mcp): fix uninstall symmetry, move subprocess import, add cli-pat…
Mar 6, 2026
a3e1a12
Merge pull request #40 from smartwatermelon/claude/fix-mcp-startup-an…
chopratejas Mar 6, 2026
cfbc522
fix: harden edge cases, expand tool exclusions, and fix mypy errors
chopratejas Mar 7, 2026
e8ec643
feat: add `headroom perf` CLI and rewrite `headroom learn` to use LLM…
chopratejas Mar 7, 2026
2ebd063
fix(learn): pass explicit model in tests to avoid API key requirement
chopratejas Mar 7, 2026
7b6621c
style: ruff format parser.py, content_router.py, smart_crusher.py
chopratejas Mar 7, 2026
03588ae
Fixing the build AND the perf logs
chopratejas Mar 7, 2026
9906182
docs: add headroom learn demo GIF to README
chopratejas Mar 8, 2026
ca54567
fix: dashboard metrics, TTFB tracking, eager LLMLingua loading, and m…
chopratejas Mar 8, 2026
85a716e
feat: provider-aware prefix cache tracking and combined savings dashb…
chopratejas Mar 9, 2026
83244a5
Updating the README to reflect the right positioning for Headroom
chopratejas Mar 10, 2026
3ff2676
Add Kompress: ModernBERT token compressor replacing LLMLingua-2
chopratejas Mar 11, 2026
138a933
Kompress: use argmax for token selection, remove hardcoded 0.5 threshold
chopratejas Mar 11, 2026
54efd07
Introducing headroom wrap
chopratejas Mar 12, 2026
2082218
Fix proxy backend bugs: env vars, tool forwarding, and provider support
chopratejas Mar 13, 2026
67713a4
Fix pricing lookup for retired Claude models
chopratejas Mar 13, 2026
b465a0a
Format prefix_cache_benchmark.py for ruff 0.15 compatibility
chopratejas Mar 13, 2026
2296292
Fix mypy no-any-return errors in kompress_compressor.py
chopratejas Mar 13, 2026
4b258a6
Fixing headroom wrap claude
chopratejas Mar 13, 2026
5e41680
fix(wrap): route through proxy, prevent pipe deadlock, bump to 0.4.2
chopratejas Mar 13, 2026
b7c2025
Slim core dependencies: 2.5GB → 195MB install size
chopratejas Mar 13, 2026
caf6698
Commit message:
chopratejas Mar 13, 2026
489c377
Bump version to 0.4.3
chopratejas Mar 13, 2026
9b232fc
Add wrap commands for Codex/Cursor/Aider with rtk instructions, fix s…
chopratejas Mar 14, 2026
6c2aa68
Bump version to 0.4.4
chopratejas Mar 14, 2026
373c059
Fix proxy crash when torch not installed (kompress lazy imports)
chopratejas Mar 15, 2026
0b7b80e
Fixing tests
chopratejas Mar 15, 2026
6e0861d
Enhanced MCP server with compress/stats tools, fix proxy torch crash
chopratejas Mar 17, 2026
1e59d46
Updated README and mcp docs
chopratejas Mar 17, 2026
8cd96ba
Add SharedContext for multi-agent, rewrite README, fix proxy cleanup
chopratejas Mar 17, 2026
78b7809
fix: add claude-opus-4-6 (1M context) to ANTHROPIC_CONTEXT_LIMITS
chopratejas Mar 19, 2026
46fa84c
feat: add CompressionCache with LRU eviction for token headroom mode
chopratejas Mar 19, 2026
fc8b257
feat: add frozen count, apply_cached, update_from_result to Compressi…
chopratejas Mar 19, 2026
2fcfd2a
feat: export CompressionCache from cache package
chopratejas Mar 19, 2026
7c4cefa
feat: add HEADROOM_MODE config to ProxyConfig
chopratejas Mar 19, 2026
d0532d7
feat: configure pipeline based on HEADROOM_MODE at startup
chopratejas Mar 19, 2026
4df9dc9
feat: add token_headroom branch to Anthropic handler
chopratejas Mar 19, 2026
71155be
feat: add token_headroom branch to OpenAI handler
chopratejas Mar 19, 2026
891aa1b
feat: add compression_cache stats to /stats endpoint
chopratejas Mar 19, 2026
7fa68e4
test: add integration tests for token_headroom mode
chopratejas Mar 19, 2026
789079b
fix: prevent stat inflation in compute_frozen_count, fix token estimate
chopratejas Mar 19, 2026
67bdaa4
feat: add clean session summary to /stats and MCP headroom_stats
chopratejas Mar 19, 2026
9d6bd4f
Merge feat/token-headroom-mode: dual-mode optimization + clean stats
chopratejas Mar 19, 2026
956acbf
chore: bump version to 0.5.0
chopratejas Mar 19, 2026
22bdbca
fix: capture Zone 1 + Zone 2 savings in token_headroom stats
chopratejas Mar 19, 2026
5d995d3
chore: bump version to 0.5.1
chopratejas Mar 19, 2026
6e6b086
.
gglucass Mar 20, 2026
bf9d948
fix: propagate HEADROOM_MODE through wrap CLI and drop cache write pr…
chopratejas Mar 20, 2026
065c4f8
fix: pass through unknown CLI flags to wrapped tools (claude, codex, …
chopratejas Mar 20, 2026
83af411
refactor: centralize error/priority pattern detection into error_dete…
chopratejas Mar 20, 2026
d9d628e
feat: make token_headroom the default mode + fix Gemini handler bug
chopratejas Mar 20, 2026
b0fa7f0
fix: use cache-aware pricing from LiteLLM for dashboard cost display
chopratejas Mar 20, 2026
62da087
chore: bump version to 0.5.2
chopratejas Mar 20, 2026
96b05c8
feat: add live traffic learning + cross-agent memory writers (--learn…
chopratejas Mar 20, 2026
c3144ad
fix: memory handler overwrites system prompt when it's a list of cont…
chopratejas Mar 22, 2026
ca98ea3
chore: gitignore headroom-saas (lives in separate private repo)
chopratejas Mar 22, 2026
3a1bf4b
feat: add license key validation + phone-home usage reporter
chopratejas Mar 22, 2026
1fecfa6
fix: close TOIN feedback loop — retrievals now flow back for learning
chopratejas Mar 22, 2026
f4b6e53
Add path "." compatibility for headroom learn
gglucass Mar 23, 2026
1cc9df3
.
gglucass Mar 23, 2026
50bf923
Fix tests
gglucass Mar 23, 2026
f11ef81
Fix failing test for hidden folders
gglucass Mar 23, 2026
62dd835
fix(bedrock): add EU/AP region support with graceful fallback
Lyt060814 Mar 23, 2026
7566365
Merge pull request #53 from Lyt060814/fix/bedrock-eu-region-support
chopratejas Mar 23, 2026
cf67872
Merge pull request #52 from gglucass/fix/learn-with-dots-in-paths
chopratejas Mar 23, 2026
efa6f27
Pin litellm to v1.82.3 and bump version to 0.5.3
chopratejas Mar 24, 2026
763a07f
Add telemetry beacon, fix Starlette 0.41+ crash, bump to 0.5.4
chopratejas Mar 24, 2026
48d84b7
Fix ruff lint errors in test files
chopratejas Mar 24, 2026
bb52403
Fix ruff format for litellm, wrap, test_scanner
chopratejas Mar 24, 2026
389d278
Add headroom_read MCP tool with session caching (feature flag)
chopratejas Mar 25, 2026
e69ff0b
Reduce compression latency: cache serializations, eager-load all comp…
chopratejas Mar 25, 2026
4205a12
Skip telemetry when no requests have been processed
chopratejas Mar 25, 2026
47651a8
fix: forward upstream ratelimit headers in streaming responses
jocel1 Mar 25, 2026
8764fc9
test: verify ratelimit headers are forwarded in streaming responses
jocel1 Mar 25, 2026
7cf1e32
Dedupe telemetry beacon: upsert per session instead of appending
chopratejas Mar 25, 2026
d91a007
Merge pull request #57 from Softizy/fix/forward-ratelimit-headers-str…
chopratejas Mar 25, 2026
50fe05e
Reduce proxy latency, enrich telemetry, protect Bash output, bump to …
chopratejas Mar 25, 2026
68b7621
Fix test failures: update hash test for MD5, handle unicode surrogates
chopratejas Mar 25, 2026
696d787
Fix docs to match implementation: remove false claims, add Strands guide
chopratejas Mar 26, 2026
c1c46c2
Revamp docs site: new theme, comprehensive landing page, complete nav
chopratejas Mar 26, 2026
35ab68e
Fix context-blind compression: pass user query to SmartCrusher releva…
chopratejas Mar 26, 2026
04f12f7
Pass user query context through proxy pipeline (9 call sites)
chopratejas Mar 26, 2026
8e911a7
Diversity-aware SmartCrusher: keep unique items, compress text within
chopratejas Mar 26, 2026
4a170fb
Use realistic verbose RAG chunks in demo — triggers Kompress within-item
chopratejas Mar 26, 2026
57198ae
Add notebook for langchain-ai/how_to_fix_your_context PR
chopratejas Mar 26, 2026
bb723df
Protect workflow XML tags from text compression
chopratejas Mar 26, 2026
ff49cb0
Remove LLMLingua: Kompress is the sole text compressor
chopratejas Mar 26, 2026
3501099
Fix API integration test: use tool output not user message for tags
chopratejas Mar 26, 2026
09aca32
Fix CI test, bump to 0.5.7
chopratejas Mar 26, 2026
4089429
Add TypeScript SDK (headroom-ai npm package)
chopratejas Mar 26, 2026
c732daa
Fix RTK binary path on Windows (.exe extension)
chopratejas Mar 26, 2026
62d80bd
Make compress() format-aware: auto-detect OpenAI, Anthropic, Vercel, …
chopratejas Mar 27, 2026
06b28e0
Fix Vercel AI SDK v6 format: use input/output instead of args/result
chopratejas Mar 27, 2026
bca4e8b
feat: allow overriding proxy telemetry sdk via HEADROOM_SDK
gglucass Mar 27, 2026
011fdd9
feat: persist proxy savings history
gglucass Mar 27, 2026
6bdab6c
fix: move extract_user_query import out of conditional scope
matgasser Mar 27, 2026
3d14e54
Merge pull request #62 from matgasser/fix/extract-user-query-closure-…
chopratejas Mar 27, 2026
89c0b4c
Fix Windows Unicode crash and Bedrock Claude 4.6 model IDs
chopratejas Mar 27, 2026
41b4289
Bump to 0.5.8: fix Windows Unicode crash, fix Bedrock Claude 4.6 mode…
chopratejas Mar 27, 2026
0e084d1
style: fix ruff format in server.py
chopratejas Mar 27, 2026
e02a3c2
fix: remove Bearer auth header for Supabase publishable key (telemetr…
chopratejas Mar 28, 2026
79f4bd7
fix: restore Authorization header in telemetry beacon
chopratejas Mar 28, 2026
2b697b6
Bump to 0.5.9: fix telemetry beacon (restore auth header, requires ta…
chopratejas Mar 28, 2026
a62dcb9
Revert version bump: table schema fix resolves telemetry for all vers…
chopratejas Mar 28, 2026
ea22602
feat: add OpenClaw ContextEngine plugin (@headroom-ai/openclaw)
chopratejas Mar 28, 2026
1a1d4eb
OpenClaw plugin fixes, telemetry fix, cost tracker improvement, token…
chopratejas Mar 29, 2026
88851f7
Merge feat/openclaw-plugin: OpenClaw ContextEngine plugin + telemetry…
chopratejas Mar 29, 2026
60ff7ba
Bump to 0.5.9
chopratejas Mar 29, 2026
94b9a67
Bump to 0.5.10 (0.5.9 had partial PyPI upload)
chopratejas Mar 29, 2026
98fcc81
Rename plugin to headroom-openclaw (npm scope not available)
chopratejas Mar 29, 2026
ffa5b61
fix: split Supabase anon key to avoid GitGuardian false positive
chopratejas Mar 29, 2026
016aad7
docs: add OpenClaw plugin to README, integration guide, index, and TS…
chopratejas Mar 29, 2026
e12449b
feat: add LangGraph compress_tool_messages node for ToolMessage compr…
KunalLohtia Mar 30, 2026
02eb609
Merge pull request #68 from KunalLohtia/feat/langgraph-compress-tool-…
chopratejas Mar 30, 2026
92490e3
Merge pull request #60 from gglucass/feat/add-headroom-sdk
chopratejas Mar 30, 2026
736fcaa
refactor: move dashboard mode switch to header
gglucass Mar 30, 2026
be29b36
refactor: remove dashboard storage path label
gglucass Mar 30, 2026
3c8de19
fix: restore anthropic compression when hooks enabled
gglucass Mar 30, 2026
f240094
Update savings_tracker.py
gglucass Mar 30, 2026
0cc7507
test: cover savings tracker branches
gglucass Mar 30, 2026
df5e137
Merge branch 'main' into feat/persist-savings-history
gglucass Mar 30, 2026
b3d6198
Fix Windows drive letter path decoding in headroom learn (fixes #69)
chopratejas Mar 30, 2026
b58bfbe
fix: remove unused imports in test_langgraph.py (ruff F401)
chopratejas Mar 30, 2026
83ee979
Fix cost calculation: don't mix cache savings into Headroom savings
chopratejas Mar 30, 2026
79fbbc2
Bump to 0.5.11: fix cost calculation, Windows path decoding, unused i…
chopratejas Mar 30, 2026
593bd49
Switch telemetry to proxy_telemetry_v2 table, bump to 0.5.12
chopratejas Mar 30, 2026
acbfcfb
Fix streaming tool calls, compressed request bodies, beacon field nam…
chopratejas Mar 30, 2026
61e219f
Fix zstd streaming decompression, add zstandard dep; bump to 0.5.14
chopratejas Mar 30, 2026
2ccb824
Fix tool_result ordering: no stray user message between tool_calls an…
chopratejas Mar 30, 2026
3b700f5
Add compression for OpenAI Responses API (/v1/responses)
chopratejas Mar 31, 2026
1e54ccc
Bump to 0.5.15: Responses API compression, tool_result ordering fix
chopratejas Mar 31, 2026
5b3eb08
Remove dedupe_telemetry.py from repo, add scripts/ to .gitignore
chopratejas Mar 31, 2026
1145aa7
Add WebSocket proxy for /v1/responses (Codex gpt-5.4+)
chopratejas Mar 31, 2026
6a30583
Update Discord invite link across all docs
chopratejas Mar 31, 2026
7bb3f9d
Kompress ONNX INT8: text compression without torch dependency
chopratejas Mar 31, 2026
7b41ddc
Bump to 0.5.16: WebSocket proxy, Kompress ONNX, Discord link update
chopratejas Mar 31, 2026
7084c2f
feat: add historical rollups and exports
gglucass Mar 31, 2026
b4e76a0
Update test_proxy_savings_history.py
gglucass Mar 31, 2026
f0623c8
style: format savings history tests
gglucass Mar 31, 2026
5d6bf6f
Merge remote-tracking branch 'origin/main' into feat/persist-savings-…
gglucass Mar 31, 2026
5a3cb5a
Update dashboard.html
gglucass Mar 31, 2026
e3e7eba
Add Anthropic url overrides via ANTHROPIC_TARGET_URL env var and --an…
Mar 31, 2026
5743032
Fix WebSocket proxy: forward all headers + required OpenAI-Beta header
chopratejas Mar 31, 2026
5fe92a4
Merge pull request #61 from gglucass/feat/persist-savings-history
chopratejas Mar 31, 2026
32a564a
Fix dashboard cost savings: use model list price, not moving average
chopratejas Mar 31, 2026
789d7b1
Fix multi-worker beacon spam, cost dashboard, upsert telemetry; bump …
chopratejas Mar 31, 2026
e04527f
Merge pull request #81 from dorphalsig/main
chopratejas Mar 31, 2026
47071cc
Add transforms_summary to TransformResult and /v1/compress response
chopratejas Mar 31, 2026
c5bc385
Refactor: extract data models to headroom/proxy/models.py (Step 1/9)
chopratejas Mar 31, 2026
187cedb
Fix Codex WS proxy (Issue #86), LiteLLM metrics, and backend forwarding
chopratejas Apr 1, 2026
be8c2b0
Disable CacheAligner: it inflates tokens and breaks prefix caching
chopratejas Apr 1, 2026
3f32db0
Fix WebSocket SSL on Windows: use native websockets SSL handling
chopratejas Apr 1, 2026
653303f
feat(docker): production-optimized multi-stage Dockerfile
pratikbin Apr 2, 2026
f7ddfce
Merge branch 'main' into feat/production-dockerfile
pratikbin Apr 2, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 43 additions & 1 deletion .dockerignore
Original file line number Diff line number Diff line change
@@ -1,15 +1,57 @@
# VCS
.git
.github
.gitignore

# Python artifacts
__pycache__
*.pyc
*.pyo
*.egg-info
dist/
build/

# Dev/test tooling
.pytest_cache
.coverage
.mypy_cache
.ruff_cache
.pre-commit-config.yaml
.venv
venv

# Tests & docs (not needed in image)
tests/
docs/
mkdocs.yml
CHANGELOG.md
LICENSE
NOTICE

# JS/TS artifacts (dashboard, SDK — not part of proxy image)
apps/
sdk/
plugins/
node_modules/
*.tgz

# Secrets & local config
.env
*.log
.env.*
*.log

# IDE
.vscode
.idea
*.swp

# Docker
Dockerfile
docker-compose*.yml
.dockerignore

# Misc
.pi-lens/
.superpowers/
examples/
node-compile-cache/
67 changes: 67 additions & 0 deletions .github/workflows/docker.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
name: Docker

on:
workflow_dispatch:
release:
types: [published]

env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}

permissions:
contents: read
packages: write
attestations: write
id-token: write

jobs:
docker:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6

- name: Set up QEMU
uses: docker/setup-qemu-action@v4

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v4

- name: Log in to GHCR
uses: docker/login-action@v4
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Extract metadata
id: meta
uses: docker/metadata-action@v6
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=ref,event=branch
type=ref,event=pr
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=sha,prefix=sha-

- name: Build and push
id: push
uses: docker/build-push-action@v7
with:
context: .
platforms: linux/amd64,linux/arm64
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
provenance: true
sbom: true

- name: Attest build provenance
uses: actions/attest-build-provenance@v4
with:
subject-name: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
subject-digest: ${{ steps.push.outputs.digest }}
push-to-registry: true
58 changes: 51 additions & 7 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,8 +1,52 @@
FROM python:3.11-slim
RUN apt-get update && apt-get install -y --no-install-recommends curl build-essential g++ && rm -rf /var/lib/apt/lists/*
COPY . /app
WORKDIR /app
RUN pip install -e .[proxy]
# ---- Build stage: compile native extensions, build wheel ----
FROM python:3.11-slim AS builder

RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential \
g++ \
&& rm -rf /var/lib/apt/lists/*

COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv

WORKDIR /build

# Layer 1: install deps only (cached unless pyproject.toml/uv.lock change)
COPY pyproject.toml uv.lock README.md ./
# Stub package so uv can resolve the local ".[proxy]" without full source
RUN mkdir -p headroom && touch headroom/__init__.py
RUN --mount=type=cache,target=/root/.cache/uv \
uv pip install --system ".[proxy]"

# Layer 2: copy real source, reinstall only headroom-ai (no deps)
COPY headroom/ headroom/
RUN --mount=type=cache,target=/root/.cache/uv \
uv pip install --system --no-deps --reinstall-package headroom-ai .

# ---- Runtime stage: minimal image with only what's needed ----
FROM python:3.11-slim AS runtime

RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \
rm -rf /var/lib/apt/lists/*

RUN groupadd --gid 1000 headroom && \
useradd --uid 1000 --gid headroom --create-home headroom

COPY --from=builder /usr/local/lib/python3.11/site-packages /usr/local/lib/python3.11/site-packages
COPY --from=builder /usr/local/bin/headroom /usr/local/bin/headroom

RUN mkdir -p /data /home/headroom/.headroom && \
chown -R headroom:headroom /data /home/headroom/.headroom

USER headroom
WORKDIR /home/headroom

ENV HEADROOM_HOST=0.0.0.0 \
PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1

EXPOSE 8787
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 CMD curl -f http://localhost:8787/health || exit 1
ENTRYPOINT ["headroom", "proxy"]

ENTRYPOINT ["headroom", "proxy"]
CMD ["--host", "0.0.0.0", "--port", "8787"]