Skip to content

chicoxyzzy/hecate

Repository files navigation

Hecate — AI Gateway · Agent Runtime

Latest release Container Test Go Report Card Go version License: MIT OpenTelemetry

Hecate is an open-source AI gateway, coding-agent console, and agent-task runtime for routing OpenAI- and Anthropic-compatible traffic across cloud and local models, running external coding agents as supervised local adapters, controlling spend, and running agent work behind policy, approvals, and OpenTelemetry.

Status: public alpha. Core gateway is usable; agent runtime, desktop app, ACP bridge, and sandbox are still evolving. Read docs/known-limitations.md before depending on it.

Table Of Contents

Why Hecate

AI systems are becoming more than model calls. A useful agent now chooses between cloud and local models, calls tools, edits files, retries flaky providers, spends real money, and leaves behind a trail operators need to understand.

Hecate sits at that crossroads. It is a self-hosted runtime layer between clients, model providers, coding-agent CLIs, and local tools — built to make agent work cheaper, easier to inspect, and safer to run.

  • One gateway for cloud and local models — OpenAI, Anthropic, DeepSeek, Gemini, Groq, Mistral, Perplexity, Together AI, xAI, Ollama, LM Studio, LocalAI, llama.cpp-compatible servers, and custom OpenAI-compatible endpoints.
  • One console for models and coding agents — use model chat or supervise Codex, Claude Code, and Cursor Agent sessions from Chats.
  • Cost and routing control where requests happen — balances, pricebook, rate limits, model/provider selection, failover, and route reports sit on the hot path instead of in a separate spreadsheet.
  • OpenTelemetry-first visibility — traces, metrics, logs, request IDs, route decisions, cache paths, provider health, and cost metadata are emitted as runtime signals.
  • Agent execution with guardrails — queued tasks, approvals, controlled shell/file/git execution, patch artifacts, resumable runs, and MCP integration.
  • Local-first distribution — run it as a desktop app, Docker container, or hecate binary with the operator UI embedded; protocol-specific companions such as hecate-acp stay separate where needed.

Quick Start

Path Best for
Desktop app Personal use on your laptop. No terminal, no Docker.
Docker Local container, scripted local deploys.

Desktop app

Download from the latest release:

Platform Bundle
macOS (Apple Silicon) Hecate_0.1.0-alpha.17_aarch64.dmg
Linux x86_64 Hecate_0.1.0-alpha.17_amd64.deb or Hecate_0.1.0-alpha.17_amd64.AppImage
Windows x86_64 Hecate_0.1.0-alpha.17_x64_en-US.msi

Open the bundle and launch Hecate. The app starts the gateway sidecar, waits for it to become healthy, and opens the embedded operator UI automatically. State lives in the platform data dir (~/Library/Application Support/io.github.chicoxyzzy.hecate/ on macOS, %APPDATA%\io.github.chicoxyzzy.hecate\ on Windows, ~/.local/share/io.github.chicoxyzzy.hecate/ on Linux).

Bundles are not yet code-signed. On macOS, the first launch needs right-click → Open (Gatekeeper will block a plain double-click). On Windows, click More info → Run anyway on the SmartScreen warning. Subsequent launches work normally. Full footguns and roadmap in docs/desktop-app.md.

Skip to Add a provider once it's running.

Docker

docker run --rm -p 127.0.0.1:8765:8765 -v hecate-data:/data \
  ghcr.io/chicoxyzzy/hecate:0.1.0-alpha.17

Open http://127.0.0.1:8765. The UI loads with no further setup.

The container intentionally publishes only on 127.0.0.1. Hecate has no built-in auth layer; same-origin checks protect browser traffic, but they are not a network security boundary. Do not expose it to the network without your own auth, firewall, or reverse proxy in front.

Pinned image tags, binary tarballs (linux/darwin × amd64/arm64), checksums, compose examples, and storage notes live in docs/deployment.md. Local development setup lives in docs/development.md.

Add a provider

On first boot, Chats is already available. If Hecate detects a local runtime such as Ollama or LM Studio, the model chat setup can be one click: choose Add detected providers and Hecate adds the detected local endpoints with the preset defaults.

Chats first-run state — detected local providers and one-click Add detected providers setup

You can still configure providers manually from Providers → Add provider:

  • Cloud providers need an API key.
  • Local providers need a running local server URL, usually the preset default.
  • Custom OpenAI-compatible endpoints can be added from the same modal when the preset catalog is not enough.

After a provider is saved, Hecate discovers models and the Chats model picker becomes routable. The full preset catalog, env bootstrapping, custom-endpoint walk-through, and credential rotation live in docs/providers.md.

Talk to it

Chats is the primary day-to-day surface. It explains missing setup before you send a request, then lets you choose between model traffic and local coding-agent sessions.

Chats workspace talking to a local Ollama llama3.1:8b model with sessions sidebar and inline runtime metadata

Chats workspace with an external-agent file-write approval waiting for operator review

Agent approval modal with ACP options, scope choices, and audit note

There are two chat targets:

  • Agent — select Codex, Claude Code, or Cursor Agent, choose a workspace, and run a supervised local ACP session with approval prompts, guardrails, raw diagnostics, and Git diff review.
  • Model — select a configured provider/model and send OpenAI-compatible Chat Completions or Anthropic Messages traffic through Hecate's router.

Model turns record route, cost, cache, and trace metadata. Agent turns record normalized transcript, raw output, status, timing, trace IDs, workspace branch, approval decisions, and captured Git diffs that can be inspected or reverted from Chats. External agents are not providers and do not appear in the provider/model picker. See docs/external-agent-adapters.md for install checks and troubleshooting.

Architecture

The gateway runs as one Go process on one local HTTP port. Inside it: a chat/messages gateway that routes traffic to upstream model providers, an external-agent adapter layer that supervises coding-agent CLIs, and a task runtime that queues native agent work, drives approvals, and shells out through a sandbox boundary. The React operator UI is embedded into the gateway and served from the same port; hecate-acp is a separate stdio bridge for ACP-aware editor clients.

flowchart LR
    Console["Operator UI"]
    APIClients["OpenAI / Anthropic clients"]
    Editors["ACP editors"]

    Gateway["Model gateway"]
    AgentAdapters["External agent adapters"]
    Runtime["Agent task runtime"]
    ACP["hecate-acp"]

    Console --> Gateway
    Console --> AgentAdapters
    Console --> Runtime
    APIClients --> Gateway
    Editors --> ACP
    ACP --> Runtime

    Gateway --> Providers["Cloud + local providers"]
    AgentAdapters --> Agents["Codex / Claude Code / Cursor Agent"]
    Runtime --> Tools["Sandboxed tools + MCP"]
    Gateway --> OTel["OpenTelemetry Collector"]
    Runtime --> OTel
Loading

For deeper internals, read docs/architecture.md, docs/runtime-api.md, docs/events.md, and docs/telemetry.md.

Operator UI

The embedded UI is a runtime console for the operator.

  • Chats — talk to model providers or external coding agents, inspect per-turn route/cost metadata, agent activity, raw output, and captured diffs.
  • Providers — manage provider credentials, defaults, model discovery, base URLs, and health.
  • Tasks — create and manage native Hecate agent_loop runs, task approvals, retries, resumes, and streamed tool output.
  • Observability — inspect requests, route candidates, skip reasons, failover, costs, traces, metrics, logs, and local trace events.
  • Costs — balance, top-up / reset, usage table.
  • Settings — pricebook, retention, external-agent readiness checks, and durable approval grants.
Various UI screenshots

Observability view — request ledger and route-report drilldown

Empty Providers tab — Add provider CTA

Add provider modal — local preset catalog with detected runtime status

Providers workspace — configured cloud and local providers with health, endpoint, credentials, and models

Tasks workspace — task list with run state and approval queue

Costs workspace — balance card and usage table

Settings → Pricing — model catalog with priced / unpriced / deprecated filters

Settings → Retention — pruning windows and last-run history

Settings → External agents — adapter readiness checks and durable approval grants

What Works Today

Hecate is public-alpha software. The core gateway path is usable; the agent runtime and sandbox are intentionally still evolving.

Area State Notes
OpenAI-compatible gateway Usable Chat Completions, streaming, vision, model discovery, custom OpenAI-compatible endpoints
Anthropic-compatible gateway Usable Messages API shape, streaming translation, Claude Code support
Provider catalog Usable Built-in presets, credentials, health, routing readiness
Local providers Usable Ollama, LM Studio, LocalAI, llama.cpp-compatible servers
Local default address Usable Defaults to 127.0.0.1:8765; same-origin enforced for browser requests; no built-in auth
Budgets and rate limits Usable Balances, warning thresholds, pricebook, 429 rate-limit headers
OpenTelemetry Usable OTLP traces, metrics, logs, response headers, local trace view
Storage tiers Usable Memory or SQLite, selected per subsystem. GATEWAY_CHAT_SESSIONS_BACKEND=sqlite covers the full agent-chat bundle (sessions, messages, approvals, grants); orphaned pending approvals are reconciled on startup
Operator UI Usable Main workflows are present; chat/debugging ergonomics are still improving
Desktop app Alpha Native .dmg, .deb, .AppImage, and .msi bundles run Hecate as a sidecar. Bundles are unsigned
External agent adapters Usable Stable enough for alpha use when you accept the trusted-subprocess model: Codex, Claude Code, and Cursor Agent discovery, long-lived ACP sessions, prompt-first approvals, grants, adapter health/version checks, cancel, guardrails, raw diagnostics, and Git diff inspect/revert
ACP bridge Alpha hecate-acp supports initialize, session new/prompt/cancel, continuation, run-event forwarding, and approval round-trip; registry/editor packaging is not done
Agent task runtime Alpha Native Hecate task runs: queue/lease execution, approvals, resumable agent_loop, MCP integration, streamed output, and periodic stale-run recovery
Execution isolation Alpha Per-call subprocess + env sanitisation + output cap + wall-clock timeout; bwrap (Linux) / sandbox-exec (macOS) wrapping where available. Not container-level — see docs/sandbox.md
Homebrew distribution Not shipped A CLI formula/cask is planned later. Homebrew helps installation, but it does not replace Apple Developer ID signing/notarization for a smooth macOS desktop-app launch

Read docs/known-limitations.md before treating Hecate as production-stable.

Documentation

Full index lives at docs/README.md, organized by reader role. The most-reached-for pages:

Running Hecate

  • Deployment — Docker, image pinning, binary install, storage tiers, rate limits.
  • Desktop app — native bundles, first-launch footguns, platform data dirs, roadmap.
  • Providers — preset catalog, OpenAI-compatible custom endpoints, credentials, health, circuit breaking.
  • Known limitations — plain-language list of what's still alpha.

Building against Hecate

  • Runtime API — task lifecycle, approvals, queue/lease execution, SSE streaming.
  • Agent runtimeagent_loop loop mechanics, tools, stdout/stderr handling, cost ceilings, retry-from-turn.
  • External agent adapters — Hecate as an ACP client/operator: use Codex, Claude Code, and Cursor Agent from Chats.
  • ACP bridge — Hecate as an ACP agent for editor panels such as Zed and JetBrains.
  • Events — every event type, payload shape, when each fires.
  • MCP integration — Hecate as MCP server + attaching external MCP servers as tools.

Observability and internals

  • Telemetry — OTLP traces / metrics / logs, response headers, local trace view.
  • Architecture — gateway request flow, task-runtime queue / lease / sandbox boundary.
  • Development — source-build toolchain, local dev, the test ladder, screenshot tooling.
  • Release — cutting a tag, verification gate, recovery if CI fails.

First-run environment knobs live in .env.example.

Contributing

See CONTRIBUTING.md. If you work with an AI assistant, start with AGENTS.md; the vendor-neutral agent instruction layer lives in docs-ai/.

License

MIT. See LICENSE.

Third-party notices live in NOTICE.md, including LiteLLM pricing-data attribution and vendored splash-font licenses.

About

Hecate is an open-source AI gateway, coding-agent console, and agent-task runtime for routing OpenAI- and Anthropic-compatible traffic across cloud and local models, running external coding agents as supervised local adapters, controlling spend, and running agent work behind policy, approvals, and OpenTelemetry.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors