| name | hecate-backend |
|---|---|
| description | Use when working on the Hecate Go backend — gateway, agent runtime, providers, sandbox, storage. Keeps backend work aligned with Hecate's "operator-grade control plane, runtime-aware" thesis. |
Use this skill for any work outside ui/. The React UI has its own skill at ../ui/SKILL.md. For the internal/providers/ package specifically, also reach for ../providers/SKILL.md — it owns the api↔providers boundary and the seven-step "add a wire field" chain.
Don't duplicate. This skill is the backend lens; the rules themselves live in:
../../core/project-context.md— repo layout, rings, storage tiers, toolchain pins, risky areas.../../core/engineering-standards.md— field-shape rules, parallel-struct rule, anti-patterns.../../core/workflow.md— operating loop, planning triggers, commit etiquette.../../core/verification.md— verification ladders, race-suite floor, done criteria.
The backend should feel like:
- A single-process gateway control plane.
- A deny-by-default policy enforcer.
- A runtime-aware proxy that explains its decisions.
- A debugging surface — every request leaves a trace, every cost is itemized, every approval is logged.
It should not feel like:
- A thin pass-through with marketing on top.
- A configurable framework where you bring your own everything.
- A research demo that works in one provider's happy path.
Default to operator confidence: clear status, clear errors, deterministic state, no surprises on restart.
Calm, durable, and explicit. Code should age well — the runtime is supposed to live for years, not iterations.
Prefer one gateway process, one port, embedded UI (//go:embed ui/dist); deterministic startup with env-driven config; backend tier choice surfaced as a config knob, never inferred; explicit error wrapping with cause chains; standard library first, well-known third party second, novel deps last.
Every endpoint, every config knob, every error message should answer:
- What did the gateway just decide?
- Why did it decide that?
- What did it cost / how long did it take?
- What happens if it fails next time — retry, fallback, fail?
- How do I find the trace for this in OTel?
When choosing between "elegant" and "operationally explicit," choose explicit.
- No auth layer. Every request is processed as the operator, and the gateway binds to
127.0.0.1by default. Do not add token/tenant assumptions back into new endpoints. - Sandbox is per-call subprocess, applied inline. Shell, file, git tool calls spawn a fresh
shfrom inside the gateway after policy validation + env sanitisation + output cap + wall-clock timeout. On Linux withbwrapinstalled and on macOS, the call is additionally wrapped bybwrap/sandbox-execfor fs+net confinement (auto-detected at startup, exposed on/healthzundersandbox.os_isolation). No separate sandbox daemon, no per-call rlimits — operators who want CPU/FD/memory caps run the gateway under systemd or in a container with--cpus/--memoryflags. New tools follow the sameinternal/sandbox/shape. - Approvals are blocking. Pre-execution and mid-loop approvals halt the run; the run record persists in
awaiting_approvaluntil resolved. New gates use the sameTaskApprovalshape. - Events are appended, not mutated. Every state transition writes a
run_eventwith a monotonic sequence. The SSE stream replays fromafter_sequence. New event types must follow the event-protocol v1 taxonomy (run.*,turn.*,tool.*,policy.*,gap.*,error.*) and be documented indocs/events.md. - Cost is in micro-USD. All money is
int64in micro-USD (1_000_000= $1). Neverfloat64for money — pricebook lookups, budgets, ledger entries all stay integer. - OTel is first-class. Every request gets a trace ID surfaced in the response header (
X-Trace-Id) and persisted on the run record. New code paths add spans, not just log lines. - Metric labels are guarded. Record metrics through
internal/telemetryhelpers and normalizers. Closed-set dimensions collapse unknown values toother; free-form dimensions must reject control characters and oversized labels. Put raw commands, paths, stdout/stderr snippets, and adapter diagnostics in spans, logs, or persisted events — never metric labels.
The seven-step chain spans pkg/types/ → internal/api/ → internal/providers/ and tests at every layer. Canonical version: ../providers/SKILL.md. Forgetting to plumb the field into the streaming wireReq is the most common bug.
internal/mcp/tools.go:
- Append a
s.RegisterTool(...)call inRegisterDefaultToolswithAnnotationsset (ReadOnlyHint,DestructiveHint,IdempotentHintas appropriate). - Add a
<name>HandlerreturningToolHandlerfurther down. - Update the
docs/mcp.mdtool table. - Tests in
internal/mcp/tools_test.gousing thefakeGatewayhelper.
Agent Chat has two persistence layers:
internal/agentchatstores the Hecate transcript and native ACP session id in memory or sqlite.internal/agentadaptersowns the live ACP/process session manager.
When changing this path:
- Keep
docs/external-agent-adapters.mdaligned for operator-visible behavior such as launchers, env sanitisation, persistence, raw diagnostics, guardrails, auth/readiness probes, and troubleshooting. - Keep
docs/acp.mdaligned only when changing the separatehecate-acpeditor bridge. - Add focused tests in
internal/agentadapters/*_test.gofor ACP/process protocol behavior andinternal/api/server_test.gofor HTTP/session persistence behavior. Guardrail changes should cover both the HTTP 422 envelope and the session snapshot fields the UI consumes. - If the change touches approval/grant durability, startup reconcile, or
cmd/hecate store wiring, add or run the binary e2e approval smokes:
go test -tags e2e -run 'TestApproval' ./e2e. - Run the race suite. Long-lived adapter sessions are runtime code, not just a UI convenience.
- Pick an event-protocol name from the existing taxonomy before adding a new dotted name. Prefer generic families such as
tool.*,policy.*,gap.*, anderror.*with specific details indataover subsystem-specific names. internal/orchestrator/runner.go→ callr.emitRunEvent(ctx, taskID, runID, "your.event.type", ..., extraDataMap)at the right life-cycle moment. Emit the event before handing off to the queue — see the emit-before-enqueue gotcha above.- Document the event and its payload in
docs/events.md. - If high-cardinality, wire into
internal/retention/retention.goas a new subsystem (seeturn_eventsfor the pattern).
For errors that should surface before a run is created (bad config, missing required field):
- Define a sentinel error in
internal/orchestrator/runner.go:var ErrMyThing = errors.New("my_thing"). - Return it (wrapped is fine; use
errors.Is) fromstartTaskWithOptionsbefore any run is created. - In
internal/api/handler_tasks.goHandleStartTask, add anerrors.Is(err, orchestrator.ErrMyThing)branch that returnsapiError(http.StatusUnprocessableEntity, "my_thing", err.Error()). - Add the error code to
internal/api/error_mapping.goif it has an OTel span status implication. - Test via
tasks.mustRequestStatus(http.StatusUnprocessableEntity, ...)ininternal/api/server_test.go.
| Helper | File | Use for |
|---|---|---|
testRoundTripperFunc |
internal/providers/provider_test_helpers_test.go |
Stub HTTP transport for provider tests |
newAnthropicTestProvider |
internal/providers/tooluse_test.go |
Anthropic provider with cached caps (skips discovery) |
newTestHTTPHandler / *WithConfig / *ForProviders |
internal/api/server_test.go |
In-process gateway handler |
fakeUpstreamCapturing |
e2e/gateway_test.go |
E2E: capture what gateway forwarded to upstream |
hecateServer |
e2e/gateway_test.go |
E2E: spawn the real binary on a free port |
startHecateProcess |
e2e/ollama_test.go |
E2E: shared hecate binary for the Ollama suite (TestMain-driven) |
autoPreconfiguredEnv |
e2e/gateway_test.go |
Inject PROVIDER_<NAME>_PRECONFIGURED=1 for every PROVIDER_<NAME>_* env var; both spawn helpers call it so test sites don't repeat the gate |
- Emit run events before enqueue, not after. The in-memory queue dispatches synchronously: calling
enqueueRuncan cause a worker to claim the job and emitrun.startedbeforerun.queuedis persisted if the emit comes after. Always write the transition event first, then hand off to the queue (seeStartTaskininternal/orchestrator/runner.go). - modernc/sqlite TIME-as-text format — the driver writes
time.Timeusing Go's defaulttime.Time.String()format (2026-04-28 02:37:38.4524 +0000 UTC), which doesn't lex-compare with RFC3339Nano cutoffs and breaks the retention sweep silently. Always write timestamps ast.UTC().Format(time.RFC3339Nano)explicitly when the column is TEXT (seeinternal/taskstate/sqlite.goAppendRunEvent). - Capability cache seeding for provider tests — see
../providers/SKILL.mdfor the snippet. Without it the discovery path panics on a nil request body. - Pricebook preflight — cloud-kind providers in tests trigger a pricebook lookup.
PROVIDER_FAKE_KIND=localbypasses it for synthetic models in e2e. - Env-PRECONFIGURED gate for e2e providers — env-supplied provider credentials (
PROVIDER_<NAME>_API_KEY/_BASE_URL) only auto-import into the CP store whenPROVIDER_<NAME>_PRECONFIGURED=1is also set. Both e2e spawn helpers funnel throughautoPreconfiguredEnvso tests don't have to repeat it. New e2e helpers that bypasshecateServer/startHecateProcessneed the same call; otherwise routed requests 400 withno provider supports model ….
See ../../core/verification.md. Race suite is the floor for runtime/backend work, not a nice-to-have.