Stateful LLM agents in Go. Each session is an actor: it stays put on one node, frees memory when idle, and survives node loss.
Concurrent callers to the same session queue in order.
Backpressure is a typed error.
The same call works whether you run on one node or a cluster.
Atto is built for people shipping agents. You wire tool.Func, agent.NewLLM, and atto.New. You stream session.Event values like any other Go iterator.
Below that surface, each session is an actor. Turns happen in order even when two HTTP handlers touch the same session at the same time. If a session gets flooded you see ErrSessionBacklogFull, not silent corruption or an OOM. The same API runs on one node or across a cluster on top of GoAkt; sticky routing and storage reload live in the runtime.
If actors are new to you: it's the framework keeping one goroutine's worth of discipline per user. You don't have to maintain a map[string]*sync.Mutex or wonder what happens when two HTTP handlers touch the same chat history.
If you already work with actors, atto plugs in the LLM pieces on top: native OpenAI, Anthropic, and Gemini adapters, schema-validated tools from plain Go functions, optimistic concurrency on persisted snapshots, and the same atto.Runtime API single-node or clustered.
- Same user, two concurrent HTTP requests: turns run one after another; history stays coherent.
- Different users: independent session actors run in parallel.
- A caller floods one session: the runtime returns
session.ErrSessionBacklogFullwhen the bounded stash is full. - The agent needs more information mid-task: it calls
request_user_inputto pause; the next message resumes the same task from history. - Idle chats pile up in RAM: configurable passivation frees memory; the next message reloads from your store.
- Rolling restart or node loss: a shared
store.SessionStoreplus sticky placement keeps session state across process boundaries. - Race-free persistence:
Snapshot.Versionprovides optimistic concurrency; conflicting writes come back asstore.ErrConcurrentWrite.
Status: The public API in this release is the stability target. Breaking changes are still permitted between
v0.xreleases; the module path is stable throughv1.
go get github.com/tochemey/attoRequires Go 1.26.2+.
The root atto package is the user-facing entry point; the rest of the surface lives under agent, session, tool, store/..., and llm/....
-
One front door, always actor-backed.
atto.New(ctx, model, build)builds and starts an internal goakt actor system, registers atto's runtime extension, spawns the per-process model actor, and returns a*atto.Runtimeready for invocations. Thebuildclosure receives the model-actor-backedllm.LLM; the agent it returns uses that instance and inherits retry, passivation and (in cluster mode) placement transparently. -
Session actors and backpressure. Behind the front door, a
SessionActorper session owns the conversation; per-invocationRunWorkergoroutines stream events through a buffered channel (atto.WithEventBufferSize, default 64). While a turn is in flight, extra work for that session queues in the actor stash up toatto.WithStashBound(default 32); beyond that you getsession.ErrSessionBacklogFull. Idle sessions passivate afteratto.WithPassivationAfter(default 15 minutes) and re-hydrate from the configured store on next use. A failedSaveon commit rolls the turn back in memory before replying. -
Model actor. Every completion runs through a singleton
ModelActor: streaming via pipe-to tasks, exponential backoff retries for transient failures (caps configured viaatto.WithModelMaxRetries,atto.WithModelBaseBackoff,atto.WithModelMaxBackoff), and cancellation tied to the caller so one stream does not abort another. -
Cluster mode in one option.
atto.WithCluster(provider, opts...)builds a cluster-enabled actor system: it registers atto's actor kinds, wires the remote serialisable types, and resolves dependencies throughatto.Extensionso actors can relocate without captured globals.provideris anyatto/discovery.Provider; tuneClusterBind,ClusterQuorum,ClusterReplicaCount,ClusterPartitions, andClusterBootstrapTimeoutas needed. For goakt features outside the option set, build the actor system yourself and adopt it viaatto.WithActorSystem(sys);atto.RegisterClusterPrerequisitesandatto.RegisterRemotePrerequisitesmerge atto's kinds and wire types into the config you own. -
Persistence contract with optimistic concurrency.
store.SessionStoresavesSnapshotvalues: history, state,UpdatedAt, andVersion.Saverejects stale versions withstore.ErrConcurrentWrite.LoadandDeleteround out the interface. Implementations live understore/inmemory,store/bolt, andstore/postgres, all checked againststore/storetest. -
Pause for clarification.
agent.WithClarification()registers a reservedrequest_user_inputtool. When the model calls it, the agent loop intercepts before dispatch, emitssession.EventInputRequiredcarrying the question and a synthesised assistant message, and ends the invocation. The question commits to history through the same path as a final message, so the paused conversation survives passivation and cluster relocation. The nextRuntime.Runagainst the same session sees the question in history and continues naturally. -
Atomic state delta. Each
agent.Invocationcarries a mutablesession.Stateplus asession.StateDelta. Ops recorded in the delta commit atomically with the assistant message for that turn; partial turns leave no half-applied state behind. -
Streaming events.
Runtime.Runyieldsiter.Seq2[*session.Event, error]: text deltas, tool call and result notices, input-required pauses, the final assistant message, and terminal errors (session.EventKind). -
LLM agent loop.
agent.NewLLMstreams completions and runs tool round trips until the model answers without more tool calls, or untilWithMaxIterationsstops the loop (default 10). ConfigureWithInstruction,WithTemperature,WithTools,WithName(shown as the event author), and anyllm.LLMadapter. -
Typed tools.
tool.Funcreflects the argument struct into JSON Schema (unless you passWithSchema), attaches an optional description, and wraps ordinary Go functions. The registry validates JSON arguments before dispatch. -
First-party model adapters. Streaming
llm.LLMimplementations for OpenAI-compatible HTTP APIs (llm/openai), native Anthropic (llm/anthropic, includingRequest.CacheKeyfor prompt caching), native Gemini (llm/gemini, AI Studio and Vertex), plus helpers for Azure, Ollama, and vLLM. -
Structured logging.
atto.WithLogger(*slog.Logger)forwards every goakt runtime message (cluster bootstrap, peer discovery, actor lifecycle, supervision, remoting) through yourslog.Handler. The default is silent. The handler's level is the single source of truth: aslog.LevelDebughandler surfaces gossip and placement traffic while aslog.LevelWarnhandler stays quiet. -
A2A bridge.
a2a.NewServer(rt, ...)exposes the runtime over A2A JSON-RPC + SSE;a2a.RemoteAgentcalls remote A2A peers asagent.Agentsub-agents. Per-task SSE topics, cross-pod resubscribe, andinput-requiredpause-and-resume survive node loss. See the A2A section below. -
Test doubles.
llm.NewFakereplays scripted chunks;tool.Fakerecords calls. Combine them withstore/inmemoryfor end-to-end tests without the network or goakt.
package main
import (
"context"
"encoding/json"
"fmt"
"log"
"github.com/tochemey/atto"
"github.com/tochemey/atto/agent"
"github.com/tochemey/atto/llm"
"github.com/tochemey/atto/session"
"github.com/tochemey/atto/tool"
)
type weatherArgs struct{ City string `json:"city"` }
func main() {
ctx := context.Background()
weather := tool.Func("get_weather",
func(_ context.Context, a weatherArgs) (string, error) {
return fmt.Sprintf("18C and cloudy in %s.", a.City), nil
},
tool.WithDescription("Look up the weather in a city."),
)
// Swap for llm/openai, llm/anthropic, or llm/gemini when you have a key.
model := llm.NewFake(
llm.Script{Chunks: []*llm.Chunk{{
ToolCalls: []session.ToolCall{{
ID: "call-1",
Name: "get_weather",
Arguments: json.RawMessage(`{"city":"Lagos"}`),
}},
Done: true,
}}},
llm.Script{Chunks: []*llm.Chunk{
{Delta: "The weather in Lagos is 18C and cloudy."},
{Done: true},
}},
)
build := func(m llm.LLM) agent.Agent {
return agent.NewLLM(
agent.WithModel(m),
agent.WithInstruction("You are a helpful assistant."),
agent.WithTools(weather),
)
}
rt, err := atto.New(ctx, model, build)
if err != nil {
log.Fatal(err)
}
defer rt.Stop(ctx)
for ev, err := range rt.Run(ctx, "user-123", session.UserText("What's the weather in Lagos?")) {
if err != nil {
log.Fatal(err)
}
if ev.Kind == session.EventTextDelta {
fmt.Print(ev.TextDelta)
}
}
}The surface area stays small: Agent, LLM, Tool, Event, Runtime. Session lifecycle, mailboxes, stash limits, passivation, and cluster serialisation live inside the runtime. You do not subclass actors or spawn a goroutine per chat by hand.
Go has real agent frameworks now. LangChainGo and Google's Agent Development Kit for Go are the most widely cited, and many smaller modules ship on pkg.go.dev.
Across those projects the dominant pattern is composition inside one process: wire models, tools, and storage with the library's abstractions, then solve per-session ordering, restart survival, load shedding, and multi-replica deployment yourself. Clustering and sticky sessions are usually application architecture (databases, queues, load balancers), not one shared runtime primitive in the toolkit.
Atto puts those concerns on an actor system. GoAkt already provides the runtime primitives: cluster-aware placement, supervision, scheduling, bounded mailboxes, passivation. Atto maps sessions and completions onto actors, so Runtime.Run looks the same whether atto built the actor system or you handed it a clustered one. The public surface stays small: Agent, LLM, Tool, Event, Runtime, plus adapters and stores. The runtime does the work.
atto.New is the façade. It builds and starts a private goakt ActorSystem, registers atto's runtime extension with your llm.LLM and store.SessionStore, spawns the per-process model actor, and returns a *atto.Runtime. The build closure receives the model-actor-backed LLM and hands it to the agent that the runtime drives. Per-invocation Worker goroutines talk to a SessionActor for history and commits and a ModelActor for retried completions.
internal/actor.SessionActor loads and saves through store.SessionStore, applies persisted turn deltas (CommitTurn), and uses goakt's stash so overlapping snapshot requests wait in line or error cleanly when the stash overflows.
The model-actor bridge wraps your raw llm.LLM so streaming completions go through ModelActor, where centralised retry and backoff live. That matters most on a cluster. The wiring is internal; the agent only ever sees llm.LLM.
store.SessionStore is the contract behind inmemory, bolt, and postgres. The shared storetest suite keeps every backend honest.
Single-process callers stay on the default atto.New(ctx, model, build). For cluster mode, pass atto a discovery.Provider and a shared store. Atto registers the actor kinds, wire messages, and runtime extension internally; your code never imports goakt:
rt, _ := atto.New(ctx, model, build,
atto.WithStore(postgresStore),
atto.WithCluster(provider,
atto.ClusterQuorum(2),
atto.ClusterBind("0.0.0.0", 3320, 3321, 3322),
),
)provider implements atto/discovery.Provider: four methods (ID, Start, DiscoverPeers, Stop) over a context.Context-aware lifecycle. For any backend goakt supports, a custom discovery.Provider is typically a 50 to 60 line wrapper around goakt's existing implementation. examples/cluster shows that shape against goakt/v4/discovery/kubernetes, keeping the rest of user code goakt-free.
atto.WithCluster requires an explicit atto.WithStore: the in-memory default is private per node and would silently break session affinity, so atto returns ErrClusterNeedsSharedStore if you forget. Pair it with a backend every node can reach: store/postgres, or the durable cluster-friendly backends under atto-stores.
Some goakt features aren't exposed through atto's option set: custom logger, custom serdes, multi-data-centre placement. To use them, build the actor system yourself and adopt it via atto.WithActorSystem(sys). atto.RegisterClusterPrerequisites and atto.RegisterRemotePrerequisites merge atto's kinds and wire types into your config without disturbing your own:
clusterCfg := gactor.NewClusterConfig().
WithDiscovery(myProvider).
WithKinds(myKinds...).
WithMinimumPeersQuorum(2)
atto.RegisterClusterPrerequisites(clusterCfg)
remoteCfg := remote.NewConfig("0.0.0.0", 3330,
remote.WithSerializables(myMessages...),
)
atto.RegisterRemotePrerequisites(remoteCfg)
sys, _ := gactor.NewActorSystem("agents",
gactor.WithRemote(remoteCfg),
gactor.WithCluster(clusterCfg),
gactor.WithExtensions(atto.Extension(
atto.ExtensionWithStore(postgresStore),
atto.ExtensionWithLLM(model),
)),
)
_ = sys.Start(ctx)
rt, _ := atto.New(ctx, model, build, atto.WithActorSystem(sys))Both Register… helpers are additive: kinds and serialisables you supply survive untouched. Dependencies resolve in PreStart through atto.Extension, so actors relocate without holding stale pointers.
a2a.NewServer(rt, ...) puts the same *atto.Runtime behind the A2A protocol: one JSON-RPC handler, one SSE stream per task, and an AgentCard auto-generated from the tools you register. The agent does not change shape; the bridge dispatches into the same SessionActor the runtime drives without A2A.
The cluster guarantees carry across:
- Per-task SSE topics on a goakt
TopicActor.tasks/resubscribereattaches from any pod. When a reconnect lands on a different node from the one that ran the turn, aTaskRegistryActorresolvestaskID → sessionID, replays a syntheticTaskfrom the persisted projection, then forwards live events through the cluster topic. Late reconnectors past the grace window (a2a.WithResubscribeGrace, default 60s) receiveErrTaskNotFound. input-requiredsurvives pod loss. A paused turn commits through the sameCommitTurnpath as a final assistant message. The nextmessage/sendreferencing the sametaskIdrehydrates history from the session store and resumes; the resume can land on a different pod than the pause.- Task state in the session snapshot. Each A2A task is a projection stored under a reserved
__a2a_task_<taskID>key insidesession.State. Optimistic concurrency, store reload, and cross-pod placement reuse the same machinery as a normal turn; there is no parallel task-store implementation to keep in sync. a2a.RemoteAgentsatisfiesagent.Agent. A remote A2A endpoint plugs into a local orchestrator as a sub-agent. Local A → remote B → local C composes; theAgentCardis fetched once and cached at construction time.- Auth at the bridge boundary.
auth.JWTandauth.APIKeymiddleware sit in front of the JSON-RPC handler. The authenticated principal flows throughcontext.Contextto aContextIDResolver(defaultprincipal/contextID), so two tenants cannot reach each other's sessions even when they reuse acontextId.
rt, _ := atto.New(ctx, model, build,
atto.WithStore(postgresStore),
atto.WithCluster(provider, atto.ClusterQuorum(2)),
)
srv, _ := a2a.NewServer(rt,
a2a.WithSkillFromTool(weather, a2a.SkillTags("weather", "live")),
a2a.WithAuth(auth.JWT(jwtCfg)),
)
http.Handle("/", srv.Handler())examples/a2a-cluster runs this bridge on a three-pod kind cluster; the demo script kills the pod that just served a turn and asserts the resume lands on a peer.
| Example | What it shows |
|---|---|
quickstart |
Runnable tour with a scripted fake model. No API key required. |
clarification |
Pause-and-resume across two turns of the same session. No API key required. |
single-agent |
Live Gemini plus a real HTTP tool. |
multi-agent |
Coordinator delegates to a specialist ("agent as tool"). |
cluster |
Three-pod atto cluster on a local kind cluster. |
a2a-cluster |
A2A bridge on a three-pod kind cluster; demo proves pod-loss survival. |
candidate-sourcing |
Multi-agent recruiter on docker-compose: clustered front-door, two A2A sub-agents wired through a2a.AgentAsTool, dnssd peer discovery, Postgres-backed sessions, web chat UI. |
llm.NewFake replays scripted chunks; tool.Fake records invocations. Together with store/inmemory, they give you deterministic, fast tests. New store backends plug into storetest.Run(t, factory) against the shared contract.
The five-concept public API is the stability target through v0.x. v0.1.0 shipped the runtime: sessions, cluster mode, three native model adapters, three in-tree stores, the model actor. v0.2 adds the A2A bridge — serve A2A over JSON-RPC + SSE, call remote agents through a2a.RemoteAgent, and resume across input-required pauses on a different pod. Details live in CHANGELOG.md. v0.3 is the next milestone: push-notification webhooks, gRPC transport, OAuth2 / mTLS auth, OTel spans across A2A hops, and RemoteAgent resume across remote pauses.
See SECURITY.md for the disclosure process.
Bug fixes, adapters, and stores are welcome. Atto uses Conventional Commits and runs go test -race, go vet, and golangci-lint as in CI. See CONTRIBUTING.md.
atto is Italian for "act", and also the metric prefix 10⁻¹⁸.