Long-running Agents must have the ability to "resume anytime, fork, and audit". KODE SDK implements a unified persistence protocol at the kernel level (messages, tool calls, Todo, events, breakpoints, lineage).
| Concept | Description |
|---|---|
| Metadata | Serializes template, tool descriptors, permissions, Todo, sandbox config, breakpoints, lineage |
| Safe-Fork-Point (SFP) | Every user message or tool result creates a recoverable node for snapshot/fork |
| BreakpointState | Marks current execution phase (READY → PRE_MODEL → ... → POST_TOOL) |
| Auto-Seal | When crash occurs during tool execution, Resume auto-seals with tool_result |
import { Agent } from '@shareai-lab/kode-sdk';
const agent = await Agent.resume('agt-demo', {
templateId: 'repo-assistant',
modelConfig: {
provider: 'anthropic',
model: process.env.ANTHROPIC_MODEL_ID ?? 'claude-sonnet-4-20250514',
apiKey: process.env.ANTHROPIC_API_KEY!,
},
sandbox: { kind: 'local', workDir: './workspace', enforceBoundary: true },
}, deps, {
strategy: 'crash', // Auto-seal incomplete tools
autoRun: true, // Continue processing queue after resume
});const agent = await Agent.resumeFromStore('agt-demo', deps, {
overrides: {
modelConfig: {
provider: 'anthropic',
model: process.env.ANTHROPIC_MODEL_ID ?? 'claude-sonnet-4-20250514',
apiKey: process.env.ANTHROPIC_API_KEY!,
},
},
});| Option | Values | Description |
|---|---|---|
strategy |
'manual' | 'crash' |
crash auto-seals incomplete tools |
autoRun |
boolean |
Continue processing message queue after resume |
overrides |
Partial<AgentConfig> |
Override metadata (model upgrade, permission changes, etc.) |
Important: You must re-bind event listeners after Resume (Control/Monitor callbacks are not auto-restored).
| Capability | SDK | Application |
|---|---|---|
| Template, tools, sandbox restore | Auto-rebuild | Not needed |
| Messages, tool records, Todo, Lineage | Auto-load | Not needed |
| FilePool watching | Auto-restore | Not needed |
| Hooks | Auto-register | Not needed |
| Control/Monitor listeners | Not handled | Must re-bind after Resume |
| Approval flows, alerts | Not handled | Integrate with business systems |
| Dependency singleton management | Not handled | Ensure store/registry global reuse |
// Create snapshot at current point
const bookmarkId = await agent.snapshot('pre-release-audit');// Fork from a snapshot
const forked = await agent.fork(bookmarkId);
// Fork from latest point
const forked2 = await agent.fork();
// Use forked Agent
await forked.send('This is a new task forked from the original conversation.');snapshot(label?)returnsSnapshotId(default:sfp-{index})fork(sel?)creates new Agent: inherits tools/permissions/lineage, copies messages to new Store namespace- Forked Agent needs independent event binding
When crash occurs during these phases, Resume auto-writes compensating tool_result:
| Phase | Seal Info | Recommended Action |
|---|---|---|
PENDING |
Tool not executed | Validate params and retry |
APPROVAL_REQUIRED |
Waiting for approval | Re-trigger approval or manually complete |
APPROVED |
Ready to execute | Confirm input still valid and retry |
EXECUTING |
Execution interrupted | Check side effects, manual confirm if needed |
Auto-seal triggers:
monitor.agent_resumed: Containssealedlist andstrategyprogress.tool:end: Adds failedtool_resultwithrecommendations
const agent = await Agent.resumeFromStore('agt-demo', deps);
// Re-bind Control/Monitor event listeners
agent.on('tool_executed', (event) => {
console.log('Tool executed:', event.call.name);
});
agent.on('error', (event) => {
console.error('Error:', event.message);
});
agent.on('permission_required', async (event) => {
await event.respond('allow');
});
// For Progress events, use subscribe()
const progressSubscription = (async () => {
for await (const envelope of agent.subscribe(['progress'])) {
if (envelope.event.type === 'text_chunk') {
process.stdout.write(envelope.event.delta);
}
if (envelope.event.type === 'done') break;
}
})();
// Continue processing
await agent.run();
await progressSubscription;-
Singleton Dependencies: Create
AgentDependenciesat module level to avoid multiple instances writing to same Store directory -
Event Re-binding: Call event binding immediately after every
resume -
Concurrency Control: Same AgentId should only run in single instance; use external locks or queues
-
Persistence Directory:
JSONStoreworks for single-machine or shared disk environments. For distributed deployments, implement custom Store (e.g., S3 + DynamoDB) -
Observability: Listen to
monitor.state_changedandmonitor.errorfor quick issue identification
| Symptom | Investigation |
|---|---|
AGENT_NOT_FOUND on Resume |
Store directory missing or not persisted. Check store.baseDir mount |
TEMPLATE_NOT_FOUND on Resume |
Template not registered at startup; ensure template ID matches metadata |
| Missing tools | ToolRegistry not registered; built-in tools need manual registration |
| FilePool not restored | Custom Sandbox not implementing watchFiles; disable watch or complete implementation |
| Event listeners not working | Not calling agent.on(...) after Resume |
import { Agent, createExtendedStore } from '@shareai-lab/kode-sdk';
async function resumeAgent(agentId: string) {
const store = await createExtendedStore();
const deps = createDependencies({ store });
// Check if Agent exists
const exists = await store.exists(agentId);
if (!exists) {
throw new Error(`Agent ${agentId} not found`);
}
// Resume from store
const agent = await Agent.resumeFromStore(agentId, deps, {
strategy: 'crash',
autoRun: false,
});
// Re-bind Monitor event listeners (on() only supports Control/Monitor events)
agent.on('tool_executed', (e) => console.log('Tool:', e.call.name));
agent.on('agent_resumed', (e) => {
if (e.sealed.length > 0) {
console.log('Auto-sealed tools:', e.sealed);
}
});
agent.on('error', (e) => console.error('Error:', e.message));
// For Progress events, use subscribe()
const progressTask = (async () => {
for await (const env of agent.subscribe(['progress'])) {
if (env.event.type === 'text_chunk') {
process.stdout.write(env.event.delta);
}
if (env.event.type === 'done') break;
}
})();
// Continue processing
await agent.run();
return agent;
}