Update dependency agents to ^0.16.0#8
Open
renovate[bot] wants to merge 1 commit into
Open
Conversation
Author
|
75a0534 to
b3ea404
Compare
b3ea404 to
5dec9e0
Compare
5dec9e0 to
5316bbd
Compare
5316bbd to
6bb9bcf
Compare
6bb9bcf to
af7137d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
^0.0.98→^0.16.0Release Notes
cloudflare/agents (agents)
v0.16.0Compare Source
Minor Changes
#1656
4c2d1a7Thanks @cjol! - Rebuildagents/browseron the codemode connector runtime (experimental).The browser tool surface is now a single durable tool,
browser_execute: the model writes sandboxed code against acdpconnector (cdp.send,cdp.attachToTarget,cdp.spec,cdp.getDebugLog, …) instead of picking from several flat tools. Executions are recorded on aCodemodeRuntimeDurable Object facet with abort-and-replay, so a run can pause for approval and resume with its browser session, tabs, and cookies intact.BrowserConnector— aCodemodeConnector(namecdp) that owns CDP sockets keyed by execution id. Sockets are released at the end of every execution pass (onPassEnd); browser sessions are torn down on terminal status (disposeExecution) — never on pause.one-shot(default, fresh session per execution),reuse(named shared session), anddynamic(starts one-shot; the model can promote withcdp.startSession()after e.g. logging in). Shared sessions are tracked in durable storage and survive hibernation;connector.sweep()reclaims expired ones from a scheduled task.maxExecIdleMs(default 24h, matching the runtime's paused TTL), so a run awaiting approval keeps its browser. A swept entry leaves a tombstone so a later resume fails with a clear "expired or was swept" error instead of silently continuing in a fresh browser. Concurrent CDP calls share one in-flight socket connect instead of leaking the loser's WebSocket. Session-store locks wrap storage operations only — liveness probes and session create/delete happen outside the lock (with a commit re-check; a racing create's redundant session is deleted), so a hung Browser Rendering call can't serialize other session operations.cdp.attachToTargetreturns{ sessionId }where the id is a stable handle bound to the target (not a raw CDP session id), so handles recorded before a pause still work after the resume reconnects. The object shape mirrors the realTarget.attachToTargetresponse, which is what models expect.sendwithout a sessionId explains that page-scoped commands needcdp.attachToTargetfirst, and a missingtargetIdexplains how to list/create targets.createBrowserTools({ ctx, browser, loader, session? })(AI SDK and TanStack AI variants) now requires the hosting Durable Object'sctxand returns{ browser_execute };createBrowserRuntimeadditionally exposes the runtime handle and connector for host-side wiring (approvals,sessionInfo/closeSession/sweep). The previousbrowser_search/flat-tool surface andcreateBrowserProviderare removed.export { CodemodeRuntime } from "agents/browser".agents/chatgainspausedExecutionUpdate, a tool-part update that replaces a paused execution's output in the transcript with its resolved outcome (completed / rejected / paused again) — the transcript-side half of human-in-the-loop approvals for durable executions.#1746
e45b5ecThanks @threepointone! - Fix RPC calls hanging forever during connection churn (#1738).useAgent's RPC layer now survives socket replacement.usePartySocketcreates a brand-new socket whenever connection options change (async query refresh,enabledtoggle, path change) — previously, a call issued against a staleagentreference was buffered inside the permanently-closed old socket and its promise never settled, and a call transmitted just before replacement lost its response with no rejection either.agent.call()(andagent.stub/agent.setState) now route through the live socket, so stale references captured by mount-time effects keep working.Connection closedwhen their socket closes or is replaced (the response is connection-bound and can never arrive). Calls in flight on a newer socket are no longer spuriously rejected by a stale close event from an old socket.agent,name,basePath, or path props change) before a queued call could be transmitted, the call is rejected instead of executing against an instance it wasn't composed for.AgentClientsimilarly keeps buffered (untransmitted) calls pending across transient disconnects — PartySocket re-sends them on reconnect — and only rejects calls the server actually received.defaultCallTimeout(0 disables) onuseAgent/AgentClient, or per call via the existingtimeoutoption (timeout: 0opts out). Streaming calls are exempt.console.warninstead of being silently discarded.Patch Changes
#1742
4b201a9Thanks @threepointone! - Fix duplicated assistant text parts when a stream resume is replayed twice (#1733).The server intentionally sends
CF_AGENT_STREAM_RESUMINGfor the same request from bothonConnectand itsCF_AGENT_STREAM_RESUME_REQUESThandler. When both offers reached theuseAgentChatfallback path (e.g. the transport's resume handshake had already timed out), the client ACKed both, the full chunk buffer was replayed twice into the same accumulator, and the streaming reply rendered as two stacked text blocks until refresh.useAgentChatnow fallback-ACKs a given resume offer at most once per socket (reset on close/reconnect). A repeated offer is still handed to a waiting transport resume handshake first, so a fallback-observed stream can become transport-owned. It also resets the matching trailing assistant message on every replayed non-continuationstart, not only while the resume request id is still pending.start, making replay idempotent under any number of replays.continuation: truefor continuation streams (persisted in stream metadata and restored after hibernation), so a replayed continuation appends to the existing assistant message instead of being mistaken for a fresh turn.#1740
6c9de59Thanks @threepointone! - Defer one-shot scheduled callbacks (and chat-recovery give-ups) on platform transients instead of consuming them mid-deploy (#1730).A mid-execution Durable Object code-update reset surfaces storage failures in two shapes: the verbatim reset/supersede messages (already deferred) and
SqlError: SQL query failed: Network connection lost.— a wrapper that drops the CFretryableflag and dodges the reset matcher. The second shape burned the in-process retry budget inside the same few-seconds reset window (which outlasts the retry schedule by design) and then consumed the one-shot row on exhaustion, freezing the turn for minutes until incident re-detection — in the reported production capture, storage was healthy again 15 ms after the final attempt.agents— new cause-awareisPlatformTransientErrorclassifier (exported, alongsideisDurableObjectCodeUpdateReset): reset/supersede messages,retryable-flagged platform errors (excluding overloaded), and "Network connection lost.", looked up through wrappercausechains._executeScheduleCallbackkeeps in-process retries for connection-lost transients (a genuine blip heals fast) but on exhaustion of a one-shot row it now re-throws instead of swallowing, so the row survives and the alarm re-runs it in the healthy window that follows. Genuine application errors are still abandoned aftermaxAttemptsexactly as before.@cloudflare/think—_handleRecoveryCallbackErrornow defers (re-throws) on any platform transient instead of terminalizing through a give-up whose own seal needs the storage that is down; the bookkeeping write on the defer path is best-effort. The defer path no longer marks the recovered submissionerror(which made the deferred re-run skip withsubmission_not_running— a self-defeating defer); it staysrunningfor the re-run to pick up. The give-up now seals the incidentexhaustedonly after the terminal writes succeed, so a transient mid-seal defers the whole give-up for an idempotent re-run instead of half-sealing.@cloudflare/ai-chat— same give-up seal ordering: the incident is sealed only after_exhaustChatRecovery(incl. the durable terminal record) succeeds, so a transient mid-seal preserves the one-shot row and the give-up re-runs in full on a healthy isolate.#1745
99c9326Thanks @cjol! - Make agent teardown reliable when the initiating request is already canceled (#1625).The MCP Streamable-HTTP session-DELETE handler ran
agent.destroy()via the request'sctx.waitUntil. By the time the DELETE lands the client is usually gone, the runtime gives a canceled request's trailing work little to no grace, and the multi-step teardown (drop tables, delete alarm, delete all storage, dispose connections) was routinely cut short — leaving half-deleted session DOs whose tables the constructor silently recreated on the next wake. (The associatedwaitUntil() tasks did not completelog warning itself originates inside workerd's WebSocket handling and is unaffected by this change.)Teardown is now deferred to the agent's own alarm invocation. The DELETE handler awaits two fast storage writes — a durable "condemned" marker plus an immediate alarm — and responds 204; the alarm then runs the real
destroy()with a fresh execution budget. The marker is removed by the finaldeleteAll(), so it survives any interruption:alarm()checks it before any other work (includingonStart) and finishes the teardown instead of resuming normal operation on a condemned agent, and_scheduleNextAlarm()keeps the destroy alarm armed rather than deleting it as "no work pending".destroy()itself now writes the marker first, so a direct destroy that gets interrupted converges the same way.New internal API:
Agent._cf_scheduleDestroy()(used by the MCP handler; unlikedestroy()it does not abort the isolate, so callers don't need to swallow an abort error). No public API or storage-schema changes; the marker is a single internal KV record (cf_agents_destroy_pending).#1729
1c8fdf5Thanks @threepointone! - Fix runFiber recovery starving when a recovery scan leaves work behind._scheduleNextAlarm()only armed a follow-up alarm for active keepAlive leases, due schedules, and facet runs — never for orphanedcf_agents_runsrows (or interrupted/pending managed ledger fibers) still awaiting recovery. Because orphaned fibers hold no keepAlive ref, a scan that yielded onfiberRecoveryScanDeadlineMs(or a pass that retained a repeatedly-throwing unmanaged hook for retry) would never get another alarm, so the remaining fibers were never recovered. The scheduler now arms a follow-up alarm whenever fiber recovery work is still outstanding, so multi-pass recovery resumes and eventually drains every fiber (and ages out poison rows viafiberRecoveryMaxAgeMs).The follow-up alarm uses exponential backoff (capped at 5 minutes) while scans make no forward progress, so a repeatedly-throwing recovery hook — or a
fiberRecoveryMaxAgeMs: 0("retain forever") row whose hook keeps throwing — no longer wakes the Durable Object everykeepAliveIntervalMs. A scan that recovers any fiber (including a scan-deadline yield that drained part of a large batch) resets the backoff, so legitimate multi-pass draining stays prompt.#1737
bc43133Thanks @cjol! - Fix the two remaining #1575 gaps in how in-band stream errors ({type: "error", errorText}chunks inside an otherwise-healthy provider stream) are observed after the fact.Errored-stream replay (partial content was lost on reconnect). A client reconnecting after an in-band error received the terminal error frame (#1645) but not the content the model streamed before the error — the replay path only served
status = 'completed'streams, so an errored stream's buffered chunks were unreachable, and the server pushes no messages on connect.ResumableStreamgainsreplayErroredChunksByRequestId, and the resume-ACK terminal replay (_replayTerminalOnAckin both AIChatAgent and Think) now replays the errored stream's stored chunks before thedone: true, error: trueframe, so a reconnecting client observes the same sequence a live client did. No wire-format or schema changes: replayed chunks reuse the existingreplay: trueframe shape and the error text still comes from the durable terminal record.Agent-tool error attribution (cross-run contamination). When an in-band error frame was broadcast on a child agent and the active run was unknown, the error was stamped onto every tailed run — so an unrelated turn's failure (or one of several overlapping runs) could mark healthy runs as
error, and capture depended on a tailer being attached at the right moment. Frames are now attributed by the request id they carry: each agent-tool run is bound to its turn's request id when the turn starts (persisted on the run row at start rather than at terminal, so attribution survives a DO restart mid-run), and only the owning run's error/progress state is updated. Frame inspection also no longer requires an attached tailer, so error capture is independent of tailer timing.#1707
d96a17cThanks @threepointone! - FixkeepAlive()leaving a stale 30s heartbeat alarm after the lease is released. Previously the dispose returned bykeepAlive()(and used bykeepAliveWhile()) only decremented the in-memory ref count and never rescheduled the alarm, so a short-lived lease could permanently bump the next alarm tonow + keepAliveIntervalMswith nothing to pull it back. The dispose now recomputes the alarm from persistent state when the last lease is released (mirroring the facet release path), clearing the heartbeat when no other work needs it. Fixes #1704 (root cause behind #1703).#1724
c18a446Thanks @whoiskatrin! - Fix SQLite memory amplification inAgentSessionProvider.getHistory()and add byte-budgeted history reads (#1710).The history path query previously selected
m.*inside its recursive CTE, so every message blob was materialized in SQLite's recursion queue AND itsORDER BYsorter — 2-3 transient copies of the entire transcript inside the SQLite allocator, which in workerd shares the isolate's memory budget with the JS heap. On large media-heavy sessions this exhausted the allocator and surfaced asSQLITE_NOMEMon every wake. The CTE now recurses over(id, parent_id, depth)only and content is fetched separately in bounded chunks viajson_each, which streams without materializing the result set. Leaf detection similarly no longer drags content blobs through its sorter.New session APIs for hosts that need to bound wake-time memory:
Session.getRecentHistory(maxContentBytes, minRecentMessages?)— returns the most recent messages on the active path that fit a byte budget (always at least the leaf, and at leastminRecentMessagesrows when provided — rows are individually capped at write time, so the floor keeps memory bounded), plustruncatedandtotalContentBytes. Backed by the optionalSessionProvider.getRecentHistory(); falls back to a full read for providers that don't implement it, reporting the real serialized size and warning once that the budget cannot be enforced.Session.getHistoryRowStats()— per-row stored sizes AND roles for the active path WITHOUT loading content (optionalSessionProvider.getHistoryRowStats()), so oversized rows can be found and processed one at a time.Session.internal_rewriteMessage()— maintenance write path that skips the full-history token-estimate status broadcast of a publicupdateMessage(), for framework passes (media eviction) that rewrite many rows with bounded memory.Bounded init reads: the init-time loaded-skill restore scan is now skipped entirely when no skill-capable context provider is configured, and when one is, it reads row stats and fetches assistant messages ONE AT A TIME instead of materializing the full transcript (full-read fallback for providers without row stats). Content hydration chunks are additionally bounded by cumulative stored bytes (4MB), not just row count, removing the 50-near-cap-rows worst case.
Also adds
chat:onstart:degraded,chat:hydration:windowed, andchat:media:evictedobservability event types emitted by@cloudflare/think.#1748
4ec3b07Thanks @threepointone! - Ignore RPC responses when the WebSocket has already closed.Async callable methods can finish after a client disconnects. The server now treats that closed-socket response delivery as a no-op instead of surfacing an uncaught
WebSocket send() after close()error from the Workers runtime.#1712
835e7b0Thanks @threepointone! - Reclaim resumable-stream buffers from an alarm so idle chats don't leak storage (#1706)Resumable-stream chunk buffers (
cf_ai_chat_stream_*) were only swept lazily when a subsequent stream completed. A chat that received a single turn and then went idle never triggered that sweep, so its buffers lingered in the Durable Object's SQLite for the lifetime of the DO.AIChatAgentandThinknow arm a scheduled cleanup alarm whenever a stream starts and whenever it finishes (completes or errors). Arming on start guarantees that a stream whose DO is evicted mid-flight and never reaches a finish still gets a future sweep instead of leaking. This is the safety net for the non-durable path (e.g.chatRecovery: false, theAIChatAgentdefault): those turns don't run insiderunFiber, so there's no leftoverkeepAlivealarm and no fiber-recovery scan, and if the client never reconnects nothing else wakes the DO. (DurablerunFiberturns already self-heal — thekeepAlivealarm survives eviction, wakes the DO, and recovery finalizes the stream, which arms cleanup — so arming on start is belt-and-suspenders there.) The alarm sweeps aged buffers via the retention windows below and re-arms only while reclaimable rows remain, so a fully-swept DO stops waking itself. Arming is idempotent so high-turn-count chats never accumulate cleanup schedules; the in-callback re-arm uses a fresh (non-idempotent) row so it survives the one-shot deletion of the firing schedule. No per-turn Durable Object and no change to the session DO lifecycle are required.Retention is now split into two short, purpose-specific windows instead of a single 24h threshold: completed/errored buffers are kept for a brief 10-minute reconnect-and-replay grace (the assistant message is persisted separately, so the buffer is only needed to replay a just-finished stream or deliver a terminal error frame to a reconnecting client), while abandoned in-flight (
streaming) rows are kept for 1 hour so an interrupted turn has ample time to be resumed or recovered before its buffer is presumed dead. The abandoned-row sweep keys off last chunk activity rather than stream start time, so a long-running stream that is still emitting chunks is never reclaimed mid-flight.ResumableStreamgainscleanup(now?)(force a sweep, bypassing the lazy interval gate) andhasReclaimableStreams()to support alarm-driven cleanup.#1713
18c438bThanks @threepointone! - Support client tools on the Think sub-agentchat()RPC path (#1709)ChatOptionsnow acceptsclientTools(the sameClientToolSchema[]carried over the WebSocket chat protocol) and anonClientToolCallexecutor. This lets a parent agent that drives a Think sub-agent overchat()expose client-defined tools to the sub-agent and complete the tool round trip within the same turn:Without
onClientToolCall, the schemas are still registered and the model's call is surfaced through the stream callback (execute-less), matching the WebSocket behavior. With it, the call is resolved inline so the turn can continue to completion — the RPC stream callback has no inbound result channel of its own.Unlike the WebSocket path, the schemas and executor are kept per-turn and are NOT persisted: the executor is a live RPC reference that cannot survive an eviction, and there is no SPA to replay a
tool-result. This keeps chat recovery correct — an eviction-interrupted client-tool call is repaired like a server tool (the model proceeds) rather than being mistaken for a pending human interaction and parking forever.agents/chat'screateToolsFromClientSchemasgains an optional{ execute }delegate (and exports a newClientToolExecutortype) to build the executable variant. Both additions are backward-compatible.Updated dependencies [
b2b6762,4c2d1a7,4c2d1a7]:v0.15.0Compare Source
Minor Changes
#1701
6caa6e8Thanks @mattzcarey! - RefactorWorkerTransportto extend the official MCP SDK'sWebStandardStreamableHTTPServerTransportinstead of being a hand-rolled implementation.The wrapper is now a thin subclass that layers Workers-specific concerns on top of the SDK transport:
corsOptions).MCPStorageApiadapter.sessionId,initialized, andinitializeParamsare snapshotted after each request and replayed on cold start so client capabilities are restored without a fresh initialize round-trip.KEEPALIVE_FRAME(: keepalive\n\n) atKEEPALIVE_INTERVAL_MS(25s) fromsse-keepalive.ts. Keepalive is unconditional on POST response streams and disabled on the standalone GET stream when aneventStoreis configured (clients recover idle drops viaLast-Event-IDinstead).Everything else — session validation, SSE streaming, protocol-version negotiation, event-store resumability, send/close lifecycle — is delegated to the SDK transport. Net: ~500 fewer lines of code to maintain.
The exported shape is unchanged:
WorkerTransport,WorkerTransportOptions,MCPStorageApi, andTransportStatekeep the same names, andWorkerTransportOptionsnow also extends the SDK's transport options. The defaultcreateMcpHandlerpath (a fresh transport per request) is unaffected.There are, however, a few observable behaviour changes for callers who used
WorkerTransportdirectly or relied on its previous quirks:handleRequest's second argument is now{ parsedBody?, authInfo? }(the SDK shape) instead of a positionalparsedBody.createMcpHandlerandMcpAgentdon't pass it, but callers invokingtransport.handleRequest(request, parsedBody)directly must wrap it astransport.handleRequest(request, { parsedBody }).retryIntervalpriming now follows the SDK contract. Previously aretry:priming frame was written to any GET SSE stream wheneverretryIntervalwas set. The SDK only writes a priming event when aneventStoreis configured and the negotiated protocol version is>= 2025-11-25(older clients can't parse the empty-data:priming frame), and on POST streams rather than the standalone GET stream.retryIntervalis still accepted but only affects that SDK priming event.onerrornow fires on client/protocol validation failures. The SDK invokesonerrorfor responses such as 400/405/406/415 and session-not-found. The old transport only surfaced internal errors, so handlers that logonerrorwill now see normal client mistakes.onsessionclosedfires before the underlyingclose()(and therefore beforeonclose) on DELETE, instead of after. Ordering only; the session id is still passed.startedis now read-only. It was a writable instance field and is now a getter backed by the SDK's internal_startedflag. Reading it (e.g.createMcpHandler's reconnect guard) is unchanged; assigning to it is no longer supported.createMcpHandlernow forwards SDK transport options. BecauseWorkerTransportOptionsextends the SDK options, the handler passes through everything except its ownroute/authContext/transportfields — includingeventStore,retryInterval,onsessionclosed, and the SDK DNS-rebinding options (enableDnsRebindingProtection,allowedHosts,allowedOrigins). The previous handler silently dropped these.The SDK dependency is pinned exactly (
@modelcontextprotocol/sdk1.29.0, no caret) because the wrapper relies on a handful of SDK internals for state restore and keepalive cleanup. The exact pin stops a patch release from shifting those out from under us, and the tests assert against the SDK field names so a bump fails CI loudly rather than breaking at runtime.v0.14.5Compare Source
Patch Changes
#1613
124a47aThanks @threepointone! - Introduce the first Think framework layer for convention-driven agent apps.This release adds a manifest-driven Vite plugin that discovers agents from the
agents/directory, generates a Worker entrypoint and virtual frameworkmodules, derives stable Durable Object class names, and merges framework-owned
Worker config defaults with user Wrangler config. It also keeps the Think Vite
plugin usable directly in normal Vite plugin arrays.
The framework now supports optional app server entries, manifest-scoped friendly
agent and sub-agent routing, deterministic route surfaces, colocated skill
detection, Worker Loader requirement diagnostics, and explicit diagnostics for
unsupported nested sub-agent conventions. Think currently supports top-level
agents and one sub-agent layer; deeper nesting is rejected with guidance so that
the routing and lifecycle model can be designed deliberately.
This framework layer is experimental: both the Vite plugin (once, on build
start) and the
thinkCLI (on startup) emit a notice that the API may changeor be removed in any release. The core Think agent runtime is unchanged.
The Think CLI now includes
think init,think inspect, andthink types.think initscaffolds a minimal Workers/Vite Think app, safely handles promptedor named target directories, refuses unsafe migrations, and installs npm
dependencies by default.
think inspectexposes manifest/config diagnostics intext or JSON, while
think typesgenerates Think-owned declarations and canoptionally compose with Wrangler type generation.
This release also adds host-framework coverage for React Router and TanStack
Start, updates examples to use the convention-first framework shape, and hardens
Agents/worker-bundler virtual modules for bundled skill compatibility.
#1613
124a47aThanks @threepointone! - Compile skill scripts ahead of time and remove the in-Worker bundler (drops ~14MB ofesbuild-wasmfrom Worker bundles).Skill scripts are now always compiled to self-contained JavaScript before they run, and the runtime no longer ships an in-Worker bundler (
@cloudflare/worker-bundleris no longer a dependency ofagents):scripts/*.ts/.tsx/.js/.mjs) with esbuild at build time — resolving sibling imports and stripping TypeScript — and marks themprecompiled.compileSkillScripthelper is exported fromagents/skills/compilefor use in your publish/upload tooling.Breaking: if you ship raw TypeScript or multi-file skill scripts to R2 (or another dynamic source) and relied on the in-Worker bundler to compile them at runtime, bundle them ahead of time (e.g. with
compileSkillScript) before upload. Bundled skills handled by the Vite plugin require no changes. The previously-addedstubWorkerBundleroption has been removed (there is nothing left to stub).v0.14.4Compare Source
Patch Changes
#1693
6496c80Thanks @threepointone! - FixAIChatAgentorphaned-stream recovery merging a new assistant turn into the previous assistant message (#1691).When a stream was interrupted before its final assistant message was persisted (Durable Object hibernation, deploy churn, isolate restart, reconnect), orphan recovery reconstructed the message from stored chunks. If those chunks carried no provider
start.messageId— the common case — recovery fell back to the last assistant message in history. That is correct for a continuation, but wrong for a normal new turn after a later user message: the recovered chunks for the new turn were appended onto the previous assistant message, corrupting both the persisted transcript and future model context.The assistant message id allocated when a stream starts is now persisted in the resumable-stream metadata (
ResumableStream.start()recordsmessage_id). When the reconstructed chunks carry no providerstart.messageId— the common case, and the one that triggered the bug — orphan recovery now uses this stored id instead of the last-assistant fallback, so a new turn becomes its own message and a continuation still merges into the message it was extending (it stored the cloned last-assistant id). A providerstart.messageId, when present, still wins, matching the live path which adopts it for new turns. Stream rows written before this release have no stored id and keep the previous behavior (provider id if present, otherwise the last assistant message). The metadata migration adds a single column, guarded by a schema check so it runs only once.This also fixes two related variants of the same corruption on the durable (
chatRecovery) continuation path:toolCallIdalready exists on the message.onChatRecoveryreturned{ persist: false }— recovery would "continue" it by cloning the previous assistant message, merging the new turn into it. Recovery now detects that the conversation leaf is still the user message (no partial to continue) and re-runs the turn fresh, so it becomes its own message.@cloudflare/thinkis unaffected — its session-tree recovery already allocates a distinct message id per orphan and never falls back to the last assistant message.v0.14.3Compare Source
Patch Changes
1e49880Thanks @threepointone! - Batch and pack chat-persistence SQLite writes to reduce rows written and round-trips.agents:ResumableStreamnow packs each buffered group of stream chunks into a single SQLite row (a JSON array of chunk bodies) instead of writing one row per chunk. Single-chunk and large-chunk segments are stored unwrapped, and a per-segment byte cap keeps rows within the 2 MB SQLite row limit. This cuts chunk rows written / stored / scanned-on-replay by up to ~10×. Reads (replay, orphan reconstruction,getStreamChunks) transparently unpack both packed segments and legacy per-chunk rows, so existing stored data keeps working. Adds sharedbuildInClauseStringsandMAX_BOUND_PARAMShelpers exported fromagents/chat.@cloudflare/ai-chat: message cleanup (stale-row pruning andmaxPersistedMessagesenforcement) previously issued oneDELETEper row in a loop; it now deletes rows in batchedDELETE ... WHERE id IN (...)queries (capped at 100 bound parameters per query).@cloudflare/think:deleteSubmissions()cleanup previously issued oneDELETEper terminal submission (up to 500 per call); it now deletes rows in batchedDELETE ... WHERE submission_id IN (...)queries.@cloudflare/ai-chat&@cloudflare/think: chat-recovery incident TTL sweep previously deleted each stale incident with a separate awaitedstorage.delete(key)(which also defeats Durable Object write-coalescing); it now deletes incidents in batchedstorage.delete(keys)calls (up to 128 keys per call).v0.14.2Compare Source
Patch Changes
#1684
ab6dd95Thanks @threepointone! - warn whenchatRecoveryis configured inonStart()(applied too late for wake recovery)On every Durable Object wake the SDK evaluates chat-recovery budgets — and may seal an interrupted turn, firing
onExhausted— before the user'sonStart()runs (_checkRunFibers()is ordered ahead ofonStart()). AchatRecoveryconfig produced insideonStart()is therefore read as the built-in defaults at the moment recovery decides, so a configuredmaxRecoveryWork/shouldKeepRecovering/onExhaustedsilently never applies to the recovery that matters.This is now documented on
ChatRecoveryConfigand thechatRecoveryfields ofThink/AIChatAgent, and the SDK logs a one-time warning if it detectschatRecoverybeing reassigned duringonStart(). The warning fires both for a custom config object and forchatRecovery = true(enabling recovery / its defaults too late); assigningfalse(disabling) inonStart()is intentionally not warned, since recovery already ran with the pre-onStart()value and disabling it afterward is a benign no-op for that wake. The fix is to assignchatRecoveryas a class field or in the constructor.#1672
f96a2baThanks @threepointone! - fix(chat-recovery): a turn making forward progress now survives unbounded deploy churn; add a work budget +shouldKeepRecoveringrunaway guardDurable chat recovery used to bound a single incident with a non-resetting 15-minute wall-clock ceiling (
CHAT_RECOVERY_MAX_WINDOW_MS). That ceiling was overloaded — it served as both a recovery-duration bound and a runaway-loop guard — and it terminated healthy, actively-progressing turns that simply took longer than 15 minutes of wall-clock to finish while being repeatedly interrupted by a dense deploy window, sealing them withreason="max_recovery_window_exceeded"and discarding completed work.The two jobs are now decoupled (see
design/rfc-chat-recovery-work-budget.md):chatRecovery.maxRecoveryWorkcaps the produced content/tool units since an incident opened; exceeding it seals withreason="work_budget_exceeded". Defaults toInfinity— the SDK ships the mechanism but imposes no implicit cap, so it never terminates a progressing turn on its own.chatRecovery.shouldKeepRecovering(ctx)is consulted per recovery attempt from the second onward (only when no hard bound has already sealed the incident); returningfalseseals withreason="recovery_aborted". This is where integrators express token/cost/step budgets the SDK should not hardcode. A throwing predicate is logged and treated as "keep recovering".chatRecovery.noProgressTimeoutMs(default 5 min, resets on progress) is the primary stuck-turn bound, now overridable per agent instead of a hardcoded constant.New public types from
agents/chat:ChatRecoveryProgressContext. NewChatRecoveryConfigfields:maxRecoveryWork,shouldKeepRecovering,noProgressTimeoutMs.ChatRecoveryExhaustedContext.reasongainswork_budget_exceededandrecovery_aborted;max_recovery_window_exceededis retained as an open-string value but is no longer emitted.Both
@cloudflare/ai-chatand@cloudflare/think(which carries its own copy of the recovery engine) are updated identically. Defaults are unchanged except that a progressing turn is no longer terminated by wall-clock age.#1668
d40cc8aThanks @ghostwriternr! - Fix RPC resource leaks in workflows.Workflows that use
waitForApproval()orThinkWorkflow.prompt()now release their RPC stubs promptly, preventing resource leaks and the associated "RPC stub was not disposed" warnings in your logs.#1679
c8d1d32Thanks @threepointone! - fix(sub-agents): a facet sub-agent no longer touches the root DO's WebSockets, fixing a production-only "Cannot perform I/O on behalf of a different Durable Object (Native)" crash (#1677)A sub-agent (facet) that called
setState(),broadcast(), or otherwise enumerated connections — directly or indirectly via the internal_broadcastProtocol()— could crash in production withCannot perform I/O on behalf of a different Durable Object. ... (I/O type: Native). It reproduced when the root Agent held a live (hibernatable) WebSocket connection and the child facet was freshly bootstrapped; it never reproduced inwrangler dev/miniflare, which made it hard to catch.Root cause: the
Agentoverrides ofgetConnections()andgetConnection()fell through tosuper.getConnections()/super.getConnection()for facets too. On a facet, that resolves to the host/root DO's hibernatable WebSockets, and reading their attachments from the facet's I/O context is a cross-DO native I/O access that workerd aborts.setState()tripped it only incidentally, because_broadcastProtocol()enumerates connections to compute its exclude list before sending anything.Fix: a facet's client connections are all virtual (real sockets owned by the root and bridged in), so
getConnections()/getConnection()now return only the facet's virtual sub-agent connections and never fall through to the host DO's sockets. Delivery of facet state updates to clients connected directly to the sub-agent is unchanged.#1670
5d64940Thanks @threepointone! - Fix: a deploy that interrupts an in-flightrunAgentToolchild no longer abandons the still-running child asinterrupted.Parent recovery re-attaches to a still-running child and tails it to its real terminal. Previously that re-attach used a flat 120s wall-clock budget that was not reset by the child's forward progress, so a healthy child whose recovery legitimately ran longer than the budget was sealed
interrupted(and its already-completed work re-run from scratch), even while it was actively streaming.The re-attach budget is now progress-keyed: it bounds how long the parent waits with no forward progress from the child (resetting on every forwarded chunk), so a genuinely hung/silent child still seals
interruptedafter one no-progress window and can never block recovery forever, while a healthy child that keeps streaming is followed through to terminal. The parent re-arms (opens a fresh tail) only when the child's stream closes cleanly while it is still advancing — i.e. a re-evicted-but-progressing child. A full no-progress window (the child went silent) sealsno-progressimmediately even if the child streamed earlier in that window; it no longer grants a bonus window. This is both the honest stall signal and what keeps at most one pending tail reader alive per re-attach (no per-cycle reader accumulation).@cloudflare/thinkand@cloudflare/ai-chatadditionally finalize a child facet's own agent-tool run row as soon as its recovered turn settles — regardless of whether recovery took the continue path (_chatRecoveryContinue) or the pre-stream retry path (_chatRecoveryRetry) — so a re-attached parent collects the terminal result immediately instead of waiting out a full no-progress window after the child has already finished.This release also adds:
RunAgentToolResult, theagentTool()AgentToolFailureenvelope, theonAgentToolFinishlifecycle result, and theagent-tool-eventwire event (kind"interrupted") now carry a machine-readablereason(AgentToolInterruptedReason:"no-progress" | "window-exceeded" | "not-tailable" | "inspect-timeout" | "inspect-failed" | "recovery-deadline") and achildStillRunningboolean oninterruptedresults, so callers (and UIs) can branch on why a run was abandoned (and whether the child is still running) instead of pattern-matching the human-readableerrorprose.retryablestays coarse (alwaystrueforinterrupted); refine withreason/childStillRunning. These fields are persisted (schema bump), so they survive a reconnect replay — a client that reconnects after an interrupt reconstructs the samereason/childStillRunninga live client saw, rather thanundefined. The persisted cause is cleared when a softinterruptedrow is later repaired tocompleted/error.AgentStaticOptions—agentToolReattachNoProgressTimeoutMs(default 120000, the progress-keyed no-progress budget) andagentToolReattachMaxWindowMs(defaultInfinity— no implicit wall-clock cap) — let an Agent tune re-attach. The hard ceiling defaults to uncapped to mirror chat-recovery'smaxRecoveryWork: Infinity: a re-attached parent follows a healthy, still-advancing child for as long as it makes progress — exactly as it would on the live (never-evicted) path — so it never abandons a long-running-but-healthy child that simply outlasts a fixed wall clock under deploy churn. A hung/silent child is bounded by the no-progress budget; a content-runaway is bounded uniformly (live and recovery) by the child's ownmaxRecoveryWork/shouldKeepRecovering. Integrators that want a hard wall-clock cap (and thewindow-exceededchild teardown it triggers) can setagentToolReattachMaxWindowMsto a finite value. Symmetrically, settingagentToolReattachNoProgressTimeoutMstoInfinitynow means "never seal on no-progress" (a silent-but-alive child is followed until its stream closes or the hard ceiling fires) instead of silently skipping the wait —0remains the "don't wait, collect only an already-terminal child" sentinel.window-exceededceiling — where the child has had its full recovery window and is truly exhausted — it now cancels the child (childStillRunning: false) so it stops consuming a fiber / keep-alive.no-progressgive-ups stay soft (childStillRunning: true): the child is left running so a re-issue can still re-attach and repair it if it self-heals, preserving the repair-on-re-issue path. In both@cloudflare/thinkand@cloudflare/ai-chat,cancelAgentToolRunalso aborts an in-flight chat-recovery turn (not just the original in-isolate run) and releases live tails — Think sweeps its_submissionAbortControllers, ai-chat its requestAbortRegistry(abortAllRequests) — so a torn-down child stops grinding instead of finishing an orphaned recovered turn.#1680
8f9500aThanks @threepointone! - Remove the now-redundant_suppressProtocolBroadcastsfacet-bootstrap guard.This flag was added in #1425 to stop
_broadcastProtocol()from enumerating theparent DO's WebSockets during facet bootstrap (the cross-DO Native I/O crash,
#1410/#1677). The proper fix in #1679 makes
getConnections()/broadcast()facet-safe at the source — on a facet they return only virtual sub-agent
connections and route through the parent bridge, never touching the parent's own
sockets. With that, suppressing broadcasts during bootstrap is unnecessary, and
removing it also lets legitimate state sync run during the bootstrap window.
The separate request/WebSocket/email native-handle clearing from #1425 is
retained, since #1679 does not cover that vector.
#1675
d915bc6Thanks @threepointone! - The skill runner now importsjust-bashand@cloudflare/codemodestatically instead of dynamically, and both have moved from optional peer dependencies to regular dependencies ofagents. The dynamic imports were ineffective in bundled Workers (the bundler includes them eagerly regardless) and triggeredINEFFECTIVE_DYNAMIC_IMPORTwarnings when bundled alongside@cloudflare/think, which imports them statically.@cloudflare/thinkalso now statically imports its internalExtensionManagerinstead of dynamically, removing the third such warning.#1662
df6c0d6Thanks @threepointone! - Add opt-in recovery for mid-turn context-window overflow.Compaction only fires between turns (
Session.compactAfterchecks the threshold onappendMessage). A single long, tool-heavy turn grows the prompt step-by-step inside onestreamTextloop and can exceed the model's context window mid-turn, before the next pre-turn check — the provider then 400s ("prompt is too long"/context_length_exceeded) and the turn dies terminally. Think deliberately ships no provider-specific error matching, so it could neither detect nor recover from this.This adds opt-in, provider-agnostic recovery (all default off — no behavior change unless enabled), configured through a single
contextOverflowproperty onThink:classifyChatError(error, ctx)— the app maps a raw error (or the in-stream error string) to aChatErrorClassification("context_overflow" | "rate_limit" | "transient" | "fatal" | "unknown"). Same framework-owns-the-mechanism / app-owns-the-provider-knowledge split astokenCounter. The classification is also threaded toonChatError/observers viaChatErrorContext.classification. The bundled, exporteddefaultContextOverflowClassifiercovers the common providers (Anthropic, OpenAI, Google, Bedrock, …) for apps that do not need custom classification.contextOverflow.reactive+contextOverflow.maxRetries— when a turn fails with acontext_overflowthe app classified, Think discards the truncated partial, runssession.compact(), and re-runs the turn (bounded) from the compacted history instead of dying. The partial is intentionally not persisted: the retry restarts the turn from scratch, so keeping the cut-off partial would orphan a half-finished assistant message beside the recovered answer (and duplicate any tool work the retry re-issues). A no-op compaction or a spent budget surfaces the overflow terminally throughonChatErrorwithclassification: "context_overflow"— never a silent end, never an infinite loop. WiredConfiguration
📅 Schedule: (UTC)
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR was generated by Mend Renovate. View the repository job log.