You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PraisonAI PR #2155 (closes issue #2153) added integrity checks to the gateway's resume-after-disconnect protocol. These are wire-level additions to the joined acknowledgement and to every outbound event — they are not yet documented in PraisonAIDocs. Client integrators reading docs/features/gateway.mdx today would still believe a resumed: true ack means they are caught up, when in fact the gateway can now explicitly signal that they must drop local state and resync from a fresh snapshot.
The PR added:
oldest_cursor on the joined ack — floor of what can be replayed (oldest event still in the buffer).
resync_required: bool on the joined ack — true when the requested since cursor is below oldest_cursor (i.e. the event buffer was trimmed past it). Tells the client: drop local state, take the snapshot.
New snapshot message type — when resync_required=true, the gateway sends a {"type": "snapshot", "state": {...}} frame instead of a partial replay stream. The snapshot carries session_id, agent_id, state, full messages history, and current event_cursor.
Top-level seq field on every delivered event — monotonic sequence so clients can detect a mid-stream gap cheaply (without digging into nested event.data.cursor). Applied to response, message, stream_end, error, token_stream, and tool_call_stream event types.
since parameter validation — non-integer since now returns an {"type": "error", "message": "Invalid 'since' cursor. Must be an integer."} frame instead of being silently ignored.
Two new session-level helpers (get_oldest_cursor(), check_resync_required(), get_snapshot()) drive the above on the server.
These additions are backward-compatible (additive only) — older clients that ignore the new fields keep working — but new clients should read them to avoid silent data loss.
Where to put the docs
Per AGENTS.md §1.8, AI agents must not touch docs/concepts/. All new content goes under docs/features/.
There are two reasonable shapes. Pick option A unless the maintainer signals otherwise — it keeps the resume contract co-located with the gateway page that already explains sessions and events.
Option A (recommended) — extend docs/features/gateway.mdx
Add a new top-level section ## Resume Integrity to docs/features/gateway.mdx, placed after ## Event Types and before ## Common Patterns. Content guidance below.
Also:
Add an oldest_cursor row + a resync_required row + a snapshot row to the existing event/message reference area.
Add a <Tip> near the top of the page pointing at the new section: "Reconnecting after a disconnect? See Resume Integrity before trusting a resumed: true ack."
Option B (only if section grows past ~150 lines) — new page docs/features/gateway-resume-integrity.mdx
If the content is large enough to deserve its own page, create docs/features/gateway-resume-integrity.mdx and add it to docs.json next to the other docs/features/gateway* entries (same Features group as gateway, gateway-overview, gateway-cli, gateway-error-handling). Then leave a one-line <Tip> + Related card on gateway.mdx pointing at it.
Do not modify docs/concepts/, docs/sdk/reference/** (auto-generated), docs/js/, or docs/rust/.
SDK ground truth (read before writing)
Per AGENTS.md §1.2 "SDK-First Documentation Cycle", read these files first — the doc must mirror them exactly:
What
Path in MervinPraison/PraisonAI
Protocol envelope — wire-format docstring for joined/snapshot/seq
src/praisonai-agents/praisonaiagents/gateway/protocols.py (around the GatewayEvent dataclass; the PR added a "Wire Protocol Extensions" + "Resume Protocol" block to the docstring)
src/praisonai/praisonai/gateway/server.py (_handle_client_message → join branch, ~lines 1347–1430 after the PR)
Top-level seq emission on outbound events
src/praisonai/praisonai/gateway/server.py (_send_to_client, ~lines 1626–1660 after the PR) — note the expanded set of event types that now trigger session tracking (response, message, stream_end, error, token_stream, tool_call_stream)
Resume helper note about callers having to check check_resync_required
src/praisonai/praisonai/gateway/server.py (resume_or_create_session docstring, ~line 1938 after the PR)
Default buffer bound (why this matters at all)
src/praisonai-agents/praisonaiagents/gateway/config.py (max_messages = 1000; buffer rolls past ~2000 events)
Something like: "When the gateway's event buffer has rolled past your last cursor, the joined ack now tells you so — and hands you a full snapshot instead of a partial replay — so a reconnecting client can never silently miss events."
Hero Mermaid (use AGENTS.md §3.1 colors, white text)
Suggested — a decision diagram covering both happy and resync paths:
graph TB
Join[📨 join + since=N] --> Check{🔍 N < oldest_cursor?}
Check -->|No, in buffer| Replay[📜 joined + replay events N+1..head]
Check -->|Yes, trimmed| Resync[⚠ joined + resync_required=true]
Resync --> Snapshot[📦 snapshot with full state]
Replay --> Done[✅ Client up to date]
Snapshot --> Done
classDef request fill:#6366F1,stroke:#7C90A0,color:#fff
classDef decision fill:#F59E0B,stroke:#7C90A0,color:#fff
classDef happy fill:#10B981,stroke:#7C90A0,color:#fff
classDef warn fill:#8B0000,stroke:#7C90A0,color:#fff
class Join request
class Check decision
class Replay,Done happy
class Resync,Snapshot warn
Loading
How the resume contract works now (sequence diagram)
Two sequence diagrams (or one with an alt block). Both start with a client sending {"type": "join", "agent_id": "support", "since": N}.
Server sends a single {"type": "snapshot", "state": {session_id, agent_id, state, messages, event_cursor}} frame.
Client discards local state and rebuilds from state.messages + state.event_cursor.
joined ack — wire fields table
Extracted from server.py_handle_client_message after the PR. Mark the new fields clearly:
Field
Type
New in #2155
Description
type
"joined"
Frame type
session_id
str
Session UUID — pass back on next reconnect
agent_id
str
Agent the client joined
resumed
bool
true if an existing session was resumed
cursor
int
Current head cursor on the server
oldest_cursor
int
✅
Oldest event still in the replay buffer
resync_required
bool
✅
true when since < oldest_cursor — drop local state
sequence
int | null
Sequence aligned with replay events
protocol_version
int
Negotiated protocol version
server_min_version
int
server_max_version
int
presence
list[dict]
Presence snapshot
health
dict
Gateway health
snapshot frame — wire fields table
Extracted from GatewaySession.get_snapshot():
Field
Type
Description
type
"snapshot"
Frame type
state.session_id
str
Session UUID
state.agent_id
str
Agent ID
state.state
dict
Free-form session state
state.messages
list[dict]
Full message history (content, sender_id, session_id, message_id, timestamp, metadata)
state.event_cursor
int
Current event cursor — use as your next since
Top-level seq on every event
Document that, in addition to the cursor nested in event.data.cursor, every outbound frame of these types now carries a top-levelseq:
response
message
stream_end
error
token_stream(new in #2155 — previously not session-tracked)
tool_call_stream(new in #2155 — previously not session-tracked)
Show a 3-line client snippet for cheap gap detection:
expected=last_seq+1ifevent["seq"] !=expected:
# gap — resume from last good cursor or request snapshotawaitclient.resync(since=last_seq)
last_seq=event["seq"]
since validation
A one-paragraph note: since must be an integer. The gateway now rejects non-integer values with {"type": "error", "message": "Invalid 'since' cursor. Must be an integer."} and drops the join. Useful for catching off-by-string-type bugs in client code.
User interaction flow (AGENTS.md §1.1 item 11)
Short narrative describing what the user actually sees:
Your Telegram bot stays connected through a 20-minute commute on flaky LTE. The session ID survives. On reconnect, normally the gateway just streams the dozen messages that arrived while you were offline. But if the buffer rolled (very long disconnect + busy session), the gateway sends a one-shot snapshot instead of a partial replay — your client UI rebuilds from authoritative state in a single round trip. Either way, no message is silently missed.
Best Practices (<AccordionGroup>)
3–4 accordions. Suggestions:
"Don't trust resumed: true alone — check resync_required." — Old clients that only read resumed will silently diverge from gateway state if the buffer rolls. New field is the explicit signal.
"Persist event_cursor from the snapshot as your next since." — After applying a snapshot, the next reconnect should pass since=state.event_cursor, not the old pre-snapshot value.
"Use the top-level seq for cheap mid-stream gap detection." — Cheaper than digging into event.data.cursor; same value, exposed at the envelope.
"Raise max_messages if your sessions are bursty." — Default is 1000 in praisonaiagents/gateway/config.py, so the buffer rolls after ~2000 events. Bump it (or expect resyncs) for high-volume sessions.
Required edits to the existing docs/features/gateway.mdx
(Only the additions — do not rewrite the existing page.)
Add a <Tip> near the top, right after the hero Mermaid:
"Reconnecting after a disconnect? See Resume Integrity — the joined ack now signals when you must take a snapshot instead of trusting partial replay."
Append the new ## Resume Integrity section between ## Event Types and ## Common Patterns (see content guidance above).
Extend the "Event Types" table with a note that token_stream and tool_call_stream are now also session-tracked (so they carry top-level seq and survive resume).
Extend the Related CardGroup with a card pointing at the new section anchor — or, in Option B, a card pointing at gateway-resume-integrity.mdx.
docs.json change (only if Option B)
If we end up creating docs/features/gateway-resume-integrity.mdx, insert one line in the same Features group that already lists the other docs/features/gateway* pages:
TypeScript and Rust SDKs are not touched by #2155. Their reference pages under docs/sdk/reference/typescript/** and docs/sdk/reference/rust/** will be regenerated by the parity tooling — do not hand-edit them.
Migration guide for old clients is not required — the fields are additive and old clients keep working (they just lose the new safety). A <Note> in Best Practices flagging this is enough.
Summary
PraisonAI PR #2155 (closes issue #2153) added integrity checks to the gateway's resume-after-disconnect protocol. These are wire-level additions to the
joinedacknowledgement and to every outbound event — they are not yet documented in PraisonAIDocs. Client integrators readingdocs/features/gateway.mdxtoday would still believe aresumed: trueack means they are caught up, when in fact the gateway can now explicitly signal that they must drop local state and resync from a fresh snapshot.The PR added:
oldest_cursoron thejoinedack — floor of what can be replayed (oldest event still in the buffer).resync_required: boolon thejoinedack —truewhen the requestedsincecursor is belowoldest_cursor(i.e. the event buffer was trimmed past it). Tells the client: drop local state, take the snapshot.snapshotmessage type — whenresync_required=true, the gateway sends a{"type": "snapshot", "state": {...}}frame instead of a partialreplaystream. The snapshot carriessession_id,agent_id,state, fullmessageshistory, and currentevent_cursor.seqfield on every delivered event — monotonic sequence so clients can detect a mid-stream gap cheaply (without digging into nestedevent.data.cursor). Applied toresponse,message,stream_end,error,token_stream, andtool_call_streamevent types.sinceparameter validation — non-integersincenow returns an{"type": "error", "message": "Invalid 'since' cursor. Must be an integer."}frame instead of being silently ignored.Two new session-level helpers (
get_oldest_cursor(),check_resync_required(),get_snapshot()) drive the above on the server.Reference PR: MervinPraison/PraisonAI#2155
Reference issue: MervinPraison/PraisonAI#2153
These additions are backward-compatible (additive only) — older clients that ignore the new fields keep working — but new clients should read them to avoid silent data loss.
Where to put the docs
There are two reasonable shapes. Pick option A unless the maintainer signals otherwise — it keeps the resume contract co-located with the gateway page that already explains sessions and events.
Option A (recommended) — extend
docs/features/gateway.mdxAdd a new top-level section
## Resume Integritytodocs/features/gateway.mdx, placed after## Event Typesand before## Common Patterns. Content guidance below.Also:
oldest_cursorrow + aresync_requiredrow + asnapshotrow to the existing event/message reference area.<Tip>near the top of the page pointing at the new section: "Reconnecting after a disconnect? See Resume Integrity before trusting aresumed: trueack."Option B (only if section grows past ~150 lines) — new page
docs/features/gateway-resume-integrity.mdxIf the content is large enough to deserve its own page, create
docs/features/gateway-resume-integrity.mdxand add it todocs.jsonnext to the otherdocs/features/gateway*entries (same Features group asgateway,gateway-overview,gateway-cli,gateway-error-handling). Then leave a one-line<Tip>+ Related card ongateway.mdxpointing at it.Do not modify
docs/concepts/,docs/sdk/reference/**(auto-generated),docs/js/, ordocs/rust/.SDK ground truth (read before writing)
Per AGENTS.md §1.2 "SDK-First Documentation Cycle", read these files first — the doc must mirror them exactly:
MervinPraison/PraisonAIjoined/snapshot/seqsrc/praisonai-agents/praisonaiagents/gateway/protocols.py(around theGatewayEventdataclass; the PR added a "Wire Protocol Extensions" + "Resume Protocol" block to the docstring)get_oldest_cursor,check_resync_required,get_snapshotsrc/praisonai/praisonai/gateway/server.py(on theGatewaySessionclass, around theadd_event/get_events_sinceblock, ~lines 127–170 after the PR)sincevalidation,oldest_cursor/resync_requiredemission, snapshot-vs-replay branchsrc/praisonai/praisonai/gateway/server.py(_handle_client_message→joinbranch, ~lines 1347–1430 after the PR)seqemission on outbound eventssrc/praisonai/praisonai/gateway/server.py(_send_to_client, ~lines 1626–1660 after the PR) — note the expanded set of event types that now trigger session tracking (response,message,stream_end,error,token_stream,tool_call_stream)check_resync_requiredsrc/praisonai/praisonai/gateway/server.py(resume_or_create_sessiondocstring, ~line 1938 after the PR)src/praisonai-agents/praisonaiagents/gateway/config.py(max_messages = 1000; buffer rolls past ~2000 events)The PR diff is the most direct view of the surface area: https://github.com/MervinPraison/PraisonAI/pull/2155/files
Do not invent fields. Document only what exists in those files.
Required content for the Resume Integrity section
Follow AGENTS.md §2 "Page Structure Template" for tone/layout. Concrete guidance below.
One-sentence intro
Something like: "When the gateway's event buffer has rolled past your last cursor, the
joinedack now tells you so — and hands you a full snapshot instead of a partial replay — so a reconnecting client can never silently miss events."Hero Mermaid (use AGENTS.md §3.1 colors, white text)
Suggested — a decision diagram covering both happy and resync paths:
graph TB Join[📨 join + since=N] --> Check{🔍 N < oldest_cursor?} Check -->|No, in buffer| Replay[📜 joined + replay events N+1..head] Check -->|Yes, trimmed| Resync[⚠ joined + resync_required=true] Resync --> Snapshot[📦 snapshot with full state] Replay --> Done[✅ Client up to date] Snapshot --> Done classDef request fill:#6366F1,stroke:#7C90A0,color:#fff classDef decision fill:#F59E0B,stroke:#7C90A0,color:#fff classDef happy fill:#10B981,stroke:#7C90A0,color:#fff classDef warn fill:#8B0000,stroke:#7C90A0,color:#fff class Join request class Check decision class Replay,Done happy class Resync,Snapshot warnHow the resume contract works now (sequence diagram)
Two sequence diagrams (or one with an alt block). Both start with a client sending
{"type": "join", "agent_id": "support", "since": N}.Happy path (
sincestill in buffer):{"type": "joined", "cursor": HEAD, "oldest_cursor": OLDEST, "resync_required": false, "sequence": ..., ...}.{"type": "replay", "event": {...}, "seq": <cursor>}frames for each missed event.Resync path (
since < oldest_cursor):{"type": "joined", "cursor": HEAD, "oldest_cursor": OLDEST, "resync_required": true, ...}.{"type": "snapshot", "state": {session_id, agent_id, state, messages, event_cursor}}frame.state.messages+state.event_cursor.joinedack — wire fields tableExtracted from
server.py_handle_client_messageafter the PR. Mark the new fields clearly:type"joined"session_idstragent_idstrresumedbooltrueif an existing session was resumedcursorintoldest_cursorintresync_requiredbooltruewhensince < oldest_cursor— drop local statesequenceint|nullprotocol_versionintserver_min_versionintserver_max_versionintpresencelist[dict]healthdictsnapshotframe — wire fields tableExtracted from
GatewaySession.get_snapshot():type"snapshot"state.session_idstrstate.agent_idstrstate.statedictstate.messageslist[dict]content,sender_id,session_id,message_id,timestamp,metadata)state.event_cursorintsinceTop-level
seqon every eventDocument that, in addition to the cursor nested in
event.data.cursor, every outbound frame of these types now carries a top-levelseq:responsemessagestream_enderrortoken_stream(new in #2155 — previously not session-tracked)tool_call_stream(new in #2155 — previously not session-tracked)Show a 3-line client snippet for cheap gap detection:
sincevalidationA one-paragraph note:
sincemust be an integer. The gateway now rejects non-integer values with{"type": "error", "message": "Invalid 'since' cursor. Must be an integer."}and drops the join. Useful for catching off-by-string-type bugs in client code.User interaction flow (AGENTS.md §1.1 item 11)
Short narrative describing what the user actually sees:
Best Practices (
<AccordionGroup>)3–4 accordions. Suggestions:
resumed: truealone — checkresync_required." — Old clients that only readresumedwill silently diverge from gateway state if the buffer rolls. New field is the explicit signal.event_cursorfrom the snapshot as your nextsince." — After applying a snapshot, the next reconnect should passsince=state.event_cursor, not the old pre-snapshot value.seqfor cheap mid-stream gap detection." — Cheaper than digging intoevent.data.cursor; same value, exposed at the envelope.max_messagesif your sessions are bursty." — Default is1000inpraisonaiagents/gateway/config.py, so the buffer rolls after ~2000 events. Bump it (or expect resyncs) for high-volume sessions.Common Patterns (
<Tabs>recommended)asyncioloop that handlesjoined,replay,snapshotcorrectly.resync_required, clears local state, rebuilds fromstate.messages.seqto detect mid-stream gaps and triggersresync().Imports must stay friendly per AGENTS.md §6.1:
(Don't introduce
from praisonai.gateway.server import GatewaySessionexamples — that's internal.)Related (
<CardGroup cols={2}>)docs/features/gateway.mdx(or the section anchor if doing Option A)docs/features/push-notifications.mdx— sibling delivery storydocs/features/gateway-client.mdxlands first) link to it as the client-side counterpartRequired edits to the existing
docs/features/gateway.mdx(Only the additions — do not rewrite the existing page.)
Add a
<Tip>near the top, right after the hero Mermaid:Append the new
## Resume Integritysection between## Event Typesand## Common Patterns(see content guidance above).Extend the "Event Types" table with a note that
token_streamandtool_call_streamare now also session-tracked (so they carry top-levelseqand survive resume).Extend the Related
CardGroupwith a card pointing at the new section anchor — or, in Option B, a card pointing atgateway-resume-integrity.mdx.docs.jsonchange (only if Option B)If we end up creating
docs/features/gateway-resume-integrity.mdx, insert one line in the same Features group that already lists the otherdocs/features/gateway*pages:"docs/features/gateway", "docs/features/gateway-overview", + "docs/features/gateway-resume-integrity", "docs/features/gateway-cli", "docs/features/gateway-error-handling",Verify the file remains valid JSON after editing (AGENTS.md §1.9 rule 8).
For Option A (extend
gateway.mdx), nodocs.jsonchange is needed.Quality checklist for the agent picking this up
Before opening the PR, confirm every item from AGENTS.md §9:
protocols.py,server.py, andconfig.pyfirst — no guessed field names or types.oldest_cursor,resync_required,snapshotframe, top-levelseq,sincevalidation error.joinedack table lists every field (existing + new) with the new ones flagged.snapshot.statesub-fields matchGatewaySession.get_snapshot()exactly (session_id,agent_id,state,messages,event_cursor).token_streamandtool_call_streamare now session-tracked.sincevalidation error message documented verbatim.classDefblocks.from praisonai.gateway.server import …in user-facing examples.your-key-hereplaceholders.docs.jsonis still valid JSON; new slot sits inside the existing Features group (not Concepts, not auto-generated SDK reference).docs/concepts/,docs/sdk/reference/**,docs/js/,docs/rust/.Out of scope
GatewayClientchange in this PR —GatewayClient's gap callback already exists from #2131 (see open docs issue docs: add Gateway Reconnecting Client + Protocol Version Negotiation (PraisonAI PR #2131 / Issue #2130) #701). Don't changeGatewayClient's docs here; only describe the wire contract a client should now read.docs/sdk/reference/typescript/**anddocs/sdk/reference/rust/**will be regenerated by the parity tooling — do not hand-edit them.<Note>in Best Practices flagging this is enough.cc @MervinPraison