checkpoint: preserve recency semantics for checkpoint-only before cursors#7105
checkpoint: preserve recency semantics for checkpoint-only before cursors#7105David Ahmann (davidahmann) wants to merge 3 commits intolangchain-ai:mainfrom
Conversation
|
User impact: This keeps latest-checkpoint lookup and cursor pagination deterministic even when checkpoint IDs are not lexicographically ordered.
Inspired by research context: CAISI publishes independent, reproducible AI agent governance research: https://caisi.dev |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4acb4c7a56
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "codex (@codex) address that feedback".
| if checkpoint_id is None: | ||
| return None | ||
|
|
||
| thread_id = config["configurable"]["thread_id"] |
There was a problem hiding this comment.
Handle checkpoint-only
before cursor in memory saver
InMemorySaver.list() now routes through _before_sort_key(), which unconditionally indexes before["configurable"]["thread_id"]. Existing callers pass a checkpoint-only cursor (for example ReplayCursor.get_checkpoint uses before={"configurable": {"checkpoint_id": ...}} in libs/langgraph/langgraph/_internal/_replay.py), so this change raises KeyError instead of returning prior checkpoints and breaks replay flows that previously worked.
Useful? React with 👍 / 👎.
| wheres.append("checkpoint_id < ?") | ||
| param_values.append(get_checkpoint_id(before)) | ||
| if before is not None and (before_checkpoint_id := get_checkpoint_id(before)): | ||
| before_thread_id = before["configurable"]["thread_id"] |
There was a problem hiding this comment.
Allow SQLite
before filters without thread metadata
search_where() now requires before["configurable"]["thread_id"] and checkpoint_ns, but before cursors are commonly provided with only checkpoint_id (including replay calls and existing examples). In that case SqliteSaver.list() / AsyncSqliteSaver.alist() now fail with KeyError instead of applying the cursor filter, which is a backward-incompatible runtime regression.
Useful? React with 👍 / 👎.
| wheres.append("checkpoint_id < %s ") | ||
| param_values.append(get_checkpoint_id(before)) | ||
| if before is not None and (before_checkpoint_id := get_checkpoint_id(before)): | ||
| before_thread_id = before["configurable"]["thread_id"] |
There was a problem hiding this comment.
Preserve Postgres support for checkpoint-only
before cursors
BasePostgresSaver._search_where() now assumes before always contains thread_id/checkpoint_ns, but callers can legally pass only checkpoint_id (for example the replay path in libs/langgraph/langgraph/_internal/_replay.py). With this change, list()/alist() raise KeyError for that input instead of returning paged history, causing replay/history queries to crash for Postgres-backed checkpointing.
Useful? React with 👍 / 👎.
|
User impact: Replay-style callers that pass
Inspired by research context: CAISI publishes independent, reproducible AI agent governance research: https://caisi.dev |
821d266 to
5712fae
Compare
|
This change addresses operator-facing recency drift when checkpoint-only |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5712faec92
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "codex (@codex) address that feedback".
| if saved is None: | ||
| return None |
There was a problem hiding this comment.
Preserve
before bound when cursor checkpoint is missing
ReplayState uses before={"configurable": {"checkpoint_id": ...}} as an upper-bound cursor for nested replay (libs/langgraph/langgraph/_internal/_replay.py and _loop.py around the replay-state setup), and that checkpoint ID can belong to the parent namespace rather than the current subgraph namespace. In this path _before_sort_key() returns None when the ID is not found locally, and list() then skips before filtering entirely, which allows memory-backed replay to load checkpoints created after the replay point instead of only historical ones.
Useful? React with 👍 / 👎.
| rowid < ( | ||
| SELECT rowid | ||
| FROM checkpoints | ||
| WHERE thread_id = ? AND checkpoint_ns = ? AND checkpoint_id = ? | ||
| ) |
There was a problem hiding this comment.
Handle absent
before row in SQLite cursor predicate
Nested replay passes a checkpoint-only before cursor as a time bound (libs/langgraph/langgraph/_internal/_replay.py:66-69), and that ID is not guaranteed to exist in the target subgraph namespace (_loop.py replay comments explain this bound behavior). This new predicate requires finding an exact (thread_id, checkpoint_ns, checkpoint_id) row first; when it is absent, the subquery yields NULL and rowid < NULL excludes all rows, so SqliteSaver.list()/AsyncSqliteSaver.alist() can return no checkpoint and incorrectly reinitialize subgraph state.
Useful? React with 👍 / 👎.
| checkpoint->>'ts' < ( | ||
| SELECT checkpoint->>'ts' | ||
| FROM checkpoints | ||
| WHERE thread_id = %s AND checkpoint_ns = %s AND checkpoint_id = %s | ||
| ) |
There was a problem hiding this comment.
Handle absent
before row in Postgres cursor predicate
The replay flow uses before as an upper-bound checkpoint ID that may come from the parent graph rather than the current namespace (libs/langgraph/langgraph/_internal/_replay.py and _loop.py replay setup), but this predicate now depends on a subquery for that exact row in the current (thread_id, checkpoint_ns). If the row does not exist, the subquery returns NULL, both timestamp comparisons evaluate false, and PostgresSaver.list()/AsyncPostgresSaver.alist() return no historical checkpoint, breaking nested replay restoration.
Useful? React with 👍 / 👎.
5712fae to
167068d
Compare
|
This update addresses the unresolved checkpoint-only |
Problem
Checkpoint pagination/lookup could drift from recency semantics when
beforecursors are checkpoint-only and non-lexicographic IDs are used. Additionally, checkpoint-onlybeforebounds from a different namespace in the same thread were not consistently honored across memory/sqlite/postgres paths.Why now
This is a contract consistency issue for checkpoint history traversal (
get_tuple,list, andbeforehandling) and affects replay-style callers that pass checkpoint-only bounds.What changed
beforeresolution in memory saver.beforepredicates to resolve checkpoint-only bounds by thread + checkpoint ID (not only current namespace), including safe fallback when bound checkpoint is absent.Validation
cd libs/checkpoint && TEST=tests/test_memory.py make testcd libs/checkpoint-sqlite && TEST='tests/test_sqlite.py tests/test_aiosqlite.py' make testcd libs/checkpoint-postgres && uv run pytest tests/test_search_where.pycd libs/checkpoint && make format && make lintcd libs/checkpoint-sqlite && make format && make lintcd libs/checkpoint-postgres && make format && make lintcd libs/checkpoint-postgres && TEST=tests/test_search_where.py make test(blocked locally: Docker daemon unavailable for the Postgres test harness)Refs #6922