langchain-ai · Sydney Runkle (sydney-runkle) · Mar 9, 2026 · Mar 4, 2026 · Mar 4, 2026 · Mar 5, 2026
diff --git a/src/oss/langgraph/persistence.mdx b/src/oss/langgraph/persistence.mdx
@@ -4,7 +4,7 @@ title: Persistence
 
 
 
-LangGraph has a built-in persistence layer, implemented through checkpointers. When you compile a graph with a checkpointer, the checkpointer saves a `checkpoint` of the graph state at every super-step. Those checkpoints are saved to a `thread`, which can be accessed after graph execution. Because `threads` allow access to graph's state after execution, several powerful capabilities including human-in-the-loop, memory, time travel, and fault-tolerance are all possible. Below, we'll discuss each of these concepts in more detail.
+LangGraph has a built-in persistence layer that saves graph state as checkpoints. When you compile a graph with a checkpointer, a snapshot of the graph state is saved at every step of execution, organized into threads. This enables human-in-the-loop workflows, conversational memory, time travel debugging, and fault-tolerant execution.
 
 ![Checkpoints](/oss/images/checkpoints.jpg)
 
@@ -13,7 +13,19 @@ LangGraph has a built-in persistence layer, implemented through checkpointers. W
 When using the [Agent Server](/langsmith/agent-server), you don't need to implement or configure checkpointers manually. The server handles all persistence infrastructure for you behind the scenes.
 </Info>
 
-## Threads
+## Why use persistence
+
+Persistence is required for the following features:
+
+- **Human-in-the-loop**: Checkpointers facilitate [human-in-the-loop workflows](/oss/langgraph/interrupts) by allowing humans to inspect, interrupt, and approve graph steps. Checkpointers are needed for these workflows as the person has to be able to view the state of a graph at any point in time, and the graph has to be able to resume execution after the person has made any updates to the state. See [Interrupts](/oss/langgraph/interrupts) for examples.
+- **Memory**: Checkpointers allow for ["memory"](/oss/concepts/memory) between interactions. In the case of repeated human interactions (like conversations) any follow up messages can be sent to that thread, which will retain its memory of previous ones. See [Add memory](/oss/langgraph/add-memory) for information on how to add and manage conversation memory using checkpointers.
+- **Time travel**: Checkpointers allow for ["time travel"](/oss/langgraph/use-time-travel), allowing users to replay prior graph executions to review and / or debug specific graph steps. In addition, checkpointers make it possible to fork the graph state at arbitrary checkpoints to explore alternative trajectories.
+- **Fault-tolerance**: Checkpointing provides fault-tolerance and error recovery: if one or more nodes fail at a given superstep, you can restart your graph from the last successful step.
+- **Pending writes**: When a graph node fails mid-execution at a given [super-step](#super-steps), LangGraph stores pending checkpoint writes from any other nodes that completed successfully at that super-step. When you resume graph execution from that super-step you don't re-run the successful nodes.
+
+## Core concepts
+
+### Threads
 
 A thread is a unique ID or thread identifier assigned to each checkpoint saved by a checkpointer. It contains the accumulated state of a sequence of [runs](/langsmith/assistants#execution). When a run is executed, the [state](/oss/langgraph/graph-api#state) of the underlying graph of the assistant will be persisted to the thread.
 
@@ -39,15 +51,13 @@ A thread's current and historical state can be retrieved. To persist state, a th
 
 The checkpointer uses `thread_id` as the primary key for storing and retrieving checkpoints. Without it, the checkpointer cannot save state or resume execution after an [interrupt](/oss/langgraph/interrupts), since the checkpointer uses `thread_id` to load the saved state.
 
-## Checkpoints
+### Checkpoints
 
-The state of a thread at a particular point in time is called a checkpoint. Checkpoint is a snapshot of the graph state saved at each super-step and is represented by `StateSnapshot` object with the following key properties:
+The state of a thread at a particular point in time is called a checkpoint. A checkpoint is a snapshot of the graph state saved at each [super-step](#super-steps) and is represented by a `StateSnapshot` object (see [StateSnapshot fields](#statesnapshot-fields) for the full field reference).
 
-* `config`: Config associated with this checkpoint.
-* `metadata`: Metadata associated with this checkpoint.
-* `values`: Values of the state channels at this point in time.
-* `next` A tuple of the node names to execute next in the graph.
-* `tasks`: A tuple of `PregelTask` objects that contain information about next tasks to be executed. If the step was previously attempted, it will include error information. If a graph was interrupted [dynamically](/oss/langgraph/interrupts#pause-using-interrupt) from within a node, tasks will contain additional data associated with interrupts.
+#### Super-steps
+
+LangGraph created a checkpoint at each **super-step** boundary. A super-step is a single "tick" of the graph where all nodes scheduled for that step execute (potentially in parallel). For a sequential graph like `START -> A -> B -> END`, there are separate super-steps for the input, node A, and node B — producing a checkpoint after each one. Understanding super-step boundaries is important for [time travel](/oss/langgraph/use-time-travel), because you can only resume execution from a checkpoint (i.e., a super-step boundary).
 
 Checkpoints are persisted and can be used to restore the state of a thread at a later time.
 
@@ -145,6 +155,40 @@ After we run the graph, we expect to see exactly 4 checkpoints:
 Note that the `bar` channel values contain outputs from both nodes as we have a reducer for the `bar` channel.
 :::
 
+#### Checkpoint namespace
+
+Each checkpoint has a `checkpoint_ns` (checkpoint namespace) field that identifies which graph or subgraph it belongs to:
+
+- **`""`** (empty string): The checkpoint belongs to the parent (root) graph.
+- **`"node_name:uuid"`**: The checkpoint belongs to a subgraph invoked as the given node. For nested subgraphs, namespaces are joined with `|` separators (e.g., `"outer_node:uuid|inner_node:uuid"`).
+
+You can access the checkpoint namespace from within a node via the config:
+
+:::python
+```python
+from langchain_core.runnables import RunnableConfig
+
+def my_node(state: State, config: RunnableConfig):
+    checkpoint_ns = config["configurable"]["checkpoint_ns"]
+    # "" for the parent graph, "node_name:uuid" for a subgraph
+```
+:::
+
+:::js
+```typescript
+import { RunnableConfig } from "@langchain/core/runnables";
+
+function myNode(state: typeof State.Type, config: RunnableConfig) {
+  const checkpointNs = config.configurable?.checkpoint_ns;
+  // "" for the parent graph, "node_name:uuid" for a subgraph
+}
+```
+:::
+
+See [Subgraphs](/oss/langgraph/use-subgraphs) for more details on working with subgraph state and checkpoints.
+
+## Get and update state
+
 ### Get state
 
 :::python
@@ -227,6 +271,36 @@ StateSnapshot {
 ```
 :::
 
+#### StateSnapshot fields
+
+:::python
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `values` | `dict` | State channel values at this checkpoint. |
+| `next` | `tuple[str, ...]` | Node names to execute next. Empty `()` means the graph is complete. |
+| `config` | `dict` | Contains `thread_id`, `checkpoint_ns`, and `checkpoint_id`. |
+| `metadata` | `dict` | Execution metadata. Contains `source` (`"input"`, `"loop"`, or `"update"`), `writes` (node outputs), and `step` (super-step counter). |
+| `created_at` | `str` | ISO 8601 timestamp of when this checkpoint was created. |
+| `parent_config` | `dict \| None` | Config of the previous checkpoint. `None` for the first checkpoint. |
+| `tasks` | `tuple[PregelTask, ...]` | Tasks to execute at this step. Each task has `id`, `name`, `error`, `interrupts`, and optionally `state` (subgraph snapshot, when using `subgraphs=True`). |
+
+:::
+
+:::js
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `values` | `object` | State channel values at this checkpoint. |
+| `next` | `string[]` | Node names to execute next. Empty `[]` means the graph is complete. |
+| `config` | `object` | Contains `thread_id`, `checkpoint_ns`, and `checkpoint_id`. |
+| `metadata` | `object` | Execution metadata. Contains `source` (`"input"`, `"loop"`, or `"update"`), `writes` (node outputs), and `step` (super-step counter). |
+| `createdAt` | `string` | ISO 8601 timestamp of when this checkpoint was created. |
+| `parentConfig` | `object \| null` | Config of the previous checkpoint. `null` for the first checkpoint. |
+| `tasks` | `PregelTask[]` | Tasks to execute at this step. Each task has `id`, `name`, `error`, `interrupts`, and optionally `state` (subgraph snapshot, when using `subgraphs: true`). |
+
+:::
+
 ### Get state history
 
 :::python
@@ -420,142 +494,74 @@ In our example, the output of `getStateHistory` will look like this:
 
 ![State](/oss/images/get_state.jpg)
 
-### Replay
-
-It's also possible to play-back a prior graph execution. If we `invoke` a graph with a `thread_id` and a `checkpoint_id`, then we will _re-play_ the previously executed steps _before_ a checkpoint that corresponds to the `checkpoint_id`, and only execute the steps _after_ the checkpoint.
+#### Find a specific checkpoint
 
-* `thread_id` is the ID of a thread.
-* `checkpoint_id` is an identifier that refers to a specific checkpoint within a thread.
-
-You must pass these when invoking the graph as part of the `configurable` portion of the config:
+You can filter the state history to find checkpoints matching specific criteria:
 
 :::python
 ```python
-config = {"configurable": {"thread_id": "1", "checkpoint_id": "0c62ca34-ac19-445d-bbb0-5b4984975b2a"}}
-graph.invoke(None, config=config)
-```
-:::
-
-:::js
-```typescript
-const config = {
-  configurable: {
-    thread_id: "1",
-    checkpoint_id: "0c62ca34-ac19-445d-bbb0-5b4984975b2a",
-  },
-};
-await graph.invoke(null, config);
-```
-:::
-
-Importantly, LangGraph knows whether a particular step has been executed previously. If it has, LangGraph simply _re-plays_ that particular step in the graph and does not re-execute the step, but only for the steps _before_ the provided `checkpoint_id`. All of the steps _after_ `checkpoint_id` will be executed (i.e., a new fork), even if they have been executed previously. See this [how to guide on time-travel to learn more about replaying](/oss/langgraph/use-time-travel).
-
-![Replay](/oss/images/re_play.png)
-
-### Update state
-
-:::python
-In addition to re-playing the graph from specific `checkpoints`, we can also _edit_ the graph state. We do this using @[`update_state`]. This method accepts three different arguments:
-:::
-
-:::js
-In addition to re-playing the graph from specific `checkpoints`, we can also _edit_ the graph state. We do this using `graph.updateState()`. This method accepts three different arguments:
-:::
-
-#### `config`
+history = list(graph.get_state_history(config))
 
-The config should contain `thread_id` specifying which thread to update. When only the `thread_id` is passed, we update (or fork) the current state. Optionally, if we include `checkpoint_id` field, then we fork that selected checkpoint.
+# Find the checkpoint before a specific node executed
+before_node_b = next(s for s in history if s.next == ("node_b",))
 
-#### `values`
+# Find a checkpoint by step number
+step_2 = next(s for s in history if s.metadata["step"] == 2)
 
-These are the values that will be used to update the state. Note that this update is treated exactly as any update from a node is treated. This means that these values will be passed to the [reducer](/oss/langgraph/graph-api#reducers) functions, if they are defined for some of the channels in the graph state. This means that @[`update_state`] does NOT automatically overwrite the channel values for every channel, but only for the channels without reducers. Let's walk through an example.
+# Find checkpoints created by update_state
+forks = [s for s in history if s.metadata["source"] == "update"]
 
-Let's assume you have defined the state of your graph with the following schema (see full example above):
-
-:::python
-```python
-from typing import Annotated
-from typing_extensions import TypedDict
-from operator import add
-
-class State(TypedDict):
-    foo: int
-    bar: Annotated[list[str], add]
+# Find the checkpoint where an interrupt occurred
+interrupted = next(
+    s for s in history
+    if s.tasks and any(t.interrupts for t in s.tasks)
+)
 ```
 :::
 
 :::js
 ```typescript
-import { StateSchema, ReducedValue } from "@langchain/langgraph";
-import * as z from "zod";
+const history: StateSnapshot[] = [];
+for await (const state of graph.getStateHistory(config)) {
+  history.push(state);
+}
 
-const State = new StateSchema({
-  foo: z.number(),
-  bar: new ReducedValue(
-    z.array(z.string()).default(() => []),
-    {
-      inputSchema: z.array(z.string()),
-      reducer: (x, y) => x.concat(y),
-    }
-  ),
-});
-```
-:::
+// Find the checkpoint before a specific node executed
+const beforeNodeB = history.find((s) => s.next.includes("nodeB"));
 
-Let's now assume the current state of the graph is
+// Find a checkpoint by step number
+const step2 = history.find((s) => s.metadata.step === 2);
 
-:::python
-```
-{"foo": 1, "bar": ["a"]}
-```
-:::
+// Find checkpoints created by updateState
+const forks = history.filter((s) => s.metadata.source === "update");
 
-:::js
-```typescript
-{ foo: 1, bar: ["a"] }
+// Find the checkpoint where an interrupt occurred
+const interrupted = history.find(
+  (s) => s.tasks.length > 0 && s.tasks.some((t) => t.interrupts.length > 0)
+);
 ```
 :::
 
-If you update the state as below:
+### Replay
 
-:::python
-```python
-graph.update_state(config, {"foo": 2, "bar": ["b"]})
-```
-:::
+Replay re-executes steps from a prior checkpoint. Invoke the graph with a prior `checkpoint_id` to re-run nodes after that checkpoint. Nodes before the checkpoint are skipped (their results are already saved). Nodes after the checkpoint re-execute, including any LLM calls, API requests, or [interrupts](/oss/langgraph/interrupts) — which are always re-triggered during replay.
 
-:::js
-```typescript
-await graph.updateState(config, { foo: 2, bar: ["b"] });
-```
-:::
+See [Time travel](/oss/langgraph/use-time-travel) for full details and code examples on replaying past executions.
+
+![Replay](/oss/images/re_play.png)
 
-Then the new state of the graph will be:
+### Update state
 
 :::python
-```
-{"foo": 2, "bar": ["a", "b"]}
-```
+You can edit the graph state using @[`update_state`]. This creates a new checkpoint with the updated values — it does not modify the original checkpoint. The update is treated the same as a node update: values are passed through [reducer](/oss/langgraph/graph-api#reducers) functions when defined, so channels with reducers _accumulate_ values rather than overwrite them.
 
-The `foo` key (channel) is completely changed (because there is no reducer specified for that channel, so @[`update_state`] overwrites it). However, there is a reducer specified for the `bar` key, and so it appends `"b"` to the state of `bar`.
+You can optionally specify `as_node` to control which node the update is treated as coming from, which affects which node executes next. See [Time travel: `as_node`](/oss/langgraph/use-time-travel#control-which-node-runs-next-with-as_node) for details.
 :::
 
 :::js
-```typescript
-{ foo: 2, bar: ["a", "b"] }
-```
+You can edit the graph state using `graph.updateState()`. This creates a new checkpoint with the updated values — it does not modify the original checkpoint. The update is treated the same as a node update: values are passed through [reducer](/oss/langgraph/graph-api#reducers) functions when defined, so channels with reducers _accumulate_ values rather than overwrite them.
 
-The `foo` key (channel) is completely changed (because there is no reducer specified for that channel, so `updateState` overwrites it). However, there is a reducer specified for the `bar` key, and so it appends `"b"` to the state of `bar`.
-:::
-
-#### `as_node`
-
-:::python
-The final thing you can optionally specify when calling @[`update_state`] is `as_node`. If you provided it, the update will be applied as if it came from node `as_node`. If `as_node` is not provided, it will be set to the last node that updated the state, if not ambiguous. The reason this matters is that the next steps to execute depend on the last node to have given an update, so this can be used to control which node executes next. See this [how to guide on time-travel to learn more about forking state](/oss/langgraph/use-time-travel).
-:::
-
-:::js
-The final thing you can optionally specify when calling `updateState` is `asNode`. If you provide it, the update will be applied as if it came from node `asNode`. If `asNode` is not provided, it will be set to the last node that updated the state, if not ambiguous. The reason this matters is that the next steps to execute depend on the last node to have given an update, so this can be used to control which node executes next. See this [how to guide on time-travel to learn more about forking state](/oss/langgraph/use-time-travel).
+You can optionally specify `asNode` to control which node the update is treated as coming from, which affects which node executes next. See [Time travel: `asNode`](/oss/langgraph/use-time-travel#control-which-node-runs-next-with-as_node) for details.
 :::
 
 ![Update](/oss/images/checkpoints_full_story.jpg)
@@ -1159,24 +1165,3 @@ checkpointer.setup()
 When running on LangSmith, encryption is automatically enabled whenever `LANGGRAPH_AES_KEY` is present, so you only need to provide the environment variable. Other encryption schemes can be used by implementing @[`CipherProtocol`] and supplying it to @[`EncryptedSerializer`].
 
 :::
-## Capabilities
-
-### Human-in-the-loop
-
-First, checkpointers facilitate [human-in-the-loop workflows](/oss/langgraph/interrupts) by allowing humans to inspect, interrupt, and approve graph steps. Checkpointers are needed for these workflows as the human has to be able to view the state of a graph at any point in time, and the graph has to be to resume execution after the human has made any updates to the state. See [the how-to guides](/oss/langgraph/interrupts) for examples.
-
-### Memory
-
-Second, checkpointers allow for ["memory"](/oss/concepts/memory) between interactions. In the case of repeated human interactions (like conversations) any follow up messages can be sent to that thread, which will retain its memory of previous ones. See [Add memory](/oss/langgraph/add-memory) for information on how to add and manage conversation memory using checkpointers.
-
-### Time travel
-
-Third, checkpointers allow for ["time travel"](/oss/langgraph/use-time-travel), allowing users to replay prior graph executions to review and / or debug specific graph steps. In addition, checkpointers make it possible to fork the graph state at arbitrary checkpoints to explore alternative trajectories.
-
-### Fault-tolerance
-
-Lastly, checkpointing also provides fault-tolerance and error recovery: if one or more nodes fail at a given superstep, you can restart your graph from the last successful step. Additionally, when a graph node fails mid-execution at a given superstep, LangGraph stores pending checkpoint writes from any other nodes that completed successfully at that superstep, so that whenever we resume graph execution from that superstep we don't re-run the successful nodes.
-
-#### Pending writes
-
-Additionally, when a graph node fails mid-execution at a given superstep, LangGraph stores pending checkpoint writes from any other nodes that completed successfully at that superstep, so that whenever we resume graph execution from that superstep we don't re-run the successful nodes.