Skip to content

fix(langgraph): add backpressure to async durability checkpoint writes#7107

Open
pawel-twardziak wants to merge 2 commits intolangchain-ai:mainfrom
pawel-twardziak:fix/async-checkpoint-backpressure
Open

fix(langgraph): add backpressure to async durability checkpoint writes#7107
pawel-twardziak wants to merge 2 commits intolangchain-ai:mainfrom
pawel-twardziak:fix/async-checkpoint-backpressure

Conversation

@pawel-twardziak
Copy link

Add backpressure to durability="async" checkpoint writes to prevent unbounded memory growth.

With durability="async" (the default), each superstep submits a checkpoint write as a background task. When checkpoint writes have latency (any real DB), tasks accumulate faster than they complete - each holding a full copy_checkpoint() in its coroutine frame. Memory grows proportionally with superstep count.

This fix ensures that if the previous checkpoint write hasn't completed by the time a new superstep finishes, the loop waits before proceeding. This bounds pending checkpoint writes to at most 1, while preserving the async overlap benefit when writes are fast.

Behavior per durability mode:

  • "sync": always awaits (unchanged)
  • "async": only awaits if the previous write is still in-flight (not fut.done()); no wait when writes complete within one superstep
  • "exit": _put_checkpoint_fut is never set, so no await (unchanged)

Potentially closes #7094

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Memory leak: _checkpointer_put_after_previous coroutine chains accumulate with default durability="async"

1 participant