feat(core, anthropic, openai): aclose() lifecycle + latched fallbacks#37718
Draft
Bagatur (baskaryan) wants to merge 3 commits into
Draft
feat(core, anthropic, openai): aclose() lifecycle + latched fallbacks#37718Bagatur (baskaryan) wants to merge 3 commits into
Bagatur (baskaryan) wants to merge 3 commits into
Conversation
Plumb an explicit resource-lifecycle contract through `BaseChatModel`, `RunnableWithFallbacks`, and the two largest partner integrations. Adds an opt-in `FallbackLatch` so `with_fallbacks(...)` can short-circuit the primary after a failure. Motivation: provider SDKs (`anthropic`, `openai`) back their clients with httpx connection pools that the SDKs only release best-effort from `__del__` — `asyncio.get_running_loop().create_task(self.aclose())` with a bare `except Exception: pass`. Long-lived workers that construct chat models per request (multi-tenant LangGraph deployments, agents-as-services) silently accumulate pools and leak memory + file descriptors. The fix today requires reaching into private attributes (`_async_client`, `root_async_client`, ...) on each provider. This PR makes teardown a first-class part of the chat-model API. ## langchain-core - `BaseChatModel.close()` / `aclose()` — default no-ops that subclasses override. `aclose()` dispatches to `close()` so async teardown works for sync-only subclasses. Adds `__enter__`/`__exit__`/`__aenter__`/ `__aexit__` so models can be used as context managers. - `RunnableWithFallbacks.close()` / `aclose()` — walks `runnable` and `fallbacks`, calling each one's lifecycle method. Per-runnable failures are suppressed so one bad close doesn't prevent the others from running. - `FallbackLatch` + `with_fallbacks(..., latch=...)` — opt-in circuit-breaker: once the primary raises a handled exception, latch trips and subsequent calls (on this wrapper, or any wrapper sharing the same latch instance) skip the primary. Useful when a primary failure is unlikely to recover within the wrapper's lifetime — wrong API key, sustained outage — so the default `try-primary-on-every-call` doesn't waste a round-trip on every retry. `latch.reset()` re-enables the primary. The latch propagates through `__getattr__` rebinds (e.g. `wrapper.bind_tools([...])`) so tool-bound and bare wrappers share one circuit. - Default-latch behaviour is unchanged: passing no `latch` retains the existing "retry primary on every call" semantics. ## langchain-anthropic - `ChatAnthropic.close()` / `aclose()` — closes `_client` (sync) and `_async_client` (async). Both are `cached_property` slots; guarded via `__dict__` so we don't materialize an uninstantiated cached client just to immediately close it. Idempotent. ## langchain-openai - `BaseChatOpenAI.close()` / `aclose()` — closes `root_client` and `root_async_client`, then clears the corresponding `client` / `async_client` attributes so the model can't be used after teardown. Idempotent. Tolerates the API-key-missing case where one client is `None`. Note: `BaseChatOpenAI`'s eager construction of both sync + async clients in its `model_validator` (even for async-only use) is a related inefficiency but not addressed here — it's a fixed per-instance cost rather than the per-request leak that `aclose()` solves. ## Tests - 7 new latch + propagation tests in `test_fallbacks.py` - 4 new lifecycle tests in `test_base.py` for `BaseChatModel` - 5 new tests in `test_chat_models.py` for `ChatAnthropic` - 5 new tests in `test_base.py` for `BaseChatOpenAI` Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Merging this PR will not alter performance
Comparing Footnotes
|
mypy flags the `# type: ignore[override]` on the test subclasses' `close()` / `aclose()` methods as unused — `BaseChatModel.close` / `aclose` are concrete (non-abstract) defaults, so overriding them in a subclass does not need an override suppression. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…se()
The first cut of close()/aclose() unconditionally closed the SDK
client's underlying httpx pool. But both integrations back their
clients with a PROCESS-WIDE SHARED pool via @lru_cache
(`_get_default_*httpx_client`): every model with the same
base_url/timeout/proxy reuses one pool by design. Closing it from a
single model's teardown broke every other live model in the process —
observed in a long-lived worker as:
RuntimeError: Cannot send a request, as the client has been closed.
-> anthropic.APIConnectionError: Connection error.
Fix: close()/aclose() now release the underlying httpx client ONLY when
the model privately owns it; the shared cached pool and user-supplied
clients are left intact.
- anthropic: `ChatAnthropic` always wraps the shared cached pool (it has
no http_client field), so close()/aclose() are effectively no-ops for
the pool. An identity check against the lru-cache getter
(`_wraps_shared_httpx`) guards a hypothetical future private-client
path. `_http_client_params()` is factored out so the cached_property
builders and the identity check stay in sync.
- openai: ownership is computed in `validate_environment` and stored on
`_owns_sync_http_client` / `_owns_async_http_client`. A client is owned
iff the model built it privately — the unhashable-`httpx.Timeout`
fresh-client path or an `openai_proxy` transport — and the user did not
supply their own `http_client` / `http_async_client`. Default (shared
cache) and user-supplied clients are never closed.
Tests rewritten to pin the invariant: a regression test builds two
default models, closes one, and asserts the other's shared pool is still
open; plus owned-path and user-injected-not-closed cases.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds an explicit resource-lifecycle contract to chat models, an opt-in
FallbackLatchcircuit-breaker forwith_fallbacks(...), andclose()/aclose()propagation throughRunnableWithFallbacks.langchain-core
BaseChatModel.close()/aclose()— default no-ops that subclasses override.aclose()dispatches toclose()so async teardown works for sync-only subclasses. Adds__enter__/__exit__/__aenter__/__aexit__for context-manager use.RunnableWithFallbacks.close()/aclose()— walkrunnable+fallbacks, calling each one's lifecycle method; per-runnable failures are suppressed so one bad close doesn't block the rest.FallbackLatch+with_fallbacks(..., latch=...)— opt-in circuit breaker. Once the primary raises a handled exception the latch trips and subsequent calls (on this wrapper, or any wrapper sharing the latch) skip the primary. The latch propagates through__getattr__rebinds (bind_tools,bind, …) so tool-bound and bare wrappers share one circuit.latch.reset()re-enables the primary. Default (no latch) behavior is unchanged.langchain-anthropic / langchain-openai
close()/aclose()release the underlying httpx client only when the model privately owns it.This is the important subtlety: both integrations back their SDK clients with a process-wide shared httpx pool via
@lru_cache(_get_default_*httpx_client). Every model with the samebase_url/timeout/proxy reuses one pool, by design. An earlier revision of this PR closed that shared pool on teardown, which broke every other live model in the process:So ownership is now tracked explicitly:
ChatAnthropicalways wraps the shared cached pool (nohttp_clientfield), soclose()/aclose()are no-ops for the pool, guarded by an identity check against the lru-cache getter (defensive against any future private-client path).httpx.Timeoutfresh-client path or anopenai_proxytransport — and the user didn't supply their ownhttp_client/http_async_client. Shared-cache and user-supplied clients are never closed.This makes
aclose()safe to call after every use without disturbing sibling models or pools the caller owns. It is most useful for the privately-owned and user-injected-and-managed cases, and as a uniform, provider-agnostic teardown hook for frameworks (LangGraph runtimes, agent orchestrators) that previously had to reach into private SDK attributes.Release Note
none
Test Plan
BaseChatModel; latch trip/reset/shared/propagation + close/aclose propagation tests forRunnableWithFallbacks.libs/partners/anthropic(106),libs/partners/openaichat_models/test_base (201). Lint + mypy clean.🤖 Generated with Claude Code