fix: handle lambda, postponed annotations, and subclass types in trim_messages token_counter#35640
Conversation
…_messages token_counter The annotation check in trim_messages used `is BaseMessage` which only matched the exact BaseMessage class. This broke for: - Lambdas (no annotation → inspect.Parameter.empty) - Postponed annotations (from __future__ import annotations → string) - Subclass annotations (e.g. HumanMessage is not BaseMessage) Extract the detection logic into _is_per_message_counter() which handles all three cases correctly. Fixes langchain-ai#35629
Merging this PR will improve performance by 39.56%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | WallTime | test_import_time[tool] |
587.4 ms | 474.1 ms | +23.9% |
| ⚡ | WallTime | test_async_callbacks_in_sync |
25.6 ms | 18.3 ms | +39.56% |
| ⚡ | WallTime | test_import_time[LangChainTracer] |
485.8 ms | 407.6 ms | +19.19% |
| ⚡ | WallTime | test_import_time[RunnableLambda] |
537.2 ms | 441.4 ms | +21.71% |
| ⚡ | WallTime | test_import_time[HumanMessage] |
279.1 ms | 235.4 ms | +18.55% |
| ⚡ | WallTime | test_import_time[Runnable] |
538 ms | 443.9 ms | +21.21% |
| ⚡ | WallTime | test_import_time[InMemoryVectorStore] |
650.8 ms | 536.3 ms | +21.36% |
| ⚡ | WallTime | test_import_time[ChatPromptTemplate] |
682.8 ms | 542.4 ms | +25.89% |
| ⚡ | WallTime | test_import_time[PydanticOutputParser] |
583 ms | 473.2 ms | +23.21% |
| ⚡ | WallTime | test_import_time[Document] |
198 ms | 166.3 ms | +19.1% |
| ⚡ | WallTime | test_import_time[CallbackManager] |
341.3 ms | 282.4 ms | +20.85% |
| ⚡ | WallTime | test_import_time[InMemoryRateLimiter] |
186 ms | 154.5 ms | +20.38% |
| ⚡ | WallTime | test_import_time[BaseChatModel] |
573.8 ms | 478.8 ms | +19.83% |
Comparing atian8179:fix/trim-messages-lambda-counter (e97b42f) with master (29134dc)2
Footnotes
-
23 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
-
No successful run was found on
master(27651d9) during the generation of this report, so 29134dc was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩
|
hi! Creator of issue here :) Thanks for working on this, however i already have a PR open to fix this when i opened the issue. Please try to avoid duplicates :) |
ccurme (ccurme)
left a comment
There was a problem hiding this comment.
Duplicated with #35630.
Problem
trim_messagesbreaks whentoken_counteris a lambda, usesfrom __future__ import annotations, or has aBaseMessagesubclass annotation (e.g.HumanMessage).The root cause is the annotation check:
This uses identity comparison (
is) which only matches the exactBaseMessageclass. It fails for:inspect.Parameter.emptyannotationfrom __future__ import annotations) — annotation is the string'BaseMessage'HumanMessage) —HumanMessage is BaseMessageisFalseIn all cases, the function incorrectly treats the callable as a list-level counter, causing a
TypeError.Solution
Extract the detection logic into a
_is_per_message_counter()helper that handles all three cases:inspect.Parameter.empty→ assume per-message (lambdas, bare defs)strannotation containing "message" → postponed annotationstypethat is a subclass ofBaseMessage→ subclass annotationsFixes #35629