fix(core): handle float type in merge_dicts, merge_obj, and _dict_int_op#36053
fix(core): handle float type in merge_dicts, merge_obj, and _dict_int_op#36053nevanwang (nevanwang) wants to merge 2 commits intolangchain-ai:masterfrom
Conversation
- merge_dicts(): expand isinstance check from int to (int, float) - merge_obj(): add numeric addition for int/float after equality check - _dict_int_op(): expand isinstance check to support float values - Update docstrings and error messages to reflect float support - Add regression tests for float values in all three functions Fixes langchain-ai#36011 Fixes langchain-ai#36015
Merging this PR will improve performance by 31.71%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | WallTime | test_import_time[LangChainTracer] |
480.7 ms | 387.4 ms | +24.08% |
| ⚡ | WallTime | test_import_time[InMemoryRateLimiter] |
181.2 ms | 141.8 ms | +27.77% |
| ⚡ | WallTime | test_import_time[HumanMessage] |
277.3 ms | 214.2 ms | +29.41% |
| ⚡ | WallTime | test_async_callbacks_in_sync |
25.2 ms | 19.7 ms | +28.06% |
| ⚡ | WallTime | test_import_time[Runnable] |
522.8 ms | 403.1 ms | +29.7% |
| ⚡ | WallTime | test_import_time[ChatPromptTemplate] |
677.3 ms | 531 ms | +27.56% |
| ⚡ | WallTime | test_import_time[tool] |
582.2 ms | 459.2 ms | +26.78% |
| ⚡ | WallTime | test_import_time[PydanticOutputParser] |
565.5 ms | 463.1 ms | +22.11% |
| ⚡ | WallTime | test_import_time[Document] |
200.2 ms | 154.8 ms | +29.32% |
| ⚡ | WallTime | test_import_time[InMemoryVectorStore] |
643.7 ms | 509.7 ms | +26.28% |
| ⚡ | WallTime | test_import_time[RunnableLambda] |
522.3 ms | 402 ms | +29.92% |
| ⚡ | WallTime | test_import_time[CallbackManager] |
336.1 ms | 258.3 ms | +30.11% |
| ⚡ | WallTime | test_import_time[BaseChatModel] |
573.4 ms | 435.4 ms | +31.71% |
Comparing nevanwang:fix/core-merge-dicts-float-support (3d20b09) with master (07fa576)2
Footnotes
-
23 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
-
No successful run was found on
master(a81203b) during the generation of this report, so 07fa576 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩
|
This PR has been automatically closed because you are not assigned to the linked issue. External contributors must be assigned to an issue before opening a PR for it. Please:
|
Description
merge_dicts()andmerge_obj()inlibs/core/langchain_core/utils/_merge.py, as well as_dict_int_op()inlibs/core/langchain_core/utils/usage.py, handleinttypes but notfloat. When these functions encounterfloatvalues, they fall through to error branches and raiseTypeErrororValueError.This affects streaming scenarios where LLM providers return float fields in
generation_info,additional_kwargs, orUsageMetadata(e.g.,logprob,score, safety scores, cost fields, etc.). During chunk aggregation viaChatGenerationChunk.__add__()→merge_dicts(), orUsageMetadata.__add__()→_dict_int_op(), these float fields cause the stream to crash.Changes
libs/core/langchain_core/utils/_merge.pymerge_dicts(): Changedisinstance(merged[right_k], int)toisinstance(merged[right_k], (int, float))so that float values are summed (or preserved for special keys likeindex,created,timestamp) instead of raisingTypeError.merge_obj(): Addedisinstance(left, (int, float))check after the equality check, so unequal numeric values are summed instead of raisingValueError. The ordering is important — equal values (includingTrue == True,42 == 42,3.14 == 3.14) are handled by the equality branch first, and only unequal numeric values reach the addition branch.libs/core/langchain_core/utils/usage.py_dict_int_op(): Changedisinstance(..., int)toisinstance(..., (int, float))so that float values inUsageMetadata(e.g., cost fields, timing data) are correctly handled during aggregation.Callable[[int, int], int]→Callable[[float, float], float]), docstrings, and error messages to reflect float support.libs/core/tests/unit_tests/utils/test_utils.pytest_merge_dicts(summing, preserved special keys likeindex/created/timestamp).test_merge_obj(intandfloat).test_merge_obj_unmergeable_valuesto usesetinstead of different integers (since different integers are now correctly summed).libs/core/tests/unit_tests/utils/test_usage.pytest_dict_int_op_float_add— tests pure float addition.test_dict_int_op_mixed_int_float— tests mixed int/float addition.test_dict_int_op_nested_float— tests nested dictionaries with float values.Design Note
As Jairooh correctly pointed out,
isinstance(x, int)returnsTrueforboolin Python. Inmerge_obj(), this is safely handled becauseboolequality (True == True) is caught by theleft == rightbranch before reaching the numeric addition branch. Formerge_dicts(), the same ordering applies. This is a key design choice — the competing PR #36048 placed the numeric check before the equality check, causing42 + 42 = 84andTrue + True = 2failures.Regarding floating-point precision concerns: the current approach mirrors the existing
intaddition semantics. For use cases where last-value or max semantics are preferred over summation, that would be a separate feature/configuration beyond this bug fix.Issue
Fixes #36011
Fixes #36015
Related
not_plannedintsupport but omittedfloat_dict_int_opraises ValueError for float values in UsageMetadata during streaming aggregation #36015 —_dict_int_opraises ValueError for float values inUsageMetadata