fix: off-by-one error in RecursiveJsonSplitter.split_json#35649
fix: off-by-one error in RecursiveJsonSplitter.split_json#35649Ethan T. (gambletan) wants to merge 1 commit intolangchain-ai:masterfrom
Conversation
Two issues fixed:
1. Changed `size < remaining` to `size <= remaining` so items that fit
exactly at the boundary are added to the current chunk instead of
being pushed to a new one.
2. Added explicit handling for empty dicts and leaf values when they
don't fit in the current chunk. Previously, recursing into an empty
dict `{}` caused the for loop to have zero iterations, silently
dropping the key-value pair from all chunks.
Fixes langchain-ai#29153
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Orb Code Review (powered by GLM 5.1 on Orb Cloud) PR #35649: fix: off-by-one error in RecursiveJsonSplitter.split_jsonFindings1. Off-by-one fix ( 2. Leaf value / empty dict handling ✅ Looks correct
The problem: if The fix correctly handles this by checking whether Minor observations
Summary: A clean, focused fix addressing two real issues: an off-by-one boundary condition and silent data loss for empty dict values during JSON splitting. Assessment: approve |
Summary
Fixes #29153
Two bugs in
RecursiveJsonSplitter._json_split()that cause data loss at chunk boundaries:Off-by-one boundary check: Changed
size < remainingtosize <= remainingso items that fit exactly at the chunk size boundary are added to the current chunk instead of being pushed to a new one.Empty dict/leaf value loss: When a value doesn't fit in the current chunk and a new chunk is started, the code recursed into the value. For empty dicts
{}, thefor key, value in data.items()loop has zero iterations, silently dropping the key-value pair from all chunks. Added explicit handling to directly set leaf values and empty dicts instead of recursing.Reproduction
Test plan
🤖 Generated with Claude Code