perf: cache inspect.signature and UntrackedValue isinstance scan#7121
Open
John Kennedy (jkennedyvz) wants to merge 2 commits intomainfrom
Open
perf: cache inspect.signature and UntrackedValue isinstance scan#7121John Kennedy (jkennedyvz) wants to merge 2 commits intomainfrom
John Kennedy (jkennedyvz) wants to merge 2 commits intomainfrom
Conversation
The `put_writes` method was scanning all channels with `any(isinstance(ch, UntrackedValue) ...)` on every call. For the sequential_1000 benchmark this produced 1M+ isinstance calls through the ABC machinery, consuming 40% of total runtime. Cache the result as `_has_untracked_channels` once in __enter__ and __aenter__, replacing both scan sites (put_writes and checkpoint sanitization). This yields a 1.8-2.3x speedup on sequential_1000. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Avoids repeated signature introspection for the same function (e.g. 1000 ChannelWrite instances all inspecting the same _write/_awrite methods). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
Author
WARNING: the benchma
Notice: +-----------------------------------------+---------+-----------------------+
| Benchmark | main | changes |
+=========================================+=========+=======================+
| sequential_1000_sync | 371 ms | 233 ms: 1.59x faster |
+-----------------------------------------+---------+-----------------------+
| sequential_1000 | 438 ms | 299 ms: 1.47x faster |
+-----------------------------------------+---------+-----------------------+
| pydantic_state_9x1200_checkpoint | 70.6 ms | 63.6 ms: 1.11x faster |
+-----------------------------------------+---------+-----------------------+
| fanout_to_subgraph_10x | 33.5 ms | 30.6 ms: 1.10x faster |
+-----------------------------------------+---------+-----------------------+
| fanout_to_subgraph_10x_checkpoint | 35.2 ms | 32.3 ms: 1.09x faster |
+-----------------------------------------+---------+-----------------------+
| pydantic_state_25x300_checkpoint_sync | 40.6 ms | 37.6 ms: 1.08x faster |
+-----------------------------------------+---------+-----------------------+
| pydantic_state_25x300 | 23.8 ms | 22.1 ms: 1.07x faster |
+-----------------------------------------+---------+-----------------------+
| pydantic_state_15x600_checkpoint | 79.4 ms | 74.2 ms: 1.07x faster |
+-----------------------------------------+---------+-----------------------+
| wide_state_25x300_sync | 10.4 ms | 9.77 ms: 1.07x faster |
+-----------------------------------------+---------+-----------------------+
| wide_dict_25x300_sync | 10.3 ms | 9.61 ms: 1.07x faster |
+-----------------------------------------+---------+-----------------------+
| pydantic_state_25x300_checkpoint | 44.8 ms | 42.2 ms: 1.06x faster |
+-----------------------------------------+---------+-----------------------+
| wide_state_15x600_sync | 13.5 ms | 12.8 ms: 1.06x faster |
+-----------------------------------------+---------+-----------------------+
| wide_dict_15x600_sync | 13.3 ms | 12.6 ms: 1.06x faster |
+-----------------------------------------+---------+-----------------------+
| fanout_to_subgraph_100x_sync | 297 ms | 295 ms: 1.01x faster |
+-----------------------------------------+---------+-----------------------+
| react_agent_10x_sync | 22.2 ms | 22.2 ms: 1.00x faster |
+-----------------------------------------+---------+-----------------------+
| fanout_to_subgraph_100x_checkpoint_sync | 313 ms | 313 ms: 1.00x slower |
+-----------------------------------------+---------+-----------------------+
| react_agent_10x | 26.8 ms | 26.8 ms: 1.00x slower |
+-----------------------------------------+---------+-----------------------+
| serde_allowlist_large | 183 us | 184 us: 1.00x slower |
+-----------------------------------------+---------+-----------------------+
| wide_state_15x600_checkpoint | 53.5 ms | 54.0 ms: 1.01x slower |
+-----------------------------------------+---------+-----------------------+
| sequential_10 | 2.38 ms | 2.40 ms: 1.01x slower |
+-----------------------------------------+---------+-----------------------+
| sequential_10_sync | 1.74 ms | 1.76 ms: 1.01x slower |
+-----------------------------------------+---------+-----------------------+
| wide_dict_15x600_checkpoint_sync | 47.6 ms | 48.1 ms: 1.01x slower |
+-----------------------------------------+---------+-----------------------+
| wide_state_15x600_checkpoint_sync | 47.8 ms | 48.4 ms: 1.01x slower |
+-----------------------------------------+---------+-----------------------+
| wide_dict_15x600_checkpoint | 52.9 ms | 53.6 ms: 1.01x slower |
+-----------------------------------------+---------+-----------------------+
| serde_allowlist_small | 39.0 us | 39.6 us: 1.01x slower |
+--------------------------------------- |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
inspect.signatureresults inRunnableCallable.__init__via a module-level_SIGNATURE_CACHEdict, avoiding repeated signature introspection for the same function (e.g. 1000ChannelWriteinstances all inspecting the same_write/_awritemethods)any(isinstance(ch, UntrackedValue) ...)scan as_has_untracked_channelsonce in__enter__/__aenter__, replacing two per-call scan sites input_writesand checkpoint sanitization — eliminates 1M+isinstancecalls through the ABC machinery forsequential_1000Split out from #6969 to keep cache changes separate from other optimizations.
Test plan
make testinlibs/langgraph)🤖 Generated with Claude Code