Skip to content

perf: cache inspect.signature and UntrackedValue isinstance scan#7121

Open
John Kennedy (jkennedyvz) wants to merge 2 commits intomainfrom
jk/cache-optimizations
Open

perf: cache inspect.signature and UntrackedValue isinstance scan#7121
John Kennedy (jkennedyvz) wants to merge 2 commits intomainfrom
jk/cache-optimizations

Conversation

@jkennedyvz
Copy link
Contributor

Summary

  • Cache inspect.signature results in RunnableCallable.__init__ via a module-level _SIGNATURE_CACHE dict, avoiding repeated signature introspection for the same function (e.g. 1000 ChannelWrite instances all inspecting the same _write/_awrite methods)
  • Cache the any(isinstance(ch, UntrackedValue) ...) scan as _has_untracked_channels once in __enter__/__aenter__, replacing two per-call scan sites in put_writes and checkpoint sanitization — eliminates 1M+ isinstance calls through the ABC machinery for sequential_1000

Split out from #6969 to keep cache changes separate from other optimizations.

Test plan

  • Existing test suite passes (make test in libs/langgraph)
  • Benchmark confirms no regression

🤖 Generated with Claude Code

The `put_writes` method was scanning all channels with
`any(isinstance(ch, UntrackedValue) ...)` on every call. For the
sequential_1000 benchmark this produced 1M+ isinstance calls through
the ABC machinery, consuming 40% of total runtime.

Cache the result as `_has_untracked_channels` once in __enter__ and
__aenter__, replacing both scan sites (put_writes and checkpoint
sanitization). This yields a 1.8-2.3x speedup on sequential_1000.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Avoids repeated signature introspection for the same function (e.g. 1000
ChannelWrite instances all inspecting the same _write/_awrite methods).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jkennedyvz
Copy link
Contributor Author

From CI

WARNING: the benchma
Notice: +-----------------------------------------+---------+-----------------------+
| Benchmark                               | main    | changes               |
+=========================================+=========+=======================+
| sequential_1000_sync                    | 371 ms  | 233 ms: 1.59x faster  |
+-----------------------------------------+---------+-----------------------+
| sequential_1000                         | 438 ms  | 299 ms: 1.47x faster  |
+-----------------------------------------+---------+-----------------------+
| pydantic_state_9x1200_checkpoint        | 70.6 ms | 63.6 ms: 1.11x faster |
+-----------------------------------------+---------+-----------------------+
| fanout_to_subgraph_10x                  | 33.5 ms | 30.6 ms: 1.10x faster |
+-----------------------------------------+---------+-----------------------+
| fanout_to_subgraph_10x_checkpoint       | 35.2 ms | 32.3 ms: 1.09x faster |
+-----------------------------------------+---------+-----------------------+
| pydantic_state_25x300_checkpoint_sync   | 40.6 ms | 37.6 ms: 1.08x faster |
+-----------------------------------------+---------+-----------------------+
| pydantic_state_25x300                   | 23.8 ms | 22.1 ms: 1.07x faster |
+-----------------------------------------+---------+-----------------------+
| pydantic_state_15x600_checkpoint        | 79.4 ms | 74.2 ms: 1.07x faster |
+-----------------------------------------+---------+-----------------------+
| wide_state_25x300_sync                  | 10.4 ms | 9.77 ms: 1.07x faster |
+-----------------------------------------+---------+-----------------------+
| wide_dict_25x300_sync                   | 10.3 ms | 9.61 ms: 1.07x faster |
+-----------------------------------------+---------+-----------------------+
| pydantic_state_25x300_checkpoint        | 44.8 ms | 42.2 ms: 1.06x faster |
+-----------------------------------------+---------+-----------------------+
| wide_state_15x600_sync                  | 13.5 ms | 12.8 ms: 1.06x faster |
+-----------------------------------------+---------+-----------------------+
| wide_dict_15x600_sync                   | 13.3 ms | 12.6 ms: 1.06x faster |
+-----------------------------------------+---------+-----------------------+
| fanout_to_subgraph_100x_sync            | 297 ms  | 295 ms: 1.01x faster  |
+-----------------------------------------+---------+-----------------------+
| react_agent_10x_sync                    | 22.2 ms | 22.2 ms: 1.00x faster |
+-----------------------------------------+---------+-----------------------+
| fanout_to_subgraph_100x_checkpoint_sync | 313 ms  | 313 ms: 1.00x slower  |
+-----------------------------------------+---------+-----------------------+
| react_agent_10x                         | 26.8 ms | 26.8 ms: 1.00x slower |
+-----------------------------------------+---------+-----------------------+
| serde_allowlist_large                   | 183 us  | 184 us: 1.00x slower  |
+-----------------------------------------+---------+-----------------------+
| wide_state_15x600_checkpoint            | 53.5 ms | 54.0 ms: 1.01x slower |
+-----------------------------------------+---------+-----------------------+
| sequential_10                           | 2.38 ms | 2.40 ms: 1.01x slower |
+-----------------------------------------+---------+-----------------------+
| sequential_10_sync                      | 1.74 ms | 1.76 ms: 1.01x slower |
+-----------------------------------------+---------+-----------------------+
| wide_dict_15x600_checkpoint_sync        | 47.6 ms | 48.1 ms: 1.01x slower |
+-----------------------------------------+---------+-----------------------+
| wide_state_15x600_checkpoint_sync       | 47.8 ms | 48.4 ms: 1.01x slower |
+-----------------------------------------+---------+-----------------------+
| wide_dict_15x600_checkpoint             | 52.9 ms | 53.6 ms: 1.01x slower |
+-----------------------------------------+---------+-----------------------+
| serde_allowlist_small                   | 39.0 us | 39.6 us: 1.01x slower |
+---------------------------------------

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant