Commit 9a85e4a
test: add KV cache subgraph tests with dynamic shapes and generation loops
Add tests/py/dynamo/hlo/test_kv_cache.py covering five KV cache patterns
common in LLM inference:
- DynamicCache: growing cache via torch.cat
- StaticCache: fixed-size cache with index_copy_ writes
- StaticScatterCache: scatter-based position-indexed writes
- SlidingWindowCache: fixed-window rolling cache via cat+slice
- RoPEDynamicCache: dynamic cache with rotary position embeddings
Each test class uses torch.export with dynamic shape dims and
torch_tensorrt.dynamo.compile, runs a multi-step generation loop
(8–20 steps), and validates TRT output against a PyTorch reference
across 6 configurations (batch size, head count, hidden dim, fp16/fp32).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>1 parent df261d5 commit 9a85e4a
2 files changed
Lines changed: 913 additions & 0 deletions
Whitespace-only changes.
0 commit comments