Commit 7aa7818
[None][feat] Add triton paged attention for AutoDeploy (NVIDIA#12642)
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
Co-authored-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>1 parent 4c97a03 commit 7aa7818
File tree
6 files changed
+1886
-2
lines changed- tensorrt_llm/_torch/auto_deploy/custom_ops/attention
- tests
- integration
- defs/accuracy
- test_lists/test-db
- unittest/auto_deploy/singlegpu/custom_ops/attention
6 files changed
+1886
-2
lines changedLines changed: 2 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | | - | |
| 25 | + | |
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
| 37 | + | |
37 | 38 | | |
38 | 39 | | |
0 commit comments