Commit f7fd3b5
authored
[webgpu] Register GQA based on graph capture (microsoft#26384)
This pull request enables conditionally register GQA with
total_sequence_length on gpu or not. It resolves the issue that a
MemcpyToHost is generated when graph capture is enabled (refer to
microsoft#25868). This is the last functionality part to support graph capture in
webgpu ep in ORT.
The main changes ensure that when graph capture is enabled, sequence
length information is read from GPU buffers instead of CPU memory, and
shader code generation adapts accordingly. This enables more efficient
execution and compatibility with graph-captured models.
In this PR, we still get total sequence length from `seqlen_k` tensor
not `total_seqlen_tensor` tensor to keep consistent with other parts. In
the next PR, we can refactor all places to directly use
`total_seqlen_tensor` instead of `seqlen_k` when graph capture enabled.1 parent 3a6a4c2 commit f7fd3b5
File tree
9 files changed
+104
-57
lines changed- onnxruntime
- contrib_ops
- cpu/bert
- webgpu
- bert
- core/providers/webgpu
9 files changed
+104
-57
lines changedLines changed: 14 additions & 14 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
251 | 251 | | |
252 | 252 | | |
253 | 253 | | |
254 | | - | |
255 | 254 | | |
256 | 255 | | |
257 | 256 | | |
258 | 257 | | |
259 | | - | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
260 | 262 | | |
261 | 263 | | |
262 | 264 | | |
| |||
267 | 269 | | |
268 | 270 | | |
269 | 271 | | |
| 272 | + | |
270 | 273 | | |
271 | | - | |
272 | | - | |
273 | | - | |
274 | | - | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
275 | 282 | | |
276 | | - | |
277 | | - | |
278 | 283 | | |
279 | | - | |
280 | | - | |
281 | | - | |
282 | | - | |
283 | | - | |
284 | 284 | | |
285 | | - | |
| 285 | + | |
286 | 286 | | |
287 | 287 | | |
288 | 288 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
34 | 37 | | |
35 | 38 | | |
36 | | - | |
37 | 39 | | |
38 | 40 | | |
39 | 41 | | |
| |||
43 | 45 | | |
44 | 46 | | |
45 | 47 | | |
46 | | - | |
| 48 | + | |
47 | 49 | | |
48 | 50 | | |
49 | 51 | | |
| |||
105 | 107 | | |
106 | 108 | | |
107 | 109 | | |
| 110 | + | |
108 | 111 | | |
109 | 112 | | |
110 | | - | |
| 113 | + | |
111 | 114 | | |
112 | 115 | | |
113 | 116 | | |
| |||
119 | 122 | | |
120 | 123 | | |
121 | 124 | | |
122 | | - | |
| 125 | + | |
123 | 126 | | |
124 | 127 | | |
125 | 128 | | |
| |||
137 | 140 | | |
138 | 141 | | |
139 | 142 | | |
140 | | - | |
| 143 | + | |
141 | 144 | | |
142 | 145 | | |
143 | 146 | | |
| |||
167 | 170 | | |
168 | 171 | | |
169 | 172 | | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
170 | 176 | | |
171 | 177 | | |
172 | 178 | | |
| |||
176 | 182 | | |
177 | 183 | | |
178 | 184 | | |
179 | | - | |
| 185 | + | |
| 186 | + | |
180 | 187 | | |
181 | 188 | | |
182 | 189 | | |
| |||
349 | 356 | | |
350 | 357 | | |
351 | 358 | | |
| 359 | + | |
| 360 | + | |
352 | 361 | | |
353 | 362 | | |
354 | 363 | | |
355 | | - | |
| 364 | + | |
356 | 365 | | |
357 | 366 | | |
358 | 367 | | |
| |||
364 | 373 | | |
365 | 374 | | |
366 | 375 | | |
367 | | - | |
| 376 | + | |
| 377 | + | |
368 | 378 | | |
369 | 379 | | |
370 | 380 | | |
371 | 381 | | |
372 | 382 | | |
373 | 383 | | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
374 | 387 | | |
375 | 388 | | |
376 | 389 | | |
377 | 390 | | |
378 | 391 | | |
379 | 392 | | |
380 | | - | |
| 393 | + | |
381 | 394 | | |
382 | 395 | | |
383 | 396 | | |
384 | | - | |
385 | 397 | | |
386 | 398 | | |
387 | 399 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
22 | | - | |
| 21 | + | |
| 22 | + | |
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
| 37 | + | |
37 | 38 | | |
38 | 39 | | |
39 | 40 | | |
| |||
45 | 46 | | |
46 | 47 | | |
47 | 48 | | |
48 | | - | |
| 49 | + | |
| 50 | + | |
49 | 51 | | |
50 | 52 | | |
51 | 53 | | |
52 | 54 | | |
53 | 55 | | |
54 | 56 | | |
55 | 57 | | |
56 | | - | |
| 58 | + | |
| 59 | + | |
57 | 60 | | |
58 | 61 | | |
59 | 62 | | |
60 | 63 | | |
61 | 64 | | |
62 | 65 | | |
63 | 66 | | |
64 | | - | |
65 | 67 | | |
66 | 68 | | |
67 | 69 | | |
| |||
74 | 76 | | |
75 | 77 | | |
76 | 78 | | |
| 79 | + | |
77 | 80 | | |
78 | 81 | | |
79 | 82 | | |
| |||
Lines changed: 27 additions & 12 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| 9 | + | |
9 | 10 | | |
10 | 11 | | |
11 | 12 | | |
12 | 13 | | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
13 | 26 | | |
14 | 27 | | |
15 | 28 | | |
| |||
45 | 58 | | |
46 | 59 | | |
47 | 60 | | |
48 | | - | |
| 61 | + | |
49 | 62 | | |
50 | 63 | | |
51 | 64 | | |
52 | 65 | | |
53 | | - | |
| 66 | + | |
54 | 67 | | |
55 | 68 | | |
56 | | - | |
| 69 | + | |
57 | 70 | | |
58 | | - | |
| 71 | + | |
59 | 72 | | |
60 | 73 | | |
61 | 74 | | |
| |||
93 | 106 | | |
94 | 107 | | |
95 | 108 | | |
96 | | - | |
| 109 | + | |
97 | 110 | | |
98 | 111 | | |
99 | | - | |
| 112 | + | |
100 | 113 | | |
101 | | - | |
| 114 | + | |
102 | 115 | | |
103 | 116 | | |
104 | 117 | | |
| |||
141 | 154 | | |
142 | 155 | | |
143 | 156 | | |
| 157 | + | |
144 | 158 | | |
145 | 159 | | |
146 | 160 | | |
147 | | - | |
| 161 | + | |
| 162 | + | |
148 | 163 | | |
149 | | - | |
150 | | - | |
| 164 | + | |
| 165 | + | |
151 | 166 | | |
152 | | - | |
153 | | - | |
| 167 | + | |
| 168 | + | |
154 | 169 | | |
155 | 170 | | |
156 | 171 | | |
| |||
Lines changed: 23 additions & 12 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
23 | | - | |
24 | | - | |
25 | | - | |
26 | | - | |
27 | | - | |
28 | | - | |
29 | | - | |
30 | | - | |
31 | | - | |
32 | | - | |
33 | | - | |
34 | 22 | | |
35 | 23 | | |
36 | 24 | | |
| |||
270 | 258 | | |
271 | 259 | | |
272 | 260 | | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
273 | 284 | | |
274 | 285 | | |
275 | 286 | | |
Lines changed: 2 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
58 | 58 | | |
59 | 59 | | |
60 | 60 | | |
| 61 | + | |
| 62 | + | |
61 | 63 | | |
62 | 64 | | |
63 | 65 | | |
0 commit comments