Commit e042c48
committed
[zephyr] Fix PR review feedback: streaming reads, secondary sort, vectorized boundaries
- Stream Arrow run files via iter_batches() instead of whole-table read to
avoid OOM on large shards that trigger external sort
- Add promote_options="default" to concat_tables in external sort for
schema-evolved items
- Include _sort_secondary in external sort keys and merge key so within-group
order is preserved
- Vectorize _find_group_boundaries using pc.not_equal diff instead of
per-element .as_py() calls — matters for high-cardinality keys
- Store has_sort as a field on _ShardBuffer set at construction instead of
fragile sorts[0] detection
- Add assertion guard in _arrow_reduce_gen for pickled shards1 parent 05deaf3 commit e042c48
3 files changed
Lines changed: 37 additions & 12 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
193 | 193 | | |
194 | 194 | | |
195 | 195 | | |
196 | | - | |
| 196 | + | |
197 | 197 | | |
198 | 198 | | |
199 | 199 | | |
| |||
214 | 214 | | |
215 | 215 | | |
216 | 216 | | |
217 | | - | |
218 | | - | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
219 | 220 | | |
220 | 221 | | |
221 | 222 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
179 | 179 | | |
180 | 180 | | |
181 | 181 | | |
182 | | - | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
183 | 187 | | |
184 | 188 | | |
185 | 189 | | |
186 | 190 | | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
187 | 199 | | |
188 | | - | |
189 | | - | |
190 | | - | |
191 | | - | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
192 | 203 | | |
193 | 204 | | |
194 | 205 | | |
| |||
223 | 234 | | |
224 | 235 | | |
225 | 236 | | |
| 237 | + | |
226 | 238 | | |
227 | 239 | | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
228 | 244 | | |
229 | 245 | | |
230 | 246 | | |
| |||
233 | 249 | | |
234 | 250 | | |
235 | 251 | | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
236 | 256 | | |
237 | 257 | | |
238 | 258 | | |
| |||
243 | 263 | | |
244 | 264 | | |
245 | 265 | | |
| 266 | + | |
| 267 | + | |
246 | 268 | | |
| 269 | + | |
| 270 | + | |
247 | 271 | | |
248 | 272 | | |
249 | 273 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
518 | 518 | | |
519 | 519 | | |
520 | 520 | | |
| 521 | + | |
521 | 522 | | |
522 | 523 | | |
523 | 524 | | |
| |||
534 | 535 | | |
535 | 536 | | |
536 | 537 | | |
537 | | - | |
538 | 538 | | |
539 | 539 | | |
540 | 540 | | |
541 | 541 | | |
542 | 542 | | |
543 | 543 | | |
544 | | - | |
| 544 | + | |
545 | 545 | | |
546 | 546 | | |
547 | 547 | | |
| |||
629 | 629 | | |
630 | 630 | | |
631 | 631 | | |
632 | | - | |
| 632 | + | |
633 | 633 | | |
634 | 634 | | |
635 | 635 | | |
| |||
694 | 694 | | |
695 | 695 | | |
696 | 696 | | |
697 | | - | |
| 697 | + | |
698 | 698 | | |
699 | 699 | | |
700 | 700 | | |
| |||
0 commit comments