- Measure the penalty of indirect indexed reads relative to sequential access.
- How much does gather performance depend on index coherence and locality?
sequentialblock_coherentrandomclustered_random
- Read from a source buffer through a host-generated index buffer with controlled distributions.
- Keep output writes and arithmetic fixed so the result is dominated by read-side access behavior.
- Median GPU time by index distribution.
- Relative slowdown vs the sequential baseline.
- Throughput and useful-payload GB/s.
- Gather cost should track how much locality survives the indirection pattern.
- This is the baseline for later systems that depend on neighbor lookups, sparse reads, or visibility indirection.