Skip to content

Latest commit

 

History

History
25 lines (19 loc) · 899 Bytes

File metadata and controls

25 lines (19 loc) · 899 Bytes

Experiment 09: vec3, vec4, and Padding Costs

1. Focus

  • Compare common vector-friendly storage choices against tighter scalarized layouts.

2. Question

  • How much performance is lost when vec3 and vec4-style layouts carry padding that the kernel does not need?

3. Variants

  • split_scalars
  • vec3_padded
  • vec4

4. Method

  • Use the same logical per-record values and the same arithmetic in every variant.
  • Change only the storage representation and validate all outputs against the same CPU reference.

5. Outputs

  • Median GPU time by layout.
  • Useful-payload GB/s by layout.
  • Padding overhead relative to the scalarized baseline.

6. Interpretation

  • Vector convenience and alignment hygiene can cost real bandwidth when the shader does not use the extra padded bytes.
  • This experiment helps decide when explicit scalar packing is worth the added code complexity.