Core Experiment Plans Index (Lecture Notes Track)

This index points to the detailed lecture-note style plan for each core experiment.

Memory Optimization Sequence

Core Experiment Plans

Experiment 01: Dispatch Basics: Minimal Vulkan compute dispatch, correctness path, and baseline GPU timing.
Experiment 02: Local Size Sweep: Workgroup sizing and execution efficiency tradeoffs.
Experiment 03: Memory Copy Baseline: Raw buffer read/write/copy throughput characterization.
Experiment 04: Sequential Indexing: Ideal contiguous thread-to-data mapping as a good-path baseline.
Experiment 05: Global ID Mapping Variants: Direct, offset, and grid-stride mapping behavior.
Experiment 06: AoS vs SoA: Array-of-Structures versus Structure-of-Arrays layout efficiency.
Experiment 07: AoSoA or Blocked Layout: Hybrid layout balancing vector locality and contiguous field access.
Experiment 08: std430 vs std140 vs Packed: Shader buffer layout standards and padding cost.
Experiment 09: vec3, vec4, and Padding Costs: Impact of vector shape choice on storage efficiency and bandwidth.
Experiment 10: Scalar Type Width Sweep: Precision-width tradeoffs: 32-bit, 16-bit, and narrower storage.
Experiment 11: Coalesced vs Strided Access: Contiguous and strided load behavior.
Experiment 12: Gather Access Pattern: Indirect indexed reads through an index buffer.
Experiment 13: Scatter Access Pattern: Indirect indexed writes and contention behavior.
Experiment 14: Read Reuse and Cache Locality: Temporal locality and reuse-distance effects.
Experiment 15: Bandwidth Saturation Sweep: Scaling data volume until practical bandwidth plateau.
Experiment 16: Shared or Workgroup Memory Tiling: Staging data in on-chip memory for reuse.
Experiment 17: Tile Size Sweep: Tradeoff between reuse, shared-memory pressure, and occupancy.
Experiment 18: Register Pressure Proxy Study: Effect of increased per-thread temporary state.
Experiment 19: Branch Divergence: Control-flow divergence within warp or wave execution.
Experiment 20: Barrier and Synchronization Cost: Synchronization overhead characterization.
Experiment 21: Parallel Reduction: Reduction patterns from naive to tree and shared-memory optimized.
Experiment 22: Prefix Sum or Scan: Inclusive/exclusive scan as a foundational parallel primitive.
Experiment 23: Histogram and Atomic Contention: Atomic update contention and privatization strategies.
Experiment 24: Stream Compaction: Flag, scan, and compact-write pipeline.
Experiment 25: Spatial Binning or Clustered Culling Capstone: Rendering-style compute pipeline combining prior primitives.

Priority Extensions Beyond Core 25

Experiment 26: Warp-Level Coalescing Alignment: Aligned vs misaligned contiguous accesses at warp granularity.
Experiment 27: Cache Thrashing, Random vs Sequential: Healthy locality versus deliberate cache defeat.
Experiment 28: Device-Local vs Host-Visible Heap Placement: Dispatch-only and end-to-end cost of host-visible buffers versus staged device-local placement.
Experiment 29: Shared Memory Bank Conflict Study: Stride-driven shared-memory bank conflicts and the padding fix.
Experiment 30: Subgroup Reduction Variants: Compare shared-tree reduction with subgroup-assisted reduction.
Experiment 31: Subgroup Scan Variants: Block-local inclusive scan using shared memory versus subgroup intrinsics.
Experiment 32: Subgroup Stream Compaction Variants: Per-workgroup compaction using shared atomics versus subgroup ballot ranking.
Experiment 33: 2D Locality and Transpose Study: Row-major copy versus naive and tiled transpose access patterns.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Core Experiment Plans Index (Lecture Notes Track)

Recommended Reading Order

Memory Optimization Sequence

Core Experiment Plans

Priority Extensions Beyond Core 25

FilesExpand file tree

core_experiment_plans_index.md

Latest commit

History

core_experiment_plans_index.md

File metadata and controls

Core Experiment Plans Index (Lecture Notes Track)

Recommended Reading Order

Memory Optimization Sequence

Core Experiment Plans

Priority Extensions Beyond Core 25