Skip to content

feat[gpu_prover]: try acting on 10 columns (depth-first) instead of 1 in hypercube bench#210

Closed
mcarilli wants to merge 1 commit intorr/hypercube-monomialsfrom
mc-butlerian-jihad
Closed

feat[gpu_prover]: try acting on 10 columns (depth-first) instead of 1 in hypercube bench#210
mcarilli wants to merge 1 commit intorr/hypercube-monomialsfrom
mc-butlerian-jihad

Conversation

@mcarilli
Copy link

What ❔

"Augments" hypercube bench to act on 10 columns depth-first, instead of 1 column repeatedly.

Why ❔

Possibly more realistic, and eliminates spurious L2 residency of a single column across bench iterations.

Is this a breaking change?

  • Yes
  • No

Checklist

  • PR title corresponds to the body of PR (we generate changelog entries from PRs).
  • Tests for the changes have been added / updated.
  • Documentation comments have been added / updated.
  • Code has been formatted.

let mut d_src = DeviceAllocation::alloc(rows * COLS)?;
let d_dst = DeviceAllocation::alloc(rows * COLS)?;

// Fill once to avoid benchmarking uninitialized memory reads.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Believe me, you can get into benchmarking allocator here - as it's zeroed memory and it may be allocated lazily

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good point that i should ensure all first touch overhead happens before benchmarking iterations, but this particular spot allocates gpu memory which doesn't have the same lazy first touch behavior as cpu ram.

Anyway, RobertGPT fixed the whole issue a different way here, so my patch is no longer needed.

@mcarilli
Copy link
Author

mcarilli commented Mar 3, 2026

Fixed by Robert in 7ab528b

@mcarilli mcarilli closed this Mar 3, 2026
@mcarilli mcarilli deleted the mc-butlerian-jihad branch March 3, 2026 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants