Skip to content

Conversation

@cwschilly
Copy link
Contributor

@cwschilly cwschilly commented Jun 6, 2025

Fixes #10
Fixes #12 (wip)

Adds both double and Kokkos::complex<double> benchmarks for:

  • Level 1, 2, and 3 BLAS kernels
  • DPOTRF

Driver Changes

Instead of passing the dimensions of the matrices used for the benchmarks, just specify the number of floating point operations that each benchmark should perform:

./slownode <iters> <flops>

Each benchmark will then determine the matrix/vector dimensions so that the operation approximates this number of fp ops.

Detection Script Changes

When running the detection script, specify which benchmark you want to analyze with the following options:

-b (--benchmark): [level1, level2, level3, dpotrf]
-d (--datatype):  [double, complex]

If these flags are not set, the script defaults to level3 and double.

@cwschilly cwschilly linked an issue Jun 6, 2025 that may be closed by this pull request
@cwschilly cwschilly marked this pull request as ready for review June 10, 2025 13:51
@cwschilly cwschilly requested review from lifflander and nlslatt June 10, 2025 15:39
@nlslatt
Copy link

nlslatt commented Jun 11, 2025

The final value on some gather lines seems highly suspicious.

gather: 95 (nodename): 1.25574: breakdown: 0.125147 0.132657 0.125616 0.131049 0.124445 0.124678 0.125 0.124691 0.126191 0.116261 3.16202e-322 
gather: 91 (nodename): 2.6874e-05: breakdown: 2.507e-06 2.253e-06 2.24e-06 2.489e-06 2.552e-06 2.523e-06 4.249e-06 2.522e-06 2.689e-06 2.85e-06 1.83819

@nlslatt
Copy link

nlslatt commented Jun 11, 2025

We also need to be able to specify the $N$ for different BLAS levels independently. I have 83 seconds for level 3 complex and 1e-5 for level 1 double.

@cwschilly
Copy link
Contributor Author

Thanks @nlslatt these issues should be fixed now

@cwschilly cwschilly marked this pull request as draft June 16, 2025 14:13
@cwschilly cwschilly marked this pull request as draft June 16, 2025 14:13
@cwschilly
Copy link
Contributor Author

Converting to draft while I work on #12

@cwschilly cwschilly marked this pull request as ready for review July 22, 2025 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Set problem size based on desired number of flops Add more benchmarks

3 participants