Skip to content

Conversation

@jeremylt
Copy link
Member

@jeremylt jeremylt commented Jun 3, 2025

This PR adds AtPoints assembly kernels to */gen

  • Cuda Diagonal
  • Hip Diagonal

And then for the next PR I'll do

  • Cuda Full
  • Hip Full

@jeremylt jeremylt self-assigned this Jun 3, 2025
@jeremylt jeremylt force-pushed the jeremy/assemble-op-gen-at-points branch 10 times, most recently from 58b80f9 to 55b5b14 Compare June 9, 2025 16:48
@jeremylt jeremylt added 1-In Review and removed 0-WIP labels Jun 9, 2025
@jeremylt
Copy link
Member Author

jeremylt commented Jun 9, 2025

Marked as ready for review - this passes the Ratel test suite

cc @zatkins-dev because you were interested in the code generation - happy to step through the content with you in a screen share or otherwise

@jeremylt jeremylt force-pushed the jeremy/assemble-op-gen-at-points branch 4 times, most recently from b548db9 to 94d00b3 Compare June 10, 2025 20:30
@jeremylt
Copy link
Member Author

Note - we should also test somewhere with Ratel + this branch + HIP before merging

@jeremylt
Copy link
Member Author

Note to me - I forgot to generalize to multiple active bases. Small fix that's easy

  1. array const CeedInt num_active_in[2] = {1, 3}; (or however many, whichever ind)
  2. array const CeedInt active_r_e_in[2] = {r_e_in_1, r_e_in_3};
  3. loop for (CeedInt b = 0; b < num_active_in; b++) {
  4. update n loop as needed

@jeremylt
Copy link
Member Author

Ok, that plan above actually won't work because of templates. I'm restricting this to just when we have a single active basis, which means we'll get best performance if we fieldsplit when we have multiple active bases, which is the plan anyways so we can use pMG on the displacement block.

@zatkins-dev
Copy link
Collaborator

In that case, I think this is good!

@jeremylt
Copy link
Member Author

Do you have a machine you can check this branch with Ratel on quickly?

@jeremylt jeremylt mentioned this pull request Jun 11, 2025
@jeremylt jeremylt force-pushed the jeremy/assemble-op-gen-at-points branch from 2376943 to 8663e86 Compare June 17, 2025 15:37
@jeremylt jeremylt force-pushed the jeremy/assemble-op-gen-at-points branch from 8663e86 to 217761a Compare June 17, 2025 18:16
@jeremylt jeremylt merged commit 0183ed6 into main Jun 17, 2025
29 checks passed
@jeremylt jeremylt deleted the jeremy/assemble-op-gen-at-points branch June 17, 2025 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants