Add GPU kernel for `calc_sources!` by MarcoArtiano · Pull Request #3012 · trixi-framework/Trixi.jl

MarcoArtiano · 2026-05-17T20:38:10Z

I first naively implemented the approach following the other existing kernel. However it was as slow as computing the surface fluxes. Here each thread of the GPU is launched for each quadrature node (i, j, element), which makes the GPU kernel roughly 6 times faster.

Per element kernel
source terms 19.2k 4.89s 12.0% 254μs 143MiB 10.5% 7.62KiB

Per quadrature node kernel
source terms 19.2k 792ms 2.0% 41.2μs 104MiB 8.1% 5.55KiB

ps: it may also be worth trying (nnodes(dg)*nnodes(dg), nelements(dg, cache)) and reconstructing the indices i and j in the kernel.

github-actions · 2026-05-17T20:38:20Z

codecov · 2026-05-17T21:56:14Z

Codecov Report

❌ Patch coverage is 81.69014% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.09%. Comparing base (a70bfa2) to head (8fda9ee).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
src/solvers/dgsem_p4est/dg_3d_gpu.jl	75.00%	11 Missing ⚠️
src/solvers/dgsem_p4est/dg_2d_gpu.jl	86.67%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3012      +/-   ##
==========================================
- Coverage   97.09%   97.09%   -0.01%     
==========================================
  Files         630      631       +1     
  Lines       48881    48911      +30     
==========================================
+ Hits        47461    47487      +26     
- Misses       1420     1424       +4

Flag	Coverage Δ
unittests	`97.09% <81.69%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ranocha · 2026-05-18T05:09:45Z

ps: it may also be worth trying (nnodes(dg)*nnodes(dg), nelements(dg, cache)) and to reconstruct the indices i and j in the kernel.

Would you expect performance improvements coming from this, @vchuravy?

MarcoArtiano · 2026-05-18T12:04:37Z

ps: it may also be worth trying (nnodes(dg)*nnodes(dg), nelements(dg, cache)) and to reconstruct the indices i and j in the kernel.

Would you expect performance improvements coming from this, @vchuravy?

Comment Flux Diff GPU here I've tested these two cases and I didn't notice any major differences. I'm not sure if one option has a better scalability over the other.

ranocha · 2026-05-18T15:13:18Z

The GPU CI job on buildkite fails. Could you please check what is going on there? Please ping me when your PR improving the performance of applying the Jacobian has been merged and this PR is updated accordingly to the new structure.

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

MarcoArtiano · 2026-05-18T17:30:11Z

This PR will be ready to be reviewed after #3013 will be merged and the conflicts will be resolved.

MarcoArtiano · 2026-05-18T21:42:56Z

@ranocha this is again ready for another round of review.

ranocha · 2026-05-19T18:49:09Z

Can you please adapt the allocation check? https://github.com/trixi-framework/Trixi.jl/actions/runs/26111095627/job/76788248592?pr=3012#step:7:1211

MarcoArtiano · 2026-05-20T10:32:15Z

Is the code coverage expected to fail, even if we are running KA with CPU backend?

benegee · 2026-05-20T12:13:30Z

Is the code coverage expected to fail, even if we are running KA with CPU backend?

I think you are good!
The KA tests were explicitly added to get coverage reports for the backend::Backend specializations. However, some parts like @index will still not be detected.

ranocha

Thanks!

JoshuaLampert

Thanks! I have two questions regarding the tests.

Co-authored-by: Joshua Lampert <51029046+JoshuaLampert@users.noreply.github.com>

Co-authored-by: Marco Artiano <57838732+MarcoArtiano@users.noreply.github.com>

ranocha

Thanks! I just have a few suggestions to prepare my PR adapting equations as well.

Co-authored-by: Hendrik Ranocha <ranocha@users.noreply.github.com>

ranocha

Thanks for your patience!

ranocha · 2026-05-21T18:35:38Z

Codecov upload failed, all tests passed (except known buildkite issues on the AMD server); I will merge this PR.

add source term gpu kernel

d50532f

add dispatch for no source terms

453f04d

MarcoArtiano added the gpu label May 17, 2026

add 1d and 3d source gpu kernels

5b7e0e8

MarcoArtiano changed the title ~~WIP: Add 2D GPU kernel for calc_sources!~~ WIP: Add GPU kernel for calc_sources! May 17, 2026

specify signature

158e481

add tests

8f8b0ef

MarcoArtiano marked this pull request as ready for review May 18, 2026 12:11

MarcoArtiano changed the title ~~WIP: Add GPU kernel for calc_sources!~~ Add GPU kernel for calc_sources! May 18, 2026

MarcoArtiano mentioned this pull request May 18, 2026

GPU porting #2822

Open

31 tasks

JoshuaLampert requested changes May 18, 2026

View reviewed changes

Comment thread test/test_amdgpu_2d.jl Outdated

JoshuaLampert and others added 2 commits May 18, 2026 17:17

Update test/test_amdgpu_2d.jl

5add261

Update test/test_amdgpu_2d.jl

0a55992

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

MarcoArtiano changed the base branch from main to ma/jacobian_gpu May 18, 2026 16:43

MarcoArtiano changed the base branch from ma/jacobian_gpu to main May 18, 2026 16:43

MarcoArtiano added 3 commits May 18, 2026 18:54

refactoring

244504d

format

d897a73

refactor also 3D kernels

74e140d

Merge branch 'main' into ma/source_gpu

7ecf686

github-actions Bot reviewed May 18, 2026

View reviewed changes

Comment thread src/solvers/dgsem_p4est/dg_2d_gpu.jl Outdated

Comment thread src/solvers/dgsem_p4est/dg_2d_gpu.jl Outdated

MarcoArtiano added 2 commits May 18, 2026 22:18

format

f1209fe

fix tests

c978036

fix tests

5a5948f

Merge branch 'main' into ma/source_gpu

6d8c9d0

MarcoArtiano added 3 commits May 19, 2026 21:21

fix allocations

031528d

fix allocations 3D

6044e49

fix allocations

e19fe5f

ranocha requested a review from JoshuaLampert May 21, 2026 06:08

ranocha previously approved these changes May 21, 2026

View reviewed changes

Merge branch 'main' into ma/source_gpu

8fda9ee

JoshuaLampert requested changes May 21, 2026

View reviewed changes

Apply suggestions from code review

f3e650d

Co-authored-by: Joshua Lampert <51029046+JoshuaLampert@users.noreply.github.com>

MarcoArtiano dismissed ranocha’s stale review via f3e650d May 21, 2026 12:01

MarcoArtiano commented May 21, 2026

View reviewed changes

Comment thread test/test_amdgpu_3d.jl Outdated

MarcoArtiano added 2 commits May 21, 2026 14:02

Update test/test_amdgpu_3d.jl

b15439e

Merge branch 'main' into ma/source_gpu

594eefd

JoshuaLampert requested changes May 21, 2026

View reviewed changes

Comment thread test/test_cuda_3d.jl Outdated

Comment thread test/test_kernelabstractions.jl Outdated

Apply suggestions from code review

7f62e6e

Co-authored-by: Joshua Lampert <51029046+JoshuaLampert@users.noreply.github.com>

JoshuaLampert previously approved these changes May 21, 2026

View reviewed changes

Merge branch 'main' into ma/source_gpu

a743ff5

MarcoArtiano commented May 21, 2026

View reviewed changes

Comment thread examples/p4est_2d_dgsem/elixir_euler_source_terms.jl

Comment thread examples/p4est_3d_dgsem/elixir_euler_source_terms.jl

add comments about storage_type and real_type choices

d4f6ff9

Co-authored-by: Marco Artiano <57838732+MarcoArtiano@users.noreply.github.com>

MarcoArtiano dismissed JoshuaLampert’s stale review via d4f6ff9 May 21, 2026 14:59

Merge branch 'main' into ma/source_gpu

d66b690

ranocha reviewed May 21, 2026

View reviewed changes

Apply suggestions from code review

34215b6

Co-authored-by: Hendrik Ranocha <ranocha@users.noreply.github.com>

ranocha approved these changes May 21, 2026

View reviewed changes

ranocha merged commit de4ab97 into main May 21, 2026
35 of 39 checks passed

ranocha deleted the ma/source_gpu branch May 21, 2026 18:35

Conversation

MarcoArtiano commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 17, 2026

Review checklist

Purpose and scope

Code quality

Documentation

Testing

Performance

Verification

Uh oh!

codecov Bot commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ranocha commented May 18, 2026

Uh oh!

MarcoArtiano commented May 18, 2026

Uh oh!

ranocha commented May 18, 2026

Uh oh!

Uh oh!

MarcoArtiano commented May 18, 2026

Uh oh!

Uh oh!

Uh oh!

MarcoArtiano commented May 18, 2026

Uh oh!

ranocha commented May 19, 2026

Uh oh!

MarcoArtiano commented May 20, 2026

Uh oh!

benegee commented May 20, 2026

Uh oh!

ranocha left a comment

Choose a reason for hiding this comment

Uh oh!

JoshuaLampert left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ranocha left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ranocha left a comment

Choose a reason for hiding this comment

Uh oh!

ranocha commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

MarcoArtiano commented May 17, 2026 •

edited

Loading

codecov Bot commented May 17, 2026 •

edited

Loading