Releases: trixi-gpu/TrixiCUDA.jl
TrixiCUDA v0.1.0-beta.6
What's New
- Now compatible with CUDA.jl 5.3.3 for Julia 1.10 but still not compatible with Julia 1.11
What's Changed
- Adapt
DGSEM
to enable caching of its data on GPU by @huiyuxie in #117 - Fuse setting diagonal elements to zeros in
derivative_split
into kernels by @huiyuxie in #119 - Optimize volume integral kernels for shock capturing (less common use) by @huiyuxie in #120
- Optimize volume integral kernels for shock capturing (frequent use) by @huiyuxie in #121
- Small optimization patch for flux differencing kernels by @huiyuxie in #122
- Enable double caching: CPU cache and GPU cache by @huiyuxie in #123
- Bump compat for Trixi to 0.9.15 by @huiyuxie in #124
- Update README.md by @huiyuxie in #125
- Optimize volume integral kernels for shock capturing (frequent use) by @huiyuxie in #126
- Add check for shared memory size per thread block by @huiyuxie in #127
- Separate GPU kernels and host functions in different files by @huiyuxie in #128
Full Changelog: v0.1.0-beta.5...v0.1.0-beta.6
TrixiCUDA v0.1.0-beta.5
What's Changed
- Small changes regarding variable name on tests by @huiyuxie in #88
- Add scripts for benchmarking and profiling workflows by @huiyuxie in #92
- Refactor solver and tests to support
Float32
computations by @huiyuxie in #94 - Add one more example to benchmark by @huiyuxie in #95
- Update README.md by @huiyuxie in #96
- Combine similar kernels using cooperative groups by @huiyuxie in #97
- Relax inbounds checking within GPU kernels by @huiyuxie in #99
- Fuse
reset_du!
function into volume integral kernels by @huiyuxie in #100 - Relax inbounds checking in minor GPU kernels by @huiyuxie in #101
- Optimize volume integral kernels by @huiyuxie in #102
- Update README.md by @huiyuxie in #103
- Load package with device property querying by @huiyuxie in #104
- Optimize volume integral kernel for flux differencing by @huiyuxie in #105
- Bump crate-ci/typos from 1.28.1 to 1.29.0 by @dependabot in #106
- Switch to less parallelism to avoid redundant computation by @huiyuxie in #107
- Remove comments for
reset_du!
function in tests by @huiyuxie in #108 - Update some critical comments by @huiyuxie in #111
- Adapt
wrap_array
for GPU arrays by @huiyuxie in #112 - Optimize volume integral kernels for flux differencing by @huiyuxie in #114
- Optimization patch for volume integral kernels by @huiyuxie in #115
- Optimize volume integral kernels for larger arrays (less common use) by @huiyuxie in #116
Full Changelog: v0.1.0-beta.4...v0.1.0-beta.5
TrixiCUDA v0.1.0-beta.4
What's Changed
- Enhance old macros by @huiyuxie in #61
- Optimize kernel configurators by @huiyuxie in #62
- Bump crate-ci/typos from 1.24.6 to 1.25.0 by @dependabot in #65
- Add dependencies to documentation by @huiyuxie in #67
- Enable documentation build again by @huiyuxie in #69
- Bump crate-ci/typos from 1.25.0 to 1.27.3 by @dependabot in #75
- Bump codecov/codecov-action from 4 to 5 by @dependabot in #76
- Refactor CI workflow for sanity check and readability by @huiyuxie in #78
- CompatHelper: bump compat for Trixi to 0.9, (keep existing compat) by @github-actions in #79
- Clean docs build and add docs to CompatHelper by @huiyuxie in #80
- CompatHelper: add new compat entry for Documenter at version 1 for package docs, (keep existing compat) by @github-actions in #81
- Remove macOS tests in CI by @huiyuxie in #82
- Add recent updates to README.md by @huiyuxie in #83
- Fix small typos from README.md by @huiyuxie in #84
- Bump crate-ci/typos from 1.27.3 to 1.28.1 by @dependabot in #85
- Parallelization of compute coefficients functions on GPU by @huiyuxie in #63
- Set version bounds for Trixi.jl to ensure compatibility by @huiyuxie in #86
- Extend margin in JuliaFormatter by @huiyuxie in #87
Full Changelog: v0.1.0-beta.3...v0.1.0-beta.4
TrixiCUDA v0.1.0-beta.3
New step would be implementing the initialization part of DG elements, interfaces, boundaries, and mortars. Also check the indicator failure when coupling the GPU cache in the function call.
What's Changed
- Update README.md by @huiyuxie in #47
- Rename package as TrixiCUDA.jl by @huiyuxie in #49
- Bump crate-ci/typos from 1.24.5 to 1.24.6 by @dependabot in #48
- Enable documentation for package by @huiyuxie in #50
- Fix README.md by @huiyuxie in #51
- Add more materials to documentation by @huiyuxie in #52
- Create cache for GPU arrays to optimize data transfer by @huiyuxie in #53
- Add new logo to documentation by @huiyuxie in #55
- Change logo for documentation by @huiyuxie in #56
- Finalize package logo by @huiyuxie in #58
- Create cache to store GPU arrays by @huiyuxie in #57
- Refactor tests based on new GPU cache by @huiyuxie in #60
Full Changelog: v0.1.0-beta.2...v0.1.0-beta.3
TrixiCUDA v0.1.0-beta.2
All kernels for tree mesh with DGSEM are completed and here are tasks to do for the next release:
- Implement cache to store the arrays that are frequently used on GPU
- Move data transfer outside of the iterative solver
- Start the process of kernel optimization on GPU
- *Start the implementation for structured mesh with DGSEM on GPU
- *Start the implementation for tree mesh initialization on GPU
The last two directions need to be discussed.
What's Changed
- Bump crate-ci/typos from 1.24.3 to 1.24.5 by @dependabot in #37
- Add shock capturing with
nonconservative_terms::True
by @huiyuxie in #36 - Macro for testing approximate equality for GPU and CPU arrays by @huiyuxie in #38
- Fix boundary flux kernel with multiple dispatches by @huiyuxie in #39
- Flux differencing for 3D with
nonconservative_terms::True
by @huiyuxie in #40 - Interface flux for 3D with
nonconservative_terms::True
by @huiyuxie in #41 - Mortar flux with
nonconservative_terms::True
by @huiyuxie in #42 - Shock capturing with
nonconservative_terms::True
by @huiyuxie in #46
Full Changelog: v0.1.0-beta...v0.1.0-beta.2
TrixiCUDA v0.1.0-beta
Kernels left to be implemented:
nonconservative_terms::True
for flux differencing, interface flux, mortar flux (only 3D, and wait for mutable structs MHD to be fixed)nonconservative_terms::True
for shock capturing (1D, 2D, 3D)
Implementation for VolumeIntegralPureLGLFiniteVolume
is not necessary as @ranocha suggested.
What's Changed
- Add mortar flux kernel with
nonconservative_terms::False
to 2D and 3D by @huiyuxie in #24 - Update README.md by @huiyuxie in #25
- Bump crate-ci/typos from 1.24.1 to 1.24.3 by @dependabot in #28
- Remove unused arguments from kernel function by @huiyuxie in #27
- Use math expression to enhance performance by @huiyuxie in #30
- Refactor and add more tests for DGSEM solver with tree mesh by @huiyuxie in #31
- Add boundary flux kernel with
nonconservative_terms::True
to 1D and 2D by @huiyuxie in #32 - Add volume integral kernel with
volume_integral::VolumeIntegralShockCapturingHG
by @huiyuxie in #34 - Add more compatible examples and update docs by @huiyuxie in #35
Full Changelog: v0.1.0-alpha...v0.1.0-beta
TrixiCUDA v0.1.0-alpha
Here are some kernels left to be implemented for TreeMesh
with DGSEM
:
calc_mortar_flux!
calc_boundary_flux!
-nonconservative_terms::True
calc_volume_integral!
-volume_integral::VolumeIntegralShockCapturingHG
andvolume_integral::VolumeIntegralPureLGLFiniteVolume
What's Changed
- CompatHelper: add new compat entry for StaticArrays at version 1, (keep existing compat) by @github-actions in #17
- CompatHelper: add new compat entry for StrideArrays at version 0.1, (keep existing compat) by @github-actions in #18
- CompatHelper: add new compat entry for SciMLBase at version 2, (keep existing compat) by @github-actions in #19
- CompatHelper: add new compat entry for SimpleUnPack at version 1, (keep existing compat) by @github-actions in #20
- Drop unpack and use standard destructing syntax by @ErikQQY in #21
- Bump crate-ci/typos from 1.23.6 to 1.24.1 by @dependabot in #23
New Contributors
- @github-actions made their first contribution in #17
- @ErikQQY made their first contribution in #21
- @dependabot made their first contribution in #23
Full Changelog: https://github.com/czha/TrixiGPU.jl/commits/v0.1.0-alpha