Description
This issue was found in #112. When running the examples, a scalar indexing issue was encountered during integration. Specifically, this can be traced back to the integrate
function (refer to analysis_dg1d.jl
, analysis_dg2d.jl
, and analysis_dg3d.jl
in the callbacks_step
directory).
An MWE can be easily reproduced by setting CUDA.allowscalar(false)
at the top of any example script.
Also, it can be further traced down to this example (updated):
using CUDA
CUDA.allowscalar(false)
using Trixi
using TrixiCUDA
# Set the precision
RealT = Float64
advection_velocity = 1.0
equations = LinearScalarAdvectionEquation1D(advection_velocity)
solver_gpu = DGSEMGPU(polydeg = 3, surface_flux = flux_lax_friedrichs, RealT = RealT)
coordinates_min = -1.0
coordinates_max = 1.0
mesh = TreeMesh(coordinates_min, coordinates_max,
initial_refinement_level = 4,
n_cells_max = 30_000)
semi_gpu = SemidiscretizationHyperbolicGPU(mesh, equations, initial_condition_convergence_test,
solver_gpu)
tspan_gpu = (0.0, 1.0)
t_gpu = 0.0
# Semi on GPU
equations_gpu, mesh_gpu, solver_gpu = semi_gpu.equations, semi_gpu.mesh, semi_gpu.solver
cache_gpu, cache_cpu = semi_gpu.cache_gpu, semi_gpu.cache_cpu
boundary_conditions_gpu, source_terms_gpu = semi_gpu.boundary_conditions, semi_gpu.source_terms
# ODE on GPU
ode_gpu = semidiscretizeGPU(semi_gpu, tspan_gpu)
u_gpu_ = copy(ode_gpu.u0)
du_gpu_ = similar(u_gpu_)
u_gpu = TrixiCUDA.wrap_array(u_gpu_, mesh_gpu, equations_gpu, solver_gpu, cache_gpu)
du_gpu = TrixiCUDA.wrap_array(du_gpu_, mesh_gpu, equations_gpu, solver_gpu, cache_gpu)
# This is the root cause
Trixi.get_node_vars(u_gpu, equations_gpu, solver_gpu, 1, 1)
Currently, this issue is skipped by setting CUDA.allowscalar(true)
.