Skip to content

CUDA.@profile: DataFrames post-processing needs to be optimized #2567

Open
@Technici4n

Description

@Technici4n

Sanity checks (read this first, then remove this section)

  • Make sure you're reporting a bug; for general questions, please use Discourse or
    Slack.

  • If you're dealing with a performance issue, make sure you disable scalar iteration
    (CUDA.allowscalar(false)). Only file an issue if that shows scalar iteration happening
    in CUDA.jl or Base Julia, as opposed to your own code.

  • If you're seeing an error message, follow the error message instructions, if any
    (e.g. inspect code with @device_code_warntype). If you can't solve the problem using
    that information, make sure to post it as part of the issue.

  • Always ensure you're using the latest version of CUDA.jl, and if possible, please
    check the master branch to see if your issue hasn't been resolved yet.

If your bug is still valid, please go ahead and fill out the template below.

Describe the bug

Wrapping a computation that takes 20 seconds with CUDA.@profile requires multiple minutes of post processing in my case.

Trace:

Thread 1 Task 0x00007f36053fc010 Total snapshots: 542. Utilization: 29%
   ╎542 @Base/client.jl:541; _start()
   ╎ 542 @Base/client.jl:567; repl_main
   ╎  542 @Base/client.jl:430; run_main_repl(interactive::Bool, quiet::Bool, banner::Symbol, history_file::Bool, color_set::Bool)
   ╎   542 @Base/essentials.jl:1052; invokelatest
   ╎    542 @Base/essentials.jl:1055; #invokelatest#2
   ╎     542 @Base/client.jl:446; (::Base.var"#1139#1141"{Bool, Symbol, Bool})(REPL::Module)
   ╎    ╎ 542 @REPL/src/REPL.jl:469; run_repl(repl::REPL.AbstractREPL, consumer::Any)
   ╎    ╎  542 @REPL/src/REPL.jl:483; run_repl(repl::REPL.AbstractREPL, consumer::Any; backend_on_current_task::Bool, backend::Any)
   ╎    ╎   542 @REPL/src/REPL.jl:324; kwcall(::NamedTuple, ::typeof(REPL.start_repl_backend), backend::REPL.REPLBackend, consumer::Any)
   ╎    ╎    542 @REPL/src/REPL.jl:327; start_repl_backend(backend::REPL.REPLBackend, consumer::Any; get_module::Function)
   ╎    ╎     542 @REPL/src/REPL.jl:342; repl_backend_loop(backend::REPL.REPLBackend, get_module::Function)
   ╎    ╎    ╎ 542 @REPL/src/REPL.jl:245; eval_user_input(ast::Any, backend::REPL.REPLBackend, mod::Module)
   ╎    ╎    ╎  542 @Base/boot.jl:430; eval
   ╎    ╎    ╎   542 @CUDA/src/profile.jl:267; profile_internally(f::Function)
   ╎    ╎    ╎    542 @CUDA/src/profile.jl:303; profile_internally(f::var"##309#profiled_code"; concurrent::Bool, kwargs::Base.Pairs{Symbol, …
   ╎    ╎    ╎     542 @CUDA/src/profile.jl:360; capture(cfg::CUDA.CUPTI.ActivityConfig)
   ╎    ╎    ╎    ╎ 12  @CUDA/lib/cupti/wrappers.jl:297; process(f::CUDA.Profile.var"#7#9"{VersionNumber, CUDA.Profile.var"#as_memory_kind#8"…
  5╎    ╎    ╎    ╎  12  @CUDA/lib/utils/call.jl:34; cuptiActivityGetNextRecord
   ╎    ╎    ╎    ╎   7   @CUDA/lib/cupti/libcupti.jl:24; check
   ╎    ╎    ╎    ╎    7   @CUDA/src/memory.jl:434; retry_reclaim(f::CUDA.CUPTI.var"#142#143"{Ptr{UInt8}, UInt64, Base.RefValue{Ptr{CUDA.CUPT…
   ╎    ╎    ╎    ╎     7   @CUDA/lib/utils/call.jl:35; #142
   ╎    ╎    ╎    ╎    ╎ 6   @CUDA/lib/cupti/libcupti.jl:2585; macro expansion
   ╎    ╎    ╎    ╎    ╎  6   @CUDA/lib/cudadrv/libcuda.jl:21; initialize_context
   ╎    ╎    ╎    ╎    ╎   2   @CUDA/lib/cudadrv/state.jl:94; prepare_cuda_state()
  1╎    ╎    ╎    ╎    ╎    1   @CUDA/lib/cudadrv/state.jl:69; task_local_state!()
   ╎    ╎    ╎    ╎    ╎    1   @CUDA/lib/cudadrv/state.jl:72; task_local_state!()
   ╎    ╎    ╎    ╎    ╎     1   @CUDA/lib/cudadrv/state.jl:61; validate_task_local_state(state::CUDA.TaskLocalState)
   ╎    ╎    ╎    ╎    ╎    ╎ 1   @CUDA/lib/cudadrv/context.jl:73; isvalid(ctx::CuContext)
   ╎    ╎    ╎    ╎    ╎    ╎  1   @CUDA/lib/cudadrv/libcuda.jl:3354; unchecked_cuCtxGetId
  1╎    ╎    ╎    ╎    ╎    ╎   1   @CUDA/lib/utils/call.jl:214; macro expansion
   ╎    ╎    ╎    ╎    ╎   4   @CUDA/lib/cudadrv/state.jl:98; prepare_cuda_state()
   ╎    ╎    ╎    ╎    ╎    4   @CUDA/lib/utils/call.jl:34; cuCtxGetCurrent
   ╎    ╎    ╎    ╎    ╎     4   @CUDA/lib/cudadrv/libcuda.jl:35; check
   ╎    ╎    ╎    ╎    ╎    ╎ 4   @CUDA/lib/utils/call.jl:35; #191
   ╎    ╎    ╎    ╎    ╎    ╎  4   @CUDA/lib/cudadrv/libcuda.jl:3336; macro expansion
  4╎    ╎    ╎    ╎    ╎    ╎   4   @CUDA/lib/utils/call.jl:214; macro expansion
   ╎    ╎    ╎    ╎    ╎ 1   @CUDA/lib/cupti/libcupti.jl:2586; macro expansion
  1╎    ╎    ╎    ╎    ╎  1   @CUDA/lib/utils/call.jl:214; macro expansion
   ╎    ╎    ╎    ╎ 3   @CUDA/lib/cupti/wrappers.jl:301; process(f::CUDA.Profile.var"#7#9"{VersionNumber, CUDA.Profile.var"#as_memory_kind#8"…
   ╎    ╎    ╎    ╎  3   @Base/dict.jl:477; getindex
  3╎    ╎    ╎    ╎   3   @Base/essentials.jl:399; getindex
  1╎    ╎    ╎    ╎ 5   @CUDA/lib/cupti/wrappers.jl:302; process(f::CUDA.Profile.var"#7#9"{VersionNumber, CUDA.Profile.var"#as_memory_kind#8"…
  4╎    ╎    ╎    ╎  4   @Base/pointer.jl:30; convert
  1╎    ╎    ╎    ╎ 520 @CUDA/lib/cupti/wrappers.jl:304; process(f::CUDA.Profile.var"#7#9"{VersionNumber, CUDA.Profile.var"#as_memory_kind#8"…
   ╎    ╎    ╎    ╎  3   @CUDA/src/profile.jl:370; (::CUDA.Profile.var"#7#9"{VersionNumber, CUDA.Profile.var"#as_memory_kind#8", DataFrames.D…
   ╎    ╎    ╎    ╎   3   @CUDA/lib/cupti/libcupti.jl:330; unchecked_cuptiGetCallbackName
   ╎    ╎    ╎    ╎    3   @CUDA/lib/cudadrv/libcuda.jl:21; initialize_context
   ╎    ╎    ╎    ╎     2   @CUDA/lib/cudadrv/state.jl:94; prepare_cuda_state()
   ╎    ╎    ╎    ╎    ╎ 2   @CUDA/lib/cudadrv/state.jl:72; task_local_state!()
   ╎    ╎    ╎    ╎    ╎  2   @CUDA/lib/cudadrv/state.jl:61; validate_task_local_state(state::CUDA.TaskLocalState)
   ╎    ╎    ╎    ╎    ╎   2   @CUDA/lib/cudadrv/context.jl:73; isvalid(ctx::CuContext)
   ╎    ╎    ╎    ╎    ╎    2   @CUDA/lib/cudadrv/libcuda.jl:3354; unchecked_cuCtxGetId
  2╎    ╎    ╎    ╎    ╎     2   @CUDA/lib/utils/call.jl:214; macro expansion
   ╎    ╎    ╎    ╎     1   @CUDA/lib/cudadrv/state.jl:98; prepare_cuda_state()
   ╎    ╎    ╎    ╎    ╎ 1   @CUDA/lib/utils/call.jl:34; cuCtxGetCurrent
   ╎    ╎    ╎    ╎    ╎  1   @CUDA/lib/cudadrv/libcuda.jl:35; check
   ╎    ╎    ╎    ╎    ╎   1   @CUDA/lib/utils/call.jl:35; #191
   ╎    ╎    ╎    ╎    ╎    1   @CUDA/lib/cudadrv/libcuda.jl:3336; macro expansion
  1╎    ╎    ╎    ╎    ╎     1   @CUDA/lib/utils/call.jl:214; macro expansion
   ╎    ╎    ╎    ╎  1   @CUDA/src/profile.jl:373; (::CUDA.Profile.var"#7#9"{VersionNumber, CUDA.Profile.var"#as_memory_kind#8", DataFrames.D…
   ╎    ╎    ╎    ╎   1   @Base/strings/cstring.jl:62; unsafe_string
  1╎    ╎    ╎    ╎    1   @Base/strings/string.jl:104; unsafe_string
   ╎    ╎    ╎    ╎  1   @CUDA/src/profile.jl:377; (::CUDA.Profile.var"#7#9"{VersionNumber, CUDA.Profile.var"#as_memory_kind#8", DataFrames.D…
   ╎    ╎    ╎    ╎   1   @Base/strings/io.jl:189; string
   ╎    ╎    ╎    ╎    1   @Base/strings/io.jl:148; print_to_string(xs::CUDA.CUPTI.CUpti_driver_api_trace_cbid_enum)
   ╎    ╎    ╎    ╎     1   @CEnum/src/CEnum.jl:28; print(io::IOBuffer, x::CUDA.CUPTI.CUpti_driver_api_trace_cbid_enum)
   ╎    ╎    ╎    ╎    ╎ 1   @Base/show.jl:289; print
  1╎    ╎    ╎    ╎    ╎  1   @Base/io.jl:879; write
   ╎    ╎    ╎    ╎  14  @CUDA/src/profile.jl:396; (::CUDA.Profile.var"#7#9"{VersionNumber, CUDA.Profile.var"#as_memory_kind#8", DataFrames.D…
   ╎    ╎    ╎    ╎   14  @DataFrames/src/dataframe/insertion.jl:887; push!(df::DataFrames.DataFrame, row::@NamedTuple{id::UInt32, start::Flo…
   ╎    ╎    ╎    ╎    14  @DataFrames/src/dataframe/insertion.jl:887; #push!#345
   ╎    ╎    ╎    ╎     1   @DataFrames/src/dataframe/insertion.jl:985; _row_inserter!(df::DataFrames.DataFrame, loc::Int64, row::@NamedTuple…
  1╎    ╎    ╎    ╎    ╎ 1   @Base/range.jl:908; iterate
  4╎    ╎    ╎    ╎     4   @DataFrames/src/dataframe/insertion.jl:1051; _row_inserter!(df::DataFrames.DataFrame, loc::Int64, row::@NamedTupl…
   ╎    ╎    ╎    ╎     3   @DataFrames/src/dataframe/insertion.jl:1055; _row_inserter!(df::DataFrames.DataFrame, loc::Int64, row::@NamedTupl…
  3╎    ╎    ╎    ╎    ╎ 3   @Base/namedtuple.jl:168; getindex
  2╎    ╎    ╎    ╎     2   @DataFrames/src/dataframe/insertion.jl:1058; _row_inserter!(df::DataFrames.DataFrame, loc::Int64, row::@NamedTupl…
  2╎    ╎    ╎    ╎     2   @DataFrames/src/dataframe/insertion.jl:1059; _row_inserter!(df::DataFrames.DataFrame, loc::Int64, row::@NamedTupl…
  2╎    ╎    ╎    ╎     2   @DataFrames/src/dataframe/insertion.jl:1060; _row_inserter!(df::DataFrames.DataFrame, loc::Int64, row::@NamedTupl…
   ╎    ╎    ╎    ╎  1   @CUDA/src/profile.jl:434; (::CUDA.Profile.var"#7#9"{VersionNumber, CUDA.Profile.var"#as_memory_kind#8", DataFrames.D…
   ╎    ╎    ╎    ╎   1   @Base/strings/substring.jl:236; string
   ╎    ╎    ╎    ╎    1   @Base/strings/substring.jl:254; _string(::String, ::Vararg{String})
  1╎    ╎    ╎    ╎     1   @Base/tuple.jl:71; iterate
  5╎    ╎    ╎    ╎  488 @CUDA/src/profile.jl:455; (::CUDA.Profile.var"#7#9"{VersionNumber, CUDA.Profile.var"#as_memory_kind#8", DataFrames.D…
   ╎    ╎    ╎    ╎   14  @Base/abstractdict.jl:193; copy(a::Base.EnvDict)
   ╎    ╎    ╎    ╎    6   @Base/abstractdict.jl:223; merge!(d::Dict{String, String}, others::Base.EnvDict)
   ╎    ╎    ╎    ╎     1   @Base/env.jl:234; length
   ╎    ╎    ╎    ╎    ╎ 1   @Base/env.jl:217; iterate
   ╎    ╎    ╎    ╎    ╎  1   @Base/env.jl:226; iterate(::Base.EnvDict, i::Int64)
   ╎    ╎    ╎    ╎    ╎   1   @Base/strings/string.jl:507; getindex
   ╎    ╎    ╎    ╎    ╎    1   @Base/strings/string.jl:131; pointer
  1╎    ╎    ╎    ╎    ╎     1   @Base/pointer.jl:317; -
   ╎    ╎    ╎    ╎     5   @Base/env.jl:236; length
  1╎    ╎    ╎    ╎    ╎ 1   @Base/env.jl:216; iterate(::Base.EnvDict, i::Int64)
  1╎    ╎    ╎    ╎    ╎ 1   @Base/env.jl:220; iterate(::Base.EnvDict, i::Int64)
   ╎    ╎    ╎    ╎    ╎ 3   @Base/env.jl:226; iterate(::Base.EnvDict, i::Int64)
   ╎    ╎    ╎    ╎    ╎  2   @Base/strings/string.jl:506; getindex
  2╎    ╎    ╎    ╎    ╎   2   @Base/strings/string.jl:109; _string_n
   ╎    ╎    ╎    ╎    ╎  1   @Base/strings/string.jl:507; getindex
   ╎    ╎    ╎    ╎    ╎   1   @Base/array.jl:268; unsafe_copyto!
  1╎    ╎    ╎    ╎    ╎    1   @Base/cmem.jl:28; memmove
   ╎    ╎    ╎    ╎    1   @Base/abstractdict.jl:226; merge!(d::Dict{String, String}, others::Base.EnvDict)
  1╎    ╎    ╎    ╎     1   @Base/dict.jl:354; setindex!(h::Dict{String, String}, v0::String, key::String)
   ╎    ╎    ╎    ╎    7   @Base/abstractdict.jl:227; merge!(d::Dict{String, String}, others::Base.EnvDict)
  1╎    ╎    ╎    ╎     1   @Base/boot.jl:0; iterate(::Base.EnvDict, i::Int64)
  4╎    ╎    ╎    ╎     4   @Base/env.jl:218; iterate(::Base.EnvDict, i::Int64)
  1╎    ╎    ╎    ╎     2   @Base/env.jl:226; iterate(::Base.EnvDict, i::Int64)
   ╎    ╎    ╎    ╎    ╎ 1   @Base/strings/string.jl:506; getindex
  1╎    ╎    ╎    ╎    ╎  1   @Base/strings/string.jl:109; _string_n
   ╎    ╎    ╎    ╎   4   @Base/cmd.jl:23; kwcall(::@NamedTuple{env::Dict{String, String}}, ::Type{Cmd}, cmd::Cmd)
   ╎    ╎    ╎    ╎    4   @Base/cmd.jl:31; _#859
   ╎    ╎    ╎    ╎     4   @Base/cmd.jl:248; byteenv(env::Dict{String, String})
   ╎    ╎    ╎    ╎    ╎ 4   @Base/strings/basic.jl:265; *
   ╎    ╎    ╎    ╎    ╎  4   @Base/strings/substring.jl:236; string
  1╎    ╎    ╎    ╎    ╎   1   @Base/strings/substring.jl:240; _string(::String, ::Vararg{String})
   ╎    ╎    ╎    ╎    ╎   3   @Base/strings/substring.jl:255; _string(::String, ::Vararg{String})
  3╎    ╎    ╎    ╎    ╎    3   @Base/strings/string.jl:109; _string_n
  1╎    ╎    ╎    ╎   5   @Base/essentials.jl:1052; invokelatest(::Any, ::Any, ::Vararg{Any})
  4╎    ╎    ╎    ╎    4   @Base/essentials.jl:1055; invokelatest(::Any, ::Any, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, @…
   ╎    ╎    ╎    ╎   460 @Base/process.jl:491; read
   ╎    ╎    ╎    ╎    61  @Base/process.jl:480; read(cmd::Cmd)
   ╎    ╎    ╎    ╎     61  @Base/process.jl:378; open(cmds::Cmd, mode::String, stdio::Base.DevNull)
   ╎    ╎    ╎    ╎    ╎ 61  @Base/process.jl:397; open
   ╎    ╎    ╎    ╎    ╎  61  @Base/process.jl:407; open(cmds::Cmd, stdio::Base.DevNull; write::Bool, read::Bool)
   ╎    ╎    ╎    ╎    ╎   61  @Base/process.jl:148; _spawn
   ╎    ╎    ╎    ╎    ╎    1   @Base/process.jl:231; setup_stdios(f::Base.var"#883#884"{Cmd}, stdios::Vector{Union{RawFD, Base.FileRedirect,…
   ╎    ╎    ╎    ╎    ╎     1   @Base/bitarray.jl:403; falses
   ╎    ╎    ╎    ╎    ╎    ╎ 1   @Base/bitarray.jl:405; falses
   ╎    ╎    ╎    ╎    ╎    ╎  1   @Base/bitarray.jl:71; BitArray
   ╎    ╎    ╎    ╎    ╎    ╎   1   @Base/bitarray.jl:37; BitArray
  1╎    ╎    ╎    ╎    ╎    ╎    1   @Base/boot.jl:579; Array
  3╎    ╎    ╎    ╎    ╎    11  @Base/process.jl:234; setup_stdios(f::Base.var"#883#884"{Cmd}, stdios::Vector{Union{RawFD, Base.FileRedirect,…
   ╎    ╎    ╎    ╎    ╎     5   @Base/process.jl:249; setup_stdio(stdio::Base.PipeEndpoint, child_readable::Bool)
  5╎    ╎    ╎    ╎    ╎    ╎ 5   @Base/stream.jl:836; link_pipe
   ╎    ╎    ╎    ╎    ╎     1   @Base/process.jl:251; setup_stdio(stdio::Base.PipeEndpoint, child_readable::Bool)
  1╎    ╎    ╎    ╎    ╎    ╎ 1   @Base/stream.jl:807; open_pipe!
  1╎    ╎    ╎    ╎    ╎     2   @Base/tuple.jl:159; indexed_iterate(t::Tuple{Base.DevNull, Bool}, i::Int64, state::Int64)
  1╎    ╎    ╎    ╎    ╎    ╎ 1   @Base/tuple.jl:159; indexed_iterate
   ╎    ╎    ╎    ╎    ╎    49  @Base/process.jl:236; setup_stdios(f::Base.var"#883#884"{Cmd}, stdios::Vector{Union{RawFD, Base.FileRedirect,…
   ╎    ╎    ╎    ╎    ╎     49  @Base/process.jl:149; #883
   ╎    ╎    ╎    ╎    ╎    ╎ 49  @Base/process.jl:157; _spawn
 49╎    ╎    ╎    ╎    ╎    ╎  49  @Base/process.jl:119; _spawn_primitive(file::String, cmd::Cmd, stdio::Memory{Union{RawFD, Base.SyncCloseFD…
  1╎    ╎    ╎    ╎    399 @Base/process.jl:481; read(cmd::Cmd)
   ╎    ╎    ╎    ╎     398 @Base/stream.jl:953; read(stream::Base.PipeEndpoint)
   ╎    ╎    ╎    ╎    ╎ 398 @Base/stream.jl:416; wait_readnb(x::Base.PipeEndpoint, nb::Int64)
   ╎    ╎    ╎    ╎    ╎  398 @Base/condition.jl:125; wait
   ╎    ╎    ╎    ╎    ╎   398 @Base/condition.jl:130; wait(c::Base.GenericCondition{Base.Threads.SpinLock}; first::Bool)
   ╎    ╎    ╎    ╎    ╎    389 @Base/task.jl:1021; wait()
383╎    ╎    ╎    ╎    ╎     389 @Base/task.jl:1012; poptask(W::Base.IntrusiveLinkedListSynchronized{Task})
   ╎    ╎    ╎    ╎    ╎    ╎ 2   @Base/stream.jl:642; uv_alloc_buf(handle::Ptr{Nothing}, size::UInt64, buf::Ptr{Nothing})
  2╎    ╎    ╎    ╎    ╎    ╎  2   @Base/stream.jl:62; getproperty
   ╎    ╎    ╎    ╎    ╎    ╎ 3   @Base/stream.jl:711; uv_readcb(handle::Ptr{Nothing}, nread::Int64, buf::Ptr{Nothing})
  3╎    ╎    ╎    ╎    ╎    ╎  3   @Base/stream.jl:680; (::Base.var"#readcb_specialized#829")(stream::Base.PipeEndpoint, nread::Int64, nreque…
   ╎    ╎    ╎    ╎    ╎    9   @Base/task.jl:1023; wait()
  7╎    ╎    ╎    ╎    ╎     9   @Base/libuv.jl:125; process_events
  2╎    ╎    ╎    ╎    ╎    ╎ 2   @Base/process.jl:70; uv_return_spawn(p::Ptr{Nothing}, exit_status::Int64, termsignal::Int32)
   ╎    ╎    ╎    ╎  11  @CUDA/src/profile.jl:457; (::CUDA.Profile.var"#7#9"{VersionNumber, CUDA.Profile.var"#as_memory_kind#8", DataFrames.D…
   ╎    ╎    ╎    ╎   11  @DataFrames/src/dataframe/insertion.jl:887; kwcall(::@NamedTuple{cols::Symbol}, ::typeof(push!), df::DataFrames.Dat…
   ╎    ╎    ╎    ╎    11  @DataFrames/src/dataframe/insertion.jl:887; #push!#345
  5╎    ╎    ╎    ╎     5   @DataFrames/src/dataframe/insertion.jl:952; _row_inserter!(df::DataFrames.DataFrame, loc::Int64, row::@NamedTuple…
   ╎    ╎    ╎    ╎     3   @DataFrames/src/dataframe/insertion.jl:964; _row_inserter!(df::DataFrames.DataFrame, loc::Int64, row::@NamedTuple…
  3╎    ╎    ╎    ╎    ╎ 3   @Base/namedtuple.jl:388; get
  2╎    ╎    ╎    ╎     2   @DataFrames/src/dataframe/insertion.jl:966; _row_inserter!(df::DataFrames.DataFrame, loc::Int64, row::@NamedTuple…
  1╎    ╎    ╎    ╎     1   @DataFrames/src/dataframe/insertion.jl:967; _row_inserter!(df::DataFrames.DataFrame, loc::Int64, row::@NamedTuple…
  2╎    ╎    ╎    ╎ 2   @CUDA/lib/cupti/wrappers.jl:306; process(f::CUDA.Profile.var"#7#9"{VersionNumber, CUDA.Profile.var"#as_memory_kind#8"…

To reproduce

N/A

Expected behavior

A clear and concise description of what you expected to happen.

Version info

Details on Julia:

Julia Version 1.11.1
Commit 8f5b7ca12ad (2024-10-16 10:53 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 64 × AMD Ryzen Threadripper PRO 5975WX 32-Cores
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 64 virtual cores)

Details on CUDA:

CUDA runtime 11.8, artifact installation
CUDA driver 12.6
NVIDIA driver 535.183.1

CUDA libraries: 
- CUBLAS: 11.11.3
- CURAND: 10.3.0
- CUFFT: 10.9.0
- CUSOLVER: 11.4.1
- CUSPARSE: 11.7.5
- CUPTI: 2022.3.0 (API 18.0.0)
- NVML: 12.0.0+535.183.1

Julia packages: 
- CUDA: 5.5.2
- CUDA_Driver_jll: 0.10.4+0
- CUDA_Runtime_jll: 0.15.5+0

Toolchain:
- Julia: 1.11.1
- LLVM: 16.0.6

Preferences:
- CUDA_Runtime_jll.version: 11.8

1 device:
  0: NVIDIA RTX A4500 (sm_86, 19.252 GiB / 19.990 GiB available)

Additional context

Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is neededperformanceHow fast can we go?

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions