Skip to content

Memory leak with unified memory? #3013

@Tortar

Description

@Tortar

Describe the bug

When using a CuArray with unified memory, running a loop many times increases memory until the program is killed. This seems related with GC not being too smart on this.

To reproduce

using CUDA

N = 10_000_000

data_cpu = rand(Float32, N)
uni_arr = cu(data_cpu; unified=true)

function f(a)
    CUDA.@sync a = sin.(a)
end

for _ in 1:1000 @time f(uni_arr) end # executing this causes an OOM error in CPU RAM
Manifest.toml

Status `~/.julia/environments/v1.10/Project.toml`
⌃ [46ada45e] Agents v6.0.9
  [56664e29] Ark v0.3.0 `~/.julia/dev/Ark`
  [6e4b80f9] BenchmarkTools v1.6.3
  [336ed68f] CSV v0.10.15
  [052768ef] CUDA v5.9.6
  [324d7699] CategoricalArrays v1.0.2
  [0ca39b1e] Chairmarks v1.3.1
  [944b1d66] CodecZlib v0.7.8
  [a93c6f00] DataFrames v1.8.1
⌅ [864edb3b] DataStructures v0.18.22
  [85a47980] Dictionaries v0.4.6
⌃ [a0c0ee7d] DifferentiationInterface v0.7.12
⌃ [31c24e10] Distributions v0.25.122
  [e30172f5] Documenter v1.16.1
⌃ [7da242da] Enzyme v0.13.109
  [ff5a1669] FieldViews v0.3.3
  [41ab1584] InvertedIndices v1.3.1
  [da2b9cff] Mooncake v0.4.193 `~/.julia/dev/Mooncake`
  [d96e819e] Parameters v0.12.3
⌃ [91a5bcdd] Plots v1.41.2
  [21216c6a] Preferences v1.5.1
⌃ [295af30f] Revise v3.12.3
  [ac92255e] Speculator v0.3.0
  [90137ffa] StaticArrays v1.9.16
⌃ [2913bbd2] StatsBase v0.34.8
  [ff63dad9] StreamSampling v0.7.6 `~/.julia/dev/StreamSampling`
  [09ab397b] StructArrays v0.7.2
  [2fbcfb34] UniqueVectors v1.2.0
  [02edb123] WeightVectors v0.1.0
Info Packages marked with ⌃ and ⌅ have new versions available. Those with ⌃ may be upgradable, but those with ⌅ are restricted by compatibility constraints from upgrading. To see why use `status --outdated`

Expected behavior

No memory increase as when using unified=false

Version info

Details on Julia:

Julia Version 1.10.10
Commit 95f30e51f41 (2025-06-27 09:51 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 12 × AMD Ryzen 5 5600H with Radeon Graphics
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 12 virtual cores)

Details on CUDA:

CUDA toolchain: 
- runtime 13.0, artifact installation
- driver 580.95.5 for 13.0
- compiler 13.1

CUDA libraries: 
- CUBLAS: 13.1.0
- CURAND: 10.4.0
- CUFFT: 12.0.0
- CUSOLVER: 12.0.4
- CUSPARSE: 12.6.3
- CUPTI: 2025.3.1 (API 13.0.1)
- NVML: 13.0.0+580.95.5

Julia packages: 
- CUDA: 5.9.6
- GPUArrays: 11.3.2
- GPUCompiler: 1.7.5
- KernelAbstractions: 0.9.39
- CUDA_Driver_jll: 13.1.0+0
- CUDA_Compiler_jll: 0.3.0+1
- CUDA_Runtime_jll: 0.19.2+0

Toolchain:
- Julia: 1.10.10
- LLVM: 15.0.7

1 device:
  0: NVIDIA GeForce GTX 1650 (sm_75, 1.743 GiB / 4.000 GiB available)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions