Unreasonable memory usage with M4 GPU

Dearests,

my fight to make a reasonable use of my M4 GPU continues.

Metal.versioninfo()

macOS 15.1.1, Darwin 24.1.0

Toolchain:

  • Julia: 1.11.2
  • LLVM: 16.0.6

Julia packages:

  • Metal.jl: 1.4.2
  • GPUArrays: 10.3.1
  • GPUCompiler: 0.27.8
  • KernelAbstractions: 0.9.31
  • ObjectiveC: 3.1.0
  • LLVM: 9.1.3
  • LLVMDowngrader_jll: 0.3.0+2

1 device:

  • Apple M4 Pro (48.953 MiB allocated)

I developed a simple optimization problem (more of an MWE than what I need to do). I observe an explosion in memory. Before giving you the not-so-minimal example let me explain what I see.

function trainmodel!(model::Model; nepochs=100, verbose=true)
    opt = Flux.setup(Flux.Optimisers.Adam(0.1), model)
    for it in 1:nepochs
        grads = Flux.gradient(model) do m
            TestMetal.losslinearalgebra(m)
        end
        verbose && println("it = $it |grad| =  $(norm(grads[1].msa))")
        Flux.Optimise.update!(opt, model, grads[1])
        GC.gc()
    end
end

The crux of the problem is that without the GC.gc() command after update!, the memory explodes when I use MtlArray Arrays (explode = computer becomes unresponsive for large memory usage). For normal Arrays, there is no problem.

If you want to run the full thing is a bit complicated, but doable. I created the gist below:

To use it you should

julia> include("testmetal.jl"); using .TestMetal
julia> q,L,M = 21,53,10_000; modelgpu=TestMetal.Model(q,L,M,gpu=true); modelcpu=TestMetal.Model(q,L,M,gpu=false);
julia> TestMetal.trainmodel!(modelgpu,nepochs=100,verbose=true) # beware that this is where my computer becomes unresponsive

Worth reporting upstream to Metal?
Thanks
A

You may keep an eye for caching allocator then :slight_smile:

1 Like

Thx! I’ll keep an eye to your PR