Why so large memory allocations?

photor · February 2, 2022, 2:33pm

I met a weird problem in the following trivial code:

using BenchmarkTools

const p=150
dd = randn(p,p)

function d2(ψ::Matrix{Float64})::Matrix{Float64}
    ψr = similar(ψ)
    ddψ = dd*ψ
    @time @inbounds for i in eachindex(ddψ,ψr)
        ψr[i]=ddψ[i]
    end
    return ψr
end

ψ=randn(p,p*p)
@btime d2(ψ);

The result is:

  0.394772 seconds (13.50 M allocations: 257.476 MiB, 4.49% gc time)
  0.396507 seconds (13.50 M allocations: 257.476 MiB, 5.61% gc time)
  0.397738 seconds (13.50 M allocations: 257.476 MiB, 3.80% gc time)
  0.396423 seconds (13.50 M allocations: 257.476 MiB, 4.15% gc time)
  0.430600 seconds (13.50 M allocations: 257.476 MiB, 7.25% gc time)
  0.410795 seconds (13.50 M allocations: 257.476 MiB, 6.39% gc time)
  0.586821 seconds (13.50 M allocations: 257.476 MiB, 26.65% gc time)
  0.486760 seconds (13.50 M allocations: 257.476 MiB, 20.09% gc time)
  0.380750 seconds (13.50 M allocations: 257.476 MiB, 3.97% gc time)
  0.404442 seconds (13.50 M allocations: 257.476 MiB, 5.99% gc time)
  0.402110 seconds (13.50 M allocations: 257.476 MiB, 5.73% gc time)
  0.411570 seconds (13.50 M allocations: 257.476 MiB, 3.88% gc time)
  0.406931 seconds (13.50 M allocations: 257.476 MiB, 6.19% gc time)
  0.404919 seconds (13.50 M allocations: 257.476 MiB, 6.29% gc time)
  0.419291 seconds (13.50 M allocations: 257.476 MiB, 7.73% gc time)
  0.442829 seconds (13.50 M allocations: 257.476 MiB, 9.08% gc time)
  0.456120 seconds (13.50 M allocations: 257.476 MiB, 7.63% gc time)
  0.395785 seconds (13.50 M allocations: 257.476 MiB, 4.61% gc time)
  0.419890 seconds (13.50 M allocations: 257.476 MiB, 3.97% gc time)
  0.405614 seconds (13.50 M allocations: 257.476 MiB, 6.46% gc time)
  0.383479 seconds (13.50 M allocations: 257.476 MiB, 5.23% gc time)
  0.412925 seconds (13.50 M allocations: 257.476 MiB, 6.41% gc time)
  0.410666 seconds (13.50 M allocations: 257.476 MiB, 6.52% gc time)
  0.404238 seconds (13.50 M allocations: 257.476 MiB, 7.23% gc time)
  0.420667 seconds (13.50 M allocations: 257.476 MiB, 6.23% gc time)
  0.420667 seconds (13.50 M allocations: 257.476 MiB, 6.23% gc time)
  0.445725 seconds (13.50 M allocations: 257.476 MiB, 3.58% gc time)
  0.405995 seconds (13.50 M allocations: 257.476 MiB, 6.12% gc time)
  0.413399 seconds (13.50 M allocations: 257.476 MiB, 7.18% gc time)
  415.830 ms (13499133 allocations: 308.99 MiB)

So, it’s incredibly slow and the memory allocation is unreasonably large. Why is that? The for loop in the function is expected to have no memory allocation at all.

heliosdrm · February 2, 2022, 2:45pm

I see a 30x gain by just making dd a const.

nilshg · February 2, 2022, 2:49pm

Was going to post something very similar - an alternative to const is to have dd as a function argument (it’s generally not a good idea to have functions depend on global variables):

julia> function d2(ψ, dd)
           ψr = similar(ψ)
           ddψ = dd*ψ
           @inbounds for i in eachindex(ddψ,ψr)
               ψr[i] = ddψ[i]
           end
           ψr
       end
d2 (generic function with 3 methods)

julia> @btime d2($ψ, $dd);
  29.738 ms (4 allocations: 51.50 MiB)

lmiq · February 2, 2022, 2:55pm

See: Performance Tips · The Julia Language

photor · February 3, 2022, 1:47am

Yes, you are right, 30x gain and no memory allocation by just making dd a const. Just curious, why the heavy consumption occurs in the for loop instead of the step “ddψ = dd*ψ”? The for loop seems to have no direct relation to dd.

xzackli · February 3, 2022, 2:25am

In Julia, type specialization (the secret sauce that makes it faster than Python but still dynamic) occurs at function calls. Nothing at compilation time can be inferred about the type of ddψ since it depends on a non-constant global. The for-loop can’t regain knowledge of the type, since it lives in the same function as when disaster struck.

Things would be different if the for-loop was in its own function (see this part of the performance tips).

photor · February 3, 2022, 2:58am

Thanks!

Topic		Replies	Views
Allocations when moving loop into a function New to Julia memory-allocation	3	130	June 1, 2025
Weird allocations General Usage question , memory-allocation	2	314	September 25, 2020
Spurious memory allocations within function Performance question , memory-allocation	9	436	March 22, 2023
Assigning values in for loop allocates memory Performance memory-allocation	3	536	November 3, 2021
Why fewer memory allocations does not necessarily suggest higher speed New to Julia performance , memory-allocation	5	792	June 6, 2021

Why so large memory allocations?

Related topics