Moving source line moves allocation?

I am trying to fix allocation issues in a function and I see a behaviour which seems strange to me. What follows are two outputs of julia --track.allocation

- function g11!(x::Array{Float64},f::SubArray{Float64},heq::SubArray{Float64},hineq::SubArray{Float64})
     432     f[1]=x[1]^2+(x[2]-1.0)^2
       0     heq[1]=x[2]-x[1]^2
       -     hineq=nothing
       0     nothing
       - end
- function g11!(x::Array{Float64},f::SubArray{Float64},heq::SubArray{Float64},hineq::SubArray{Float64})
     432     heq[1]=x[2]-x[1]^2
       0     f[1]=x[1]^2+(x[2]-1.0)^2
       -     hineq=nothing
       0     nothing
       - end

First oddity is that the allocation seems to happen in the first case for f[1] and the second time for heq[1]. Furthermore since f and heq are in fact views of preallocated Arrays I would expect no allocation to take place at all. I am surely missing something, but what?

Compilation of a function allocates memory, which will show up in your .mem file unless you’re careful. From the manual:

More significantly, JIT-compilation also adds to allocation counts, because much of Julia’s compiler is written in Julia (and compilation usually requires memory allocation). The recommended procedure is to force compilation by executing all the commands you want to analyze, then call Profile.clear_malloc_data() to reset all allocation counters. Finally, execute the desired commands and quit Julia to trigger the generation of the .mem files.

And note that you need to do all of that within the same Julia session, so:

f()
Profile.clear_malloc_data()
f()
1 Like

The two logs I showed have been obtained exactly as you suggest. What follows is a complete self-contained example.

function g11!(x::Array{Float64},f::SubArray{Float64},heq::SubArray{Float64},hineq::SubArray{Float64})
    f[1]=x[1]^2+(x[2]-1.0)^2
    heq[1]=x[2]-x[1]^2
    hineq=nothing
    nothing
end

function testfun()
    nind=2
    nobj=1
    neq=1
    nineq=0
    f=Array{Float64}(nobj,nind)
    heq=Array{Float64}(neq,nind)
    hineq=Array{Float64}(nineq,nind)

    for i=1:nind
        x=rand(Float64,2)
        @views g11!(x,f[:,i],heq[:,i],hineq[:,i])
    end

end

testfun()
Profile.clear_malloc_data()
testfun()

Looking at the complete example shows another oddity:

 240     f=Array{Float64}(nobj,nind)
       96     heq=Array{Float64}(neq,nind)

why have the two arrays, which have the same number of rows and columns, different allocations?

Interesting! Thanks for posting the full example. I wonder if the fact the type of hineq changes to Void is relevant?

No. In the following example I only have f and heq, and the behaviour is the same.

        - function g11!(x::Array{Float64},f::SubArray{Float64},heq::SubArray{Float64})
      192     f[1]=x[1]^2+(x[2]-1.0)^2
        0     heq[1]=x[2]-x[1]^2
        0     nothing
        - end
        - 
        - function testfun()
        -     nind=2
        -     nobj=1
        -     neq=1
      240     f=Array{Float64}(nobj,nind)
       96     heq=Array{Float64}(neq,nind)
        - 
        0     for i=1:nind
        0         x=rand(Float64,2)
        0         @views g11!(x,f[:,i],heq[:,i])
        -     end
        - 
        - end
        - 
        - testfun()
        - Profile.clear_malloc_data()
        - testfun()

At this point I’m just guessing but…perhaps the allocation of x=rand(Float64, 2) in testfun() is being misattributed to the first line of g11! ?

I don’t think so, as the next example shows. Now x is allocated out of the loop. Still f and heq have misteriously (for me) different sizes, and the first line of g11! reports an allocation while the second does not, even if I swap the assignment of f and heq in g11!

        - function g11!(x::Array{Float64},f::SubArray{Float64},heq::SubArray{Float64})
      192     f[1]=x[1]^2+(x[2]-1.0)^2
        0     heq[1]=x[2]-x[1]^2
        0     nothing
        - end
        - 
        - function testfun()
        -     nind=2
        -     nobj=1
        -     neq=1
      240     f=Array{Float64}(nobj,nind)
       96     heq=Array{Float64}(neq,nind)
       96     x=ones(Float64,2)
        0     x[1]=2.0
        0     x[2]=3.0
        0     for i=1:nind
        0         @views g11!(x,f[:,i],heq[:,i])
        -     end
        - 
        - end
        - 
        - testfun()
        - Profile.clear_malloc_data()
        - testfun()

I think in this case nind, nobj, and neq are being optimized out, so the line with f is still the ‘first line of the function’, to which some allocations are likely being misattributed.

Edit: yeah, if you add a line println("bla") before f=..., the number of allocations for f becomes 96.

1 Like

Yes! You are indeed right. If I allocate something before f, the the sizes of f and heq coincide. And the same happens for the first line of g11!. Can I therefore conclude that --track-allocations wrongly adds the allocation of function pointers/stuff/whatever to the first line of the function instead of assigning it to the function itself?
Furthermore the following example shows, probably a related problem

        - function g11!(x::Array{Float64},f::SubArray{Float64},heq::SubArray{Float64})
 19200000     f[1]=x[1]^2+(x[2]-1.0)^2
        0     heq[1]=x[2]-x[1]^2
        0     nothing
        - end
        - 
        - function testfun()
        -     nind=2
        -     nobj=1
        -     neq=1
      240     f=Array{Float64}(nobj,nind)
       96     heq=Array{Float64}(neq,nind)
        0     for k=1:100000
        0         for i=1:nind
        0             x=rand(Float64,2)
        0             @views g11!(x,f[:,i],heq[:,i])
        -         end
        -     end
        - end
        - 
        - testfun()
        - Profile.clear_malloc_data()
        - testfun()
        - 

if g11! is called repeatedly a lot of memory gets allocated for the function call. Is this expected? It seems to be related to using views because if I use standard Arrays no memory is allocated:

        - function g11!(x::Array{Float64},f::Array{Float64},heq::Array{Float64})
        0     f[1]=x[1]^2+(x[2]-1.0)^2
        0     heq[1]=x[2]-x[1]^2
        0     nothing
        - end
        - 
        - function testfun()
        -     nind=2
        -     nobj=1
        -     neq=1
      240     f=Array{Float64}(nobj)
       96     heq=Array{Float64}(neq)
        0     for k=1:100000
        0         for i=1:nind
        0             x=rand(Float64,2)
        0             g11!(x,f,heq)
        -         end
        -     end
        - end
        - 
        - testfun()
        - Profile.clear_malloc_data()
        - testfun()

(in this example I could of course use arrays but in the real code I must use views)
Sorry for abusing everybody’s patience and kindness.

In the end I found your UnsafeVectorView modified it to work with SharedArrays (which was my ultimate goal) and now everything is fine, and the allocations are more or less what I would expect. What I also plan to do is to further modify it in such a way that if I repeatedly create new views of the same array (which I do all the time) I do not create a new view but “recycle” the existing one by updating offset and len (ptr does not change). Why such stuff is not intrinsically part of Julia surprises me, since I believe that what you describe as “extremely cheap views within the inner loop of an algorithm” is surely an extremely common feature of many algorithms.

The performance issues with views are certainly a known pain point, and I believe that much of the recent work on the compiler has been done to help avoid allocations in cases like this. I’m far from an expert on the compiler, though.

I would suggest trying out your code on the latest v0.7 nightlies (without resorting to UnsafeVectorView) as you may see some substantial improvements.

Thanks for the suggestion. I tried it and the .mem files look ok, but if I do a @time the reported allocated memory appears to be the same. It always confused me that if I add up the reported memory allocation in the .mem files it does not coincide with what @time reports. Is this normal?