Way to show where memory allocations occur?

@dmbates, thanks for the answer! I tried your method, but am still having difficulty in getting the information I want.

As a simple test, I created testalloc.jl, whose contents are

function test(A)
    θ = linspace(0, 2π, 50)
    X = [cos(θ) sin(θ)]'
    Z = A*X

    return Z
end

After starting Julia (v0.5) with julia --track-allocation=user, I executed the following commands in REPL:

julia> include("testalloc.jl")
test (generic function with 1 method)

julia> A = rand(2,2);

julia> test(A);

julia> Profile.clear_malloc_data()

julia> test(A);

This created testalloc.jl.mem in the current directory. Then, I quit Julia and restarted it without the --track-allocation=user option, and executed the following commands in REPL:

julia> using Coverage

julia> analyze_malloc(".")
3-element Array{Coverage.MallocInfo,1}:
 Coverage.MallocInfo(0,"./testalloc.jl.mem",4)
 Coverage.MallocInfo(0,"./testalloc.jl.mem",6)
 Coverage.MallocInfo(144368,"./testalloc.jl.mem",3)

In comparison, I get the following benchmark result:

julia> using BenchmarkTools

julia> A = rand(2,2);

julia> include("testalloc.jl");

julia> @benchmark test($A)
BenchmarkTools.Trial:
  memory estimate:  3.91 KiB
  allocs estimate:  13

Now, here are my questions:

  1. The analyze_malloc() result reports 0-byte allocation in line 4 of testalloc.jl, which is Z = A*X. I don’t think that is true, because this line clearly allocates memory for the array Z. How can I make sense of this result?

  2. The @benchmark result reports 13 allocations while running test(A). I would like to know exactly how these 13 allocations distribute over the lines of the function test. I guess each of the first three lines inside the function consumes a portion of these allocations because the variables θ, X, and Z are created there, but how many of the 13 allocations are used to create each of θ, X, and Z? Is there a way to know these details?

(Edited) Later, I figured that I might be able to get an answer to the 2nd question above by commenting out the lines of my test() function and then using @benchmark. Specifically,

  • If I comment out all the lines inside the body of test() except for the first line (where θ is created), then @benchmark would report the number of allocations used in the first line.
  • Then, if I uncomment the second line (where X is created) and use @benchmark, then @benchmark would report the number of allocations used in the first and second lines.
  • By subtracting the former from the latter, I would be able to obtain the number of allocations used in the second line.
  • Repeat this procedure to get the number of allocations used in the subsequent lines.

This procedure indeed revealed that no allocations were used in creating θ and X, and all the 13 allocations were consumed in creating Z! I can understand the result for θ, because it is an instance of UnitRange rather than Array, but I still don’t understand why creating X uses 0 allocation…

Also, I don’t understand why the third line of the body of test() (where Z is created) consumes as much as 13 allocations. Don’t we need just one allocation to create Z and fill its contents with A*X?

1 Like