@dmbates, thanks for the answer! I tried your method, but am still having difficulty in getting the information I want.
As a simple test, I created
testalloc.jl, whose contents are
θ = linspace(0, 2π, 50)
X = [cos(θ) sin(θ)]'
Z = A*X
After starting Julia (v0.5) with
julia --track-allocation=user, I executed the following commands in REPL:
test (generic function with 1 method)
julia> A = rand(2,2);
testalloc.jl.mem in the current directory. Then, I quit Julia and restarted it without the
--track-allocation=user option, and executed the following commands in REPL:
julia> using Coverage
In comparison, I get the following benchmark result:
julia> using BenchmarkTools
julia> A = rand(2,2);
julia> @benchmark test($A)
memory estimate: 3.91 KiB
allocs estimate: 13
Now, here are my questions:
analyze_malloc() result reports 0-byte allocation in line 4 of
testalloc.jl, which is
Z = A*X. I don’t think that is true, because this line clearly allocates memory for the array
Z. How can I make sense of this result?
@benchmark result reports 13 allocations while running
test(A). I would like to know exactly how these 13 allocations distribute over the lines of the function
test. I guess each of the first three lines inside the function consumes a portion of these allocations because the variables
Z are created there, but how many of the 13 allocations are used to create each of
Z? Is there a way to know these details?
(Edited) Later, I figured that I might be able to get an answer to the 2nd question above by commenting out the lines of my
test() function and then using
- If I comment out all the lines inside the body of
test() except for the first line (where
θ is created), then
@benchmark would report the number of allocations used in the first line.
- Then, if I uncomment the second line (where
X is created) and use
@benchmark would report the number of allocations used in the first and second lines.
- By subtracting the former from the latter, I would be able to obtain the number of allocations used in the second line.
- Repeat this procedure to get the number of allocations used in the subsequent lines.
This procedure indeed revealed that no allocations were used in creating
X, and all the 13 allocations were consumed in creating
Z! I can understand the result for
θ, because it is an instance of
UnitRange rather than
Array, but I still don’t understand why creating
X uses 0 allocation…
Also, I don’t understand why the third line of the body of
Z is created) consumes as much as 13 allocations. Don’t we need just one allocation to create
Z and fill its contents with