Why the apparent difference in @time and @btime macros?
What is the 368 bytes (and 208 bytes) measuring? I’ve created an array of 15 Float64 elements all taking 8 bytes at most. So shouldn’t the allocation be 8*15 = 120 bytes? Even if one accounts for “overhead” the 368 bytes is more than twice that number.
Is this what it means by heap allocated ? If this was stack allocated, it would show as “0 allocations” right?
Put things into functions if you want accurate allocation measurements (or like you saw, use BenchmarkTols):
julia> function allocation_test()
A = Vector{Float64}(undef, 15)
return A
end
allocation_test (generic function with 1 method)
julia> f() = @time allocation_test()
f (generic function with 1 method)
julia> f()
0.000001 seconds (1 allocation: 208 bytes)
I don’t understand the point 1. I am calling a function. Are you saying that even if @time is run from a global scope, there are issues? So the allocations I saw were allocations @time was using for its own purpose?
I am having a hard time profiling my code because I don’t know how many “functions” to nest before calling @time.
Global variables typically cause allocations. The @time macro expands to many variables, so if @time is called in global scope there will be a few extra allocations coming from the macro itself.
julia> @macroexpand @time a = 1+1
quote
local #9#stats = (Base.gc_num)()
local #11#elapsedtime = (Base.time_ns)()
local #10#val = (a = 1 + 1)
#11#elapsedtime = (Base.time_ns)() - #11#elapsedtime
local #12#diff = (Base.GC_Diff)((Base.gc_num)(), #9#stats)
(Base.time_print)(#11#elapsedtime, (#12#diff).allocd, (#12#diff).total_time, (Base.gc_alloc_count)(#12#diff))
(Base.println)()
#10#val
end
The 5 allocations seen are from running the @time macro itself in global scope. If we instead run the timing in a function, we can see that indeed no allocations are performed:
Either use BenchmarkTools or make sure @time is not called from global scope. The overhead will be in the hundreds of bytes so for most measurements it doesn’t matter.