Hi,
I am a little confused by some results that I obtain comparing @time
, @elapsed
and @benchmark
. I am aware of some differences between this macros, but I suppose that I am not really understanding what is happening here, and what should I consider a good estimate.
I have the following code:
using SparseArrays, FFTW, BenchmarkTools
function gen_matrix(n,m)
x = rand(1:n,m); y = rand(1:n,m)
z = ones(m)
sparse(x,y,z,n,n)
end
function test_fft(n,m)
A = gen_matrix(n,m)
@elapsed fft(A)
end
Essentially, I want to run fft
on some sparse matrices. The function gen_matrix
produces a random sparse matrix for testing, whereas test_fft
creates such a matrix and test the time of fft
using @elapsed
.
In what follows, all measurements are given in seconds.
Now, if I run
julia> A = gen_matrix(50,1000)
julia> @time fft(A)
I obtain consistent measurements of around 1e-4 seconds (the same happens if @time
is replaced by @elapsed
). I am aware of certain overhead due to the handling of the global variable A
, so I also tested
julia> @benchmark fft($A)
and, alternatively
julia> @benchmark fft(B) setup(B=gen_matrix(50,1000))
In both cases I obtain a pretty narrow histogram with median around 5e-5, i.e.: half the time than reported by @time
. So far, this looks reasonable.
The weird thing, al least for me, is that I assumed that test_fft
should report something similar to @benchmark
. My reasoning is that @elapsed
does the same thing that @time
does, but now it is running inside the local scope of test_fft
. However… this does not happen, and test_fft
returns consistently times of around 2e-4, i.e.: twice the time reported by @time
. I even run test_fft
in a loop to check this.
However, if I run:
julia> @time test_fft(50,1000)
I get something even weirder: @time
reports times of around 1e-4, but the output of test_fft
is now around 9e-5. So, the same call to @elapsed
is returning different values (by a factor of 2) if it is run alone or under the macro @time
.
Evidently there is something that I am missing.
Can someone explain this behaviour?
And more importantly: what should I take as the most reliable measument for a realistic situation, where the matrix is produced or loaded inside a function that also runs fft
?
Thanks!