Hi,
I am a little confused by some results that I obtain comparing @time, @elapsed and @benchmark. I am aware of some differences between this macros, but I suppose that I am not really understanding what is happening here, and what should I consider a good estimate.
I have the following code:
using SparseArrays, FFTW, BenchmarkTools
function gen_matrix(n,m)
x = rand(1:n,m); y = rand(1:n,m)
z = ones(m)
sparse(x,y,z,n,n)
end
function test_fft(n,m)
A = gen_matrix(n,m)
@elapsed fft(A)
end
Essentially, I want to run fft on some sparse matrices. The function gen_matrix produces a random sparse matrix for testing, whereas test_fft creates such a matrix and test the time of fft using @elapsed.
In what follows, all measurements are given in seconds.
Now, if I run
julia> A = gen_matrix(50,1000)
julia> @time fft(A)
I obtain consistent measurements of around 1e-4 seconds (the same happens if @time is replaced by @elapsed). I am aware of certain overhead due to the handling of the global variable A, so I also tested
julia> @benchmark fft($A)
and, alternatively
julia> @benchmark fft(B) setup(B=gen_matrix(50,1000))
In both cases I obtain a pretty narrow histogram with median around 5e-5, i.e.: half the time than reported by @time. So far, this looks reasonable.
The weird thing, al least for me, is that I assumed that test_fft should report something similar to @benchmark. My reasoning is that @elapsed does the same thing that @time does, but now it is running inside the local scope of test_fft. However… this does not happen, and test_fft returns consistently times of around 2e-4, i.e.: twice the time reported by @time. I even run test_fft in a loop to check this.
However, if I run:
julia> @time test_fft(50,1000)
I get something even weirder: @time reports times of around 1e-4, but the output of test_fft is now around 9e-5. So, the same call to @elapsed is returning different values (by a factor of 2) if it is run alone or under the macro @time.
Evidently there is something that I am missing.
Can someone explain this behaviour?
And more importantly: what should I take as the most reliable measument for a realistic situation, where the matrix is produced or loaded inside a function that also runs fft?
Thanks!