I try to benchmark multiplication between matrices and multilication between one matrix and one number. Here’s the code:
using LinearAlgebra
BLAS.set_num_threads(1)
A = rand(ComplexF64, 1000, 1000)
B = rand(ComplexF64, 1000, 1000)
function foo(A, B)
for i in 1:10
1.0*A*B
end
return nothing
end
function bar(A, B)
for i in 1:10
A*B
end
end
However, I found inconsistencies between the result of the @btime
macro from BenchmarkTools
and the profiler.
using BenchmarkTools
@btime foo($A, $B)
# 1.808 s (40 allocations: 305.18 MiB)
@btime bar($A, $B)
# 1.764 s (20 allocations: 152.59 MiB)
which indicates the two functions take roughly the same amount of time. But if I use the profiler
function foobar(A, B)
foo(A, B)
bar(A, B)
return nothing
end
foobar(A, B) # compile once
using Profile
@profile foobar(A, B)
Profile.print(mincount=60)
gives
168 ./task.jl:268; (::getfield(Revise, Symbol("##80#82")){REPL.R...
168 ...es/Revise/439di/src/Revise.jl:975; run_backend(::REPL.REPLBackend)
168 .../stdlib/v1.2/REPL/src/REPL.jl:86; eval_user_input(::Any, ::REPL.REPLBackend)
168 ./boot.jl:330; eval(::Module, ::Any)
106 ./REPL[57]:2; foobar(::Array{Complex{Float64},2}, ::Array{...
106 ./REPL[3]:3; foo(::Array{Complex{Float64},2}, ::Array{Co...
106 ./operators.jl:529; *
67 ...LinearAlgebra/src/matmul.jl:143; *
67 ...inearAlgebra/src/matmul.jl:145; mul!
67 ...inearAlgebra/src/matmul.jl:464; gemm_wrapper!(::Array{Complex{Float64},2...
67 ...LinearAlgebra/src/blas.jl:1131; gemm!(::Char, ::Char, ::Complex{Float64...
62 ./REPL[57]:3; foobar(::Array{Complex{Float64},2}, ::Array{...
62 ./REPL[4]:3; bar
61 ...LinearAlgebra/src/matmul.jl:143; *
61 ...LinearAlgebra/src/matmul.jl:145; mul!
61 ...inearAlgebra/src/matmul.jl:464; gemm_wrapper!(::Array{Complex{Float64},2}...
61 ...LinearAlgebra/src/blas.jl:1131; gemm!(::Char, ::Char, ::Complex{Float64}...
which implies a ~ 40% percent time difference between the two functions.
I tend to believe the result of @btime
is correct since @time
gives consistent results. What happens to the profiler then?
Edit: the behavior is the same if I run the profiler twice.