function tax_labor(τ0_y::Real,τ1_y::Real,ρ_τ::Real,barρ::Real,earn::Real,rent::Real)
τ0_y*(earn - min(ρ_τ*rent,barρ))^(1.0-τ1_y)
end
If I do @btime, I get 9.6 ns. Now, if I do the following:
function tax_labor(τ0_y::Real,τ1_y::Real,ρ_τ::Real,barρ::Real,earn::Real,rent::Real)
return τ0_y*(earn)^(1.0-τ1_y)
end
I get 1.20 ns. So apparently, doing 1 multiplication increases the computation time of this function 9 fold. Does this make sense to you? Am I doing something wrong? This might seem trivial, but I call this function a lot, so computation times are very important for the performance of the rest of the code. Thanks
You also have a min() call in the first function which does a comparison and selection of the smaller one and another - call, so it’s not just “1 multiplication” in difference.
Benchmarking is tricky to do right. It seems you might have benchmarked your functions with the inputs as compile-time constants and this gave you a misleading result. Here’s a more representative benchmark, (assuming none of the function arguments are known at compile time)
function tax_labor1(τ0_y::Real,τ1_y::Real,ρ_τ::Real,barρ::Real,earn::Real,rent::Real)
τ0_y*(earn - min(ρ_τ*rent,barρ))^(1.0-τ1_y)
end
function tax_labor2(τ0_y::Real,τ1_y::Real,ρ_τ::Real,barρ::Real,earn::Real,rent::Real)
return τ0_y*(earn)^(1.0-τ1_y)
end
let
a, b, c, d, e, f = Ref.(rand(6))
@btime tax_labor1($a[], $b[], $c[], $d[], $e[], $f[])
@btime tax_labor2($a[], $b[], $c[], $d[], $e[], $f[])
end;
#+RESULTS:
: 18.967 ns (0 allocations: 0 bytes)
: 17.465 ns (0 allocations: 0 bytes)
As you can see here, the difference in runtimes is proprotionally, not so great.
Timings are also value dependent. I don’t get the difference you are getting between the functions but I just made up some numbers to plug in and I can see 10:1 timing differences depending on those values.
EDIT: I wasn’t clear. I’m not seeing a 10:1 difference between the functions, but I can make both go faster or slower depending on the arguments. I’m guessing that I’m triggering different paths in the exponentiation function.
Just want to point out that even if the timing difference is 10:1 here, the slow version still only takes 10ns (or likely 20ns as Mason showed). Even if you call this millions of times, I am 99% sure this is not the bottleneck in your code. If your code is slow, start with something where you can generate meaningful performance improvements first (reducing allocations, removing type instabilities, etc.).
Ok, I confirm this is the case in my own code. Thanks. Can you tell me where I can read up on this? Like what is a compile time constant? What is going on under the hood here?