Multiplication performance

Hey guys,

Here is a very simple code I’m using:

``````function tax_labor(τ0_y::Real,τ1_y::Real,ρ_τ::Real,barρ::Real,earn::Real,rent::Real)
τ0_y*(earn - min(ρ_τ*rent,barρ))^(1.0-τ1_y)
end
``````

If I do @btime, I get 9.6 ns. Now, if I do the following:

``````function tax_labor(τ0_y::Real,τ1_y::Real,ρ_τ::Real,barρ::Real,earn::Real,rent::Real)
return τ0_y*(earn)^(1.0-τ1_y)
end
``````

I get 1.20 ns. So apparently, doing 1 multiplication increases the computation time of this function 9 fold. Does this make sense to you? Am I doing something wrong? This might seem trivial, but I call this function a lot, so computation times are very important for the performance of the rest of the code. Thanks

You also have a `min()` call in the first function which does a comparison and selection of the smaller one and another `-` call, so it’s not just “1 multiplication” in difference.

2 Likes

Benchmarking is tricky to do right. It seems you might have benchmarked your functions with the inputs as compile-time constants and this gave you a misleading result. Here’s a more representative benchmark, (assuming none of the function arguments are known at compile time)

``````function tax_labor1(τ0_y::Real,τ1_y::Real,ρ_τ::Real,barρ::Real,earn::Real,rent::Real)
τ0_y*(earn - min(ρ_τ*rent,barρ))^(1.0-τ1_y)
end

function tax_labor2(τ0_y::Real,τ1_y::Real,ρ_τ::Real,barρ::Real,earn::Real,rent::Real)
return τ0_y*(earn)^(1.0-τ1_y)
end

let
a, b, c, d, e, f = Ref.(rand(6))

@btime tax_labor1(\$a[], \$b[], \$c[], \$d[], \$e[], \$f[])
@btime tax_labor2(\$a[], \$b[], \$c[], \$d[], \$e[], \$f[])
end;

#+RESULTS:
:   18.967 ns (0 allocations: 0 bytes)
:   17.465 ns (0 allocations: 0 bytes)
``````

As you can see here, the difference in runtimes is proprotionally, not so great.

3 Likes

Timings are also value dependent. I don’t get the difference you are getting between the functions but I just made up some numbers to plug in and I can see 10:1 timing differences depending on those values.

EDIT: I wasn’t clear. I’m not seeing a 10:1 difference between the functions, but I can make both go faster or slower depending on the arguments. I’m guessing that I’m triggering different paths in the exponentiation function.

@Joao_Barata if you haven’t already, I’d recommend reading https://github.com/JuliaCI/BenchmarkTools.jl/blob/master/doc/manual.md#understanding-compiler-optimizations

1 Like

Just want to point out that even if the timing difference is 10:1 here, the slow version still only takes 10ns (or likely 20ns as Mason showed). Even if you call this millions of times, I am 99% sure this is not the bottleneck in your code. If your code is slow, start with something where you can generate meaningful performance improvements first (reducing allocations, removing type instabilities, etc.).

2 Likes

Ok, I confirm this is the case in my own code. Thanks. Can you tell me where I can read up on this? Like what is a compile time constant? What is going on under the hood here?

EDIT: saw your recommendation. Thanks

1 Like

Good point. In this case, we are talking about something that is called hundreds of millions of times. Possibly more. But still, I take your point.

You can check out the docs of BenchmarkTools.jl, the ref-trick is mentioned in the “Quick Start” section in the readme: https://github.com/JuliaCI/BenchmarkTools.jl

2 Likes