# Help with performance improvements with repeated computations

Guys,

I have a question about how “smart” the Julia compiler is. Sorry, maybe this is already answered but I could not find maybe because I do not even know what is the English term that describes what I want to say In my code, sometimes I have a lot of repeated operations like:

``````            g211 =    3.6160 -  13.2470e_0 + 16.29000e_0^2
g310 = - 19.3020 + 117.3900e_0 - 228.4190e_0^2 + 156.5910e_0^3
g322 = - 18.9068 + 109.7927e_0 - 214.6334e_0^2 + 146.5816e_0^3
g410 = - 41.1220 + 242.6940e_0 - 471.0940e_0^2 + 313.9530e_0^3
g422 = - 146.407 + 841.8800e_0 - 1629.014e_0^2 + 1083.435e_0^3
g520 = - 532.114 + 3017.977e_0 - 5740.032e_0^2 + 3708.276e_0^3
``````

Question: is the compiler is smart enough to see that all those `^` are the same operations or should I compute it before and store in a variable?

The same happens for a lot of `sin` and `cos`.

Many times it will put this in another form that will reduce the unnecessary computations. But this isn’t what you’re really looking for. This way of computing polynomials is not numerically stable so you need to be careful, especially when some of the coefficients sum to almost zero. You’ll want to use `@evalpoly` instead.

3 Likes

Here’s a reference on the computation reduction. It’s called the common subexpression elimination pass in LLVM

http://transit.iut2.upmf-grenoble.fr/cgi-bin/dwww/usr/share/doc/llvm-3.8-doc/html/tutorial/OCamlLangImpl4.html#llvm-optimization-passes

1 Like

Thanks @ChrisRackauckas, you were very helpful!

Well, it seems that Julia does not fully optimize those repeated expressions. For example:

``````function test1(a::Number)
b = 1 + a + a^2 + a^3 + a^4
c = 2 + 2a + 2a^2 + 2a^3 + 2a^4
d = 3 + 3a + 3a^2 + 3a^3 + 3a^4
end

function test2(a::Number)
a2 = a*a
a3 = a*a*a
a4 = a*a*a*a
b = 1 + a + a2 + a3 + a4
c = 2 + 2a + 2a2 + 2a3 + 2a4
d = 3 + 3a + 3a2 + 3a3 + 3a4
end

function test3(a::Number)
b = @evalpoly a 1 1 1 1 1
c = @evalpoly a 2 2 2 2 2
d = @evalpoly a 3 3 3 3 3
end
``````

We have

``````julia> @benchmark test1(10.0)
BenchmarkTools.Trial:
memory estimate:  0 bytes
allocs estimate:  0
--------------
minimum time:     20.082 ns (0.00% GC)
median time:      20.165 ns (0.00% GC)
mean time:        20.371 ns (0.00% GC)
maximum time:     83.892 ns (0.00% GC)
--------------
samples:          10000
evals/sample:     997

julia> @benchmark test2(10.0)
BenchmarkTools.Trial:
memory estimate:  0 bytes
allocs estimate:  0
--------------
minimum time:     3.195 ns (0.00% GC)
median time:      3.465 ns (0.00% GC)
mean time:        3.513 ns (0.00% GC)
maximum time:     26.934 ns (0.00% GC)
--------------
samples:          10000
evals/sample:     1000

julia> @benchmark test3(10.0)
BenchmarkTools.Trial:
memory estimate:  0 bytes
allocs estimate:  0
--------------
minimum time:     3.237 ns (0.00% GC)
median time:      3.267 ns (0.00% GC)
mean time:        3.330 ns (0.00% GC)
maximum time:     33.811 ns (0.00% GC)
--------------
samples:          10000
evals/sample:     1000
``````

Hence, I will use `@evalpoly` for polynomials, but I have to store the repeated computations for complex expressions like:

``````    C2 = QOMS2T*ξ^4*nll_0*aux1*
( all_0*( 1 + (3/2)η^2 + 4e_0*η + e_0*η^3) +
3/2*(k_2*ξ)/aux0*(-1/2 + (3/2)θ²)*(8 + 24η^2 + 3η^4) )

C1 = bstar*C2

C3 = (e_0 > 1e-4) ? QOMS2T*ξ^5*A_30*nll_0*AE*sin_i_0/(k_2*e_0) : T(0)

C4 = 2nll_0*QOMS2T*aux2*
( 2η*(1+e_0*η) + (1/2)e_0 + (1/2)η^3 -
2*k_2*ξ/(all_0*aux0)*
(3*(1-3θ²)*(1 + (3/2)η^2 - 2*e_0*η - (1/2)e_0*η^3) +
3/4*(1-θ²)*(2η^2 - e_0*η - e_0*η^3)*cos(2*ω_0)))
``````

In this case, it is better to store the repeated computations.

Am I right or am I missing something?

Do note that `a*a*a` and `a^3` are not equivalent: `a^3` is guaranteed to be exact to machine precision, `a*a*a` is not, and so the former is faster than the latter. Use `@fastmath` to give the compiler license to perform substitutions that are equivalent in infinite precision.

3 Likes

I might be wrong, but I remember a professor saying (long time ago) never use `-ffast-math` for scientific programming. Is it still true today?

Nice!!! Using the `^` make `test1` and `test2` have the same benchmarks. The compiler is very smart If you’re feeling adventurous, you could try out my common-subexpression-elimination tool: https://github.com/rdeits/CommonSubexpressions.jl . It aggressively removes repeated computations by making wild assumptions about your code (specifically that every function call is pure). It can sometimes do better than the compiler because it’s much easier to assume purity than for the compiler to prove purity. Of course, if that assumption is wrong you will get the wrong answer. Buyer beware.

6 Likes

Good! This will be awesome to use in some expressions! I will do some tests here. Thanks for the info!