Guys,
I have a question about how “smart” the Julia compiler is. Sorry, maybe this is already answered but I could not find maybe because I do not even know what is the English term that describes what I want to say 
In my code, sometimes I have a lot of repeated operations like:
            g211 =    3.6160 -  13.2470e_0 + 16.29000e_0^2
            g310 = - 19.3020 + 117.3900e_0 - 228.4190e_0^2 + 156.5910e_0^3
            g322 = - 18.9068 + 109.7927e_0 - 214.6334e_0^2 + 146.5816e_0^3
            g410 = - 41.1220 + 242.6940e_0 - 471.0940e_0^2 + 313.9530e_0^3
            g422 = - 146.407 + 841.8800e_0 - 1629.014e_0^2 + 1083.435e_0^3
            g520 = - 532.114 + 3017.977e_0 - 5740.032e_0^2 + 3708.276e_0^3
Question: is the compiler is smart enough to see that all those ^ are the same operations or should I compute it before and store in a variable?
The same happens for a lot of sin and cos.
             
            
              
              
              
            
            
           
          
            
            
              
Many times it will put this in another form that will reduce the unnecessary computations. But this isn’t what you’re really looking for. This way of computing polynomials is not numerically stable so you need to be careful, especially when some of the coefficients sum to almost zero. You’ll want to use @evalpoly instead.
             
            
              
              
              3 Likes
            
            
           
          
          
            
            
              Thanks @ChrisRackauckas, you were very helpful!
             
            
              
              
              
            
            
           
          
            
            
              Well, it seems that Julia does not fully optimize those repeated expressions. For example:
function test1(a::Number)
    b = 1 + a + a^2 + a^3 + a^4
    c = 2 + 2a + 2a^2 + 2a^3 + 2a^4
    d = 3 + 3a + 3a^2 + 3a^3 + 3a^4
end
function test2(a::Number)
    a2 = a*a
    a3 = a*a*a
    a4 = a*a*a*a
    b = 1 + a + a2 + a3 + a4
    c = 2 + 2a + 2a2 + 2a3 + 2a4
    d = 3 + 3a + 3a2 + 3a3 + 3a4
end
function test3(a::Number)
    b = @evalpoly a 1 1 1 1 1
    c = @evalpoly a 2 2 2 2 2
    d = @evalpoly a 3 3 3 3 3
end
We have
julia> @benchmark test1(10.0)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     20.082 ns (0.00% GC)
  median time:      20.165 ns (0.00% GC)
  mean time:        20.371 ns (0.00% GC)
  maximum time:     83.892 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     997
julia> @benchmark test2(10.0)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     3.195 ns (0.00% GC)
  median time:      3.465 ns (0.00% GC)
  mean time:        3.513 ns (0.00% GC)
  maximum time:     26.934 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000
julia> @benchmark test3(10.0)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     3.237 ns (0.00% GC)
  median time:      3.267 ns (0.00% GC)
  mean time:        3.330 ns (0.00% GC)
  maximum time:     33.811 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000
Hence, I will use @evalpoly for polynomials, but I have to store the repeated computations for complex expressions like:
    C2 = QOMS2T*ξ^4*nll_0*aux1*
         ( all_0*( 1 + (3/2)η^2 + 4e_0*η + e_0*η^3) +
           3/2*(k_2*ξ)/aux0*(-1/2 + (3/2)θ²)*(8 + 24η^2 + 3η^4) )
    C1 = bstar*C2
    C3 = (e_0 > 1e-4) ? QOMS2T*ξ^5*A_30*nll_0*AE*sin_i_0/(k_2*e_0) : T(0)
    C4 = 2nll_0*QOMS2T*aux2*
         ( 2η*(1+e_0*η) + (1/2)e_0 + (1/2)η^3 -
           2*k_2*ξ/(all_0*aux0)*
                (3*(1-3θ²)*(1 + (3/2)η^2 - 2*e_0*η - (1/2)e_0*η^3) +
                3/4*(1-θ²)*(2η^2 - e_0*η - e_0*η^3)*cos(2*ω_0)))
In this case, it is better to store the repeated computations.
Am I right or am I missing something?
             
            
              
              
              
            
            
           
          
            
            
              Do note that a*a*a and a^3 are not equivalent: a^3 is guaranteed to be exact to machine precision, a*a*a is not, and so the former is faster than the latter. Use @fastmath to give the compiler license to perform substitutions that are equivalent in infinite precision.
             
            
              
              
              3 Likes
            
            
           
          
            
            
              I might be wrong, but I remember a professor saying (long time ago) never use -ffast-math for scientific programming. Is it still true today?
Nice!!! Using the ^ make test1 and test2 have the same benchmarks. The compiler is very smart 
             
            
              
              
              
            
            
           
          
            
            
              If you’re feeling adventurous, you could try out my common-subexpression-elimination tool: https://github.com/rdeits/CommonSubexpressions.jl . It aggressively removes repeated computations by making wild assumptions about your code (specifically that every function call is pure). It can sometimes do better than the compiler because it’s much easier to assume purity than for the compiler to prove purity. Of course, if that assumption is wrong you will get the wrong answer. Buyer beware.
             
            
              
              
              6 Likes
            
            
           
          
            
            
              Good! This will be awesome to use in some expressions! I will do some tests  here. Thanks for the info!