It’s not that Julia never does CSE. But it’s not very good at it right now.
julia> code_llvm(x -> exp(abs(x)) / (1+exp(abs(x))), Tuple{Float64}; debuginfo=:none)
define double @"julia_#36_2009"(double %0) #0 {
top:
%1 = call double @llvm.fabs.f64(double %0)
%2 = call double @j_exp_2011(double %1) #0
%3 = call double @j_exp_2011(double %1) #0
%4 = fadd double %3, 1.000000e+00
%5 = fdiv double %2, %4
ret double %5
}
Notice that the call to abs
was CSE’d, but the call to exp
was not. I believe that the compiler is not yet able to recognize what exp
can and can’t do to the program state. Absent that understanding, it must call it twice to ensure any side effects occur. This might improve with ongoing work to the effects system, but really this is all a bit beyond me to fully understand (much less explain). In particular, abs
and exp
have the same effects so I can’t tell you why one was CSE’d and the other was not.
julia> Base.infer_effects(abs, Tuple{Float64})
(+c,+e,+n,+t,+s,+m,+i)
julia> Base.infer_effects(exp, Tuple{Float64})
(+c,+e,+n,+t,+s,+m,+i)
If we inspect the un-optimized LLVM:
julia> code_llvm(x -> exp(abs(x)) / (1+exp(abs(x))), Tuple{Float64}; debuginfo=:none, optimize=false)
define double @"julia_#58_2531"(double %0) #0 {
top:
%1 = call {}*** @julia.get_pgcstack()
%2 = bitcast {}*** %1 to {}**
%current_task = getelementptr inbounds {}*, {}** %2, i64 -13
%3 = bitcast {}** %current_task to i64*
%world_age = getelementptr inbounds i64, i64* %3, i64 14
%4 = call double @llvm.fabs.f64(double %0)
%5 = call double @j_exp_2533(double %4) #0
%6 = call double @llvm.fabs.f64(double %0)
%7 = call double @j_exp_2533(double %6) #0
%8 = fadd double 1.000000e+00, %7
%9 = fdiv double %5, %8
ret double %9
}
we see that the second abs
call is still there. My wild speculation is that CSE is handled by LLVM and that Julia’s effects analysis is not yet (as of v1.9) integrated with LLVM, so LLVM does not currently recognize the CSE opportunity with exp
.
EDIT: I now see that the issue linked in the original post has a bit more discussion of this topic.
As for code_lowered
, it is just a reorganized version of the source. For example, the following calls a function nonexistent
that doesn’t even exist. How could it know to eliminate the second call without even knowing whether nonexistent
is a real function? We could define nonexistent
to mutate the global RNG, print, change processor flags, spawn tasks, or have all sorts of other side effects.
julia> code_lowered(x -> nonexistent(abs(x)) / (1+nonexistent(abs(x))), Tuple{Float64})
1-element Vector{Core.CodeInfo}:
CodeInfo(
1 ─ %1 = Main.abs(x)
│ %2 = Main.nonexistent(%1)
│ %3 = Main.abs(x)
│ %4 = Main.nonexistent(%3)
│ %5 = 1 + %4
│ %6 = %2 / %5
└── return %6
)